Partner with data scientists to develop new code and scripts, refactoring them into maintainable, efficient, and reusable functions that prevent future bottlenecks
Work closely with internal teams and end-users to understand their needs, address technical challenges, and co-design scalable architectural solutions that serve the broader organization
Govern and manage large datasets across AWS environments, including data versioning and resolving quality issues for internal users
Own and continuously improve data ingestion and transformation pipelines for large-scale climate and renewables datasets (weather data, renewable generation data), ensuring quality and timely delivery across the business
Facilitate cross-functional data exchange by ingesting and transforming datasets from other engineering teams and delivering them to stakeholders in their required formats
Create shared code frameworks, templates, and internal libraries that enforce best practices, contribute to company-wide tooling, and accelerate data science workflows
Define and implement comprehensive data quality assurance processes, including validity checks and proactive diagnosis and resolution of production issues
Design and implement optimization strategies for large-scale data processing and complex modeling tasks, leveraging parallelization and distributed computing tools like Dask for maximum performance and efficiency
Mentor data scientists on code structure, effective testing practices, and engineering standards
Build and maintain robust unit and BDD (Behave) test suites that validate complex transformation and modeling logic
Build and deploy self-service infrastructure components (AWS Lambda functions, Glue/Athena tables, computing infrastructure) that make data access and preparation seamless for data scientists
Requirements
python
aws
pyspark
ci/cd
data engineering
containerization
Deep knowledge of software engineering best practices, including design patterns, refactoring, infrastructure as code, containerization, and CI/CD pipelines
Proven experience writing comprehensive test suites (unit, integration, and behavioural/Behave tests)