Erm banner

Data Engineer - India GDC (Gurugram)

Erm

The Role

Overview

Engineer scalable Azure data pipelines for climate science datasets.

Key Responsibilities

  • azure pipelines
  • pyspark
  • distributed processing
  • monitoring
  • metadata management
  • storage optimization

Tasks

-Develop, test, and maintain scalable data pipelines in Azure and Microsoft Fabric to process climate and environmental data. -Collaborate with climate scientists and software engineers to translate prototype code into scalable, production-ready processes. -Evaluate and configure Azure storage (Blob, Data Lake, etc.) for climate-scale workloads; ensure performance alignment with data access needs. -Build and manage distributed processing pipelines to support large-scale data transformations using PySpark and parallel computing strategies. -Implement logging, monitoring, and metadata management systems for traceability and auditability. -Support the application of scientific algorithms to global climate model outputs and other multidimensional datasets (e.g., NetCDF, Zarr). -Monitor and optimize I/O performance across workflows and storage tiers. -Document data engineering workflows and support internal data governance efforts. -Optimize data pipelines for performance, scalability, and reliability, including benchmarking storage read/write speeds and tuning Spark-based workflows.

Requirements

  • azure services
  • python
  • spark
  • zarr
  • bachelor's
  • 2+ years

What You Bring

-2+ years of experience in data engineering, including pipeline development and workflow optimization (cloud-based preferred). -Preferred: Azure Data Engineering certification, or demonstrable equivalent experience. -Strong communication skills, with ability to collaborate across technical and non-technical teams. -Experience implementing logging, monitoring, and metadata practices in data environments. -Hands-on experience with Microsoft Azure data services and/or Microsoft Fabric. -Familiarity with cloud storage performance tuning and benchmarking techniques. -Bachelor’s degree in Computer Science, Data Science, Environmental/Climate Science, Oceanography, or a related field. -Strong programming skills in Python; experience with Spark (PySpark or SparkSQL) in distributed environments. -Demonstrated experience working with large, multidimensional datasets (Zarr, NetCDF, HDF5).

The Company

About Erm

-Flourished through merging UK and US consultancies, building a truly global footprint. -Has carried out major environmental assessments for projects like High Speed 1, HS2, Keystone and Dakota Access pipelines. -Works in sectors from mining, energy and chemicals to manufacturing, pharma, power, renewables, finance and TMT. -Combines strategic advisory and technical delivery, using digital tools and proprietary methodologies. -Notably led environmental impact assessments for cross-border hydroelectric, rail and oil infrastructure.

Sector Specialisms

Chemical

Financial Services

Manufacturing

Mining & Metals

Oil & Gas

Pharmaceutical

Power

Renewables