Petrofac banner

Data Engineer

Petrofac

The Role

Overview

Design and build scalable data pipelines, big-data platforms, and ensure data governance.

Key Responsibilities

  • data pipelines
  • ci/cd
  • data lake
  • data modeling
  • olap cubes
  • cluster management

Tasks

-Architect and defining the data flows and building highly efficient, scalable data pipelines. -Data management: modelling, normalisation, cleaning, and maintenance -Coordinate with multiple business stake holders to understand the requirement and deliver. -Participate in the Technical Design Authority forums. -Deploy and maintain highly efficient CI/CD devops pipelines across multiple environments such as dev, stg and production. -Deliver master data cleansing and improvement efforts; including automated and cost-effective solutions for processing, cleansing, and verifying the integrity of data used for analysis. -Architecting and defining data flows for big data/data lake use cases -Act as a coach and provide consultancy services and advice to data engineers by offering technical guidance, and ensuring architecture principles, design standards and operational requirements are met. -Allocate task to various team members, track the status and provide the report on activities to management. -To work in tandem with the Enterprise and Domain Architects to understand the business goals and vision, and to contribute to the Enterprise Roadmaps. -Conducting a continuous audit of data management system performance, refine whenever required, and report immediately any breach or loopholes to the stakeholders. -Collaborates with analytics and business stakeholders to improve data models that feed BI tools, increasing data accessibility, and fostering data-driven decision making across the organization. -Identify, design, and implement internal process improvements: automating manual processes, optimizing data delivery, re-designing infrastructure for greater scalability. -To guide and build highly efficient OLAP cubes using data modelling techniques to cater all the required business cases and mitigate the limitation of Power BI in analysis service. -Work with team of data engineers to deliver the tasks and achieving weekly and monthly goals, also to guide the team to follow the best practices and improve the deliverables. -Responsible for estimating the cluster size, core size, monitoring, and troubleshooting of the data bricks cluster and analysis server to produce optimal capacity for computing data ingestion.

Requirements

  • scala
  • python
  • azure
  • hadoop
  • spark
  • etl

What You Bring

-Experience with programming languages such as Scala, Java, Python and Shell scripting -Bachelor’s degree (masters’ preferred) in Computer Science, Engineering, or any other technology related field -Strong troubleshooting skills, problem solving skills of any issues stopping business progress. -Extensive background in data mining and statistical analysis -Comprehensive knowledge on data extraction, Transformation and loading data from various sources like Oracle, Hadoop HDFS, Flat files, JSON, Avro, Parquet and ORC. -Understand Data architectures, Data warehousing principles and be able to participate in the design and development of conventional data warehouse solutions -experience in data analytics platform and hands-on experience on ETL and ELT transformations with strong SQL programming knowledge. -Extensive work experience onboarding various data sources using real-time, batch load or scheduled loads. The sources can be in cloud, on premise, SQL DB, NO SQL DB or API-based -The candidate should poses problem solving skills to handle unexpected situations or challenges at work such as: -Proven Experience in pulling data through REST API, ODATA, XML,Web services -Experience building robust and impactful data visualisation solutions and gaining adoption -Expertise in extracting the data through JSON, ODATA, REST API, WEBSERVICES, XML. -Understand the physical and logic plan of execution and optimize the performance of data pipeline. s -Strictly follow scrum based agile approach of development to work based on allocated stories. -Experience in various databases including Azure SQL DB, Oracle, MySQL, Cosmos DB, MongoDB -Decision-making: the candidate should have have the capability to assess available options, weigh their pros and cons and make informed decisions that best address the problem at hand. -Excellent knowledge on implementing full life cycle of data management principles such as Data Governance, Architecture, Modelling, Storage, Security, Master data, and Quality. -Collaboration: the candidate should have have the aptitude to work effectively with others, seek input and feedback and leverage collective knowledge and skills to solve problems as a team. -Ability to work with ETL tools with strong knowledge on ETL concepts -Operational experience with Big Data Technologies and Engines including Presto, Spark, Hive and Hadoop Environments -Expertise in data ingestion platforms such as Apache Sqoop, Apache Flume, Amazon kinesis, Fluent, Logstash etc. -Analytical skills: the candidate should have have the capacity to break down complex problems into smaller, manageable components and analyze the root causes and potential solutions. -Expertise in securing the big data environment including encryption, tunnelling, access control, secure isolation. -Experience in data modelling (data marts, snowflake/Star, Normalization, SCD2). -Able to understand various data structures and common methods in data transformation -Proficient knowledge on Hadoop and Spark eco systems like HDFS, Hive, Sqoop, Oozie, Spark core, streaming. -Experience with Azure product offerings and data platform. -Experience supporting and working with cross-functional teams in a dynamic environment. -Hands on experience in using Databricks, Pig, SCALA, HIVE, Azure Data Factory, Python, R -hands-on experience on big data engineering, distributed storage and processing massive data into data lake using Scala or Python. -Strong focus on delivering outcomes -Experience defining, implementing, and maintaining a global data platform

The Company

About Petrofac

-Provides services to the oil, gas, and energy industries, focusing on engineering and construction. -Operates globally, working with major energy players on projects ranging from design to operations. -Services span engineering, procurement, construction, and maintenance across onshore and offshore facilities. -Has delivered a wide range of projects, including refineries, gas processing plants, and offshore platforms. -Recognized for expertise in complex, large-scale projects, with a particular focus on the energy sector. -Known for building long-term relationships with clients, contributing to the sustainable development of energy infrastructure. -Innovative solutions and technology-driven approach set it apart in the competitive global market.

Sector Specialisms

Upstream Oil and Gas

Downstream Oil and Gas

Renewable Energy

Hydrogen

Ammonia

Carbon Capture

Energy Transition

Well Engineering

Asset Management

Operations and Maintenance

Decommissioning

Engineering Design

Training and Competence