Senior AI Ops Engineer

Ge Vernova

The Role

Overview

Lead AI/ML solutions to automate and optimize enterprise IT operations.

Key Responsibilities

  • ai automation
  • ai integration
  • data governance
  • model monitoring
  • security controls
  • llm optimization

Tasks

-Automate critical processes, including anomaly detection, root cause analysis, and resolution workflows leveraging advanced AI/ML and/or GenAI technology. -Lead collaboration with IT and DevOps teams to integrate AI tools into cloud and on-premise use case solutions across multiple environments. -Ensure compliance with data governance and regulatory requirements across cloud environments -Monitor and troubleshoot production deployments to ensure model accuracy, latency, and uptime requirements are met. -Research, recommend and implement the latest advancements in AI/ML technologies to maintain a cutting-edge IT infrastructure (i.e., newly developed Large Language Models, Agentic frameworks, OCR tooling, advanced Chunking & Embedding methodologies) -Support projects as a trusted technical advisor to team members to solve complex technical challenges. -Implement robust security controls for AI/ML workflows, including data encryption, IAM policies, and secure API integrations. -Leverage artificial intelligence (AI) and machine learning (ML) technologies and frameworks to drive greater observability and service operations automation. -Align AI Ops initiatives with broader organizational goals and long-term IT strategies. Optimize LLM performance, scalability, and cost-efficiency using techniques like model pruning, quantization, or distributed inference. -Drive the interpretation and translation of enterprise goals into technical specifications, delivering a point of view on cloud agnostic technologies. -Architect and deploy advanced AI/ML solutions to monitor, analyze, and optimize IT operations. -Establish, maintain, and improve data pipelines to support performance of AI and GenAI solution applications. -Own, develop and maintain process to support IT Operations Management, Discovery, Monitoring, and AIOps solutions using current industry platforms.

Requirements

  • python
  • kubernetes
  • aws
  • ci/cd
  • nlp
  • master's

What You Bring

-Strong analytical, strategic thinking, and leadership skills. Excellent communication and collaboration abilities to work effectively with stakeholders across all levels. -Strong knowledge of database systems (SQL and NoSQL) Position -Experience building CI/CD pipelines for machine learning and experience with tools like GitLab CI/CD or Jenkins for automating workflows. -Experience with frameworks like Hugging Face Transformers, LangChain, or OpenAI APIs Advanced skills in Natural Language Processing, including summarization, translation, and augmentation (preferred experience with advanced prompting and/or model fine tuning) -Experience with automated alerting and logging best practices for large-scale AI systems. -Familiarity with DevOps principles, CI/CD pipelines, and ITIL best practices. Strong experience in Programming/scripting languages (e.g., Python, Pyspark, etc.) ETL pipelines, data lakes, and data warehousing -Advanced proficiency in scripting and programming languages (e.g., Python, Bash, PowerShell). -Must be willing to work out of an office located in Bangalore India. -Experience with Infrastructure-as-Code (IaC) tools like Terraform, Ansible, or CloudFormation. -Proven ability to lead and deliver AI solutions in large-scale IT environments. Experience working with BMC Observability and AIOps technologies for monitoring Cloudbased environments (AWS, Azure, Google Cloud Platform) and their key technologies. -Familiarity with tools like Kubeflow, MLflow, Airflow, or Argo Workflows. -Proficiency in GPU/TPU acceleration and parallelization techniques. Familiarity with performance tuning, auto-scaling, and load balancing for high-throughput AI workloads. -7+ years of experience in IT operations, DevOps, or AI/ML systems implementation. Expertise in one or more of the following is desirable: DevOps, Serverless, Networking, Security, Storage, Databases, IOT, AI/ML, Cloud Migration and IT Transformation. -Experience orchestrating the entire AI/ML lifecycle (data ingestion, model training, validation, deployment, monitoring). -Deep knowledge of AI/ML frameworks (e.g., TensorFlow, PyTorch, scikit-learn) and algorithms. -Proven proficiency with tools like Apache Spark, Kafka, Snowflake, Redshift. -Proficiency in Kubernetes, Docker, and container orchestration. -Expertise in IT monitoring tools (e.g., AWS CloudWatch, Azure Monitoring, Splunk, Dynatrace, Prometheus, Datadog, etc.). -Expertise in cloud platforms like AWS, Azure, or Google Cloud Platform (GCP). -Bachelor’s degree or Master’s degree in computer science, Engineering, or related fields (Master’s degree preferred).

The Company

About Ge Vernova

-Traces roots back to Edison and Alstom, merging power, renewable, digital & financial wings. -Headquartered in Cambridge, MA, crafts large-scale gas turbines, SMRs, wind turbines, hydro and grid tech to fuel economies. -On the nuclear front, advancing small modular reactors (like BWRX‑300) in partnership with utilities and supporting semiconductor projects. -Wind prowess spans onshore, offshore and blade making—with key sites like Dogger Bank offshore and blade plants in Spain. -Electrification arm tackles grid stability: HVDC, transformers, storage, conversion, plus GridOS software powering smarter infrastructure. -Weaves finance and consulting through energy-infrastructure investments, funding solar farms to pipelines via GE Energy Financial Services.

Sector Specialisms

Power

Gas Power

Steam Power

Nuclear

Hydro Power

Wind

Onshore Wind

Offshore Wind

Electrification Systems

Power Conversion and Storage

Grid Solutions

Electrification Software