Want to hear how I work? Hit play.Kablio AI applies for you. You just show up to the interviewKablio AI helps you secure roles in construction, clean energy, facilities management, engineering, architecture, sustainability, environment and other physical world sectors.
Get hired, get rewarded!
Land a job through Kablio and earn a 5% salary bonus.
Exclusive benefits
5%Bonus
Principal Cloud Engineer - Data & AI
Equinix
Global leader in data center and interconnection services, enabling digital transformation.
Lead design, build, and manage multi-cloud AI/Data platforms and pipelines.
13d ago
Expert & Leadership (13+ years)
Full Time
Singapore
Hybrid
Company Size
10,000 Employees
Service Specialisms
Data center colocation
Interconnection services
Smart Hands / remote support
Software-defined interconnection
Network Edge services
Equinix Metal (bare metal)
Equinix Fabric (cloud routing)
Managed services (integration & advisory)
Sector Specialisms
No specialisms available
Role
What you would be doing
terraform
kubernetes
ci/cd
multi-cloud
data pipelines
llm integration
Build reusable frameworks and infrastructure-as-code (IaC) using Terraform, Kubernetes, and CI/CD to drive self-service and automation
Build and orchestrate multi-agent systems using frameworks like CrewAI, LangGraph, or AutoGen for use cases such as pipeline debugging, code generation, and MLOps
Architect and manage multi-cloud and hybrid cloud platforms (e.g., GCP, AWS, Azure) optimized for AI, ML, and real-time data processing workloads
Create extensible CLIs, SDKs, and blueprints to simplify onboarding, accelerate development, and standardize best practices
Foster a culture of ownership, continuous learning, and innovation
Collaborate across teams to shape the next generation of intelligent platforms in the enterprise
Drive technical leadership across AI-native data platforms, automation systems, and self-service tools
Integrate LLM APIs (OpenAI, Gemini, Claude, etc.) into platform workflows for intelligent automation and enhanced user experience
Design and develop event-driven architectures using Apache Kafka, Google Pub/Sub, or equivalent messaging systems
Work with engineering by introducing platform enhancements, observability, and cost optimization techniques
Develop and maintain real-time and batch data pipelines using tools like Airflow, dbt, Dataform, and Dataflow/Spark
Build and expose high-performance data APIs and microservices to support downstream applications, ML workflows, and GenAI agents
Streamline onboarding, documentation, and platform implementation & support using GenAI and conversational interfaces
Ensure platform scalability, resilience, and cost efficiency through modern practices like GitOps, observability, and chaos engineering
Collaborate across teams to enforce cost, reliability, and security standards within platform blueprints.
What you bring
terraform
lookml
prometheus
azure architect
15+ years
genai
Experience with incident management and building self-healing cloud architectures
Experience with Looker Modeler, LookML, or semantic modeling layers
Strong communication skills to engage stakeholders at various levels, from engineering to executives
Familiarity with observability tools (Prometheus, Grafana, OpenTelemetry) and strong debugging skills across the stack
Experience with enterprise-grade IAM, security controls, and compliance frameworks across cloud environments
Experience with RAG pipelines, vector databases, and embedding-based search
Ability to design for cost optimization, tagging strategies, and usage monitoring across cloud providers
Experience with multi-account / multi-subscription / multi-project governance models, including landing zones, service control policies, and organizational structures
Prior implementation of data mesh or data fabric in a large-scale enterprise
Ability to work cross-functionally with security, networking, application, and data teams to deliver integrated cloud solutions
Proven experience in architecting hybrid and multi-cloud solutions, including interconnectivity, security, workload placement, and DR strategies
Proven experience building scalable, efficient data pipelines for structured and unstructured data
Experience in developing and integrating GenAI applications using MCP and orchestration of LLM-powered workflows (e.g., summarization, document Q&A, chatbot assistants, and intelligent data exploration)
Experience with GenAI/LLM frameworks and tools for orchestration and workflow automation
Experience leading cloud architecture reviews, defining standards, and mentoring engineering teams
Hands-on experience with CI/CD pipelines (e.g., GitHub Actions) for cloud resource deployments
Mastery of Infrastructure as Code (IaC) tools — especially Terraform, Terragrunt, and CloudFormation / ARM / Deployment Manager
Proficiency in designing and managing Kubernetes, serverless workloads, and streaming systems (Kafka, Pub/Sub, Flink, Spark)
15+ years of hands-on experience in Platform or Data Engineering, Cloud Architecture, Multi-Cloud Multi-Region Deployment & Architecture, AI Engineering roles
Microsoft Certified: Azure Solutions Architect Expert
Hands-on expertise building and optimizing vector search and RAG pipelines using tools like Weaviate, Pinecone, or FAISS to support embedding-based retrieval and real-time semantic search across structured and unstructured datasets
Expertise in implementing policy-as-code & Compliance-as-code (e.g., Open Policy Agent, Sentinel)
Deep knowledge of data modeling, distributed systems, and API design in production environments
Experience building and managing cloud automation frameworks (e.g., using Python, Go, or Bash for orchestration and tooling)
Strong programming background in Java, Python, SQL, and one or more general-purpose languages
Other relevant certifications (CKA, CKS, CISSP cloud concentration) are a plus.
HashiCorp Certified: Terraform Associate
Experience with ML Platforms (MLFlow, Vertex AI, Kubeflow) and AI/ML observability tools
Familiarity with cloud monitoring, logging, and observability tools (e.g., CloudWatch, Azure Monitor, GCP Operations Suite, Datadog, Prometheus)
Experience with metadata management, data catalogs, data quality enforcement, and semantic modeling & automated integration with Data Platform
Strong background in implementing cloud security best practices (network segmentation, encryption, secrets management, key management, etc.).
Deep expertise in designing, implementing, and managing architectures across multiple cloud platforms (e.g., AWS, Azure, GCP)
Benefits
Work with a high-energy, mission-driven team that embraces innovation, open-source, and experimentation
Hey there! Before you dive into all the good stuff on our site, let’s talk cookies—the digital kind. We use these little helpers to give you the best experience we can, remember your preferences, and even suggest things you might love. But don’t worry, we only use them with your permission and handle them with care.