
Equinix
Global leader in data center and interconnection services, enabling digital transformation.
Principal Engineer, Product Software
Lead architect for high‑scale network telemetry and AI‑driven observability platform.
Job Highlights
About the Role
The Principal Engineer for NPE Observability leads the architecture of distributed systems that ingest, store, and analyze the global network state. This role bridges network protocols with big‑data patterns, designing high‑performance ingestion engines capable of handling trillions of telemetry points and enabling real‑time anomaly detection across the infrastructure. The engineer will construct the high‑throughput, low‑latency data fabric that makes the network self‑aware. In this position you will define a multi‑year roadmap for a unified on‑prem and telemetry ecosystem, evolve the network into an intelligent, self‑healing system, and drive the lifecycle of network data from gNMI/SNMP ingestion to structured storage optimized for AI modeling. You will integrate cutting‑edge LLMOps and agentic workflows to accelerate root‑cause analysis and automate remediation, while ensuring architectural integrity through SOLID and Clean Architecture principles. • Define a multi‑year architecture for a unified on‑prem and telemetry ecosystem to enable a self‑healing intelligent network. • Direct the lifecycle of network data from gNMI/SNMP ingestion to structured storage optimized for large‑scale AI modeling. • Integrate LLMOps, tool‑use frameworks, and agentic workflows to accelerate incident root‑cause analysis and automated remediation. • Enforce SOLID and Clean Architecture principles across telemetry pipelines, observability data stores, and AI orchestration layers. • Bridge NetOps, DevOps, and Security to standardize the MELT (Metrics, Events, Logs, Traces) strategy globally. • Mentor staff and senior engineers through deep‑dive design reviews and strategic coaching on network‑centric observability.
Key Responsibilities
- ▸telemetry architecture
- ▸data ingestion
- ▸ai modeling
- ▸llmops integration
- ▸anomaly detection
- ▸mentorship
What You Bring
• Architect high‑throughput, resilient pipelines that ingest trillions of telemetry events for predictive AIOps. • 10+ years of experience designing distributed, high‑scale observability platforms or mission‑critical network software. • Expert‑level proficiency in Java and Go, with deep fluency in gNMI, SNMP, and flow protocols. • Extensive experience with Kubernetes, Jenkins, ArgoCD, and service‑provider networking technologies (BGP, IS‑IS, MPLS, EVPN, VXLAN). • Mastery of OLAP and time‑series databases such as ClickHouse, Prometheus/Thanos, or InfluxDB, and schema design for high‑cardinality telemetry. • Proven expertise with Apache Flink, Kafka, and Kappa/Lambda streaming patterns for stateful processing. • Demonstrated ability to develop AI agents using LLMs, including function calling, retrieval‑augmented generation, and agentic orchestration.
Requirements
- ▸java
- ▸go
- ▸kubernetes
- ▸kafka
- ▸llms
- ▸10+ years
Benefits
Equinix offers a targeted base salary ranging from $177,000 to $319,000 annually, depending on location, with additional eligibility for bonuses and potential equity. Compensation reflects role level, experience, and education, and is reviewed periodically to stay competitive. Employees receive a comprehensive benefits package that includes health, life, disability, and voluntary insurance options, a 401(k) retirement plan with company contributions, paid time off and holidays, and an Employee Assistance Program. The company is committed to inclusive, sustainable, and connected benefits that support employees throughout their careers and lives. • Competitive health, life, disability, and voluntary insurance plans. • 401(k) retirement plan with company contributions. • Accrued paid time off, holidays, and an Employee Assistance Program. • Eligibility for bonuses and potential equity participation.
Work Environment
Hybrid