Design metadata extraction and enrichment pipelines that enhance discoverability of unstructured assets
Design security and access control models that work across legacy systems and modern AI platforms
Build architectures for multi-modal AI applications that combine structured and unstructured data sources
Design tiered storage architectures that balance performance needs with storage costs
Implement caching layers for frequently accessed embeddings and AI model inputs
Build architectures for continuous learning where RAG systems are updated with new data in near real-time
Design scalable architectures for processing and indexing unstructured data (PDFs, documents, emails, logs, images) for AI consumption
Implement chunking strategies and embedding pipelines for diverse document types and data sources
Implement intelligent document extraction using LLMs' native vision and context capabilities to handle complex layouts, tables, and mixed media
Architect document processing pipelines that leverage multi-modal LLMs (GPT-4V, Claude, Gemini) for direct document understanding without traditional OCR preprocessing
Design end-to-end RAG architectures that leverage existing data lakes and enterprise knowledge bases
Optimize storage strategies for cost-effective management of structured and unstructured data
Architect hybrid search systems combining traditional keyword search with semantic/vector search capabilities
Create data governance frameworks that ensure compliance while enabling AI innovation
Requirements
python
sql
llms
data architect
12+ years
master's
Proficiency in Python and SQL, with experience in data processing libraries
12+ years of experience modernizing legacy data architectures for cloud and AI workloads
Must be 18 years or older.
Understanding of modern document processing using multi-modal LLMs and traditional extraction methods
Master's degree in Computer Science, Information Systems, or related field
Deep expertise in unstructured data processing using both multi-modal LLMs and traditional methods
Experience with multi-modal LLMs for document understanding and their cost/performance trade-offs
Experience with multi-modal AI architectures combining text, image, and structured data
Background in information retrieval, search engineering, or content management systems
5+ years of experience working with both structured (SQL databases, data warehouses) and unstructured data (documents, logs, multimedia)
Bachelor's degree in Computer Science, Information Systems, or related field
10+ years of experience as a Data Architect, Data Platform Engineer, or similar role with enterprise data systems
Legal authorization to work in the U.S. is required. We will not sponsor individuals at the Bachelor’s level for employment visas, now or in the future, for this job opening.
Benefits
Information not given or found
Training + Development
Information not given or found
Interview process
Information not given or found
Visa Sponsorship
no visa sponsorship; must already have legal u.s. work authorization
Security clearance
offer contingent on successful drug screening
Company
Overview
April 2024
Founded
The company emerged from the spin-off of GE's energy units in April 2024.
>$10B Quarterly Revenue
Revenue Growth
Achieves over $10 billion in quarterly revenue, driven by demand for power infrastructure and digital solutions.
$3B
Wind Turbine Backlog
Maintains a significant backlog in wind turbine orders, reflecting strong market demand.
25%
Global Electricity Supply
Contributes to generating 25% of the world’s electricity through its installed turbines and grids.
Traces roots back to Edison and Alstom, merging power, renewable, digital & financial wings.
Headquartered in Cambridge, MA, crafts large-scale gas turbines, SMRs, wind turbines, hydro and grid tech to fuel economies.
On the nuclear front, advancing small modular reactors (like BWRX‑300) in partnership with utilities and supporting semiconductor projects.
Wind prowess spans onshore, offshore and blade making—with key sites like Dogger Bank offshore and blade plants in Spain.
Electrification arm tackles grid stability: HVDC, transformers, storage, conversion, plus GridOS software powering smarter infrastructure.
Weaves finance and consulting through energy-infrastructure investments, funding solar farms to pipelines via GE Energy Financial Services.
Culture + Values
Relentlessly focused on advancing the world’s transition to cleaner, more sustainable energy.
Believes in working with customers, partners, and communities to create innovative energy solutions that make a meaningful difference.
Prioritizes excellence, integrity, and accountability in everything they do.
Committed to driving real change through technology and partnerships that will transform the global energy landscape.
Environment + Sustainability
2050 target
Net zero commitment
Committed to achieving net zero carbon emissions by 2050.
Focused on reducing emissions through advanced energy technologies.
Maximizing use of renewable energy sources and leveraging digital solutions for energy efficiency.
Solutions aim to decarbonize industries and help customers meet their sustainability goals.