Description
etl pipelines
data modeling
data monitoring
automation
ml‑ready data
mentorship
The Data Engineer role is central to GreenLite’s mission of cutting municipal review time from months to days by delivering trustworthy, instantly accessible data for upcoming computer‑vision and large‑language‑model capabilities. As a key hire in the AI engineering organization, you will shape the long‑term data strategy and enable multiple squads to feed high‑quality data into ML pipelines, directly impacting company OKRs.
You will be responsible for building and owning robust batch and stream pipelines, establishing quality gates, and maintaining metadata systems that make data ready for experimentation and production. Collaboration with ML engineers and domain experts, mentorship of teammates, and continuous automation of inefficiencies are core parts of the role.
- Design and implement end‑to‑end ETL/ELT workflows on AWS.
- Model source‑of‑truth data for drawings, annotations, feedback, and regulatory metadata; enforce contracts via automated tests.
- Instrument and monitor data lineage, freshness, and cost; define SLOs for pipeline reliability.
- Collaborate with ML engineers and domain experts to prioritize data sources, labeling, and feature‑store needs.
- Mentor teammates, participate in hiring and onboarding.
- Identify inefficiencies, automate toil, and introduce tooling to accelerate model development and control costs.
- Own it: act as CEO of the data layer, ensuring no broken pipelines.
- Solve real problems: deliver ML‑ready data that shortens training cycles and improves model performance.
- Regular team‑building events.
Requirements
python
sql
terraform
soc2
orchestration
debate-deliver
Ideal candidates have 4–7 years of experience building data platforms at fast‑growing companies, are fluent in Python, SQL, and modern orchestration tools, and have shipped cloud‑native data stacks with an eye on cost and performance. Experience with Terraform/IaC, SOC 2 Type 2 compliance, and a “debate, decide, deliver” mindset are strong pluses.
- 4–7 years building data platforms at high‑growth companies or startups.
- Proficiency in Python, SQL, and at least one modern orchestration framework; Typescript/full‑stack a plus.
- Experience shipping cloud‑native data stacks (Terraform/IaC preferred) with cost/performance awareness.
- Familiarity with SOC 2 Type 2 compliance and building compliant systems.
- Ability to thrive in “debate, decide, deliver” environments, turning ambiguous goals into maintainable systems.
Benefits
GreenLite offers a competitive compensation package that includes a generous base salary, equity, performance bonuses, premium health coverage, 401(k), parental leave, wellness stipend, unlimited PTO, and a hybrid work environment with weekly team lunches and regular team‑building events. The company values transparency, alignment, and inclusion, hosting twice‑yearly all‑hands meetings and encouraging a collaborative culture.
- Competitive base salary and employee equity program.
- Performance‑based annual bonuses.
- Premium health, dental, and vision coverage for employees and families.
- 401(k) retirement plan.
- Generous parental leave.
- Monthly wellness stipend and access to Wellhub, Talkspace, and Teladoc.
- Weekly catered team lunches in NYC office.
- Bi‑annual company‑wide all‑hands meetings.
- Unlimited PTO.
- Hybrid work: 4 days in office (3 days during summer).
Training + Development
Information not given or found