Catalyst Labs · 1 month ago
Tech Lead, Data & Inference Engineer
Catalyst Labs is a leading talent agency specializing in Applied AI, Machine Learning, and Data Science. They are seeking a Tech Lead, Data & Inference Engineer to lead the design and development of a comprehensive data platform, ensuring data reliability and efficiency while mentoring engineers and promoting best practices across teams.
Management Consulting
Responsibilities
Lead the design, development and scaling of an end to end data platform from ingestion to insights, ensuring that data is fast, reliable and ready for business use
Build and maintain scalable batch and streaming pipelines, transforming diverse data sources and third party application programming interfaces into trusted and low latency systems
Take full ownership of reliability, cost and service level objectives. This includes achieving ninety nine point nine percent uptime, maintaining minutes level latency and optimizing cost per terabyte. Conduct root cause analysis and provide long lasting solutions
Operate inference pipelines that enhance and enrich data. This includes enrichment, scoring and quality assurance using large language models and retrieval augmented generation. Manage version control, caching and evaluation loops
Work across teams to deliver data as a product through the creation of clear data contracts, ownership models, lifecycle processes and usage based decision making
Guide architectural decisions across the data lake and the entire pipeline stack. Document lineage, trade offs and reversibility while making practical decisions on whether to build internally or buy externally
Scale integration with application programming interfaces and internal services while ensuring data consistency, high data quality and support for both real time and batch oriented use cases
Mentor engineers, review code and raise the overall technical standard across teams. Promote data driven best practices throughout the organization
Qualification
Required
Bachelors or Masters degree in Computer Science, Computer Engineering, Electrical Engineering, or Mathematics
Excellent written and verbal communication; proactive and collaborative mindset
Comfortable in hybrid or distributed environments with strong ownership and accountability
A founder-level bias for actionable to identify bottlenecks, automate workflows, and iterate rapidly based on measurable outcomes
Demonstrated ability to teach, mentor, and document technical decisions and schemas clearly
6 to 12 years of experience building and scaling production-grade data systems, with deep expertise in data architecture, modeling, and pipeline design
Expert SQL (query optimization on large datasets) and Python skills
Hands-on experience with distributed data technologies (Spark, Flink, Kafka) and modern orchestration tools (Airflow, Dagster, Prefect)
Familiarity with dbt, DuckDB, and the modern data stack; experience with IaC, CI/CD, and observability
Exposure to Kubernetes and cloud infrastructure (AWS, GCP, or Azure)
Preferred
Strong Node.js skills for faster onboarding and system integration
Previous experience at a high-growth startup (10 to 200 people) or early-stage environment with a strong product mindset
Benefits
Above market base
Bonus
Equity
Company
Catalyst Labs
Welcome to Catalyst Labs – Powering Catalytic Growth At Catalyst Labs, catalytic growth isn't just a concept, it's our driving force.
Funding
Current Stage
Early StageCompany data provided by crunchbase