Machine Learning Infra Engineer jobs in United States
cer-icon
Apply on Employer Site
company-logo

Distributed Spectrum · 5 months ago

Machine Learning Infra Engineer

Distributed Spectrum is a company focused on creating systems for radio spectrum intelligence. The Machine Learning Infra Engineer will design and build core infrastructure to enable fast, scalable model training and facilitate data access for researchers.

AnalyticsSaaSSensor

Responsibilities

Design and build core infrastructure from scratch using technologies that you actually want to use
Scale our distributed data storage and write Python APIs that make loading 30GB datasets feel instantaneous
Set up the orchestration for model training on GPU clusters, versioning, and artifact deployment
Explore creative ways to combine relational and vector-based search queries, enabling researchers to discover the most relevant data for any modeling task

Qualification

Data infrastructure designAWS servicesPython codebase managementDatabase schema designExperiment tracking toolsEvent streaming systemsOrchestration frameworksClear writingCollaborationAdaptability

Required

Experience designing and implementing data infrastructure from scratch, including databases, cloud storage, and cloud compute
Experience managing a production-grade Python codebase that was used by other people
Experience with AWS, including AWS networking, S3, Sagemaker, RDS, ECS, Lambda, and related infra-as-code tools
Experience designing database schemas, metadata states, and software abstractions that promote clarity and generalize well to new situations
Experience working directly with researchers and using infrastructure that supports experiment tracking, model versioning, and artifact deployment, such as MLflow or similar
You know how to deal with larger-than-memory data inexpensively, without setting up a cluster
You can write clearly
Extremely collaborative attitude and interest in helping define large areas of our engineering roadmap

Preferred

Understanding of database internals (indexes, query optimizers) and data storage formats, and the ability to use it to make practical design decisions
Experience writing production Rust or C++
Experience with modern DataFrame libraries and database systems, including Polars, Ibis, Duckdb, or similar
Experience with maintaining a versioned Python package, related CI/CD best practices, and the Python packaging ecosystem
Experience with event streaming data systems like ZeroMQ, Kafka, Flink, or similar
Experience with orchestration frameworks like Airflow, Prefect, Dagster, or similar
Experience dealing with role-based access to AWS and permissioning
Experience running distributed jobs using Spark, Ray, Dask, or similar

Benefits

Above-market salary, equity, and benefits package.
Early Series A Equity
Excellent health, dental, and vision coverage
401(k) match - up to 4% of your salary
Unlimited PTO
Daily office lunches in NYC

Company

Distributed Spectrum

twittertwitter
company-logo
Radio signals are everywhere - we find the ones that matter.

Funding

Current Stage
Early Stage
Total Funding
$25.23M
Key Investors
National Science Foundation
2025-03-19Series A· $25M
2024-03-01Seed
2022-06-01Seed

Leadership Team

leader-logo
Isaac Struhl
Co-Founder and CTO
linkedin
Company data provided by crunchbase