Featherless AI · 1 day ago
Machine Learning Engineer — Multilingual Data
FeatherlessAI is seeking a Machine Learning Engineer to own and scale their multilingual data pipeline. The role involves designing and maintaining large-scale multilingual datasets, developing data pipelines, and ensuring model performance across various languages and cultural contexts.
Artificial Intelligence (AI)Cloud ComputingDatabase
Responsibilities
Design, build, and maintain large-scale multilingual datasets across high- and low-resource languages
Develop data pipelines for collection, cleaning, normalization, deduplication, and labeling
Implement quality filters using statistical, heuristic, and model-based methods
Work with researchers to define language coverage, benchmarks, and evaluation metrics
Analyze dataset bias, coverage gaps, and failure modes across regions and scripts
Support training, fine-tuning, and distillation workflows with high-quality multilingual data
Continuously iterate on datasets based on model performance and real-world usage
Qualification
Required
3+ years of experience as an ML Engineer, Applied Scientist, or similar role
Strong experience working with multilingual or non-English datasets
Solid understanding of NLP fundamentals (tokenization, embeddings, language modeling)
Experience building scalable data pipelines (Python, Spark, Ray, or similar)
Familiarity with Unicode, scripts, tokenization challenges, and language-specific quirks
Comfort collaborating with researchers and translating research needs into production systems
Preferred
Experience with low-resource languages or multilingual benchmarks (e.g. FLORES, XTREME)
Exposure to LLM training, fine-tuning, or distillation
Linguistics background or experience working with native language experts
Contributions to open-source datasets or ML tooling
Experience with data quality evaluation at scale
Benefits
Competitive compensation + meaningful equity at Series A stage
Company
Featherless AI
We enable serverless inference via our GPU orchestration and model load-balancing system.
H1B Sponsorship
Featherless AI has a track record of offering H1B sponsorships. Please note that this does not
guarantee sponsorship for this specific role. Below presents additional info for your
reference. (Data Powered by US Department of Labor)
Distribution of Different Job Fields Receiving Sponsorship
Represents job field similar to this job
Trends of Total Sponsorships
2025 (1)
Funding
Current Stage
Early StageTotal Funding
$5MKey Investors
Airbus Ventures
2025-10-31Series A
2025-03-17Seed· $5M
Company data provided by crunchbase