Featherless AI · 17 hours ago
AI Researcher – Multilingual Data
Featherlessai is seeking an AI Researcher focused on multilingual data to build and scale next-generation language models. The role involves designing research strategies for multilingual datasets, collaborating on training pipelines, and publishing high-quality research.
Artificial Intelligence (AI)Cloud ComputingDatabase
Responsibilities
Design and execute research on multilingual datasets, including data collection, filtering, deduplication, and quality measurement
Develop strategies for low-resource and long-tail languages (sampling, augmentation, curriculum design)
Research and improve cross-lingual transfer, alignment, and robustness in large language models
Build and maintain evaluation benchmarks for multilingual performance
Collaborate with engineers and researchers on training pipelines and model architecture decisions
Publish research at top venues (e.g., ACL, EMNLP, NeurIPS, ICML, ICLR) and contribute to open-source when appropriate
Translate research insights into practical improvements in production models
Qualification
Required
Strong background in NLP / ML research, with a focus on multilingual or cross-lingual modeling
Publication record at respected conferences or journals (ACL, EMNLP, NeurIPS, ICML, ICLR, etc.)
Experience working with large-scale text datasets across multiple languages
Solid understanding of: Tokenization and vocabulary design for multilingual models, Data quality metrics, filtering, and dataset bias, Transfer learning and multilingual representation learning
Comfortable prototyping in Python with modern ML frameworks (PyTorch, JAX, etc.)
Ability to operate independently and ship research in a startup pace environment
Preferred
Experience with low-resource languages or non-Latin scripts
Open-source contributions in NLP or data tooling
Experience training or evaluating large language models
Familiarity with multilingual benchmarks (e.g., XTREME, FLORES, TyDi QA)
Benefits
Competitive compensation
Meaningful equity at an early stage
Company
Featherless AI
We enable serverless inference via our GPU orchestration and model load-balancing system.
H1B Sponsorship
Featherless AI has a track record of offering H1B sponsorships. Please note that this does not
guarantee sponsorship for this specific role. Below presents additional info for your
reference. (Data Powered by US Department of Labor)
Distribution of Different Job Fields Receiving Sponsorship
Represents job field similar to this job
Trends of Total Sponsorships
2025 (1)
Funding
Current Stage
Early StageTotal Funding
$5MKey Investors
Airbus Ventures
2025-10-31Series A
2025-03-17Seed· $5M
Company data provided by crunchbase