ML Safety Research Engineer jobs in United States
cer-icon
Apply on Employer Site
company-logo

Apple · 3 days ago

ML Safety Research Engineer

Apple Services Engineering (ASE) powers many AI features across App Store, Music, Video and more. In this role, you will lead the design and continuous development of automated safety benchmarking methodologies, investigating media-related agents and developing evaluation frameworks to ensure responsible AI performance.

AppsArtificial Intelligence (AI)BroadcastingDigital EntertainmentFoundational AIMedia and EntertainmentMobile DevicesOperating SystemsTVWearables
check
Comp. & Benefits
check
H1B Sponsor Likelynote

Responsibilities

Design scientifically-grounded benchmarking methodologies covering multiple dimensions of responsibility and safety across several media and application marketplace use cases
Develop automated evaluation pipelines that collect, automatically judge, and analyze model outputs with respect to safety policies, at scale
Create and curate datasets, tasks, and feature usage scenarios that represent realistic and adversarial use cases across multiple languages, markets, and domains
Define and validate new metrics for complex phenomena such as multi-turn agentic interaction patterns
Apply statistical rigor and reproducibility to above mentioned objectives
Work closely with engineering and research teams to translate experimental findings into actionable model improvements and safety mitigations
Publish internal reports and external papers
Monitor evolving industry practices and academic work to ensure benchmarks remain relevant

Qualification

PythonAI/ML model evaluationBenchmarking methodologiesStatistical rigorAutomated testing frameworksAnalytical skillsLarge datasetsRAG systemsEducation in Data ScienceCommunication skills

Required

Advanced degree (MS or PhD) in Computer Science, Software Engineering, or equivalent research/work experience
1+ years of work experience either as a postdoc or in the industry
Strong research background in empirical evaluation, experimental design, or benchmarking
Strong proficiency in Python (pandas, NumPy, Jupyter, PyTorch, etc.)
Deep familiarity with software engineering workflows and developer tools
Experience working with or evaluating AI/ML models, preferably LLMs or program synthesis systems
Strong analytical and communication skills, including the ability to write clear reports
Proficiency in Python (pandas, NumPy, Jupyter, PyTorch, etc.)
Experience working with large datasets, annotation tools, and model evaluation pipelines
Familiarity with evaluations specific to responsible AI and safety, hallucination detection, and/or model alignment concerns
Ability to design taxonomies, categorization schemes, and structured labeling frameworks
Analytical Strength: Ability to interpret unstructured data (text, transcripts, user sessions) and derive meaningful insights
Communication: Strong ability to stitch together qualitative and quantitative insights into actionable guidance; strong ability to communicate complex architectures and systems to a variety of stakeholders
Education in Data Science, Linguistics, Cognitive Science, HCI, Psychology, Social Science, or a related field

Preferred

Publications in AI/ML evaluation or related fields
Experience with automated testing frameworks
Experience constructing human-in-the-loop or multi-turn evaluation setups
Intermediate or Advanced Proficiency in Swift
Familiarity with RAG systems, reinforcement learning, agentic architectures, and model fine-tuning
Expertise in designing annotation guidelines and validation instruments and techniques
Background in human factors, social science, and/or safety assessment methodologies

Benefits

Comprehensive medical and dental coverage
Retirement benefits
A range of discounted products and free services
Reimbursement for certain educational expenses — including tuition
Discretionary bonuses or commission payments
Relocation

Company

Apple is a technology company that designs, manufactures, and markets consumer electronics, personal computers, and software.

H1B Sponsorship

Apple has a track record of offering H1B sponsorships. Please note that this does not guarantee sponsorship for this specific role. Below presents additional info for your reference. (Data Powered by US Department of Labor)
Distribution of Different Job Fields Receiving Sponsorship
Represents job field similar to this job
Trends of Total Sponsorships
2025 (6998)
2024 (3766)
2023 (3939)
2022 (4822)
2021 (4060)
2020 (3656)

Funding

Current Stage
Public Company
Total Funding
$5.67B
Key Investors
Berkshire HathawayMicrosoftSequoia Capital
2025-05-05Post Ipo Debt· $4.5B
2025-01-16Post Ipo Debt· $0.31M
2021-04-30Post Ipo Equity

Leadership Team

leader-logo
Tim Cook
CEO
leader-logo
Craig Federighi
SVP, Software Engineering
Company data provided by crunchbase