Apple · 3 days ago
ML Safety Research Engineer
Apple Services Engineering (ASE) powers many AI features across App Store, Music, Video and more. In this role, you will lead the design and continuous development of automated safety benchmarking methodologies, investigating media-related agents and developing evaluation frameworks to ensure responsible AI performance.
AppsArtificial Intelligence (AI)BroadcastingDigital EntertainmentFoundational AIMedia and EntertainmentMobile DevicesOperating SystemsTVWearables
Responsibilities
Design scientifically-grounded benchmarking methodologies covering multiple dimensions of responsibility and safety across several media and application marketplace use cases
Develop automated evaluation pipelines that collect, automatically judge, and analyze model outputs with respect to safety policies, at scale
Create and curate datasets, tasks, and feature usage scenarios that represent realistic and adversarial use cases across multiple languages, markets, and domains
Define and validate new metrics for complex phenomena such as multi-turn agentic interaction patterns
Apply statistical rigor and reproducibility to above mentioned objectives
Work closely with engineering and research teams to translate experimental findings into actionable model improvements and safety mitigations
Publish internal reports and external papers
Monitor evolving industry practices and academic work to ensure benchmarks remain relevant
Qualification
Required
Advanced degree (MS or PhD) in Computer Science, Software Engineering, or equivalent research/work experience
1+ years of work experience either as a postdoc or in the industry
Strong research background in empirical evaluation, experimental design, or benchmarking
Strong proficiency in Python (pandas, NumPy, Jupyter, PyTorch, etc.)
Deep familiarity with software engineering workflows and developer tools
Experience working with or evaluating AI/ML models, preferably LLMs or program synthesis systems
Strong analytical and communication skills, including the ability to write clear reports
Proficiency in Python (pandas, NumPy, Jupyter, PyTorch, etc.)
Experience working with large datasets, annotation tools, and model evaluation pipelines
Familiarity with evaluations specific to responsible AI and safety, hallucination detection, and/or model alignment concerns
Ability to design taxonomies, categorization schemes, and structured labeling frameworks
Analytical Strength: Ability to interpret unstructured data (text, transcripts, user sessions) and derive meaningful insights
Communication: Strong ability to stitch together qualitative and quantitative insights into actionable guidance; strong ability to communicate complex architectures and systems to a variety of stakeholders
Education in Data Science, Linguistics, Cognitive Science, HCI, Psychology, Social Science, or a related field
Preferred
Publications in AI/ML evaluation or related fields
Experience with automated testing frameworks
Experience constructing human-in-the-loop or multi-turn evaluation setups
Intermediate or Advanced Proficiency in Swift
Familiarity with RAG systems, reinforcement learning, agentic architectures, and model fine-tuning
Expertise in designing annotation guidelines and validation instruments and techniques
Background in human factors, social science, and/or safety assessment methodologies
Benefits
Comprehensive medical and dental coverage
Retirement benefits
A range of discounted products and free services
Reimbursement for certain educational expenses — including tuition
Discretionary bonuses or commission payments
Relocation
Company
Apple
Apple is a technology company that designs, manufactures, and markets consumer electronics, personal computers, and software.
H1B Sponsorship
Apple has a track record of offering H1B sponsorships. Please note that this does not
guarantee sponsorship for this specific role. Below presents additional info for your
reference. (Data Powered by US Department of Labor)
Distribution of Different Job Fields Receiving Sponsorship
Represents job field similar to this job
Trends of Total Sponsorships
2025 (6998)
2024 (3766)
2023 (3939)
2022 (4822)
2021 (4060)
2020 (3656)
Funding
Current Stage
Public CompanyTotal Funding
$5.67BKey Investors
Berkshire HathawayMicrosoftSequoia Capital
2025-05-05Post Ipo Debt· $4.5B
2025-01-16Post Ipo Debt· $0.31M
2021-04-30Post Ipo Equity
Leadership Team
Tim Cook
CEO
Craig Federighi
SVP, Software Engineering
Recent News
Venrock
2025-12-01
2025-09-25
Mac Daily News
2025-09-25
Company data provided by crunchbase