Datadog · 1 day ago
AI Research Engineer – Datadog AI Research (DAIR)
Datadog is a global SaaS business focused on enabling digital transformation and cloud migration. They are seeking an AI Research Engineer to collaborate with research scientists to develop AI-powered solutions for cloud observability and security, building infrastructure and tools for rapid iteration and evaluation.
AnalyticsCloud ComputingCloud Data ServicesCloud InfrastructureData ManagementDevOpsProductivity ToolsSaaS
Responsibilities
Build and operate datasets, training and evaluation pipelines, benchmarks, and internal tooling
Implement models, run experiments at scale, and profile for reliability, performance, and cost
Orchestrate distributed training and distributed RL with Ray, including scheduling, scaling, and failure recovery
Make the research stack observable, reproducible, and easier to use
Establish rigorous automated benchmarks and regression tests for forecasting, anomaly detection, multi-modal analysis, agents, and code repair tasks
Collaborate with Research Scientists, Product, and Engineering to integrate advanced AI capabilities into Datadog’s product ecosystem and to harden prototypes into reliable services
Contribute high-quality code, documentation, and open-source artifacts that enable the community and internal teams to reproduce, extend, and evaluate results
Qualification
Required
You have strong software engineering skills with experience in domains such as observability, SRE, or security
You have depth in distributed computing and ML systems for training and inference at scale; experience with Ray, Slurm, or similar frameworks is a plus
You are proficient in Python, familiar with a systems language (e.g., Rust, C++, or Go), and you are comfortable with modern cloud and data infrastructure
You have practical experience implementing and operating ML training and inference systems (e.g., PyTorch or JAX), including containerization, orchestration, and GPU acceleration
You are familiar with efficient training, fine-tuning, and inference techniques for large foundation models
You can explain design and performance trade-offs clearly to both technical and non-technical audiences
You have a strong interest in open-science and open-source contributions, including establishing rigorous benchmarks and sharing artifacts with the community
Preferred
You have a demonstrated ability to bridge cutting-edge research prototypes and real-world product applications, ideally with large foundation models, generative AI agents, or domain-specific LLM deployments
You are passionate about pushing the boundaries of AI while maintaining a strong focus on customer impact, scalability, and responsible deployment of new technologies
You have hands-on experience with GPU programming and optimization, including experience in CUDA
You have experience writing production data pipelines and applications
You have experience supporting or contributing to research publications
Benefits
Competitive global benefits
New hire stock equity (RSUs) and employee stock purchase plan (ESPP)
Opportunity to collaborate closely with colleagues across the Datadog offices in New York City and Paris
Opportunity to attend and present at conferences and meetups
Intra-departmental mentor and buddy program for in-house networking
An inclusive company culture, ability to join our Community Guilds (Datadog employee resource groups)
Healthcare
Dental
Parental planning
Mental health benefits
A 401(k) plan and match
Paid time off
Fitness reimbursements
A discounted employee stock purchase plan
Company
Datadog
Datadog is an observability and security platform that offers infrastructure, applications, software development, and monitoring services.
H1B Sponsorship
Datadog has a track record of offering H1B sponsorships. Please note that this does not
guarantee sponsorship for this specific role. Below presents additional info for your
reference. (Data Powered by US Department of Labor)
Distribution of Different Job Fields Receiving Sponsorship
Represents job field similar to this job
Trends of Total Sponsorships
2025 (123)
2024 (66)
2023 (45)
2022 (53)
2021 (31)
2020 (29)
Funding
Current Stage
Public CompanyTotal Funding
$1.02BKey Investors
ICONIQ GrowthIndex VenturesOpenView
2024-12-09Post Ipo Debt· $870M
2020-05-28Post Ipo Debt
2019-09-19IPO
Recent News
Benzinga.com
2026-01-15
Company data provided by crunchbase