Staff Software Engineer jobs in United States
cer-icon
Apply on Employer Site
company-logo

DataRobot · 2 months ago

Staff Software Engineer

DataRobot is a company that delivers AI solutions to maximize impact and minimize business risk. They are seeking a Staff Software Engineer focused on Application Scalability & Performance to lead the design, implementation, and operation of backend systems for high-throughput AI applications, ensuring performance, reliability, and cost-effectiveness.

Enterprise SoftwareMachine Learning

Responsibilities

Architect, build, and lead backend services that scale to handle large workloads, high concurrency, and low latency requirements
Design and implement autoscaling strategies (horizontal/vertical), dynamic resource allocation, and load balancing to ensure responsive, cost-efficient service
Improve end-to-end request pipelines, optimizing for latency, throughput, reliability, and correctness
Instrument, monitor, and profile systems in production; identify bottlenecks, troubleshoot performance issues, and proactively tune services
Collaborate with ML/AI teams to ensure models’ serving pipelines uphold accuracy, consistency, and performance under load
Drive best practices in systems reliability, observability, error handling, capacity planning, resilience, and failover
Mentor and coach other engineers; provide technical leadership and influence across teams
Contribute to defining architecture, coding standards, performance benchmarks, and technical roadmap items related to scalability and performance

Qualification

Backend engineeringPerformance optimizationAutoscaling strategiesProgramming languagesMonitoringCloud providers experienceObservabilityAI/ML model deploymentDocker experienceOpen source contributions

Required

7+ years of backend engineering experience building scalable, high-performance distributed systems / services
Strong experience with performance optimization: e.g. profiling, latency tuning, concurrency, caching strategies
Deep experience with autoscaling, resource management, load balancing, throughput/latency SLAs
Solid programming skills in one or more backend languages (e.g. Python, Java, Go, C++, or equivalent)
Strong understanding of observability and monitoring: metrics, tracing, logging; and instrumentation of services
Design and architect scalable AI-backed services and applications, integrating AI models into production systems with high performance, reliability, and low latency
Ability to solve ambiguous challenges and influence technical direction across teams, balancing performance, accuracy, and cost
Experience operating across multiple cloud providers (AWS, GCP, Azure) and/or hybrid environments

Preferred

Experience with AI/ML model deployment, serving, inference, and production integration
Experience with Gen AI / serving LLMs, embeddings, etc
Exposure to on-prem delivery models or regulated environments
Experience with Docker and building containerized applications
Open source software development experience or contributions

Benefits

Medical, Dental & Vision Insurance
Flexible Time Off Program
Paid Holidays
Paid Parental Leave
Global Employee Assistance Program (EAP)

Company

DataRobot

company-logo
DataRobot provides AI technology and ROI enablement services to global enterprises.

Funding

Current Stage
Late Stage
Total Funding
$1.05B
Key Investors
Snowflake VenturesAltimeter CapitalSapphire Ventures
2021-06-27Series G· $300M
2020-12-09Series F· $50M
2020-11-17Series F· $270M

Leadership Team

leader-logo
Debanjan Saha
Chief Executive Officer
linkedin
leader-logo
Brian Brown
Chief Financial and Legal Officer
linkedin
Company data provided by crunchbase