DataRobot · 2 months ago
Staff Software Engineer
DataRobot is a company that delivers AI solutions to maximize impact and minimize business risk. They are seeking a Staff Software Engineer focused on Application Scalability & Performance to lead the design, implementation, and operation of backend systems for high-throughput AI applications, ensuring performance, reliability, and cost-effectiveness.
Enterprise SoftwareMachine Learning
Responsibilities
Architect, build, and lead backend services that scale to handle large workloads, high concurrency, and low latency requirements
Design and implement autoscaling strategies (horizontal/vertical), dynamic resource allocation, and load balancing to ensure responsive, cost-efficient service
Improve end-to-end request pipelines, optimizing for latency, throughput, reliability, and correctness
Instrument, monitor, and profile systems in production; identify bottlenecks, troubleshoot performance issues, and proactively tune services
Collaborate with ML/AI teams to ensure models’ serving pipelines uphold accuracy, consistency, and performance under load
Drive best practices in systems reliability, observability, error handling, capacity planning, resilience, and failover
Mentor and coach other engineers; provide technical leadership and influence across teams
Contribute to defining architecture, coding standards, performance benchmarks, and technical roadmap items related to scalability and performance
Qualification
Required
7+ years of backend engineering experience building scalable, high-performance distributed systems / services
Strong experience with performance optimization: e.g. profiling, latency tuning, concurrency, caching strategies
Deep experience with autoscaling, resource management, load balancing, throughput/latency SLAs
Solid programming skills in one or more backend languages (e.g. Python, Java, Go, C++, or equivalent)
Strong understanding of observability and monitoring: metrics, tracing, logging; and instrumentation of services
Design and architect scalable AI-backed services and applications, integrating AI models into production systems with high performance, reliability, and low latency
Ability to solve ambiguous challenges and influence technical direction across teams, balancing performance, accuracy, and cost
Experience operating across multiple cloud providers (AWS, GCP, Azure) and/or hybrid environments
Preferred
Experience with AI/ML model deployment, serving, inference, and production integration
Experience with Gen AI / serving LLMs, embeddings, etc
Exposure to on-prem delivery models or regulated environments
Experience with Docker and building containerized applications
Open source software development experience or contributions
Benefits
Medical, Dental & Vision Insurance
Flexible Time Off Program
Paid Holidays
Paid Parental Leave
Global Employee Assistance Program (EAP)
Company
DataRobot
DataRobot provides AI technology and ROI enablement services to global enterprises.
Funding
Current Stage
Late StageTotal Funding
$1.05BKey Investors
Snowflake VenturesAltimeter CapitalSapphire Ventures
2021-06-27Series G· $300M
2020-12-09Series F· $50M
2020-11-17Series F· $270M
Recent News
vcnewsdaily.com
2025-12-04
globalventuring.com
2025-12-03
2025-11-18
Company data provided by crunchbase