MixMode · 19 hours ago
Sr. Software Reliability Engineer for AI
MixMode is a leading provider of AI-powered cybersecurity solutions at scale, pioneering a patented third-wave, context-aware AI approach. They are looking for a Senior Software Reliability Engineer for AI to improve the reliability, performance, and scalability of their production AI systems, working closely with ML researchers to enhance existing distributed services.
Artificial Intelligence (AI)Big DataCloud SecurityCyber SecurityIntrusion DetectionNetwork SecuritySecurity
Responsibilities
Own the reliability, performance, and operational health of production AI systems, focusing on improving complex, existing services
Lead efforts to refactor and harden the AI codebase to improve observability, maintainability, and resilience
Diagnose and resolve issues across distributed systems, including latency, throughput, data pipelines, and resource utilization
Design and build monitoring, alerting, and debugging tools for high-availability services
Partner with researchers and ML engineers to productionize models at scale
Establish best practices for testing, deployment, capacity planning, and incident response
Serve as a technical leader during on-call rotations, driving incident response, postmortems, and continuous system improvements
Qualification
Required
7+ years of professional software engineering experience
Strong proficiency in Python and at least one JVM language (Java, Scala, or Kotlin preferred)
Proven experience designing, building, and operating distributed systems in production
Strong understanding of service architecture, concurrency, resource management, and distributed failure modes
Prior experience with streaming data pipelines (e.g. Spark streaming, Flink, Kafka)
Hands-on experience running production services on Kubernetes, including pod lifecycle management and fault tolerance
Strong experience with relational databases (e.g., PostgreSQL, MySQL), including query performance analysis, indexing, and connection management
Demonstrated ability to diagnose and resolve performance, scalability, and reliability issues across application, database, and infrastructure layers
Experience implementing automated testing (unit, integration, end-to-end) and production observability (logging, metrics, tracing)
Experience collaborating with ML or data science teams to productionize predictive systems. (Note: ML expertise is not required.)
Ability to improve system architecture and engineering practices over time through design, code review, and mentorship
Benefits
Remote-First Work Culture
Healthcare (Medical, Dental, Vision, Accident)
Basic & Voluntary Life and AD&D
Flexible Spending Account (FSA)
401(k) with Employer Match
Paid Holidays & Flexible Paid Time Off (PTO)
Company
MixMode
MixMode is a self-supervised AI platform to defend against cyber attacks.
H1B Sponsorship
MixMode has a track record of offering H1B sponsorships. Please note that this does not
guarantee sponsorship for this specific role. Below presents additional info for your
reference. (Data Powered by US Department of Labor)
Distribution of Different Job Fields Receiving Sponsorship
Represents job field similar to this job
Trends of Total Sponsorships
2025 (1)
Funding
Current Stage
Growth StageTotal Funding
$62.33MKey Investors
PSG EquityEntrada VenturesKeshif Ventures
2022-03-23Private Equity· $45M
2020-04-07Series A· $4M
2019-02-25Series Unknown· $1.83M
Recent News
2025-05-06
2024-05-19
2024-05-19
Company data provided by crunchbase