Staff Machine Learning Engineer - AI Tech Lead jobs in United States
cer-icon
Apply on Employer Site
company-logo

Sumo Logic · 1 day ago

Staff Machine Learning Engineer - AI Tech Lead

Sumo Logic, Inc. empowers the people who power modern, digital business. As a Staff Machine Learning Engineer – AI Tech Lead, you will lead the design and delivery of advanced AI systems for Security Operation Center, focusing on building scalable multi-agent architectures and evaluating state-of-the-art AI technologies.

AnalyticsBig DataCloud Data ServicesEnterprise SoftwareSaaS
badNo H1Bnote

Responsibilities

Lead and partner with fellow leadership members and teams on technical evaluation and adoption of cutting-edge agentic AI platforms, including Anthropic (Claude), LangChain/LangGraph, AWS Bedrock, and other emerging agent frameworks
Architect, prototype, and productionize multi-agent AI systems for Agentic SOC use cases, including detection, triage, investigation, and response workflows
Own the design of core agent architecture components, including planning, execution, tool orchestration, memory, context engineering, and long-running agent workflows
Lead AI agent evaluation systems, including offline and online evaluation pipelines, golden datasets, synthetic data generation, human- and LLM-based judging, and continuous quality monitoring
Drive LLM fine-tuning and alignment efforts to improve domain-specific reasoning, accuracy, and reliability for security and observability use cases
Design scalable LLMOps and AI agent infrastructure, including inference routing, latency optimization, cost control, and production observability for agent systems
Partner with product, security, and data platform leadership and teams to deliver end-to-end AI agent capabilities from prototype to customer-facing production systems
Lead and partner on technical direction and mentorship for AI engineers working on agentic AI and LLM systems
Define and implement best practices for AI safety, reliability, evaluation, and monitoring in production agentic systems
Operate as a senior technical owner in ambiguous problem spaces—setting technical direction, breaking down complex problems, and driving delivery across teams

Qualification

Machine LearningLarge-scale System DesignPythonLLM Fine-tuningAgentic AI Design PatternsEvaluation FrameworksML InfrastructureCommunication SkillsTechnical LeadershipMentorship

Required

B.Tech, M.Tech, or Ph.D. in Computer Science, Machine Learning, Data Science, or a related technical field
5+ years of hands-on industry experience building, operating, and leading production ML/AI systems, with demonstrated technical leadership and ownership
Strong foundation in machine learning, distributed systems, data pipelines, and large-scale system design
Deep industry understanding of LLMs, prompt engineering, context engineering, agentic AI design patterns, and reasoning workflows
Strong proficiency in Python and modern ML/AI ecosystems
Experience designing and operating evaluation frameworks for ML/LLM systems (offline + online)
Proven ability to lead complex technical initiatives across teams and influence architecture decisions
Excellent communication skills and ability to translate complex AI systems into business impact

Preferred

Hands-on experience building and scaling agentic AI systems or multi-agent architectures in production
Experience with modern agent frameworks such as LangGraph, LangChain, CrewAI, or similar
Experience with major foundation model platforms such as Anthropic, OpenAI, AWS Bedrock, or Vertex AI
Experience with LLM fine-tuning pipelines (SFT, RLHF/RLAIF, preference learning, domain adaptation)
Strong background in LLMOps, including inference optimization, latency/cost management, observability, and production monitoring
Experience with ML infrastructure and tooling such as PyTorch, MLflow, Airflow, Docker, Kubernetes, and cloud platforms (AWS/GCP/Azure)
Experience applying AI/ML to security, observability, or large-scale log/telemetry data is a strong plus

Benefits

Bonus or commission plans
Benefits offerings
Equity awards

Company

Sumo Logic

company-logo
Sumo Logic is a provider of cloud-based machine data analytics that enables reliable and secure cloud-native applications.

Funding

Current Stage
Public Company
Total Funding
$340M
Key Investors
Battery VenturesSapphire VenturesDFJ Growth
2023-02-09Acquired
2020-09-16IPO
2019-05-08Series G· $110M

Leadership Team

leader-logo
Stewart Grierson
Chief Financial Officer
linkedin
leader-logo
Aaron Feigin
Chief Communications & Brand Officer
linkedin
Company data provided by crunchbase