Lead Machine Learning Engineer-MLOps jobs in United States
cer-icon
Apply on Employer Site
company-logo

hackajob · 7 hours ago

Lead Machine Learning Engineer-MLOps

hackajob is collaborating with J.P. Morgan to connect exceptional tech professionals for the role of Lead Machine Learning Engineer. The successful candidate will work closely with Data Scientists to build and deploy ML models, focusing on maintaining pipelines for distributed model training and ensuring optimal performance for real-time and batch inference.

Artificial Intelligence (AI)Generative AIHuman ResourcesRecruitingSoftware

Responsibilities

Build, deploy, and maintain robust pipelines for distributed training on GPU-enabled clusters to support scalable machine learning workflows
Develop and manage pipelines for high-throughput, real-time inference as well as batch inference, ensuring optimal performance and reliability
Implement quantization techniques and deploy large language models (LLMs) to maximize efficiency and resource utilization
Oversee the management and optimization of vector databases to support advanced AI and machine learning applications
Establish and maintain comprehensive monitoring and observability pipelines to ensure system health, performance, and rapid issue resolution
Collaborate with cross-functional teams to integrate new technologies and continuously improve existing infrastructure
Partner with product, architecture, and other engineering teams to define scalable and performant technical solutions

Qualification

PythonAWSMLOpsBig Data ToolsMonitoring ToolsContainersAnalytical MindsetAction OrientedCollaboration

Required

BS in Computer Science or related Engineering field with 6+ years of experience Or MS degree in Computer Science or related Engineering field with 4+ years experience
Solid knowledge and extensive experience in Python
Solid fundamentals in cloud computing, preferably AWS
Deep knowledge and passion for data science fundamentals, training and deploying models
Experience in monitoring and observability tools to monitor model input/output and features stats
Operational experience in big data/ML tools such as Ray, DuckDB, Spark
Solid grounding in engineering fundamentals and analytical mindset
Action Oriented and iterative development

Preferred

Experience with recommendation and personalization systems is a plus
Solid fundamentals and experience in containers (docker ecosystem), container orchestration systems [Kubernetes, ECS], DAG orchestration [Airflow, Kubeflow etc]
Good knowledge of Databases

Benefits

Comprehensive health care coverage
On-site health and wellness centers
A retirement savings plan
Backup childcare
Tuition reimbursement
Mental health support
Financial coaching

Company

hackajob

twittertwittertwitter
company-logo
The AI-native tech hiring platform trusted by enterprises, scale-ups, and 1M+ tech professionals worldwide.

Funding

Current Stage
Growth Stage
Total Funding
$33M
Key Investors
Volition CapitalAVP,Downing VenturesDowning Ventures
2023-05-03Series B· $25M
2018-10-25Series A· $6.7M
2017-03-31Seed· $0.58M

Leadership Team

leader-logo
Mark Chaffey
CEO
linkedin
leader-logo
Phil Kell
VP - Marketplace
linkedin
Company data provided by crunchbase