Raft ยท 9 hours ago
ML Ops Engineer
Raft is a customer-obsessed small business focused on Distributed Data Systems and Complex Application Development. As an ML Ops Engineer, you will collaborate with a cross-functional data team to design and maintain infrastructure for Machine Learning, ensuring efficient resource management and deployment.
Computer Software
Responsibilities
Design, build, and maintain the infrastructure and pipelines that enable Machine Learning model training, deployment, and scaling
Manage distributed workloads across GPU-enabled Kubernetes clusters
Ensure efficient resource orchestration between training and inference operations
Qualification
Required
3+ years of relevant hands-on experience
Experience building and maintaining machine learning pipelines
Strong Python skills for defining and maintaining ML pipelines
Practical experience with PyTorch (TensorFlow experience acceptable)
Airflow for job orchestration, particularly managing resources between training and inference workloads
Strong Kubernetes experience including managing local clusters, running different flavors, and managing custom resource definitions
Istio networking experience in Kubernetes environments
Experience working with MinIO object storage
Must have hands-on experience running GPU workloads on Kubernetes
Fast learner, analytical thinker, creative, hands-on, strong communication skills
Able to work both independently and as part of a team
Excellent problem-solving skills and attention to detail
Active TS with ability to obtain and maintain SCI
Preferred
CENTCOM or DoD experience
Experience with time slicing GPUs on Kubernetes
Exposure to computer vision and/or large imagery formats such as NITF
Publications or GitHub repos showcasing your skills
Experience with Docker and container orchestration best practices
Benefits
Fully covered healthcare, dental, and vision coverage
401(k) and company match
Take as you need PTO + 11 paid holidays
Education & training benefits
Generous Referral Bonuses
Company
Raft
A niche consulting organization focused on Cloud Native, DevSecOps, and Modern Application Development for mission focused enterprises
Funding
Current Stage
Growth StageTotal Funding
$60MKey Investors
Washington Harbour Partners
2024-04-10Private Equityยท $60M
Recent News
Company data provided by crunchbase