Lavendo Β· 3 months ago
Senior AI/ML Specialist Solutions Architect (AI Infra & Cloud)
Lavendo is a publicly traded company at the forefront of the AI revolution, offering an AI-centric cloud platform. They are seeking a Senior AI/ML Specialist Solutions Architect to design and implement scalable AI solutions for AI-focused customers, working with advanced technologies and contributing to powerful supercomputing resources.
RecruitingSalesSales AutomationVirtual Workforce
Responsibilities
Architect and optimize distributed training and inference systems for large-scale AI models
Design and deliver customer-focused solutions that maximize performance and business value
Lead the transition of ML pipelines from POC to scalable production systems
Build long-term customer relationships, ensuring satisfaction and alignment with strategic goals
Create whitepapers, deliver technical presentations, and host webinars to share insights and best practices
Provide technical leadership and mentor teams on AI infrastructure and deployment strategies
Collaborate with engineering and product teams to prioritize customer feedback and influence product roadmaps
Qualification
Required
5+ years of experience with cloud technologies and infrastructure, ideally in senior MLOps or Solutions Architect roles
Proven expertise in scaling and optimizing AI workloads across multi-node and multi-GPU environments
Demonstrated success delivering ML products, scaling from POC to production
Deep knowledge of ML frameworks like PyTorch and JAX
Strong background in the NVIDIA HPC ecosystem (CUDA, NCCL, Infiniband)
Exceptional communication skills to engage both technical teams and business stakeholders
Legal authorization to work in the United States on a full-time basis without sponsorship
Preferred
Programming Languages: Python, Go, Java, C++
Infrastructure as Code (IaC): Terraform, Ansible
Orchestration: Kubernetes (K8s), Slurm
DevOps Tools: Git, Docker, Helm
Big Data Frameworks: Spark, Kafka, Hadoop
Databases: SQL, NoSQL, and vector databases
ML Frameworks: PyTorch, TensorFlow, JAX, HuggingFace, Scikit-learn
Benefits
Full medical benefits: 100% company-paid medical, dental, and vision coverage for employees and families
401(k) plan with a 4% match program
Stock options plan
Flexible remote work environment
Company-paid short-term, long-term disability, and life insurance coverage
20 weeks paid parental leave for primary caregivers, 12 weeks for secondary caregivers
Up to $85/month for mobile and internet