Hippocratic AI · 10 hours ago
Technical Lead Manager (TLM) – Platform Engineering
Hippocratic AI is the leading generative AI company in healthcare, focused on creating safe and autonomous clinical conversations with patients. They are seeking a Technical Lead Manager to oversee the design, implementation, and operation of their cloud infrastructure and observability systems, while leading a multidisciplinary engineering team to support cutting-edge LLM workloads.
Artificial Intelligence (AI)Foundational AIGenerative AIHealth CareInformation Technology
Responsibilities
Foster a culture of innovation, accountability, and technical excellence
Mentor and coach engineers to achieve high performance and career growth
Build and scale a high-performing team responsible for all infrastructure operations and systems reliability
Define and execute the long-term infrastructure roadmap for a multi-cloud, multi-region GPU and compute environment
Drive excellence in cloud cost optimization, capacity planning, and service reliability
Architect and manage HippocraticAI’s global GPU control plane, enabling dynamic provisioning, scheduling, and monitoring of inference workloads across regions and providers
Lead the design and automation of deployments (AWS, GCP, Azure, on-prem) using infrastructure-as-code and CI/CD best practices
Ensure strong security posture and compliance across all environments, aligned with HIPAA, SOC 2, and other healthcare data standards
Develop and scale comprehensive observability systems—covering telemetry, tracing, logging, and alerting—to ensure full visibility into production systems and AI workloads
Establish SLOs, SLIs, and SLAs for all mission-critical services and infrastructure
Implement robust incident management, root cause analysis, and continuous improvement processes
Partner with AI and product teams to anticipate infrastructure needs and design scalable architectures for rapid experimentation and deployment
Contribute to the design of internal developer platforms that improve productivity and standardization
Evaluate emerging technologies (e.g., new GPU hardware, orchestration frameworks, data center partnerships) to advance our capabilities
Qualification
Required
8+ years of engineering experience, including 5+ years leading infrastructure, SRE, or platform teams at scale
Proven success in managing large-scale distributed systems and global cloud infrastructure
Deep experience with high-performance computing or large-scale AI workloads
Strong background in cloud platforms (AWS, GCP, Azure) and infrastructure-as-code (Terraform, Pulumi, etc.)
Expertise in observability stacks (Prometheus, Grafana, OpenTelemetry, Datadog, etc.) and operational excellence
Experience with security and compliance frameworks relevant to healthcare (HIPAA, SOC 2)
Exceptional communication skills and the ability to partner across product, AI research, and operations
Preferred
Experience designing or operating GPU control planes or schedulers (e.g., Kubernetes, Ray, Slurm, custom orchestration frameworks)
Prior work with ML infrastructure, data pipelines, or model-serving platforms
Background in cost optimization and sustainability of GPU/compute operations
Familiarity with edge or hybrid-cloud deployments for low-latency AI systems
Company
Hippocratic AI
Hippocratic AI is a healthcare technology company that develops safety-focused large-language models for medical applications.
H1B Sponsorship
Hippocratic AI has a track record of offering H1B sponsorships. Please note that this does not
guarantee sponsorship for this specific role. Below presents additional info for your
reference. (Data Powered by US Department of Labor)
Distribution of Different Job Fields Receiving Sponsorship
Represents job field similar to this job
Trends of Total Sponsorships
2025 (9)
2024 (1)
Funding
Current Stage
Growth StageTotal Funding
$402MKey Investors
AvenirKleiner PerkinsNVentures
2025-11-03Series C· $126M
2025-01-09Series B· $141M
2024-09-19Series A· $17M
Recent News
2026-01-11
Company data provided by crunchbase