Site Reliability Engineer jobs in United States
cer-icon
Apply on Employer Site
company-logo

Archetype AI · 1 day ago

Site Reliability Engineer

Archetype AI is developing an innovative AI platform aimed at transforming real-world data into valuable insights. As a Site Reliability Engineer, you will design, scale, and maintain the infrastructure that supports AI-driven products, ensuring high availability and performance through collaboration with engineering and ML teams.

Artificial Intelligence (AI)Information TechnologySoftware

Responsibilities

Design, build, and operate highly available distributed systems
Collaborate with engineering and ML teams to ensure reliable deployment of backend services (in Rust, C++ or similar)
Implement monitoring, alerting, and observability solutions across infrastructure
Automate deployments, scaling, and infrastructure provisioning using infrastructure-as-code
Diagnose and resolve performance bottlenecks, system outages, and production incidents
Support AI/ML infrastructure for training and serving models at scale, including GPU clusters, pipelines, and inference services
Contribute to infrastructure architecture, standards, and operational best practices

Qualification

Distributed systemsRustKubernetesMonitoring toolsAI/ML infrastructureCloud platformsAutomation mindsetProblem-solvingCollaborationCommunication

Required

5+ years of experience as SRE, DevOps, or Systems Engineer
Strong expertise in distributed systems, fault-tolerant architectures, and large-scale production environments
Proficiency in Rust, C++, or other backend languages with willingness to learn
Solid experience with Kubernetes, containers, and cloud platforms (AWS, GCP, Azure)
Hands-on experience with monitoring and observability tools (Prometheus, Grafana, ELK, OpenTelemetry)
Experience with data pipelines, messaging systems, and streaming technologies (Kafka, Pulsar, etc.)
Familiarity with AI/ML infrastructure (training pipelines, GPU clusters, inference systems)
Strong debugging, problem-solving, and automation mindset (Terraform, Ansible, Pulumi, scripting)
Excellent communication and collaboration skills

Preferred

Experience with real-time or low-latency systems
Open-source contributions to distributed systems or infrastructure projects
Knowledge of security best practices for distributed environments
Experience with edge or embedded systems and sensor-based infrastructure
Background in multimodal data fusion or physical-world perception systems

Company

Archetype AI

twittertwittertwitter
company-logo
Archetype AI develops Physical AI agents that harness real-world sensor data to enhance decision-making and automate processes.

Funding

Current Stage
Early Stage
Total Funding
$48M
Key Investors
Comcast NBCUniversal LIFT LabsVenrock5G Open Innovation Lab
2025-11-20Series A· $35M
2024-10-20Non Equity Assistance
2024-04-05Seed· $13M

Leadership Team

leader-logo
Jaime Lien
Co-Founder & Chief Scientist
linkedin
leader-logo
Nick Gillian
Founder, CTO
linkedin
Company data provided by crunchbase