Replit · 2 months ago
Staff Infrastructure Engineer
Replit is a software creation platform that democratizes application development for millions of users globally. They are looking for a Staff Infrastructure Engineer to ensure the reliability and scalability of their infrastructure, implementing automation and best practices to enhance performance and availability.
Artificial Intelligence (AI)Cloud ComputingDeveloper ToolsInformation TechnologySoftware
Responsibilities
Drive Automation and Infrastructure as Code: Architect, build, and improve automation to eliminate toil and operational work. Design and maintain CI/CD pipelines and infrastructure automation using tools like Terraform or Pulumi. Create self-healing systems that can automatically respond to common failure scenarios
Optimize Performance and Infrastructure: Collaborate with core infrastructure and product teams to performance tune and optimize our cloud deployments (Kubernetes, Docker, GCP). Identify and resolve performance bottlenecks, implement capacity planning strategies, and reduce latency across global regions
Elevate Developer Experience: Design and implement improvements to our build, test, and deployment systems to make software delivery faster, safer, and more reliable for all engineers
Drive Cross-Company Improvements: Partner directly with service owners across Replit to understand their pain points, and collaborate on implementing build/test/deploy enhancements within their specific services
Build Shared Tooling: Create and maintain centralized tooling and automation that improves the entire engineering lifecycle, from local development to production monitoring
Debug and Harden Systems: Dive deep into debugging extremely difficult technical problems, making our systems and products more robust, operable, and easier to diagnose
Provide Staff-Level Guidance: Review feature and system designs, acting as an owner for the security, scale, and operational integrity of those designs
Educate and Mentor: Educate, mentor, and hold accountable the engineering team to improve the reliability of our systems, making reliability a core value of the Replit engineering culture
Build and Integrate: Write high-quality, well-tested code to meet the needs of your customers, including building pipelines to integrate with 3rd party vendors
Qualification
Required
8-10 years of experience in Infrastructure Engineering or similar roles (DevOps, Systems Engineering, Site Reliability Engineering)
Strong programming skills in languages like Python or Go
You write high-quality, well-tested code
Deep understanding of distributed systems. You've designed, built, scaled, and maintained production services and know how to compose a service-oriented architecture
Experience with container orchestration platforms (Kubernetes) and cloud-native technologies
Proven track record of implementing and maintaining monitoring/observability solutions, with strong skills in debugging and performance tuning
Strong incident management skills with experience leading incident response and demonstrated critical thinking under pressure
Experience with infrastructure as code (e.g., Terraform) and configuration management tools
Excellent written and verbal communication skills, with an ability to explain technical concepts clearly and simply and a bias toward open, transparent cultural practices
Strong interpersonal skills, with experience working with engineers from junior to principal levels
A willingness to dive into understanding, debugging, and improving any layer of the stack
You're passionate about making software creation accessible and empowering the next generation of builders
Preferred
Deep experience with Google Cloud Platform (GCP) services and tools
Knowledge of modern observability platforms (Prometheus, Grafana, Datadog, etc.)
Experience designing and building reliable systems capable of handling high throughput and low latency
Experience with Go and Terraform
Familiarity with working in rapid-growth environments
Experience writing company-facing blog posts and training materials
Benefits
401(k) Program
Health, Dental, Vision and Life Insurance
Short Term and Long Term Disability
Paid Parental, Medical, Caregiver Leave
Commuter Benefits
Monthly Wellness Stipend
Autonoumous Work Environement
In Office Set-Up Reimbursement
Flexible Time Off (FTO) + Holidays
Quarterly Team Gatherings
In Office Amenities
Company
Replit
Replit is the most secure agentic platform for production-ready apps.
H1B Sponsorship
Replit has a track record of offering H1B sponsorships. Please note that this does not
guarantee sponsorship for this specific role. Below presents additional info for your
reference. (Data Powered by US Department of Labor)
Distribution of Different Job Fields Receiving Sponsorship
Represents job field similar to this job
Trends of Total Sponsorships
2025 (8)
2024 (5)
2023 (2)
2022 (2)
Funding
Current Stage
Growth StageTotal Funding
$472.02MKey Investors
Prysm CapitalCraft VenturesAndreessen Horowitz
2025-07-30Series C· $250M
2023-11-06Series B· $20M
2023-04-25Series B· $97.4M
Recent News
2026-01-19
Company data provided by crunchbase