GitLab · 2 months ago
Intermediate Site Reliability Engineer, Environment Automation
GitLab is an open-core software company that develops a comprehensive AI-powered DevSecOps Platform. The Site Reliability Engineer will focus on operating and automating hundreds of GitLab environments, ensuring they remain secure, consistent, and reliable at scale while debugging production issues and contributing to infrastructure automation.
Cloud SecurityDeveloper ToolsDevOpsOpen SourceSaaS
Responsibilities
Support Environment Automation at Scale: Contribute to automating the provisioning, configuration, and management of GitLab environments using Terraform, Ansible, and Kubernetes. Follow best practices to support infrastructure across many tenants with guidance from senior team members
Assist in Debugging Production Issues: Investigate and troubleshoot issues in Kubernetes clusters and GitLab services. Help resolve common problems such as failed deployments, pod crashes, and scheduling conflicts using tools like kubectl
Contribute to IaC and CI/CD Workflows: Write and maintain Terraform modules and scripts to automate routine operations. Participate in improving CI/CD pipelines for safe and repeatable infrastructure changes
Participate in Monitoring and Maintenance: Help monitor environment health using tools like Prometheus, ELK, and Grafana. Assist in improving observability and capacity tracking for tenant environments
Respond to Incidents and Alerts: Take part in the incident response process, helping triage alerts, document issues, and support resolution efforts under the guidance of senior engineers
Collaborate Across Teams: Work with Infrastructure and Development teams to contribute to solutions that improve platform reliability and operational efficiency
Qualification
Required
Experience with Infrastructure as Code: Familiarity with Terraform and Ansible to manage cloud infrastructure. Able to work with modules and understand the basics of state and variable use
Kubernetes Fundamentals: Experience using kubectl, Helm, or Kustomize to interact with Kubernetes clusters. Understands core concepts such as pods, deployments, and rollouts
Basic Programming Skills: Able to read and modify infrastructure tooling written in Go, Ruby, or similar languages
Exposure to Multi-Environment Operations: Experience working with multiple environments or customer setups, even if not at full scale. Understands the challenges of managing consistency and isolation
Monitoring and Troubleshooting Skills: Familiar with basic observability tools and logs. Can identify service issues using dashboards or metrics and escalate appropriately
Collaborative Mindset: Works well in cross-functional teams. Eager to learn from others, share knowledge, and contribute to team success
On-Call Experience: Has participated in on-call rotations for production systems and is comfortable responding to alerts, triaging incidents, and collaborating during recovery efforts
Benefits
Flexible Paid Time Off
Team Member Resource Groups
Equity Compensation & Employee Stock Purchase Plan
Growth and Development Fund
Parental leave
Home office support
Company
GitLab
GitLab is a web-based Git repository manager that offers a variety of features for software development teams.
Funding
Current Stage
Public CompanyTotal Funding
$413.5MKey Investors
ICONIQ GrowthGoogle VenturesAugust Capital
2021-10-14IPO
2019-09-17Series E· $268M
2018-09-19Series D· $100M
Recent News
2026-01-13
MarketScreener
2026-01-06
2026-01-03
Company data provided by crunchbase