Intermediate Site Reliability Engineer, Database Operations jobs in United States
cer-icon
Apply on Employer Site
company-logo

GitLab · 3 months ago

Intermediate Site Reliability Engineer, Database Operations

GitLab is an open-core software company developing an AI-powered DevSecOps Platform. They are seeking an Intermediate Site Reliability Engineer to manage and ensure the reliability and performance of their PostgreSQL database engine while automating operational tasks and collaborating with engineering teams.

Cloud SecurityDeveloper ToolsDevOpsOpen SourceSaaS
check
Comp. & Benefits

Responsibilities

Automating every operational task is a core requirement for this role. For example, package updates, configuration changes across all environments, creating tools for automatic provisioning of user facing services, etc
Responding to platform emergencies, alerts, and escalations from Customer Support
Ensure systems exist to manage software life-cycles (e.g. Operating Systems) with a minimum of manual effort
Develop a fully automated multi-environment observability stack based on the existing SaaS system, and extend it to predict capacity needs based on the usage patterns
Plan for new service roll-outs, expansion and capacity management of existing services, and work with users to optimize their resource consumption
Work on database reliability and performance aspects for GitLab.com from within the SRE team as well as work on shipping solutions with the product
Analyze solutions and implement best practices for our PostgreSQL database clusters and its components
Work on observability of relevant database metrics and make sure we reach our database objectives
Work with peer SREs to roll out changes to our production environment and help mitigate database-related production incidents
OnCall support on rotation with the team
Provide database expertise to engineering teams (for example through reviews of database migrations, queries and performance optimizations)
Work on automation of database infrastructure and help engineering succeed by providing self-service tools
Use the GitLab product to run GitLab.com as a first resort and improve the product as much as possible
Plan the growth of GitLab's database infrastructure
Design, build and maintain core database infrastructure components that allow GitLab to scale to support hundreds of thousands of concurrent users
Support and debug database production issues across services and levels of the stack
Make monitoring and alerting alert on symptoms and not on outages
Document every action so your learnings turn into repeatable actions and then into automation

Qualification

PostgreSQLInfrastructure automationSQLPL/pgSQLSaaS distributed systemsData modelingProgramming skillsMonitoringAlertingDocumentationCollaborationCommunication skills

Required

Have primary experience running PostgreSQL in high-growth, large production environments using both self-managed (VM, Kubernetes with modern PostgreSQL Operators) as well DBaaS services
Have hands-on experience using data from PostgreSQL internals to design, build and troubleshoot systems
Have primary experience with infrastructure automation, orchestration and configuration management (Chef, Ansible, Puppet, Terraform)
Have solid understanding of SQL and PL/pgSQL
Significant experience working in a Large SaaS distributed Systems production environment
Share our values, and work in accordance with those values
Have excellent written and verbal English communication skills, with an urge to collaborate and communicate asynchronously
Have an urge to document all the things so you don't need to learn the same thing twice, and an urge for delivering quickly and iterating fast
Have a proactive, go-for-it attitude. When you see something broken, you can't help but fix it
Solid data modeling and data structure design skills

Preferred

Solid programming skills as a (former) backend engineer - Preferably with Ruby and/or Go
Experience with Clickhouse, or other modern OLAP database

Company

GitLab is a web-based Git repository manager that offers a variety of features for software development teams.

Funding

Current Stage
Public Company
Total Funding
$413.5M
Key Investors
ICONIQ GrowthGoogle VenturesAugust Capital
2021-10-14IPO
2019-09-17Series E· $268M
2018-09-19Series D· $100M

Leadership Team

leader-logo
Bill Staples
Chief Executive Officer
linkedin
leader-logo
Sytse Sijbrandij
Co-Founder and Executive Chair
linkedin
Company data provided by crunchbase