SIGN IN
Senior Site Reliability Engineer jobs in United States
cer-icon
Apply on Employer Site
company-logo

Gotham Technology Group · 4 hours ago

Senior Site Reliability Engineer

Gotham Technology Group is seeking a hands-on Senior Site Reliability Engineer (SRE) to support the reliability, scalability, and performance of production systems. This role focuses on observability, automation, and operational excellence, partnering closely with development, operations, and product teams to improve system health and ensure resilient services.
Information ServicesInformation Technology
check
Diversity & Inclusion
Hiring Manager
Finlay Walker
linkedin

Responsibilities

Design, implement, and support automated deployment, monitoring, and alerting solutions
Build, manage, and maintain scalable infrastructure using Infrastructure as Code (IaC) tools
Own and support Grafana dashboards, visualizations, and data sources for reliability and performance monitoring
Maintain and extend existing Python automation and data ingestion jobs integrating data from multiple systems into an infrastructure data warehouse
Improve observability through enhanced logging, metrics, monitoring, and alerting strategies
Diagnose and resolve production issues quickly while minimizing downtime and business impact
Partner with development and operations teams to improve system reliability, scalability, and efficiency
Create and maintain operational documentation, runbooks, and best practices
Support knowledge transfer and continuity for existing automation and monitoring solutions

Qualification

GrafanaPythonInfrastructure as CodeCI/CD pipelinesCloud environmentsTroubleshootingNetworking best practicesSecurity best practicesCommunicationCollaborationFast-paced environment

Required

Bachelor's degree in Computer Science, Engineering, or equivalent practical experience
Proven experience in an SRE, Platform Engineer, or similar production-focused role
Strong hands-on experience with Grafana for monitoring, visualization, and operational analysis
Strong programming or scripting skills, particularly in Python
Experience with Infrastructure as Code tools such as Terraform, Ansible, or equivalent
Experience designing or supporting automated deployment and monitoring solutions
Familiarity with CI/CD pipelines and related tooling
Experience working in cloud environments such as AWS, Azure, or Google Cloud
Strong troubleshooting skills and ability to operate calmly in production environments
Excellent communication skills and ability to collaborate across technical teams
Comfortable working in a fast-paced, evolving environment

Preferred

Experience with Prometheus or other time-series databases
Knowledge of containerization and orchestration technologies such as Docker and Kubernetes
Understanding of networking and security best practices
Experience with database administration, performance tuning, or optimization
Exposure to data pipelines or operational reporting platforms

Company

Gotham Technology Group

twittertwittertwitter
company-logo
Gotham Technology Group is a provider of guidance and direction to IT professionals.

Funding

Current Stage
Growth Stage

Leadership Team

leader-logo
Ira Silverman
CEO
linkedin
Company data provided by crunchbase