Mid-Level DevOps / SRE 3 - Chicago, IL OR Denver, CO - Onsite jobs in United States
cer-icon
Apply on Employer Site
company-logo

hackajob · 1 week ago

Mid-Level DevOps / SRE 3 - Chicago, IL OR Denver, CO - Onsite

hackajob is collaborating with Comcast to connect exceptional tech professionals for a Site Reliability Engineer (SRE) 3 role. The successful candidate will be responsible for ensuring system reliability, scalability, and performance while collaborating with various engineering and operations teams to manage infrastructure and resolve technical issues.

Artificial Intelligence (AI)Generative AIHuman ResourcesRecruitingSoftware

Responsibilities

System Monitoring and Optimization: Design and implement monitoring and alerting systems to ensure the stability, reliability, and performance of data platforms. Join in on-call shift to quickly respond to and resolve issues
Automation and Tool Development: Develop and maintain automation tools and scripts for deployment, monitoring, backup and disaster recovery
Performance Optimization: Analyze and optimize the performance of data storage, query performance, and data flows to ensure efficient processing of large-scale datasets, reduce latency, an improve processing speed
Incident Response and Troubleshooting: Respond quickly to platform failures, perform troubleshooting, and coordinate cross-team efforts to resolve issues and ensure high availability and reliability
Capacity Planning and Scaling: Work with engineering teams to analyze and forecast capacity requirements, ensuring the system can handle traffic growth and scale infrastructure accordingly. Support Freewheel powered Live events
Documentation and Knowledge Sharing: Document the architecture, configurations, and operational procedures for platforms, ensuring knowledge is shared across the team and providing relevant training
Security and Compliance: Ensure platforms meet security standards and compliance requirements to prevent breaches or misuse
Cross-Team Collaboration: Collaborate with engineering team, product team, and project management team to support product design and implementation, solving reliability-related issues

Qualification

Cloud platformsTerraformAutomation toolsDatabase managementProgramming skillsSystem monitoringTroubleshootingProactive learnerTeam collaboration

Required

At least 3 years of experience as an SRE, DevOps or Operations Engineer
Experience with cloud platforms (e.g. AWS, OCI, GCP, Azure)
Hands-on experience with Terraform and infrastructure as code principle
Proficiency in automation tools and frameworks (e.g. Ansible, Terraform, Kubernetes, Docker) for automating system deployment and maintenance
Familiarity with modern data architectures and technologies, including big data platforms (e.g. Kafka, Hadoop, Spark), distributed storage (e.g. Cassandra, HDFS, AWS S3), etc
Extensive experience in data base management (e.g. NoSQL databases, MySQL, PostgreSQL)
Programming Skills: Proficient in at least one programming language, such as Python, Go, Java, or Scala, with the ability to write efficient scripts and automation tools
System Monitoring and Log Management: Familiar with using monitoring and log management tools such as Prometheus, Grafana, ELK Stack, or other similar tools
Troubleshooting and Debugging: Strong debugging and troubleshooting skills, with the ability to quickly identify and resolve production issues
Team Collaboration and Communication: Excellent communication skills with the ability to convey technical information clearly and concisely to both technical and non-technical stakeholders
Proactive learner eager to grow in operations and governance
Education: Bachelor's degree or higher in Computer Science, Software Engineering, or a related field

Benefits

Comcast provides best-in-class Benefits to eligible employees.
We believe that benefits should connect you to the support you need when it matters most, and should help you care for those who matter most.
That's why we provide an array of options, expert guidance and always-on tools, that are personalized to meet the needs of your reality - to help support you physically, financially and emotionally through the big milestones and in your everyday life.

Company

hackajob

twittertwittertwitter
company-logo
The AI-native tech hiring platform trusted by enterprises, scale-ups, and 1M+ tech professionals worldwide.

Funding

Current Stage
Growth Stage
Total Funding
$33M
Key Investors
Volition CapitalDowning VenturesTechstars
2023-05-03Series B· $25M
2018-10-25Series A· $6.7M
2017-03-31Seed· $0.58M

Leadership Team

leader-logo
Mark Chaffey
CEO
linkedin
leader-logo
Phil Kell
VP - Marketplace
linkedin
Company data provided by crunchbase