Ooma, Inc. · 7 hours ago
Site Reliability Engineer
Ooma is a company that empowers people to connect through their cloud-based communication platform. The Site Reliability Engineer will ensure the stability and efficiency of systems by leveraging expertise in Linux Systems, Virtualization, and CI/CD pipelines, while collaborating with teams to implement best practices for infrastructure management and application performance monitoring.
Telecommunications
Responsibilities
Monitor and troubleshoot system performance, reliability, and availability issues using modern observability tools and techniques, with a strong emphasis on diagnosing and resolving issues in operating systems and bare metal environments
Design, implement, and maintain scalable and reliable infrastructure using containers, Kubernetes, and microservices architecture
Manage CI/CD pipelines to facilitate efficient software development and deployment processes
Implement GitOps workflows using ArgoCD or Flux, manage Helm charts and Kustomize configurations for declarative application deployment and version control
Oversee configuration management to ensure consistent and reliable software releases across environments. Using Ansible for consistent system configuration, patch management, and provisioning across datacenter infrastructure
Design and operate high-throughput Kafka clusters for event streaming, managing topics, partitions, replication, consumer lag monitoring, and disaster recovery strategies across datacenter infrastructure
Collaborate with development teams to influence system design choices and operational policies
Provide expert guidance on managing large data centers, including hundreds of bare metal servers and virtual machines (VMs), ensuring optimal configuration and performance
Implement name services and server management practices to support our infrastructure needs
Continuously evaluate and integrate new technologies to enhance operational efficiency and reliability
Participate in on-call rotations to provide support for production systems as necessary, conduct blameless post-mortems with root cause analysis, and maintain incident response runbooks and procedures
Create comprehensive technical documentation, runbooks, architectural diagrams, network topology maps, and maintain knowledge bases for operational procedures and best practices
Continuously evaluate and integrate new technologies to enhance operational efficiency and reliability
Qualification
Required
Bachelor's degree in Computer Science, Engineering, or a related field; advanced degree preferred
5+ years of experience as an SRE or a related field, with a strong focus on production systems, containers, microservices and service delivery
Extensive experience with managing and maintaining CI/CD Pipelines and the essentials supporting it (GitOps workflows, ArgoCD, Helm charts)
Comprehensive knowledge of Observability Tools such as Prometheus, ELK Stack, log collectors, and Grafana for visuals
Extensive on-premises datacenter experience managing large data centers with hundreds of bare metal servers and VMs
Deep knowledge of Linux operating systems, their configuration, performance tuning, and troubleshooting
Experience with configuration management tools
Familiarity with networking concepts and protocols in the scope of Linux Operating Systems
Proven ability to analyze complex systems, identify bottlenecks, and implement solutions with strong troubleshooting skills
Excellent communication skills, with the ability to collaborate effectively with cross-functional teams
Preferred
Experience with containers and orchestration technologies, particularly Kubernetes is a plus
Benefits
Comprehensive Medical/Dental/Vision insurance for you and eligible dependents
HMO, PPO’s or a PPO with a HDHP (including HSA, which Ooma helps fund)
Employer Paid Income Protection Benefits (Basic Life and AD&D, Short- and Long-term disability)
FSA Healthcare & Dependent Care
Commuter Benefits
Voluntary Accident, Critical Illness, Hospital Indemnity and Legal
401(k), including employer match, and Roth
Employee Stock Purchase Plan (ESPP)
Paid Time off, Sick Time, as well as corporate holidays observed
Employee Assistance Program
Life Balance benefits with Travel Assistance Services and Identity Theft
Additional Benefits include a Discount Program, Credit Union, Medicare Assistance, etc.
Company
Ooma, Inc.
Ooma delivers phone, messaging, video and advanced communications services that are easy to implement and provide great value.
H1B Sponsorship
Ooma, Inc. has a track record of offering H1B sponsorships. Please note that this does not
guarantee sponsorship for this specific role. Below presents additional info for your
reference. (Data Powered by US Department of Labor)
Distribution of Different Job Fields Receiving Sponsorship
Represents job field similar to this job
Trends of Total Sponsorships
2025 (3)
2024 (14)
2023 (5)
2022 (14)
2021 (18)
2020 (13)
Funding
Current Stage
Late StageRecent News
2024-05-05
2024-05-05
Company data provided by crunchbase