Phenom · 1 week ago
Site Reliability Engineer (Onsite/Hybrid)
Phenom is an AI-Powered talent experience platform that is redefining the HR tech space. As a Site Reliability Engineer (SRE), you will ensure the reliability, performance, and operational excellence of the platform while collaborating closely with Engineering, Product, and Platform teams.
Artificial Intelligence (AI)Human ResourcesMachine LearningRecruitingSaaSSoftware
Responsibilities
Ensure high availability and performance across Phenom’s cloud-native infrastructure, services, and database with Ownership or knowledge of SLIs/SLOs/SLAs, error budgets, and reporting
Explicit responsibilities for integrating security into CI/CD, scanning for vulnerabilities, and incident handling for security breaches
Experience with auto-scaling, load balancing, and capacity planning methodologies
Experience with GitOps workflows for automated infra/config management
Collaboration with platform engineering for developer enablement and toolchain automation
Implement and support infrastructure, application, and database changes following governance policies and ServiceNow-based Change workflows
Serve as a key technical responder during Major Incidents, collaborating with cross-functional teams to rapidly restore service, communicate status, and drive post-incident actions
Contribute to Root Cause Analysis (RCA) efforts and provide technical input for corrective and preventive actions (CAPAs)
Actively support and execute production deployments, ensuring readiness, rollback planning, and validation during releases and patches, including database schema/version changes
Qualification
Required
5+ years of experience in Cloud Ops/DevOps/SRE/Software engineering with hands-on responsibility for production systems
Proficient in one or more programming/scripting languages (e.g., Python, JavaScript/TypeScript, Java)
Hands-on experience with Cloud compute, network and storage expertise
Tooling expertise in Kubernetes, ArgoCD, Helm, LinkerD/Istio/Nginx
Public cloud platforms (AWS, GCP, or Azure)
Kafka, Redis, MongoDB, and relational databases (e.g., PostgreSQL, MySQL, or Aurora)
Strong understanding of production Change Management processes and use of ServiceNow for change execution and tracking
Proven experience supporting and executing production deployments in structured release environments, including database updates
Familiarity with observability tooling and best practices for monitoring and diagnostics of both applications and databases
Experience with CI/CD, container orchestration, and Infrastructure as Code
Solid Linux system administration and troubleshooting skills
US 'Secret' clearance will be required (Must be a US Citizen)
Preferred
Familiarity with SaaS platforms
Prior experience with handling federal requirements such as FIPS, FedRAMP, and FISMA
Experience participating in release readiness reviews, Go/No-Go meetings, and Early Life Support (ELS)
Exposure to structured RCA methodologies and configuration item tracking in a CMDB
Understanding of ITSM practices and service lifecycle principles
Prior experience as a Database Reliability Engineer (DBRE) or supporting mission-critical databases at scale
Company
Phenom
Phenom is an HR technology company that uses AI to help businesses hire, develop, and retain employees.
Funding
Current Stage
Late StageTotal Funding
$161.42MKey Investors
B CapitalWestBridge CapitalAVP
2021-04-07Series D· $100M
2020-01-16Series C· $30M
2018-05-24Series B· $22M
Leadership Team
Recent News
Company data provided by crunchbase