Altana · 4 hours ago
Staff Site Reliability Engineer
Altana is the network for trusted trade, empowering governments and businesses to build a more resilient and secure global economy. As a Staff Site Reliability Engineer, you will ensure the availability, performance, and scalability of Altana’s critical production services, focusing on cloud-native environments and data pipelines.
Data IntegrationLogisticsSoftwareSupply Chain Management
Responsibilities
Reliability Engineering: Champion and implement SRE principles, including establishing and monitoring Service Level Objectives (SLOs) and error budgets for critical services. Drive initiatives to improve system reliability, availability, performance, and efficiency
Observability & Monitoring: Design, implement, and maintain advanced monitoring, logging, and tracing solutions for our cloud-native applications and infrastructure (e.g., Kubernetes, microservices). Develop dashboards, alerts, and runbooks that provide deep insights into system health and behavior
Automation & Toil Reduction: Identify and automate repetitive operational tasks and manual processes across our production environment. Develop tools and scripts to enhance system operations, deployment pipelines, and incident response
Incident Management & Postmortems: Actively participate in the incident response lifecycle, including detection, triage, mitigation, and resolution of production issues. Lead thorough blameless postmortems to identify root causes and implement preventative measures and lasting improvements
System Design & Optimization: Collaborate closely with development teams to influence the design of new services, ensuring they are built for operability, reliability, and cost-efficiency. Proactively identify and address performance bottlenecks and architectural weaknesses
On-Call Rotation: Participate in a periodic on-call rotation, responding to critical alerts and ensuring rapid resolution of production incidents
Data Reliability: Implement and maintain reliability and observability for critical data pipelines and data infrastructure, ensuring data integrity, availability, and timely processing
Qualification
Required
5+ years of hands-on experience in a Site Reliability Engineering (SRE), DevOps, or equivalent role focusing on production system reliability and operations
Strong understanding and practical application of Site Reliability Engineering (SRE) principles, including SLOs, error budgets, toil reduction, and blameless culture
Expertise in designing, implementing, and managing observability platforms for cloud-native environments (e.g., Prometheus, Grafana, Datadog, ELK stack, OpenTelemetry, Jaeger)
Proficiency in at least one programming/scripting language (e.g., Python, Go) for automation and tool development
Extensive hands-on experience with cloud platforms (AWS, Azure, or GCP), including their compute, networking, and database services
Demonstrated experience with containerization technologies (Docker) and container orchestration platforms (Kubernetes)
Experience with Infrastructure as Code (IaC) tools (e.g., Terraform, OpenTofu, CloudFormation) for managing cloud resources
Proven experience participating in and improving incident management processes for critical systems
Knowledge of modern software delivery paradigms, including microservices architectures and CI/CD pipelines
Excellent problem-solving, analytical, and troubleshooting skills in complex distributed systems
Strong communication and collaboration skills, with the ability to work effectively across engineering teams
Experience with data engineering concepts, including building or operating reliable data pipelines, data streaming technologies, or managing large-scale data infrastructure
Benefits
Flexible Time Off
Paid Parental Leave
Health Benefits
Supplemental Benefits
401(k) Savings
Commuter Benefits
Wellness
Pet Insurance
Employee Assistance Program
Dependent Care FSA
Company
Altana
Altana is the only Product Network connecting buyers, suppliers, logistics providers & government agencies across the global supply chain.
H1B Sponsorship
Altana has a track record of offering H1B sponsorships. Please note that this does not
guarantee sponsorship for this specific role. Below presents additional info for your
reference. (Data Powered by US Department of Labor)
Distribution of Different Job Fields Receiving Sponsorship
Represents job field similar to this job
Trends of Total Sponsorships
2025 (8)
2024 (4)
2023 (3)
2022 (1)
Funding
Current Stage
Growth StageTotal Funding
$322MKey Investors
US Innovative Technology FundActivate Capital PartnersGoogle Ventures
2024-07-29Series C· $200M
2022-10-03Series B· $100M
2021-09-20Series A· $15M
Recent News
Company data provided by crunchbase