Autonomai Recruitment · 3 hours ago
Site Reliability Engineer
Autonomai Recruitment is seeking a Site Reliability Engineer to operate at the core of high-performance infrastructure powering global trading environments. The role involves deep systems engineering across distributed infrastructure and production trading systems, focusing on performance, reliability, and automation.
Responsibilities
Designing and building tooling that maintains large-scale production infrastructure
Debugging complex distributed systems and application stacks
Root cause analysis and deep troubleshooting across global environments
Linux systems engineering for performance-critical platforms
Observability, automation, and operational tooling
On-prem Kubernetes cluster engineering
Infrastructure configuration management and orchestration
Supporting globally distributed compute and trading environments
Cross-functional collaboration on platform engineering projects
Qualification
Required
Deep Linux expertise
Strong debugging and profiling capability
Experience designing and maintaining complex systems
Solid scripting or programming proficiency (Python/Go/RUST)
Experience operating production infrastructure at scale
Root-cause driven engineering mindset
Curiosity for new technologies and difficult technical challenges
Preferred
Linux systems engineering
Networking fundamentals
Distributed systems
Containerisation / Kubernetes
Observability stacks (Prometheus / Grafana / ClickHouse etc.)
Configuration management (Salt, Puppet, Ansible)
Scripting & development (Go, Python, Rust)
Automation & infrastructure tooling
HPC / high-scale compute environments
Company
Autonomai Recruitment
Autonomai Recruitment is a boutique search agency specializing in tailored recruitment solutions for FinTech, Crypto, and Ai.
Funding
Current Stage
Early StageCompany data provided by crunchbase