MA CAPITAL U.S. LLC · 3 hours ago
Site Reliability Engineer
MA Capital US LLC is a proprietary trading firm specializing in systematic and high-performing discretionary strategies across multiple asset classes. They are seeking a Site Reliability Engineer to support and evolve their production trading environment with a strong focus on Linux performance, automation, and reliability.
Financial ServicesRisk ManagementTrading Platform
Responsibilities
Own the availability, stability, and performance of Linux-based trading systems (RedHat, Rocky, Ubuntu)
Lead and participate in incident response, on-call rotations, and post-incident reviews, producing clear and blameless post-mortems while implementing automation or process improvements to prevent repeat failures
Develop and maintain runbooks, documentation, and operational standards to ensure consistent, repeatable production support
Partner with developers and traders to ensure systems are designed and deployed with reliability, performance, and operational readiness in mind
Perform OS- and system-level tuning (CPU topology, IRQ affinity, memory, networking) to support deterministic, latency-sensitive workloads
Diagnose complex performance issues using perf, ftrace, tcpdump, & eBPF
Treat infrastructure and system configuration as version-controlled, reproducible code using Ansible, Terraform, Python, and shell scripting, ensuring systems are consistently built optimized
Design and improve CI/CD pipelines that incorporate automated testing and performance validation prior to production release
Support and automate core infrastructure services including DNS, NFS, LDAP/Active Directory, and multicast networking
Build and evolve monitoring, alerting, and logging for trading-critical systems, improving alert quality, response times, and operational visibility
Reduce operational toil through automation, tooling, and standardization, implementing automated remediation for known failure scenarios
Identify gaps in operational practices and help drive the organization toward proactive reliability engineering through scalable processes and tooling
Qualification
Required
4–8+ years of experience in Site Reliability Engineering, Linux engineering, DevOps, or infrastructure-focused roles
Hands-on experience supporting highly available, performance-sensitive systems in production environments
Deep understanding of Linux internals, including scheduling, memory management, interrupts, filesystems, and storage behavior
Strong knowledge of TCP/IP, UDP, multicast, and networked services
Proficiency with Ansible, Terraform, Python, shell scripting, YAML/JSON, & Git-based workflows
Experience with Docker (or similar platforms) and familiarity with observability stacks such as Prometheus, Grafana, ELK, or comparable tooling
Experience supporting production databases and integrating system and application logs with centralized logging or SIEM platforms
Strong documentation skills & solid understanding of incident management & on-call best practices
Benefits
Comprehensive Health Coverage: Medical, dental, and vision insurance.
401(k) Retirement Plan: Helping you and your family plan for a secure financial future.
Company
MA CAPITAL U.S. LLC
We are a nimble trading firm focused on maximizing the opportunities of alpha generation through innovation in data science, technology and risk management, brought together with rigorous discipline and teamwork.
Funding
Current Stage
Early StageCompany data provided by crunchbase