Apply on Employer Site

MA CAPITAL U.S. LLC · 3 hours ago

Site Reliability Engineer

Chicago, IL

Full-time

Onsite

Mid, Senior Level

4+ years exp

MA Capital US LLC is a proprietary trading firm specializing in systematic and high-performing discretionary strategies across multiple asset classes. They are seeking a Site Reliability Engineer to support and evolve their production trading environment with a strong focus on Linux performance, automation, and reliability.

Financial ServicesRisk ManagementTrading Platform

Responsibilities

Own the availability, stability, and performance of Linux-based trading systems (RedHat, Rocky, Ubuntu)

Lead and participate in incident response, on-call rotations, and post-incident reviews, producing clear and blameless post-mortems while implementing automation or process improvements to prevent repeat failures

Develop and maintain runbooks, documentation, and operational standards to ensure consistent, repeatable production support

Partner with developers and traders to ensure systems are designed and deployed with reliability, performance, and operational readiness in mind

Perform OS- and system-level tuning (CPU topology, IRQ affinity, memory, networking) to support deterministic, latency-sensitive workloads

Diagnose complex performance issues using perf, ftrace, tcpdump, & eBPF

Treat infrastructure and system configuration as version-controlled, reproducible code using Ansible, Terraform, Python, and shell scripting, ensuring systems are consistently built optimized

Design and improve CI/CD pipelines that incorporate automated testing and performance validation prior to production release

Support and automate core infrastructure services including DNS, NFS, LDAP/Active Directory, and multicast networking

Build and evolve monitoring, alerting, and logging for trading-critical systems, improving alert quality, response times, and operational visibility

Reduce operational toil through automation, tooling, and standardization, implementing automated remediation for known failure scenarios

Identify gaps in operational practices and help drive the organization toward proactive reliability engineering through scalable processes and tooling

Qualification

Linux performanceAutomationReliability engineeringAnsibleTerraformPythonDockerTCP/IPIncident managementDocumentation skills

Required

4–8+ years of experience in Site Reliability Engineering, Linux engineering, DevOps, or infrastructure-focused roles

Hands-on experience supporting highly available, performance-sensitive systems in production environments

Deep understanding of Linux internals, including scheduling, memory management, interrupts, filesystems, and storage behavior

Strong knowledge of TCP/IP, UDP, multicast, and networked services

Proficiency with Ansible, Terraform, Python, shell scripting, YAML/JSON, & Git-based workflows

Experience with Docker (or similar platforms) and familiarity with observability stacks such as Prometheus, Grafana, ELK, or comparable tooling

Experience supporting production databases and integrating system and application logs with centralized logging or SIEM platforms

Strong documentation skills & solid understanding of incident management & on-call best practices

Benefits

Comprehensive Health Coverage: Medical, dental, and vision insurance.

401(k) Retirement Plan: Helping you and your family plan for a secure financial future.

Company

MA CAPITAL U.S. LLC

We are a nimble trading firm focused on maximizing the opportunities of alpha generation through innovation in data science, technology and risk management, brought together with rigorous discipline and teamwork.

Founded in 2022

Dubai, Dubai, ARE

11-50 employees

https://ma-capital.com

Funding

Current Stage

Early Stage

Company data provided by crunchbase