Magic · 1 hour ago
Distributed Systems Engineer: Secure Sandboxes
Magic is on a mission to build safe AGI to accelerate humanity's progress on important problems. As a Distributed Systems Engineer, you will develop systems for large-scale AI research, focusing on sandboxed execution environments and distributed systems orchestration while collaborating with ML and infrastructure teams.
Artificial Intelligence (AI)Information TechnologyMachine Learning
Responsibilities
Build highly scalable, highly performant, software that facilitates arbitrary code execution with strong isolation guarantees
Design and build systems that allow our AI models to interface with machines in various modes, interactive terminal, GUI applications, etc
Provision and operate high density compute and storage nodes (NVMe, high IOPS SSDs, high bandwidth networks), and build software that performs efficient load balancing, and resource utilization across them
Instrument and optimize end to end performance including storage IO, network bandwidth, CPU, memory, and endurance constraints
Develop APIs, self service platforms, and automation and tools so researchers and engineers can deploy and monitor workloads at scale
Troubleshoot complex infrastructure issues across OS, drivers, hardware, storage systems (local NVMe, block storage, NFS), networking, namespace isolation, and cloud or hybrid environments
Produce clean, documented code and developer workflows, and collaborate with SRE and security teams to ensure safe, reliable, and self serviceable compute offerings
Qualification
Required
Strong software engineering background (C, C++, Go, Rust, or similar systems languages)
Experience designing or operating sandboxed or isolated execution environments (namespaces, cgroups, container runtime internals), or strong interest in this area
Experience building or operating distributed systems or parallel processing frameworks (scatter aggregate processing, worker pools, multi thread and multi process coordination, shared memory, atomics, merging strategies)
Solid understanding of storage and IO subsystems (NVMe, SSD endurance, write amplification), network performance, CPU and memory resource constraints in high performance compute clusters
Comfortable working on low level systems (OS, threading, memory management, synchronization) as well as higher level orchestration or automation
Experience with cloud infrastructure (GCP, AWS, Azure, etc.) including IaC tools such as OpenTofu, Terraform, Pulumi, or CDK is a plus
Intellectual curiosity, strong ownership, and the ability to make tradeoffs in ambiguous environments such as latency versus throughput and isolation versus performance
Preferred
Prior experience with GPU scheduling, RDMA networking, or bare metal HPC clusters
Contributions to open source container runtimes or sandboxing frameworks
Experience with kernel internals, device drivers, or SSD and NVMe endurance modeling
Familiarity with Rust for systems programming or Go for infrastructure orchestration
Benefits
Significant equity component
401(k) with matching
Comprehensive health, dental, and vision insurance
Unlimited paid time off
Visa sponsorship and relocation support
Company
Magic
Magic is an AI coding startup that enables developers to work with AI to find code for building apps.
Funding
Current Stage
Growth StageTotal Funding
$465.93MKey Investors
Flat CapitalNFDG VenturesCapitalG
2025-01-14Series Unknown· $0.81M
2024-08-29Series Unknown· $320M
2024-02-16Series B· $117M
Recent News
2025-11-13
TechWire Asia
2025-09-12
2024-11-01
Company data provided by crunchbase