Inference Software Engineer - Collectives jobs in United States
cer-icon
Apply on Employer Site
company-logo

Etched · 1 hour ago

Inference Software Engineer - Collectives

Etched is a pioneering company focused on developing an AI inference system designed specifically for transformers. The Inference Software Engineer will be responsible for optimizing collectives and enhancing the runtime performance to meet the demands of frontier inference applications.

Artificial Intelligence (AI)ComputerHardwareSemiconductor
check
H1B Sponsor Likelynote

Responsibilities

Formalize and optimize our collectives (e.g. Send/Recieve, AllReduce, Broadcast, etc.)
Collaborate across systems and research teams to bring MoE architectures to Sohu’s runtime
Optimize expert routing and communication layers using Sohu’s collectives
Contribute to scaling and enhancing Sohu’s runtime, including multi-node inference, intra-node execution, state management, and robust error handling
Develop tools for performance profiling and debugging, identifying bottlenecks and correctness issues

Qualification

RustC++Collectives optimizationDistributed systemsLinux internalsPerformance profilingUser-facing interfacesPyTorchJAXLarge language modelsNetwork simulationLow-latency applicationsSIMD optimizations

Required

Strong proficiency in Rust and/or C++; familiarity with PyTorch and/or JAX
Experience designing/optimizing collectives (e.g. NCCL, MPI collectives, XLA collectives, etc.)
Strong systems knowledge, including Linux internals, accelerator architectures (e.g., GPUs, TPUs), high-speed interconnects (e.g., NVLink, InfiniBand) and RDMA
Solid understanding of distributed systems concepts, algorithms, and challenges, including consensus protocols, consistency models, and communication patterns
Experience analyzing performance traces and logs from distributed systems and ML workloads
A knack for designing user-facing interfaces and libraries, and enjoy looking for that elusive optimum between performance and usability

Preferred

Large language model architectures, particularly Mixture-of-Experts (MoE)
Familiarity with network simulation techniques
Developed low-latency, high-performance applications using both kernel-level and user-space networking stacks
Ported applications to non-standard or accelerator hardware platforms
Contributed to runtime systems with complex, well-documented interfaces, such as distributed storage systems or machine learning runtimes
Built applications with extensive SIMD (Single Instruction, Multiple Data) optimizations for performance-critical paths

Benefits

Full medical, dental, and vision packages, with generous premium coverage
Housing subsidy of $2,000/month for those living within walking distance of the office
Daily lunch and dinner in our office
Relocation support for those moving to West San Jose

Company

Etched

twittertwitter
company-logo
Building the hardware for superintelligence

H1B Sponsorship

Etched has a track record of offering H1B sponsorships. Please note that this does not guarantee sponsorship for this specific role. Below presents additional info for your reference. (Data Powered by US Department of Labor)
Distribution of Different Job Fields Receiving Sponsorship
Represents job field similar to this job
Trends of Total Sponsorships
2025 (9)
2024 (11)
2023 (1)

Funding

Current Stage
Growth Stage
Total Funding
$125.36M
Key Investors
Primary Venture Partners
2024-06-25Series A· $120M
2023-05-16Seed· $5.36M

Leadership Team

leader-logo
Chris Zhu
Co-Founder
linkedin
leader-logo
Robert Wachen
Co-Founder and President
linkedin
Company data provided by crunchbase