ElastixAI · 1 month ago
AI Compiler and Performance Engineer
ElastixAI is an early-stage startup on a mission to reinvent AI inference infrastructure from the ground up. They are seeking a deeply technical AI Compiler & Performance Engineer who will design and optimize AI compute stacks, collaborating with hardware and ML teams to improve inference efficiency.
Artificial Intelligence (AI)Generative AIMachine Learning
Responsibilities
Break down LLM and transformer workloads into fine-grained primitives tailored to our proprietary compute hardware
Design and implement IR transformations, graph optimizations, kernel lowering, and code generation for novel hardware architectures
Collaborate with ML researchers to co-design algorithmic optimizations that yield real end-to-end performance gains
Work closely with hardware architects to refine microarchitectural features, instruction sets, memory hierarchies, and execution models
Build performance models, profiling tools, and benchmarking frameworks to identify bottlenecks and guide design decisions
Prototype and validate improvements across the entire stack — from PyTorch/XLA-level passes to custom kernel implementations
Contribute to shaping the overall system architecture of a first-of-its-kind inference engine
Qualification
Required
BS/MS/PhD in Computer Science, Software Engineering, or a related field
Deep experience building compilers, optimizing kernels, or working with ML frameworks at a systems level
Strong proficiency in one or more programming languages such as Python and C++
Strong understanding of one or more of the following: LLM architectures and transformer internals, MLIR, LLVM, XLA, TVM, Triton, or similar compiler infrastructures, GPU/TPU/FPGA/ASIC compute models, memory hierarchies, and parallel execution, Quantization, sparsity, or algorithmic optimization for deep learning
Deep expertise on ML frameworks (e.g., PyTorch, TensorFlow, JAX) and understanding of ML model deployment challenges
Solid understanding of software engineering best practices, including data structures, algorithms, and testing
Thinking in terms of latency, cycles, memory bandwidth, and arithmetic intensity, not just algorithms
Excellent problem-solving abilities and a knack for tackling complex technical challenges
Excited to collaborate across ML, hardware, and software boundaries to invent something fundamentally new
Strong communication skills and a proven ability to collaborate effectively in a cross-functional team environment
Ability to thrive in a fast-paced, dynamic startup environment
Preferred
PhD in Computer Science, Software Engineering, or a related field
Experience with custom hardware accelerators for ML inference
Contributions to open-source compiler or ML systems projects
Prior startup experience or background building first-generation systems
Benefits
Comprehensive medical, dental, and vision coverage (100% paid by employer)
Life insurance and AD&D
Flexible Time Off (FTO)
12-paid holidays
Paid parental leave
Gym or fitness benefit
Commuter benefit
Weekly catered lunches in the office
Investment in employee learning & development
Company
ElastixAI
ElastixAI is developing an AI inference platform designed to optimize how large language models are run.
H1B Sponsorship
ElastixAI has a track record of offering H1B sponsorships. Please note that this does not
guarantee sponsorship for this specific role. Below presents additional info for your
reference. (Data Powered by US Department of Labor)
Distribution of Different Job Fields Receiving Sponsorship
Represents job field similar to this job
Trends of Total Sponsorships
2025 (1)
Funding
Current Stage
Early StageTotal Funding
$16MKey Investors
FUSE
2025-05-14Series Unknown· $16M
Company data provided by crunchbase