Senior Compiler Backend Engineer jobs in United States
cer-icon
Apply on Employer Site
company-logo

Oxmiq Labs · 1 day ago

Senior Compiler Backend Engineer

OXMIQ Labs is re-architecting the GPU stack to build a licensable GPU hardware and software platform for next-generation AI and graphics workloads. The Senior Compiler Backend Engineer will own the compiler backend for OxCore hardware IP, designing and implementing the lowering pipeline and optimizing performance for Python and CUDA workloads.

Computer Software
check
H1B Sponsor Likelynote
Hiring Manager
Mark Hirsch
linkedin

Responsibilities

Own the OxCore compiler backend
Design and implement the OxCore codegen backend (likely on top of LLVM/MLIR or similar) from high-level IR down to OxCore’s instruction set / micro-ops
Define and evolve OxCore-specific IR dialects, calling conventions, and ABI details across scalar, vector, and tensor engines
Implement lowering passes that map Python/CUDA-like kernels and ML operators to OxCore execution units and memory hierarchy
Build OxCore-aware optimization passes: instruction selection and scheduling across heterogeneous units, register allocation tuned for OxCore’s register files, memory-access shaping for near-/in-memory compute, coalescing, tiling, and locality, warp/SIMD-style utilization given OxCore’s SIMT/SIMD/CUDA-compatible execution model
Develop cost models and auto-tuning hooks that understand different OxCore/OxQuilt configurations (ratios of compute, memory, and interconnect)
Collaborate with the OXPython team to ensure Python-based CUDA workloads lower efficiently onto OxCore, preserving semantics while exploiting OxCore features
Work with Capsule/runtime engineers on: kernel launch strategies and stream/queue design, heterogeneous dispatch across OxCore and other accelerators, profiling hooks and debug interfaces
Work closely with OxCore architecture and OxQuilt teams to: capture hardware capabilities and constraints into compiler models, co-design micro-architectural features that unlock compiler-driven performance
Use pre-silicon models, simulators, and FPGA/emulation platforms to validate correctness and drive performance prior to customer silicon
Provide technical leadership for OxCore backend architecture, coding standards, and design reviews
Mentor other engineers on compiler/backend internals, GPU/accelerator performance, and RISC-V nuances

Qualification

Compiler backend experienceLLVM/MLIR knowledgeGPU architecture intuitionModern C++ proficiencyRISC-V exposurePerformance profiling toolsPre-silicon environmentsTechnical leadershipMentoring skillsCollaboration skills

Required

7+ years of experience in compiler backend / codegen / low‑level performance engineering (title flexible for exceptional candidates)
Have shipped or led substantial work on compiler backends (LLVM, GCC, MLIR, custom) targeting GPUs or accelerators
Understand deeply: SSA/IR design, CFGs, dataflow, instruction selection & scheduling, register allocation strategies, loop transforms, tiling, vectorization
Have strong GPU/accelerator architecture intuition: SIMD/SIMT, warps/wavefronts, occupancy, memory hierarchies (local/shared, HBM/DRAM, scratchpad), throughput vs latency trade‑offs
Are fluent in modern C++ (and/or Rust) for large systems codebases
Have meaningful exposure to RISC‑V or other ISA‑level work (writing backends, intrinsics, or hand‑tuned assembly is a plus)
Know how to profile and optimize: you've used tools like Nsight, perf, VTune, ROCm tools, or custom profilers to chase down performance wins
Are comfortable operating in pre‑silicon environments (simulators, emulators, performance modeling)
Enjoy working across boundaries: hardware, compilers, runtimes, and ML/graphics workloads

Preferred

Experience with ML compilers / DSLs (e.g., MLIR, TVM, XLA, Triton, Halide, IREE)
Background in GPU IP or licensable core design flows (ARM, IP providers, or custom accelerators)
Familiarity with Python‑first or CUDA‑centric toolchains, and porting CUDA workloads to new backends
Experience with chiplet / heterogeneous SoC design constraints or HW/SW co‑design
Contributions to open‑source compilers or runtimes
Prior work in AI, graphics, or multimodal workloads (rendering, path tracing, transformer models, etc.)

Company

Oxmiq Labs

twittertwitter
company-logo
OXMIQ delivers a comprehensive GPU hardware IP stack scalable from edge AI devices.

H1B Sponsorship

Oxmiq Labs has a track record of offering H1B sponsorships. Please note that this does not guarantee sponsorship for this specific role. Below presents additional info for your reference. (Data Powered by US Department of Labor)
Distribution of Different Job Fields Receiving Sponsorship
Represents job field similar to this job
Trends of Total Sponsorships
2025 (1)

Funding

Current Stage
Early Stage
Total Funding
$20M
Key Investors
MediaTek
2025-08-04Seed· $20M
Company data provided by crunchbase