Apply on Employer Site

Oxmiq Labs · 1 day ago

Senior Compiler Backend Engineer

Campbell, CA

Full-time

Onsite

Senior Level

7+ years exp

OXMIQ Labs is re-architecting the GPU stack to build a licensable GPU hardware and software platform for next-generation AI and graphics workloads. The Senior Compiler Backend Engineer will own the compiler backend for OxCore hardware IP, designing and implementing the lowering pipeline and optimizing performance for Python and CUDA workloads.

Computer Software

H1B Sponsor Likely

Hiring Manager

Mark Hirsch

Responsibilities

Own the OxCore compiler backend

Design and implement the OxCore codegen backend (likely on top of LLVM/MLIR or similar) from high-level IR down to OxCore’s instruction set / micro-ops

Define and evolve OxCore-specific IR dialects, calling conventions, and ABI details across scalar, vector, and tensor engines

Implement lowering passes that map Python/CUDA-like kernels and ML operators to OxCore execution units and memory hierarchy

Build OxCore-aware optimization passes: instruction selection and scheduling across heterogeneous units, register allocation tuned for OxCore’s register files, memory-access shaping for near-/in-memory compute, coalescing, tiling, and locality, warp/SIMD-style utilization given OxCore’s SIMT/SIMD/CUDA-compatible execution model

Develop cost models and auto-tuning hooks that understand different OxCore/OxQuilt configurations (ratios of compute, memory, and interconnect)

Collaborate with the OXPython team to ensure Python-based CUDA workloads lower efficiently onto OxCore, preserving semantics while exploiting OxCore features

Work with Capsule/runtime engineers on: kernel launch strategies and stream/queue design, heterogeneous dispatch across OxCore and other accelerators, profiling hooks and debug interfaces

Work closely with OxCore architecture and OxQuilt teams to: capture hardware capabilities and constraints into compiler models, co-design micro-architectural features that unlock compiler-driven performance

Use pre-silicon models, simulators, and FPGA/emulation platforms to validate correctness and drive performance prior to customer silicon

Provide technical leadership for OxCore backend architecture, coding standards, and design reviews

Mentor other engineers on compiler/backend internals, GPU/accelerator performance, and RISC-V nuances

Qualification

Compiler backend experienceLLVM/MLIR knowledgeGPU architecture intuitionModern C++ proficiencyRISC-V exposurePerformance profiling toolsPre-silicon environmentsTechnical leadershipMentoring skillsCollaboration skills

Required

7+ years of experience in compiler backend / codegen / low‑level performance engineering (title flexible for exceptional candidates)

Have shipped or led substantial work on compiler backends (LLVM, GCC, MLIR, custom) targeting GPUs or accelerators

Understand deeply: SSA/IR design, CFGs, dataflow, instruction selection & scheduling, register allocation strategies, loop transforms, tiling, vectorization

Have strong GPU/accelerator architecture intuition: SIMD/SIMT, warps/wavefronts, occupancy, memory hierarchies (local/shared, HBM/DRAM, scratchpad), throughput vs latency trade‑offs

Are fluent in modern C++ (and/or Rust) for large systems codebases

Have meaningful exposure to RISC‑V or other ISA‑level work (writing backends, intrinsics, or hand‑tuned assembly is a plus)

Know how to profile and optimize: you've used tools like Nsight, perf, VTune, ROCm tools, or custom profilers to chase down performance wins

Are comfortable operating in pre‑silicon environments (simulators, emulators, performance modeling)

Enjoy working across boundaries: hardware, compilers, runtimes, and ML/graphics workloads

Preferred

Experience with ML compilers / DSLs (e.g., MLIR, TVM, XLA, Triton, Halide, IREE)

Background in GPU IP or licensable core design flows (ARM, IP providers, or custom accelerators)

Familiarity with Python‑first or CUDA‑centric toolchains, and porting CUDA workloads to new backends

Experience with chiplet / heterogeneous SoC design constraints or HW/SW co‑design

Contributions to open‑source compilers or runtimes

Prior work in AI, graphics, or multimodal workloads (rendering, path tracing, transformer models, etc.)

Company

Oxmiq Labs

OXMIQ delivers a comprehensive GPU hardware IP stack scalable from edge AI devices.

Founded in 2023

Campbell, California, USA

11-50 employees

https://oxmiq.ai

H1B Sponsorship

Oxmiq Labs has a track record of offering H1B sponsorships. Please note that this does not guarantee sponsorship for this specific role. Below presents additional info for your reference. (Data Powered by US Department of Labor)

Distribution of Different Job Fields Receiving Sponsorship

Represents job field similar to this job

Trends of Total Sponsorships

2025 (1)

Funding

Current Stage

Early Stage

Total Funding

$20M

Key Investors

MediaTek

2025-08-04Seed· $20M

Company data provided by crunchbase