200+ applicantsPosted by Agency

Company

Original Job Post

DeepRec.ai · 2 days ago

Machine Learning Compiler Engineer

United States

Full-time

Remote

Mid Level

$200K/yr - $350K/yr

Wonder how qualified you are to the job?

Maximize your interview chances

Staffing and Recruiting

Hiring Manager

Hayley Killengrey

Insider Connection @DeepRec.ai

Discover valuable connections within the company who might provide insights and potential referrals, giving your job application an inside edge.

Responsibilities

Lower deep learning graphs - from common frameworks (PyTorch, Tensorflow, Keras, etc) down to an IR representation for training - with particular focus on ensuring reproducibility

Write novel algorithms - for transforming intermediate representations of compute graphs between different operator representations.

Ownership - of two of the following compiler areas:

• Front-end: Integrate common Deep Learning Frameworks with our internal IR, and implement transformation passes in ONNX to adapt IR for middle-end consumption.

• Middle-end: Design compiler passes for training-based compute graphs, integrate reproducible Deep Learning kernels into the code generation stage, and debug compilation passes and transformations.

• Back-end: Translate IR from the middle-end to GPU target machine code.

Qualification

Find out how your skills align with this job's requirements. If anything seems off, you can easily click on the tags to select or unselect skills to reflect your actual expertise.

Traditional CompilersSoftware EngineeringParallel ProgrammingRust ProgrammingHigh-Level IRLow-Level IRGPU-Specific OptimizationsProblem-SolvingCommunicationSelf-MotivatedComputer ArchitecturesNeural NetworksSystems ProgrammingRustOpen-Source ContributionsCompiler StacksCompilation KnowledgeHigh-Performance ComputingNumeric LibrariesSoftware DesignDeep Learning FundamentalsMachine Learning FrameworksPyTorchTensorFlowScikit-LearnDeep Learning CompilersTVMMLIRTensorComprehensionsTriton

Required

Fundamental knowledge of traditional compilers (e.g., LLVM, GCC) and graph traversals necessary for compiler code development.

Strong software engineering skills, demonstrated by contributing to and deploying production-grade code.

Understanding of parallel programming, particularly concerning GPUs.

Willingness to learn Rust, as it is our company's default programming language.

Ability to operate with High-Level IR/Clang/LLVM up to middle-end optimization, and/or Low-Level IR/LLVM targets/target-specific optimizations, especially GPU-specific optimizations.

Highly self-motivated with excellent verbal and written communication skills.

Comfortable working independently in an applied research environment.

Preferred

Thorough understanding of computer architectures specialized for training neural network graphs (e.g., Intel Xeon CPU, GPUs, TPUs, custom accelerators).

Experience in systems-level programming with Rust.

Contributions to open-source Compiler Stacks.

In-depth knowledge of compilation in relation to High-Performance Computer architectures (CPU, GPU, custom accelerator, or a heterogeneous system).

Strong foundation in CPU and GPU architectures, numeric libraries, and modular software design.

Understanding of recent architecture trends and fundamentals of Deep Learning, along with experience with machine learning frameworks and their internals (e.g., PyTorch, TensorFlow, scikit-learn, etc.).

Exposure to Deep Learning Compiler frameworks like TVM, MLIR, TensorComprehensions, Triton, JAX.

Experience in writing and optimizing highly-performant GPU kernels.