Architecture Intern - Inference jobs in United States
cer-icon
Apply on Employer Site
company-logo

Etched · 2 days ago

Architecture Intern - Inference

Etched is building the world’s first AI inference system purpose-built for transformers, delivering exceptional performance and efficiency. They are seeking an Architecture Intern to contribute to the design of next-generation AI accelerators by developing and optimizing compute architectures for transformer workloads.

Artificial Intelligence (AI)ComputerHardwareSemiconductor
check
H1B Sponsor Likelynote

Responsibilities

Support porting state-of-the-art models to our architecture. Help build programming abstractions and testing capabilities to rapidly iterate on model porting
Assist in building, enhancing, and scaling Sohu’s runtime, including multi-node inference, intra-node execution, state management, and robust error handling
Contribute to optimizing routing and communication layers using Sohu’s collectives
Utilize performance profiling and debugging tools to identify bottlenecks and correctness issues
Develop and leverage a deep understanding of Sohu to co-design both HW instructions and model architecture operations to maximize model performance
Implement high-performance software components for the Model Toolkit

Qualification

C++RustTransformer architecturesPerformance-sensitive systemsDistributed systemsPyTorchJAXHigh-performance applicationsSIMD optimizations

Required

Progress towards a Bachelor's, Master's, or PhD degree in computer science, computer engineering, or a related field
Proficiency in C++ or Rust
Understanding of performance-sensitive or complex distributed software systems, e.g. Linux internals, accelerator architectures (e.g. GPUs, TPUs), Compilers, or high-speed interconnects (e.g. NVLink, InfiniBand)
Familiarity with PyTorch or JAX
Ported applications to non-standard accelerator hardware or hardware platforms
Deep knowledge of transformer model architectures and/or inference serving stacks (vLLM, SGLang, etc.)

Preferred

Low-latency, high-performance applications using both kernel-level and user-space networking stacks
Deep understanding of distributed systems concepts, algorithms, and challenges, including consensus protocols, consistency models, and communication patterns
Solid grasp of Transformer architectures, particularly Mixture-of-Experts (MoE)
Built applications with extensive SIMD (Single Instruction, Multiple Data) optimizations for performance-critical paths

Benefits

Generous housing support for those relocating
Daily lunch and dinner in our office

Company

Etched

twittertwitter
company-logo
Building the hardware for superintelligence

H1B Sponsorship

Etched has a track record of offering H1B sponsorships. Please note that this does not guarantee sponsorship for this specific role. Below presents additional info for your reference. (Data Powered by US Department of Labor)
Distribution of Different Job Fields Receiving Sponsorship
Represents job field similar to this job
Trends of Total Sponsorships
2025 (9)
2024 (11)
2023 (1)

Funding

Current Stage
Growth Stage
Total Funding
$125.36M
Key Investors
Primary Venture Partners
2024-06-25Series A· $120M
2023-05-16Seed· $5.36M

Leadership Team

leader-logo
Chris Zhu
Co-Founder
linkedin
leader-logo
Robert Wachen
Co-Founder and President
linkedin
Company data provided by crunchbase