Qualcomm · 10 hours ago
Staff AI Performance Architect
Qualcomm Technologies, Inc. is a leader in mobile technology and AI solutions. They are seeking a Staff AI Performance Architect to drive enhancements in hardware for AI training systems, focusing on architecture and performance analysis to support scalable AI solutions.
Artificial Intelligence (AI)Generative AISoftwareTelecommunicationsWireless
Responsibilities
Understand trends in ML network design through customer engagements and latest academic research and determine how this will affect both SW and HW design
Work with customers to determine hardware requirements for AI training systems
Analysis of current accelerator and GPU architectures
Architect enhancements required for efficient training of AI models
Design and architecture of:
Flexible Computational Blocks
Involving a variety of datatypes : floating point, fixed point, microscaling
Involving a variety of precision : 32/16/8/4/2/1
Capable of optimally performing dense and sparse GEMM, GEMV
Memory Technology and subystems that are optimized for a range of requirements
Capacity
Bandwidth
Compute in Memory, Compute near memory
Scale-Out and Scale-Up Architectures
Switches, NoCs, Codesign with Communication Collectives
Optimized for Power
Ability to perform Competitive Analysis
Codesign HW with SW/GenAI (LLM) requirements
Define performance models to prove effectiveness of architecture proposals
Pre-Silicon prediction of performance for various ML training workloads
Perform analysis of performance/area/power trade-offs for future HW and SW ML algorithms including impact of SOC components (memory and bus impacts)
Qualification
Required
Master's degree in Computer Science, Engineering, Information Systems, or related field
3+ years Hardware Engineering experience defining architecture of GPUs or accelerators used for training of AI models
In-depth knowledge of nVidia/AMD GPU capabilities and architectures
Knowledge of LLM architectures and their HW requirements
Preferred
Knowledge of computer architecture, digital circuits and hardware simulators
Knowledge of communication protocols used in AI systems
Knowledge of Network-on-Chip (NoC) designs used in System-on-Chip (SoC) designs
Understanding of various memory technologies used in AI systems
Experience in modeling hardware and workloads in order to extract performance and power estimates
High-level hardware modeling experience preferred
Knowledge of AI Training systems such as NVIDIA DGX and NVL72
Experience training and finetuning LLMs using distributed training framework such as DeepSpeed, FSDP
Knowledge of front-end ML frameworks (i.e.,TensorFlow, PyTorch) used for training of ML models
Strong communication skills (written and verbal)
Detail-oriented with strong problem-solving, analytical and debugging skills
Demonstrated ability to learn, think and adapt in a fast-changing environment
Ability to code in C++ and Python
Knowledge of a variety of classes of ML models (i.e. CNN, RNN, etc)
Benefits
Competitive annual discretionary bonus program
Opportunity for annual RSU grants
Highly competitive benefits package
Company
Qualcomm
Qualcomm designs wireless technologies and semiconductors that power connectivity, communication, and smart devices.
H1B Sponsorship
Qualcomm has a track record of offering H1B sponsorships. Please note that this does not
guarantee sponsorship for this specific role. Below presents additional info for your
reference. (Data Powered by US Department of Labor)
Distribution of Different Job Fields Receiving Sponsorship
Represents job field similar to this job
Trends of Total Sponsorships
2025 (2013)
2024 (1910)
2023 (3216)
2022 (2885)
2021 (2104)
2020 (1181)
Funding
Current Stage
Public CompanyTotal Funding
$3.5M1991-12-20IPO
1988-01-01Undisclosed· $3.5M
Recent News
2025-12-31
2025-12-31
2025-12-30
Company data provided by crunchbase