Sr. Staff Software Engineer - GPU Network Software, RCCL jobs in United States
cer-icon
Apply on Employer Site
company-logo

AMD · 2 weeks ago

Sr. Staff Software Engineer - GPU Network Software, RCCL

AMD is a company focused on building innovative products that enhance computing experiences across various domains including AI and data centers. The role involves developing multi-node GPU communication libraries to support high-performance computing and machine learning applications as part of the AMD Radeon Open Ecosystem (ROCm).

AI InfrastructureArtificial Intelligence (AI)Cloud ComputingComputerEmbedded SystemsGPUHardwareSemiconductor
check
Growth Opportunities
check
H1B Sponsor Likelynote
Hiring Manager
Gearóid Ó.
linkedin

Responsibilities

Support AMD’s RCCL, an open source, GPU-accelerated communication collective middleware and related technologies
Design, implement, and test networking features for multi-GPU and multi-node communication libraries
Benchmark, profile and optimize code to maximize throughput on single-GPU, multi-GPU and clustered systems
Deliver high-quality code and documentation following best practices for open source software development
Work with key technical experts across AMD and with our partners and customers to improve ROCm applications, libraries, and tools
Deploy the libraries on large clusters and debug complex system level issues that could span across different layers of the software stack: gpu kernel drivers, nic driver etc

Qualification

CC++PythonGPU NetworksRoCELibfabricInfiniBandLinux KernelDevice driversCollective communication librariesOpen-source contributionsHIPCUDAOpenCLCPU architecturesGPU architecturesAssembly programmingVectorization

Required

Strong background developing applications and libraries in C, C++, and Python
Experience working with RoCE(RDMA over Converge Ethernet), Libfabric and InfiniBand
Experience working with Linux Kernel, Device drivers and network drivers
Experience designing and building GPU Networks for Large Scale Clusters
Experience in collective communication libraries: MPI, RCCL, SHMEM and optimization to scale collective communication to scale distributed systems
In-depth knowledge of best-practices in software development, including testing, profiling, debugging, documentation, version control, issue tracking, and planning
Contributions to open-source libraries and applications
B.Sc. or B.Eng. degree in Computer Science, Software Engineering, Electrical Engineering, or equivalent

Preferred

GPU software development using HIP, CUDA, or OpenCL
Understanding of CPU and GPU architectures and low-level optimization techniques including assembly programming and/or vectorization
Advanced degrees, such as M.Sc., M.Eng., Ph.D. are preferred

Benefits

AMD benefits at a glance.

Company

Advanced Micro Devices is a semiconductor company that designs and develops graphics units, processors, and media solutions.

H1B Sponsorship

AMD has a track record of offering H1B sponsorships. Please note that this does not guarantee sponsorship for this specific role. Below presents additional info for your reference. (Data Powered by US Department of Labor)
Distribution of Different Job Fields Receiving Sponsorship
Represents job field similar to this job
Trends of Total Sponsorships
2025 (836)
2024 (770)
2023 (551)
2022 (739)
2021 (519)
2020 (547)

Funding

Current Stage
Public Company
Total Funding
unknown
Key Investors
OpenAIDaniel Loeb
2025-10-06Post Ipo Equity
2023-03-02Post Ipo Equity
2021-06-29Post Ipo Equity

Leadership Team

leader-logo
Lisa Su
Chair & CEO
linkedin
leader-logo
Mark Papermaster
CTO and EVP
linkedin
Company data provided by crunchbase