Sr. Staff Software Engineer - GPU Network Software, RCCL jobs in United States
cer-icon
Apply on Employer Site
company-logo

Advanced Microdevices Pvt. Ltd. (India) · 3 months ago

Sr. Staff Software Engineer - GPU Network Software, RCCL

Advanced Micro Devices, Inc is dedicated to transforming lives with its technology, focusing on building products that enhance next-generation computing experiences. They are seeking a Sr. Staff Software Engineer to develop multi-node GPU communication libraries aimed at high-performance computing and machine learning workloads, while collaborating with various technical experts to improve ROCm applications.

BiopharmaBiotechnologyIndustrialManufacturing

Responsibilities

Support AMD’s RCCL, an open source, GPU-accelerated communication collective middleware and related technologies
Design, implement, and test networking features for multi-GPU and multi-node communication libraries
Benchmark, profile and optimize code to maximize throughput on single-GPU, multi-GPU and clustered systems
Deliver high-quality code and documentation following best practices for open source software development
Work with key technical experts across AMD and with our partners and customers to improve ROCm applications, libraries, and tools
Deploy the libraries on large clusters and debug complex system level issues that could span across different layers of the software stack: gpu kernel drivers, nic driver etc

Qualification

CC++PythonGPU NetworksRoCELibfabricInfiniBandLinux KernelDevice driversMPIRCCLSHMEMOpen-source contributionsHIPCUDAOpenCLCPU architecturesLow-level optimizationAssembly programmingVectorization

Required

Strong background developing applications and libraries in C, C++, and Python
Experience working with RoCE(RDMA over Converge Ethernet), Libfabric and InfiniBand
Experience working with Linux Kernel, Device drivers and network drivers
Experience designing and building GPU Networks for Large Scale Clusters
Experience in collective communication libraries: MPI, RCCL, SHMEM and optimization to scale collective communication to scale distributed systems
In-depth knowledge of best-practices in software development, including testing, profiling, debugging, documentation, version control, issue tracking, and planning
Contributions to open-source libraries and applications
B.Sc. or B.Eng. degree in Computer Science, Software Engineering, Electrical Engineering, or equivalent

Preferred

Advanced degrees, such as M.Sc., M.Eng., Ph.D. are preferred
GPU software development using HIP, CUDA, or OpenCL
Understanding of CPU and GPU architectures and low-level optimization techniques including assembly programming and/or vectorization

Benefits

AMD benefits at a glance.

Company

Advanced Microdevices Pvt. Ltd. (India)

twittertwittertwitter
company-logo
Advanced Microdevices (mdi) is a leader in innovative membrane technologies.

Funding

Current Stage
Late Stage

Leadership Team

leader-logo
Nalini Kant Gupta
Founder & Managing Director
Company data provided by crunchbase