AI/HPC Networking Software Engineer @ Hewlett Packard Enterprise | Jobright.ai
JOBSarrow
RecommendedLiked
0
Applied
0
External
0
AI/HPC Networking Software Engineer jobs in San Jose, CA
Be an early applicantLess than 25 applicants
company-logo

Hewlett Packard Enterprise · 3 hours ago

AI/HPC Networking Software Engineer

ftfMaximize your interview chances
Data CenterEnterprise Software
check
Actively Hiring

Insider Connection @Hewlett Packard Enterprise

Discover valuable connections within the company who might provide insights and potential referrals.
Get 3x more responses when you reach out via email instead of LinkedIn.

Responsibilities

Engage and work with the GPU/CPU vendors, customers, AI ISV and open-source SW communities to validate, tune, and enable high performance AI applications on the Slingshot Ethernet fabric.
Work on partner engagements for the leading communication libraries, middleware and frameworks used in AI development today (NCCL, RCCL, UCX, OneCCL. Pytorch, etc.).
Design, implement and maintain system software that enables communication between GPUS, CPUs, and storage in scale out AI and HPC systems. Work with all the leading architectures and vendors in the AI and Data Center markets – Nvidia, AMD, Intel.
Work with the OEM, ODM, and VAR channels vendors on bring Slingshot to a broader set of customers. Validate and tune applications driving those engagements.
Develop and own HPE product usage support, upstreaming and community engagements, and internal testing and infrastructure.
Work with cross-disciplinary teams to understand business requirements and align software direction to meet those needs.

Qualification

Find out how your skills align with this job's requirements. If anything seems off, you can easily click on the tags to select or unselect skills to reflect your actual expertise.

AI/ML networking softwareHigh Performance ComputingC/C++ programmingPython programmingNetworking architectureGPU architectureNvidia GPU infrastructureAMD GPU infrastructurePerformance analysisTuningDeploymentEthernet technologyInfiniBand technologyUser-based networkingOFI libfabric APIsCloud ArchitecturesDevOpsDistributed ComputingMicroservices FluencyFull Stack DevelopmentSecurity-First MindsetSolutions DesignTesting & AutomationUser Experience (UX)

Required

Bachelor’s/master's degree in computer science, engineering, or related field
3+ years of relevant experience with software development and/or architecture in the Data Center, university, government lab, or AI-centric environments.
Familiarity with AI/ML networking software development with an emphasis on performance analysis, tuning, and deployment in a scale-out compute cluster environment
Ability to participate and own pieces of the product release pipeline up to and including package integration and support.
Understanding of networking architecture and communications including Ethernet and InfiniBand networking technologies
Understanding of computer architecture, and familiarity with the fundamentals of GPU architecture. Experience with Nvidia and AMD GPU infrastructure and software stacks.
Programming and debugging skills in C, C++ and or Python. Ability to understand how applications and industry middleware/libraries work in Slingshot enabled systems and identify strategies and ideas for allowing these applications to work to customer expectations.
Knowledge of user-based networking and OFI libfabric software interfaces and APIs.

Preferred

Cloud Architectures
Cross Domain Knowledge
Design Thinking
Development Fundamentals
DevOps
Distributed Computing
Microservices Fluency
Full Stack Development
Security-First Mindset
Solutions Design
Testing & Automation
User Experience (UX)

Benefits

Health & Wellbeing
Personal & Professional Development
Diversity, Inclusion & Belonging

Company

Hewlett Packard Enterprise

twittertwittertwitter
company-logo
Hewlett Packard Enterprise is an edge-to-cloud company that uses comprehensive solutions to accelerate business outcomes.

Funding

Current Stage
Public Company
Total Funding
$1.35B
2024-09-10Post Ipo Equity· $1.35B
2015-11-02IPO· undefined

Leadership Team

leader-logo
Antonio Neri
President & CEO
linkedin
leader-logo
Irv Rothman
President & CEO
linkedin
Company data provided by crunchbase
logo

Orion

Your AI Copilot