Solutions Architect, Infrastructure - Research Computing @ NVIDIA | Jobright.ai
JOBSarrow
RecommendedLiked
0
Applied
0
External
0
Solutions Architect, Infrastructure - Research Computing jobs in New York, United States
Be an early applicantLess than 25 applicants
company-logo

NVIDIA · 4 hours ago

Solutions Architect, Infrastructure - Research Computing

ftfMaximize your interview chances
Artificial Intelligence (AI)GPU
check
Growth Opportunities
check
H1B Sponsor Likelynote

Insider Connection @NVIDIA

Discover valuable connections within the company who might provide insights and potential referrals.
Get 3x more responses when you reach out via email instead of LinkedIn.

Responsibilities

Technical advisor for the design, build-out, and optimization of university-level research computing infrastructures that include GPU-accelerated scientific workflows.
Work with university research computing to optimize hardware utilization with software orchestration tools such as NVIDIA Base Command, Kubernetes, Slurm, and Jupyter notebook environments.
Implement systems monitoring and telemetry tools to help optimize resource utilization, and track most demanding application workloads at research computing centers.
Document what you learn. This can include building targeted training, writing whitepapers, blogs, and wiki articles, and working through hard problems with a customer on a whiteboard.
Provide customer requirements and feedback to product and engineering teams.

Qualification

Find out how your skills align with this job's requirements. If anything seems off, you can easily click on the tags to select or unselect skills to reflect your actual expertise.

GPU-accelerated computingCluster orchestrationJob scheduling technologiesSystems monitoringTelemetryPerformance optimizationDatacenter networkingContainer toolsHigh-performance parallel file systemsOpenMPINCCLDebugging toolsProfiling tools

Required

MS or PhD in Engineering, Mathematics, Physical Sciences, or Computer Science (or equivalent experience).
5+ years of relevant work experience.
Strong experience in designing and deploying GPU-accelerated computing infrastructure.
In-depth knowledge of cluster orchestration and job scheduling technologies, e.g. Slurm, Kubernetes, Ansible and/or Open OnDemand.
Experience with container tools (Docker, Singularity, Enroot/Pyxis) including at-scale deployment of containerized environments.
Expertise in systems monitoring, telemetry, and systems performance optimization of research computing environments.
Familiarity with tools like Prometheus, Grafana or NVIDIA DCGM.
Understanding of datacenter networking technologies (InfiniBand, Ethernet, OFED) and experience with network configuration.
Familiarity with power and cooling systems architecture for data center infrastructure.

Preferred

Experience in deploying LLM training and inference workflows in a research computing environment.
Experience working with technical computing customers in the academic research computing space.
Practical knowledge of high-performance parallel file systems.
Applications and systems-level knowledge of OpenMPI and NCCL.
Experience with debugging and profiling tools. E.g. Nsight Systems, Nsight Compute, Compute Sanitizer, GDB or Valgrind.

Benefits

Equity
Comprehensive benefits package

Company

NVIDIA is a computing platform company operating at the intersection of graphics, HPC, and AI.

H1B Sponsorship

NVIDIA has a track record of offering H1B sponsorships. Please note that this does not guarantee sponsorship for this specific role. Below presents additional info for your reference. (Data Powered by US Department of Labor)
Distribution of Different Job Fields Receiving Sponsorship
Represents job field similar to this job
Trends of Total Sponsorships
2023 (735)
2022 (892)
2021 (696)
2020 (534)

Funding

Current Stage
Public Company
Total Funding
$4.09B
Key Investors
ARPA-EARK Investment ManagementSoftBank Vision Fund
2023-05-09Grant· $5M
2022-08-09Post Ipo Equity· $65M
2021-02-18Post Ipo Equity

Leadership Team

leader-logo
Jensen Huang
CEO and Founder
linkedin
leader-logo
Chris Malachowsky
Co-Founder, SVP
linkedin

Recent News

No support or updates for Windows 11 on machines not meeting minimum hardware requirements, says Microsoft | CIO
Company data provided by crunchbase
logo

Orion

Your AI Copilot