NVIDIA · 4 hours ago
Solutions Architect, Infrastructure - Research Computing
Maximize your interview chances
Artificial Intelligence (AI)GPU
Growth OpportunitiesH1B Sponsor Likely
Insider Connection @NVIDIA
Get 3x more responses when you reach out via email instead of LinkedIn.
Responsibilities
Technical advisor for the design, build-out, and optimization of university-level research computing infrastructures that include GPU-accelerated scientific workflows.
Work with university research computing to optimize hardware utilization with software orchestration tools such as NVIDIA Base Command, Kubernetes, Slurm, and Jupyter notebook environments.
Implement systems monitoring and telemetry tools to help optimize resource utilization, and track most demanding application workloads at research computing centers.
Document what you learn. This can include building targeted training, writing whitepapers, blogs, and wiki articles, and working through hard problems with a customer on a whiteboard.
Provide customer requirements and feedback to product and engineering teams.
Qualification
Find out how your skills align with this job's requirements. If anything seems off, you can easily click on the tags to select or unselect skills to reflect your actual expertise.
Required
MS or PhD in Engineering, Mathematics, Physical Sciences, or Computer Science (or equivalent experience).
5+ years of relevant work experience.
Strong experience in designing and deploying GPU-accelerated computing infrastructure.
In-depth knowledge of cluster orchestration and job scheduling technologies, e.g. Slurm, Kubernetes, Ansible and/or Open OnDemand.
Experience with container tools (Docker, Singularity, Enroot/Pyxis) including at-scale deployment of containerized environments.
Expertise in systems monitoring, telemetry, and systems performance optimization of research computing environments.
Familiarity with tools like Prometheus, Grafana or NVIDIA DCGM.
Understanding of datacenter networking technologies (InfiniBand, Ethernet, OFED) and experience with network configuration.
Familiarity with power and cooling systems architecture for data center infrastructure.
Preferred
Experience in deploying LLM training and inference workflows in a research computing environment.
Experience working with technical computing customers in the academic research computing space.
Practical knowledge of high-performance parallel file systems.
Applications and systems-level knowledge of OpenMPI and NCCL.
Experience with debugging and profiling tools. E.g. Nsight Systems, Nsight Compute, Compute Sanitizer, GDB or Valgrind.
Benefits
Equity
Comprehensive benefits package
Company
NVIDIA
NVIDIA is a computing platform company operating at the intersection of graphics, HPC, and AI.
H1B Sponsorship
NVIDIA has a track record of offering H1B sponsorships. Please note that this does not
guarantee sponsorship for this specific role. Below presents additional info for your
reference. (Data Powered by US Department of Labor)
Distribution of Different Job Fields Receiving Sponsorship
Represents job field similar to this job
Trends of Total Sponsorships
2023 (735)
2022 (892)
2021 (696)
2020 (534)
Funding
Current Stage
Public CompanyTotal Funding
$4.09BKey Investors
ARPA-EARK Investment ManagementSoftBank Vision Fund
2023-05-09Grant· $5M
2022-08-09Post Ipo Equity· $65M
2021-02-18Post Ipo Equity
Recent News
Mexico Business
2024-12-13
No support or updates for Windows 11 on machines not meeting minimum hardware requirements, says Microsoft | CIO
2024-12-13
vcnewsdaily.com
2024-12-13
Company data provided by crunchbase