HPC Infra Engineer @ TestingXperts | Jobright.ai
JOBSarrow
RecommendedLiked
0
Applied
0
External
0
HPC Infra Engineer jobs in United States
Be an early applicantLess than 25 applicants
expire-info-iconThis job has closed.
company-logo

TestingXperts · 4 days ago

HPC Infra Engineer

ftfMaximize your interview chances
DevOpsInformation Technology
check
H1B Sponsor Likelynote

Insider Connection @TestingXperts

Discover valuable connections within the company who might provide insights and potential referrals.
Get 3x more responses when you reach out via email instead of LinkedIn.

Responsibilities

Designs, and maintains HPC clusters.
Build CI / CD Pipeline. Perform DevOps Operations, IAAS (Terraform).
Write LSF esub such that GPU memory calculation circumvents IBM’s standard method for calculation and allows Pipeline management system to rely on this internally built metric
Investigates and analyzes verbal and written requests for infrastructure management.
Excellent teamwork and communication abilities
Maintains high standards documentation, and deliverables..
Self-motivated and self-managing, with strong organizational skills
Ability to work with tight deadlines and multiple competing priorities
Ability to optimize the application for performance
Interact with development teams to develop a strong understanding of the project and testing objectives.
Participate in troubleshooting of issues with different teams to drive towards root cause identification and resolution
Documentation skills to track the development and implementations
Effective communication skills: Regularly achieve consensus with peers, and clear status updates.

Qualification

Find out how your skills align with this job's requirements. If anything seems off, you can easily click on the tags to select or unselect skills to reflect your actual expertise.

High Performance ComputingHost Level ParallelismAdvanced LSF Job SubmissionLSF Admin ExperienceBackend Lustre StoragePython ProgrammingGithub ActionsJenkinsAnsibleCI/CD ProcessesGitGoogle Cloud KnowledgeRequirement AnalysisCloud Understanding

Required

B Tech, MS or PhD degree in Computer Science or similar.
10-14 years of strong hands-on experience in High performance computing.
Knowledge of host level parallelism (eg, thread / LWP vs processing / HWP, OpenMP vs MPI, GPU vs CPU parallelism).
Advanced LSF job submission (eg, Jobs vs Arrays, bacct vs bhist, memory requests vs limits).
Knowledge of shared file system backend types, differences, and advantages (eg, local vs networked storage, shared FS types like NFS / SBM / GPFS / Lustre).
LSF admin experience (eg, mbatchd vs sbatchd, elim vs esub, deploying an LSF cluster from scratch).
Backend Lustre storage knowledge (eg, architecture such as MDS / MDT, OSS / OST, MGS / MGT).
Good knowledge in Python programming, Object oriented programming.
Good knowledge in Github Actions, Jenkins, Ansible, CI / CD processes.
Good communication skills and ability to work independently.
Expertise in understanding and analyzing requirements.
Proficiency with modern development tools, like Git.
Ability to suggest any enhancements or changes that are required to stay up with modern security and development best practices.
Excellent teamwork and communication abilities.
Self-motivated and self-managing, with strong organizational skills.
Ability to work with tight deadlines and multiple competing priorities.
Ability to optimize the application for performance.
Interact with development teams to develop a strong understanding of the project and testing objectives.
Participate in troubleshooting of issues with different teams to drive towards root cause identification and resolution.
Documentation skills to track the development and implementations.
Effective communication skills: Regularly achieve consensus with peers, and clear status updates.

Preferred

Good to have Google Cloud knowledge.
Good to have Cloud understanding.

Company

TestingXperts

company-logo
Next Gen QA & Software Testing Company

H1B Sponsorship

TestingXperts has a track record of offering H1B sponsorships. Please note that this does not guarantee sponsorship for this specific role. Below presents additional info for your reference. (Data Powered by US Department of Labor)
Distribution of Different Job Fields Receiving Sponsorship
Represents job field similar to this job
Trends of Total Sponsorships
2020 (1)

Funding

Current Stage
Late Stage

Leadership Team

leader-logo
Manish Gupta
CEO
linkedin
leader-logo
Archana Gupta
CFO
linkedin
Company data provided by crunchbase
logo

Orion

Your AI Copilot