ALOIS Solutions ยท 19 hours ago
Linux System Administrator
Maximize your interview chances
Insider Connection @ALOIS Solutions
Get 3x more responses when you reach out via email instead of LinkedIn.
Responsibilities
Install, configure, and maintain Linux operating systems on HPC clusters.
Manage job schedulers such as Slurm or LSF.
Utilize node provisioning tools like Werewolf.
Troubleshoot system issues and provide technical support to users.
Monitor system performance and ensure optimal operation of the HPC environment.
Collaborate with other IT professionals to integrate new technologies into the existing infrastructure.
Progressive experience in HPC system administration, preferably in a Redhat/CentOS Linux environment.
Expertise in troubleshooting complex system issues.
Experience with parallel file systems and storage solutions.
Strong knowledge of job schedulers such as Slurm or LSF.
Familiarity with node provisioning tools like Werewolf.
Proficiency in Linux OS administration.
Knowledge of job scheduling tools (e.g., Slurm).
Understanding of node provisioning tools (e.g., Werewolf).
Excellent problem-solving abilities.
Strong communication skills.
Ability to work collaboratively in a team-oriented environment.
Security+ certification preferred.
Linux+ certification preferred.
Top Secret Clearance: TS/SCI preferred.
On-site presence at customer location in Stennis, MS.
Availability for some on-call/weekend work.
Hands on experience setting up HPC compute cluster.
Install Nvidia drivers.
Install manage configure GPU software stack like Pytorch, tensorflow, cuda Python.
Setup PBS job scheduler and supporting PBS servers.
Experience with Redhat and Rocky Linux; bash scripting.
Nice to have Docker, Kubernetes experience.
Nice to have Storage knowledge.
Nice to have networking and devops knowledge.
Qualification
Find out how your skills align with this job's requirements. If anything seems off, you can easily click on the tags to select or unselect skills to reflect your actual expertise.
Required
Install, configure, and maintain Linux operating systems on HPC clusters.
Manage job schedulers such as Slurm or LSF.
Utilize node provisioning tools like Werewolf.
Troubleshoot system issues and provide technical support to users.
Monitor system performance and ensure optimal operation of the HPC environment.
Collaborate with other IT professionals to integrate new technologies into the existing infrastructure.
Progressive experience in HPC system administration, preferably in a Redhat/CentOS Linux environment.
Expertise in troubleshooting complex system issues.
Experience with parallel file systems and storage solutions.
Strong knowledge of job schedulers such as Slurm or LSF.
Familiarity with node provisioning tools like Werewolf.
Proficiency in Linux OS administration
Knowledge of job scheduling tools (e.g., Slurm)
Understanding of node provisioning tools (e.g., Werewolf)
Excellent problem-solving abilities
Strong communication skills
Ability to work collaboratively in a team-oriented environment
On-site presence at customer location in Stennis, MS
Availability for some on-call/weekend work
Hands on experience setting up HPC compute cluster.
Install Nvidia drivers
Install manage configure GPU software stack like Pytorch, tensorflow, cuda Python
Setup PBS job scheduler and supporting PBS servers
Experience with Redhat and Rocky Linux; bash scripting
Bachelor's Degree required.
Preferably in Computer Science, Information Systems, or related field
Preferred
Security+ certification preferred
Linux+ certification preferred
Top Secret Clearance: TS/SCI preferred
Nice to have Docker, Kubernetes experience
Nice to have Storage knowledge
Nice to have networking and devops knowledge.
Very good written and presentation / verbal communication skills with experience of customer interfacing role.
In-depth requirement understanding skills with good analytical and problem-solving ability, interpersonal efficiency, and positive attitude