Linux System Administrator, Senior (HPC) jobs in United States
cer-icon
Apply on Employer Site
company-logo

Resource Management Concepts, Inc. · 14 hours ago

Linux System Administrator, Senior (HPC)

Resource Management Concepts, Inc. (RMC) provides high-quality, professional services to government and commercial sectors. RMC is hiring for a Senior Linux System Administrator who will provide Linux based workstation and server support to an RDT&E environment to support their HPC system.

Consulting
check
Growth Opportunities

Responsibilities

The applicant must have general experience performing day-to-day administrative tasks related to managing a RedHat Linux based cluster to include storage and networking
The applicant must have experience creating custom scripts (Bash, Python, Powershell, etc) to assist with administrative and operational management of a RedHat based HPC cluster
Install, configure, maintain and monitor software applications, operating system, development tools, hardware and storage of the HPC Cluster
The applicant must have experience with managing a SLURM Workload Manager cluster or other similar job scheduling manager in an HPC cluster
The applicant must have working knowledge of configuration management tools such as Ansible
The applicant must have working knowledge of network technology to include IP addressing, VLANS, trunks, LAG, Infiiniband, switch configuration and networking security practices
The applicant will collaborate with end users and core RDTE staff to provide technical assistance in troubleshooting issues related to the cluster performance and submitted job execution
The applicant must create and maintain detailed documentation of systems configuration, standard operating procedures (SOP), end user guides and troubleshooting guides
The applicant must have working knowledge of DISA STIGs to include RedHat STIG and applying them to systems
The applicant must provide secondary support to core and end user RHEL systems as needed
The applicant must have working knowledge of storage technology to include NFS, SCSI, SAN/NAS, SMB/CIFS and other protocols to include configuration
Administered Splunk forwarders and indexers on RHEL/CentOS systems, managing OS-level services, resource tuning, log ingestion pipelines, and secure data flows using TLS and role-based access controls
Configured and maintained enterprise proxy services (Squid/Apache) and SMTP relays (Postfix/Sendmail) to support outbound system communications, alerting, and automated job notifications across HPC clusters
Implemented centralized logging for kernel, auditd, syslog, and application logs, integrating with Splunk for real-time monitoring, custom dashboards, and automated alert thresholds to support high-availability operations

Qualification

Linux AdministrationRedHat Linux Cluster ManagementSLURM Workload ManagerAnsibleNetwork TechnologyStorage TechnologyDISA STIGsTroubleshooting SkillsDevOpsCloud AdministrationContainer AdministrationDocumentation Skills

Required

6 - 10 years of Linux Administration experience
The selected applicant must have DoD 8570/5239 IAT Level 3 Certification (Security X or CISSP) and the ability to work independently and as part of a team
The selected applicant must be Red Hat Certified System Administrator (RHCSA) or equivalent
Linux system administration knowledge
Working configuration management knowledge with Ansible and/or other enterprise tools
Working networking technology knowledge
Working networked storage knowledge
Working knowledge of DISA STIGs
Strong troubleshooting and problem-solving skills
Strong knowledge on enterprise system administration
Strong experience in DevOPs, Cloud and Container administration

Preferred

The applicant must have general experience performing day-to-day administrative tasks related to managing a RedHat Linux based cluster to include storage and networking
The applicant must have experience creating custom scripts (Bash, Python, Powershell, etc) to assist with administrative and operational management of a RedHat based HPC cluster
Install, configure, maintain and monitor software applications, operating system, development tools, hardware and storage of the HPC Cluster
The applicant must have experience with managing a SLURM Workload Manager cluster or other similar job scheduling manager in an HPC cluster
The applicant must have working knowledge of configuration management tools such as Ansible
The applicant must have working knowledge of network technology to include IP addressing, VLANS, trunks, LAG, Infiiniband, switch configuration and networking security practices
The applicant will collaborate with end users and core RDTE staff to provide technical assistance in troubleshooting issues related to the cluster performance and submitted job execution
The applicant must create and maintain detailed documentation of systems configuration, standard operating procedures (SOP), end user guides and troubleshooting guides
The applicant must have working knowledge of DISA STIGs to include RedHat STIG and applying them to systems
The applicant must provide secondary support to core and end user RHEL systems as needed
The applicant must have working knowledge of storage technology to include NFS, SCSI, SAN/NAS, SMB/CIFS and other protocols to include configuration
Administered Splunk forwarders and indexers on RHEL/CentOS systems, managing OS-level services, resource tuning, log ingestion pipelines, and secure data flows using TLS and role-based access controls
Configured and maintained enterprise proxy services (Squid/Apache) and SMTP relays (Postfix/Sendmail) to support outbound system communications, alerting, and automated job notifications across HPC clusters
Implemented centralized logging for kernel, auditd, syslog, and application logs, integrating with Splunk for real-time monitoring, custom dashboards, and automated alert thresholds to support high-availability operations

Benefits

Tuition assistance
Certifications
Paid vacation package with 11 paid federal holidays
High-quality, low-deductible healthcare plans
Pet insurance
Competitive 401K package

Company

Resource Management Concepts, Inc.

twittertwittertwitter
company-logo
RMC is a dedicated small business provider of exceptional management and technology solutions.

Funding

Current Stage
Late Stage
Company data provided by crunchbase