Solutions Engineer - HPC Systems Engineer jobs in United States
cer-icon
Apply on Employer Site
company-logo

Wallero ยท 1 month ago

Solutions Engineer - HPC Systems Engineer

Wallero Technologies Inc is seeking an HPC Systems Engineer to join their Houston Systems team, responsible for managing a hybrid on-premises and cloud HPC environment. The role involves supporting proprietary SLB Omega software and various production groups while ensuring optimal performance and manageability of HPC systems.

ComputerCyber SecurityInformation TechnologySoftware
check
Growth Opportunities
check
H1B Sponsor Likelynote

Responsibilities

A minimum of 5 yearsโ€™ experience working in a large HPC enterprise environment comprising thousands of servers, large storage solutions, tape and tape automation
Proficient in the installation, configuration and management of Linux based operating systems, preferably using RHEL, CentOS, Rocky Linux
Experience with IBMโ€™s xCAT distributed computing management software
Experience with installation and maintenance of computer hardware including servers, tape drives, robotic tape libraries, GPGPU, SSD, disk arrays
Experience with containerization
Knowledge of networking and datacenter technologies, switching, routing, high-availability, LAN / WAN / WLAN topologies and system configuration for Ethernet, InfiniBand, and Fiber Channel SAN
Experience with HPC Storage Solutions, for example configuration and operation of HPE ClusterStor systems, NetApp, Dell Isilon, and Pure Storage
Ability to write and troubleshoot Bourne, Bash and C Shell, Perl, Python, Ruby and MRTG scripts
Experience with PostgreSQL and database installation and support
Experience with Google Cloud Platform and Azure public clouds. Able to provision and manage instances, build images, write installation scripts
Experience with configuration tools like Ansible and Terraform
Experience with backup and recovery tools, IBM Spectrum, Dell Networker
Good knowledge of Linux security, including configuration of endpoint security tools
Ability to evaluate HPC system environments and make recommendations for improvement in performance and manageability
Ability to investigate, debug and diagnose system level issues
Conform to local change management philosophies, including full testing on non-production systems, prior to production deployment
Effectively communicate all change activities to all affected parties including a clear description of the change, related service outages and possible effects on the different environments we support
Ensure SLB IT deployment standards are maintained, with verification through reporting systems
Meet KPO requirements for InTouch support processing, including full documentation of problem resolution, creation of knowledge content and best practice items
Show a good understanding of computer equipment, and its care and maintenance
Work with other internal support groups, systems, networking, programming, desktop support, computer operations, and facilities as required to complete administration functions
Work with a variety of vendors in technical environments and in the reporting and investigation of system problems
Provide a written weekly status report to the team manager and be prepared to present and discuss this with the team at a weekly status meeting
Prepared to work outside of normal hours as system maintenance often must be performed outside of prime time; provide 24/7 support to computer operations; work with other remote support locations, for example Kuala Lumpur, backing follow the sun support
Participate in support on-call schedule and in weekend power outages, normally two per year and in emergency data center activities
Peer-review all major projects, as part of the normal deployment philosophy
Ensure compliance with all quality assurance, best practice procedures and QHSE requirements, as defined by job position

Qualification

HPC enterprise experienceLinux managementIBM xCAT experienceContainerizationNetworking knowledgeHPC Storage SolutionsScripting languagesPostgreSQL supportCloud platformsConfiguration toolsBackupRecovery toolsLinux securitySystem evaluationDebugging skills

Required

A minimum of 5 years' experience working in a large HPC enterprise environment comprising thousands of servers, large storage solutions, tape and tape automation
Proficient in the installation, configuration and management of Linux based operating systems, preferably using RHEL, CentOS, Rocky Linux
Experience with IBM's xCAT distributed computing management software
Experience with installation and maintenance of computer hardware including servers, tape drives, robotic tape libraries, GPGPU, SSD, disk arrays
Experience with containerization
Knowledge of networking and datacenter technologies, switching, routing, high-availability, LAN / WAN / WLAN topologies and system configuration for Ethernet, InfiniBand, and Fiber Channel SAN
Experience with HPC Storage Solutions, for example configuration and operation of HPE ClusterStor systems, NetApp, Dell Isilon, and Pure Storage
Ability to write and troubleshoot Bourne, Bash and C Shell, Perl, Python, Ruby and MRTG scripts
Experience with PostgreSQL and database installation and support
Experience with Google Cloud Platform and Azure public clouds. Able to provision and manage instances, build images, write installation scripts
Experience with configuration tools like Ansible and Terraform
Experience with backup and recovery tools, IBM Spectrum, Dell Networker
Good knowledge of Linux security, including configuration of endpoint security tools
Ability to evaluate HPC system environments and make recommendations for improvement in performance and manageability
Ability to investigate, debug and diagnose system level issues
Conform to local change management philosophies, including full testing on non-production systems, prior to production deployment
Effectively communicate all change activities to all affected parties including a clear description of the change, related service outages and possible effects on the different environments we support
Ensure SLB IT deployment standards are maintained, with verification through reporting systems
Meet KPO requirements for InTouch support processing, including full documentation of problem resolution, creation of knowledge content and best practice items
Show a good understanding of computer equipment, and its care and maintenance
Work with other internal support groups, systems, networking, programming, desktop support, computer operations, and facilities as required to complete administration functions
Work with a variety of vendors in technical environments and in the reporting and investigation of system problems
Provide a written weekly status report to the team manager and be prepared to present and discuss this with the team at a weekly status meeting
Prepared to work outside of normal hours as system maintenance often must be performed outside of prime time; provide 24/7 support to computer operations; work with other remote support locations, for example Kuala Lumpur, backing follow the sun support
Participate in support on-call schedule and in weekend power outages, normally two per year and in emergency data center activities
Peer-review all major projects, as part of the normal deployment philosophy
Ensure compliance with all quality assurance, best practice procedures and QHSE requirements, as defined by job position

Company

Wallero

twittertwittertwitter
company-logo
๐˜๐จ๐ฎ๐ซ ๐๐š๐ซ๐ญ๐ง๐ž๐ซ ๐Ÿ๐จ๐ซ ๐๐ซ๐จ๐ ๐ซ๐ž๐ฌ๐ฌ ๐ข๐ง ๐š๐ง ๐„๐ฏ๐ž๐ซ-๐‚๐ก๐š๐ง๐ ๐ข๐ง๐  ๐ƒ๐ข๐ ๐ข๐ญ๐š๐ฅ ๐–๐จ๐ซ๐ฅ๐ Wallero was born out of a group of technically savvy individuals whose lives revolve around technology and a passion for solving innovative problems for customers.

H1B Sponsorship

Wallero has a track record of offering H1B sponsorships. Please note that this does not guarantee sponsorship for this specific role. Below presents additional info for your reference. (Data Powered by US Department of Labor)
Distribution of Different Job Fields Receiving Sponsorship
Represents job field similar to this job
Trends of Total Sponsorships
2020 (6)

Funding

Current Stage
Growth Stage

Leadership Team

leader-logo
Krishna Padisetty
Founder & Managing Director
linkedin
Company data provided by crunchbase