DevOps Software Developer (KBase Team) jobs in United States
cer-icon
Apply on Employer Site
company-logo

Berkeley Lab · 1 month ago

DevOps Software Developer (KBase Team)

Berkeley Lab’s Environmental Genomics and Systems Biology Division is looking for a DevOps Software Engineer to join the Systems Biology Knowledgebase team. In this role, you will contribute to an open-source platform that enhances collaboration among biologists and data scientists by developing automation and ensuring platform stability and performance.

Research
badNo H1Bnote

Responsibilities

Develop and implement automation to deploy, configure, and support on-premise compute resources and services (e.g., databases, microservices, LLMs, monitoring systems, object storage like Minio, and High - Performance Computing (HPC))
Design, implement, and support robust monitoring, alerting, and logging solutions for infrastructure and platform services
Ensure the security, reliability, and performance of KBase's on-premise hardware and software stack by documenting, hardening, and continuously improving its security posture in adherence with National Lab and DOE security standards
Develop and maintain comprehensive documentation for infrastructure designs, configurations, and operational procedures
Implement DevSecOps pipelines, best practices, and security scanning (SCA/SAST) for infrastructure and software components

Qualification

DevOps practicesInfrastructure as CodeContainerization technologiesLinux operating systemsScripting languagesMonitoring toolsRelational databasesNoSQL databasesVersion control systemsCloud-native workflowsAnalytical skillsCommunicationInterpersonal skills

Required

A Bachelor's Degree (or equivalent knowledge/training) in Computer Science, Engineering, or a related field and a minimum of 5 years of relevant experience as a Software Infrastructure Engineer, DevOps Engineer, Site Reliability Engineer (SRE), or similar role or an equivalent combination of education and experience
Experience with infrastructure as code (laC) tools (e.g., Terraform, Ansible), containerization technologies (e.g., Docker), and container orchestration platforms (e.g., Kubernetes)
Experience with containerization (Docker) and Kubernetes orchestration, including Helm, operators, and resource management for data-intensive workloads
Experience with version control systems (e.g., Git), CI/CD pipelines, monitoring, and observability tools (e.g., Prometheus, Grafana, ELK stack or similar)
Experience with the deployment and management of relational and/or NoSQL databases
Expert-level knowledge of Linux operating systems, system administration, and proficiency in scripting languages (e.g., Python, Bash, Go)
Proficiency in Python, with the ability to write modular, production-ready software and integrate it into cloud-native workflows
Demonstrated understanding of core DevOps, software engineering principles for on-premise distributed systems, microservices, and HPC architectures
Familiarity with object storage systems such as MinIO or AWS S3 and understanding of data lifecycle management in distributed storage
Familiarity with Apache Spark (PySpark, SparkSQL, or Structured Streaming) and distributed data processing frameworks
Excellent oral and written communication skills including experience organizing and presenting information to technical and non technical audiences
Strong analytical skills including experience identifying and solving complex technical problems
Demonstrated interpersonal skills including experience collaborating with a variety of scientific, operations, and technical teams
Must be available to come onsite as required to access the server room for maintenance or troubleshooting

Preferred

A Master's Degree (or equivalent knowledge/training) in Computer Science, Engineering, or a related discipline
Experience with Computational or Systems Biology within an academic or research environment
Experience with virtualization technologies (e.g., KVM), distributed messaging or search systems (e.g., Kafka, Elasticsearch), and MLOps practices and tools. (e.g., MLflow, Kubeflow, Model Serving infrastructure etc)
Experience with HPC environments and workload managers/schedulers (e.g., Slurm, HTCondor, PBS)

Benefits

Exceptional health and retirement benefits, including pension or 401K-style plans
A culture where you’ll belong - we are invested in our teams!
In addition to accruing vacation and sick time, we also have an annual Winter Holiday Shutdown
Parental bonding leave (for both mothers and fathers)
Pet insurance

Company

Berkeley Lab

twittertwittertwitter
company-logo
Berkeley Lab is a national laboratory that creates advanced new tools for scientific discovery.

Funding

Current Stage
Late Stage

Leadership Team

leader-logo
Mary Barnum, MBA
Business Manager, COO Office
linkedin
leader-logo
Rebecca Rishell
Deputy Chief Operating Officer
linkedin
Company data provided by crunchbase