TensorDock · 2 hours ago
System Administrator
Maximize your interview chances
Cloud Computing
Insider Connection @TensorDock
Get 3x more responses when you reach out via email instead of LinkedIn.
Responsibilities
Manage a global fleet of servers
Work closely with suppliers to resolve issues in a timely manner
Troubleshoot hardware, software, and user error issues
Analyze customer reports to determine bugs and escalate to our developers, following up as necessary
Document what you learn on the job
Develop automation and optimize our customer support experience
Qualification
Find out how your skills align with this job's requirements. If anything seems off, you can easily click on the tags to select or unselect skills to reflect your actual expertise.
Required
5+ years Linux administration (Ubuntu/Debian focus)
Strong experience with libvirt (KVM) virtualization
Proficient in Python and Bash scripting
Experience with automation tools (preferably Ansible)
Solid networking knowledge
Experience with PostgreSQL
Familiarity with CEPH and NFS storage solutions
Preferred
Experience with GPU virtualization and PCIe passthrough
Knowledge of Proxmox VE, OpenStack, or OpenNebula
Experience with Docker and Kubernetes
Experience with bare metal automation (e.g., Ubuntu MAAS)
Monitoring experience (Prometheus, Grafana, ELK Stack)
Experience with infrastructure-as-code tools (e.g., Terraform)
Experience with Redis
Can come to an in person office in San Francisco (will adjust compensation for cost of living)
Candidates who’ve used GPUs before, whether to render animations or to train machine learning models
Benefits
Equity
Medical, dental, and vision insurance covered
Short-term and long-term disability insurance
5% 401(k) match
Paid paternity and maternity leave
20 days PTO, 2 floating holidays, 2 sick days on top of federal holidays off
Other miscellaneous reimbursements (wellness, phone/internet, education)