Staff Engineer (Fleet Performance) @ DigitalOcean | Jobright.ai
JOBSarrow
RecommendedLiked
0
Applied
0
External
0
Staff Engineer (Fleet Performance) jobs in Denver
68 applicants
company-logo

DigitalOcean · 8 hours ago

Staff Engineer (Fleet Performance)

ftfMaximize your interview chances
Cloud ComputingDevOps
check
Growth Opportunities
check
H1B Sponsor Likelynote

Insider Connection @DigitalOcean

Discover valuable connections within the company who might provide insights and potential referrals.
Get 3x more responses when you reach out via email instead of LinkedIn.

Responsibilities

Develop and implement comprehensive performance metrics, analysis tools, and reporting systems
Lead initiatives to enhance shared infrastructure, balancing performance optimization with rigorous security standards
Collaborate with hardware engineering teams and vendors to continuously validate GPU fabric performance
Engage with the open-source Linux community to advance virtualization technologies and integrate them into our fleet
Conduct in-depth performance analysis of the Linux kernel, virtualization layer, storage, and network stack to devise optimization strategies
Identify system bottlenecks proactively and drive optimizations across the hypervisor software stack
Work cross-functionally to harness new performance capabilities from evolving hardware architectures
Enhance test frameworks, harnesses, and pipelines to ensure robust performance validation
Investigate and resolve virtual machine downtime and performance issues in our production environment
Participate in on-call rotations as needed to support system reliability

Qualification

Find out how your skills align with this job's requirements. If anything seems off, you can easily click on the tags to select or unselect skills to reflect your actual expertise.

Linux kernelPerformance measurement toolsGo programmingPython programmingDistributed systems performanceHypervisorsGPU technologyMonitoringAnalyzing infrastructureSecurity best practicesRuby programmingEBPFXDPFioTPCCMLPerfNCCLChefAWXKubernetesX86_64 architectureARM architectureML-based solutionsCPU schedulingMemory managementFile systemI/O

Required

Bachelor's or Master's degree in Computer Science, Mathematics, Statistics or Computer/Electrical Engineering or equivalent work experience
Extensive knowledge of Linux kernel, hypervisors, and open-source operating systems
7+ experience with performance measurement tools such as profilers, eBPF, XDP, fio, TPCC, MLPerf, and NCCL
5+ years developing strategies for managing, monitoring, and analyzing infrastructure, applications and services
Strong proficiency in Go, Python, and/or Ruby
Deep understanding of kernel performance aspects, including scheduling, context switching, and hardware acceleration
Expertise in distributed systems performance, including tracing and debugging methodologies
Knowledge of GPU technology, GPU fabrics, and programming for multi-GPU workloads
Demonstrated ability to solve complex problems at scale
Strong security mindset with proactive approach to implementing best practices
Excellent cross-team collaboration and communication skills
Leadership experience in skills development and mentorship
Professional-level written and spoken English with strong presentation abilities

Preferred

Experience with observability platforms such as Splunk, Prometheus, Grafana, Elastic, or Dynatrace
Proficiency in C programming language
Proficiency in compiler-level performance optimization techniques
Experience with Chef, AWX, and/or Kubernetes
Familiarity with x86_64 and/or ARM architectures
Successful history of upstreaming Linux kernel patches
In-depth knowledge of at least one Linux subsystem (CPU scheduling, memory management, file system, I/O, etc.)
Experience in developing and deploying ML-based solutions for anomaly detection and dynamic load balancing

Benefits

Reimbursement for relevant conferences, training, and education
Access to LinkedIn Learning's 10,000+ courses
One-time work from home stipend
Wellness allowance
Flexible time off policy
Equity compensation to eligible employees
Equity grants upon hire
Option to participate in our Employee Stock Purchase Program

Company

DigitalOcean

company-logo
DigitalOcean provides a cloud platform to deploy, manage, and scale applications of any size.

H1B Sponsorship

DigitalOcean has a track record of offering H1B sponsorships. Please note that this does not guarantee sponsorship for this specific role. Below presents additional info for your reference. (Data Powered by US Department of Labor)
Distribution of Different Job Fields Receiving Sponsorship
Represents job field similar to this job
Trends of Total Sponsorships
2023 (3)
2022 (19)
2021 (19)
2020 (10)

Funding

Current Stage
Public Company
Total Funding
$491.28M
Key Investors
Global Secure InvestAccess IndustriesKeyBanc Capital Markets
2021-09-13Post Ipo Equity· $34.91M
2021-03-23IPO· undefined
2021-01-01Series Unknown· undefined

Leadership Team

leader-logo
Admas Kanyagia
VP, Social Impact
linkedin
leader-logo
Adrienne Calderone
Vice President, Finance
linkedin
Company data provided by crunchbase
logo

Orion

Your AI Copilot