Lead Production System Engineer - Ashburn jobs in United States
cer-icon
Apply on Employer Site
company-logo

ByteDance · 13 hours ago

Lead Production System Engineer - Ashburn

ByteDance is a global technology company known for its innovative products like TikTok and CapCut. They are seeking a Lead Production System Engineer to enhance the stability, efficiency, and scalability of their data center and server operations while collaborating with various stakeholders to ensure optimal performance and reliability.

ContentData MiningFoundational AIInternetSocial Media
check
Comp. & Benefits
check
H1B Sponsor Likelynote

Responsibilities

Operation: As a Production Systems Engineer, your mission is to contribute to enhancing the stability, efficiency, effectiveness, and scalability of our data center and server operations, platform, and service on a worldwide scale
Lifecycle Enhancement: Participate in and enhance the entire lifecycle of the server fleet - from system design/introduction consultation to launch reviews, deployment, operation, and retirement
Automation: Develop and deploy tools and solutions to enhance the automation, reliability, scalability, and operability of servers in the datacenter
Monitoring: Develop and deploy tools and solutions for improving the availability, latency, and overall service of the datacenter infrastructure, server, and network health
Disaster Recovery: Troubleshoot and resolve complex technical issues in a high-pressure, fast-paced environment. Conduct high-level root-cause analysis for service interruption and establish preventive measures. Practice sustainable incident response and postmortem
Cross-team Collaboration: Collaborate with stakeholders such as infrastructure architects, project managers, data center operations engineers, platform developers, supply chain teams, and our internal customers to comprehend overarching business objectives. Additionally, you will have the chance to design and implement innovative solutions for our Core IDCs and CDN/Edge
On-call: Engage in our on-call support spanning across regions and incident response teams to address critical issues in the production environment

Qualification

Linux system administrationScripting in BashScripting in PythonData center operationsRESTful APIsFlaskJavaScriptNode.jsSQLAnsibleCommunicationTeam management

Required

Bachelor's degree in Computer Science, Electronic Engineering, relevant technical field, or equivalent practical experience
Experience in at least one of the areas below: Server Operations, Tooling Adaptation, Deployment, and Maintenance, Communication
Demonstrated proficiency in Linux system administration tasks
In-depth comprehension of Linux kernels, drivers, and modules
Capable of scripting in Bash and Python to automate routine system operations
In-depth understanding of server hardware and able to conduct troubleshooting or diagnostics
Experience participating in the planning, delivery, and operation of large-scale data centers in different countries
Proficient in customizing operation and maintenance tools to satisfy specific demands for new server hardware
Competent in managing the entire software tool lifecycle, ranging from deployment to continuous maintenance
Experience in developing and maintaining hardware, network, or service monitoring software for more than 10,000 servers
Experience in managing and coordinating teams in the global context
Engage in on-call support spanning across regions and incident response teams to address critical issues in the production environment

Preferred

5 years of work experience in related filed
An intermediate level of expertise in Data Center operations
Proficiency in the operation and maintenance of GPU server
Full Stack Software Development skills including creating and integrating RESTful APIs using Flask for Python-based back-end development
Proficient in JavaScript and capable of leveraging it, along with Node.js, for both front-end and back-end development tasks
Demonstrate proficiency in SQL for efficient database management, including designing database schemas, composing queries, and ensuring data integrity; familiar with Redis
Experience in Ansible Configuration Management, Application Deployment, and Task Execution

Company

ByteDance

company-logo
ByteDance is a technology company that develops content creation platforms and services.

H1B Sponsorship

ByteDance has a track record of offering H1B sponsorships. Please note that this does not guarantee sponsorship for this specific role. Below presents additional info for your reference. (Data Powered by US Department of Labor)
Distribution of Different Job Fields Receiving Sponsorship
Represents job field similar to this job
Trends of Total Sponsorships
2025 (1350)
2024 (1123)
2023 (775)
2022 (487)
2021 (417)
2020 (245)

Funding

Current Stage
Late Stage
Total Funding
$9.8B
Key Investors
Capital TodayG42Tiger Global Management
2025-11-20Secondary Market· $300M
2024-07-25Secondary Market
2023-03-14Secondary Market· $100M

Leadership Team

leader-logo
Jochen Bischoff
Head of Global Business Solutions - Africa
linkedin
leader-logo
Matty Lin
General Manager, Global Business Solutions, KR
linkedin
Company data provided by crunchbase