ByteDance · 2 days ago
Senior Machine Learning Ops Engineer
Wonder how qualified you are to the job?
ContentData Mining
Insider Connection @ByteDance
Responsibilities
Responsible for ensuring ML systems operate efficiently for large model development, training, evaluation, and inference
Ensure stability of offline tasks/services in multi-data center, multi-region, and multi-cloud scenarios
Manage and plan resources, costs, and budgets for computing and storage resources
Ensure global system disaster recovery, cluster machine governance, business service stability, resource utilization and operation efficiency improvement
Develop software tools, products, and systems to monitor and manage ML infrastructure and services efficiently
Participate in global team roster for system and business on-call support
Qualification
Find out how your skills align with this job's requirements. If anything seems off, you can easily click on the tags to select or unselect skills to reflect your actual expertise.
Required
Bachelor's degree or above, major in computer science, computer engineering or related
Strong proficiency in at least one programming language such as Go/Python/Shell in Linux environment
Strong hands-on experience with Kubernetes and containers skills, and have more than 2 years of relevant operation and maintenance experience
Possess excellent logical analysis ability, able to reasonably abstract and split business logic
Have good documentation principles and habits to be able to write and update workflow and technical documentation as required on time
Possess a strong sense of responsibility, good learning ability, communication ability and self-drive, good team spirit
Preferred
Engaged in the operation and maintenance of large-scale ML distributed systems
Experience in operation and maintenance of GPU servers
Benefits
100% premium coverage for employee medical insurance
75% premium coverage for dependents
Health Savings Account (HSA) with company match
Dental and Vision insurance plans
Short/Long term Disability insurance
Basic Life and Voluntary Life insurance
AD&D insurance plans
Flexible Spending Account (FSA) options
10 paid holidays per year
17 days of Paid Personal Time Off (PPTO)
10 paid sick days per year
12 weeks of paid Parental leave
8 weeks of paid Supplemental Disability
Mental and emotional health benefits through EAP and Lyra
401K company match
Gym and cellphone service reimbursements
Company
ByteDance
ByteDance is an internet technology company that operates creative content platforms.
H1B Sponsorship
ByteDance has a track record of offering H1B sponsorships. Please note that this does not
guarantee sponsorship for this specific role. Below presents additional info for your
reference. (Data Powered by US Department of Labor)
Distribution of Different Job Fields Receiving Sponsorship
Trends of Total Sponsorships
2023 (502)
2022 (518)
2021 (510)
2020 (272)
Funding
Current Stage
Late StageTotal Funding
$9.51BKey Investors
G42Tiger Global ManagementGeneral Atlantic
2023-03-15Secondary Market· $100M
2020-12-11Private Equity· $2B
2020-03-30Secondary Market· Undisclosed
Recent News
Music Business Worldwide
2024-06-05
2024-06-04
South China Morning Post
2024-06-04
Company data provided by crunchbase