Senior Network Engineer jobs in United States
cer-icon
Apply on Employer Site
company-logo

Alibaba Cloud · 1 day ago

Senior Network Engineer

Alibaba Cloud is a leading cloud computing and data intelligence company. They are seeking a Senior Network Engineer to develop and implement stability solutions, establish monitoring mechanisms, and enhance operational efficiency within their operations and maintenance platforms.

Cloud Data ServicesCloud ManagementData CenterData ManagementSoftware
check
H1B Sponsor Likelynote
Hiring Manager
Moyin Hu
linkedin

Responsibilities

Have a global perspective on stability, capable of developing and implementing stability solutions
Establish and continually optimize monitoring mechanisms for application operations and maintenance; develop and maintain corresponding monitoring platforms/tools
Establish and continuously optimize warning mechanisms for application operations and maintenance, ensuring that faults can be quickly discovered, located, and addressed
Quickly analyze, diagnose, and locate problems, and collaborate with relevant personnel to resolve issues; establish and improve the rapid recovery service mechanism to reduce business impact and ensure stable business operations by identifying and eliminating potential risks through stability governance projects and architectural optimizations
Design, develop, and maintain reliable operations and maintenance platforms and tools, such as inspection systems, water level systems, delivery systems, cost management systems, etc., to address issues related to delivery, performance, stability, and cost encountered by production systems, ensuring business availability and enhancing performance and efficiency
Responsible for data-driven analysis of operations and maintenance quality; analyze and study daily operations and maintenance metrics, issues, and risks to establish models and provide optimization suggestions for operations and maintenance
Establish operation and maintenance process specifications and standardization (such as change standards, protection plans, cloud product configuration standards, etc.) to ensure the normativity and standardization of operations and maintenance, thereby enhancing stability
Develop and implement emergency response specifications and standards for application operations and maintenance faults
Develop and implement alarm handling specifications and standards for application operations and maintenance, as well as Service Level Agreements (SLA)
Based on business requirements, plan budget preparation, capacity planning, and readiness, and coordinate with development teams for predictions and estimates of resource consumption such as storage and computing
Analyze business demands, ensuring stability while integrating water levels, specifications, and billing rules; control the reasonableness of resource estimation in technical solutions and collaborate with development to reduce resource costs
24/7 emergency response, daily monitoring alerts, and emergency handling, continuously identifying and rectifying existing issues
Responsible for operations and maintenance support during major events (such as National Day, Spring Festival, New Year's Day, and significant activities)
Develop and drill emergency plans, respond to emergencies, and handle faults
Establish a problem/fault record repository, conduct targeted analysis of the repository, and enhance and optimize the emergency plan repository and standard process repository
Responsible for system architecture upgrades, such as kernel upgrades, architecture upgrades, inter-room service migration, and containerization transformation
Responsible for the design and implementation of disaster recovery architecture, such as local disaster recovery and multi-active geographically distributed setups

Qualification

OperationsMaintenance experienceArchitecture designPerformance optimizationStability optimizationCloud service proficiencyNetwork protocols knowledgeIntelligent operations toolsBusiness understandingInfluence within business lineFluent in ChineseProblem-solving skills

Required

Fluent in Chinese communication skills, able to clearly articulate technical issues and solutions
Over 3 years of experience in operations and maintenance in related fields such as applications, networks, and containerization
Basic mastery of professional abilities in architecture design, performance optimization, and stability optimization
Capable of applying intelligent and automated operations and maintenance platforms and tools, designing and utilizing complex workflows and daily operational templates, quickly identifying, locating, and resolving relatively complex faults, thereby improving operational efficiency
Able to summarize and consolidate issues discovered in daily operations and maintenance into operational experience, and apply this knowledge to enhance capabilities within the operations and maintenance platform
Proficient in protocols such as TCP/IP, DNS, and HTTP, with the ability to perform preliminary analysis of network traffic and troubleshoot network issues
Familiar with at least one cloud service platform (such as AWS, Alibaba Cloud, Azure, etc.) and its related mainstream products (such as Flink, MaxCompute, Log Service, RDS, Redis, etc.), able to preliminarily troubleshoot and resolve basic issues related to the use of corresponding cloud products

Preferred

Familiarity with DPDK (Data Plane Development Kit) and experience in enhancing network processing performance
Some development capabilities to advance automation in operations and maintenance capabilities
Strong business understanding, capable of independently handling complex issues with real case examples
Possessing personal judgment regarding business issues, able to skillfully utilize processes and tools to identify risks and formulate solutions
Having a certain level of influence within the business line and able to gain recognition from surrounding teams

Company

Alibaba Cloud

twittertwittertwitter
company-logo
Alibaba Cloud develops cloud computing and data management services. It is a sub-organization of Alibaba Group.

H1B Sponsorship

Alibaba Cloud has a track record of offering H1B sponsorships. Please note that this does not guarantee sponsorship for this specific role. Below presents additional info for your reference. (Data Powered by US Department of Labor)
Distribution of Different Job Fields Receiving Sponsorship
Represents job field similar to this job
Trends of Total Sponsorships
2025 (18)
2024 (14)
2023 (2)
2022 (1)

Funding

Current Stage
Late Stage
Total Funding
$1.2B
Key Investors
Alibaba Group
2015-07-29Series B· $1B
2012-09-20Series A· $200M
Company data provided by crunchbase