Apply on Employer Site

McAfee · 5 hours ago

Senior Site Reliability Engineer

US, Texas, Frisco

Full-time

Hybrid

Mid, Senior Level

4+ years exp

McAfee is a leader in personal security for consumers, focused on protecting people in an always online world. The Senior Site Reliability Engineer will be responsible for maintaining service levels, engaging with various teams to support business needs, and ensuring the availability and reliability of mission-critical services.

Consumer ElectronicsEnterprise SoftwareInformation TechnologyNetwork Security

H1B Sponsor Likely

Responsibilities

Responsible for proactive monitoring of mission critical production environment and respond quickly in response to breach in trends or issues

Troubleshoot, debug, and escalate issues with proper analysis to concerned teams to ensure maximum availability

Troubleshoot problems in real-time, interacting with DevOps/Engineering and internal support representatives to deliver maximum customer satisfaction

Detect and triage of all operational incidents and requests

Work extensively to help reduce the Mean Time to Restore (MTTR) & improve Mean Time To Detect (MTTD)

Work across Engineering and Support teams to ensure we meet our goals for service reliability, availability, and efficiency

Ensure security events and alerts are addressed in a timely manner

Own availability and performance of mission critical services. Automation to prevent problem recurrence, and responses to all non-exceptional service conditions

Help maintain and improve service operations by following established processes and procedures and periodic update of SOP and documents in confluence page

Create and manage day to day processes including Change Management, Incident Management, and Problem Management

Support automation initiatives to enhance Mean Time to Restore (MTTR) and Mean Time To Detect (MTTD)

Help track Key Performance Indicators (KPIs) to support operational performance and service reliability

Participate in incident retrospectives and assist in managing the incident lifecycle

Planning and deployment of patches and product enhancements to our environments

Engage in readiness reviews before changes or deployments into production environments

Support product engineering teams on SRE related activities to establish optimal SLAs for all pre-defined activities and provide a high-quality customer experience

Provide detail summary of all high priority issues to stakeholders ensuring quality in data provided

Participate early in the SDLC to ensure reliability is built in from the beginning and creating plans for successful implementations/launches and transition into SRE team smoothly

Create accurate root cause of Production issues and help to provide long term solutions to fix them

Continually evaluate and adopt the latest industry technologies to optimize costs and streamline processes

Communicate effectively and present team progress to leadership

Lead by example technically and establish credibility with quality technical execution

Mentor, coach, other SRE team members

Qualification

SRE experienceCloud operationsMonitoring toolsCI/CD toolsContainer technologiesAWS knowledgeIncident ManagementChange ManagementProblem ManagementSoft skills

Required

4 to 5+ years of software development and/or technical operations experience, and experience running large-scale applications

Prior experience in SRE / DevOps, Infrastructure Engineering, and Systems Engineering required

Experience in defining and monitoring for highly resilient and reliable applications

Experience maintaining and operating production systems (> 99.95% SLA) on Cloud

Able to Monitor, Debug & RCA for any service failures

Exceptional communication skills that cross both team and geographical boundaries

Advanced knowledge and skills within a specific technical or professional discipline with understanding of the impact of work on other areas of the organization

Enjoy working with a large variety of services and technologies

Experience with Monitoring, logging, APM & other tools: APMs. Grafana, CloudWatch, etc

Experience with CI/CD tools: Git, Jenkins, Harness, etc

Experience with container technologies: Kubernetes, Docker

Experience with both Windows and Linux Operating Systems

Strong knowledge of AWS cloud service offerings covering serverless and containerized workloads

Working experience in very well in a fast-paced, high-growth environment

Ability to work some non-standard hours to support a global team and initiatives

Preferred

Good to have ITIL, HDI, AWS, any other Cloud certifications

Benefits

Bonus Program

Pension and Retirement Plans

Medical, Dental and Vision Coverage

Paid Time Off

Paid Parental Leave

Support for Community Involvement

Company

McAfee

Glassdoor3.8

McAfee is an online security company that provides virus alerts and analysis on malware, network security threats, and web vulnerabilities.

Founded in 1987

Santa Clara, California, USA

1001-5000 employees

http://www.mcafee.com

H1B Sponsorship

McAfee has a track record of offering H1B sponsorships. Please note that this does not guarantee sponsorship for this specific role. Below presents additional info for your reference. (Data Powered by US Department of Labor)

Distribution of Different Job Fields Receiving Sponsorship

Represents job field similar to this job

Trends of Total Sponsorships

2025 (28)

2024 (24)

2023 (12)

2022 (26)

2021 (46)

2020 (84)

Funding

Current Stage

Public Company

Total Funding

unknown

2022-03-01Private Equity

2022-03-01Debt Financing

2021-11-08Acquired

Leadership Team

Craig Boundy

Chief Executive Officer

Steve Grobman

Executive Vice President and Chief Technology Officer

Recent News

Globes

McAfee buys privacy app of Israeli co Mine

2025-11-25

PR Newswire

MineOS Completes Strategic Sale of Consumer Privacy Business, Accelerating the Shift to Autonomous Privacy for Enterprises

2025-11-24

Business Wire

Veeam Names Allison Cerra as Chief Marketing Officer

2025-11-11

Company data provided by crunchbase