SRE Team Lead - Monitoring and Support jobs in United States
cer-icon
Apply on Employer Site
company-logo

Frontline Education · 1 month ago

SRE Team Lead - Monitoring and Support

Frontline Education is reimagining education through AI-driven solutions. The SRE Team Lead will oversee the 24/7 monitoring and first-response function, ensuring effective incident triage and leading a globally distributed support team.

EmploymentHuman ResourcesRecruitingSoftware
check
Work & Life Balance
check
H1B Sponsor Likelynote

Responsibilities

Provide day‑to‑day direction for a team of Level 1 support engineers, fostering a culture of ownership, prompt communication and operational excellence; manage offshore staffing, schedules and rotations to ensure around‑the‑clock coverage
Oversee the configuration of monitoring platforms and continuously improve the signal‑to‑noise ratio by refining alert thresholds, health checks and dashboards; implement best practices drawn industry standards
Perform initial triage of production incidents by validating alerts, assessing impact and urgency and documenting context; collaborate with development and platform engineers to isolate issues and ensure timely escalation
Maintain high‑quality runbooks, knowledge‑base articles and shift‑hand‑off documentation to support rapid recovery cycles; ensure that post‑incident notes capture lessons learned and drive continuous improvement
Promote the use of large‑language‑model‑based tools (e.g., ChatGPT, generative AI) for faster log analysis and summarization of incidents; explore opportunities to automate repetitive triage tasks
Define and refine processes for alert routing, status updates and stakeholder communications; ensure adherence to change control and operational readiness standards while keeping the team aligned with Frontline’s core values

Qualification

Monitoring tools expertiseIncident triage experienceTeam leadership experienceLog analysis platformsBasic scriptingIT Service ManagementMentoring offshore teamsCollaboration toolsContinuous improvement mindsetCommunication skills

Required

5+ years in monitoring, incident triage or production support roles with at least 1 year of team‑lead experience
Deep knowledge of monitoring and APM tools such as Dynatrace, Nagios, Prometheus, Datadog or similar, with an ability to tune alerts and dashboards
Experience with log analysis platforms (ELK/Splunk) and basic scripting (Python, Bash) for automation
Familiarity with incident management platforms (PagerDuty, OpsGenie) and ticketing systems (JIRA) and comfort using collaboration tools (Slack, Teams) for real‑time updates
Strong communication and documentation skills; ability to lead and mentor offshore teams and implement efficient support processes
Interest or experience in using LLMs or Agentic AI to augment triage and documentation
Experience working with IT Service Management processes and frameworks; familiarity with ITIL best practices is highly desirable
Enterprise experience – proven ability to operate and deliver support within large-scale enterprise environments

Benefits

Personalized Time Off: Take time when it’s needed most — whether that’s a family vacation, a reset day, or simply time to rest and refocus.
Paid Sick Time: Separate, dedicated sick leave to care for yourself or loved ones.
Volunteer Time Off: Paid time to give back and support causes that matter to you.
Ten Paid Holidays: Enjoy meaningful moments and traditions throughout the year.
World-Class Learning Access: Explore thousands of on-demand courses through platforms like LinkedIn Learning.
Leadership & Technical Skill Building: Develop new capabilities and chart your own professional path.
AI Empowerment: Use OpenAI tools to build fluency with emerging technology and harness AI as a creative partner for innovation and problem-solving.
Tuition Reimbursement: Invest in formal education to advance your skills and career.
Ongoing Learning Culture: Participate in company-led webinars on AI, inclusion, and industry trends—designed to inspire curiosity and continuous improvement.
Wellness Initiatives: Company-sponsored programs that support physical, mental, and emotional well-being.
Employee Assistance Program (EAP): Confidential support for you and your family’s needs.
Comprehensive Benefits: Health and financial benefits that support your happiness and future.

Company

Frontline Education

company-logo
Frontline Education is an integrated insights software primarily focusing on human capital management.

H1B Sponsorship

Frontline Education has a track record of offering H1B sponsorships. Please note that this does not guarantee sponsorship for this specific role. Below presents additional info for your reference. (Data Powered by US Department of Labor)
Distribution of Different Job Fields Receiving Sponsorship
Represents job field similar to this job
Trends of Total Sponsorships
2023 (3)
2021 (1)
2020 (1)

Funding

Current Stage
Late Stage
Total Funding
unknown
2022-08-30Acquired

Leadership Team

leader-logo
Matt Strazza
President & CEO
linkedin
leader-logo
Chris Tonas
CTO
linkedin
Company data provided by crunchbase