AI Operations Engineer jobs in United States
cer-icon
Apply on Employer Site
company-logo

Frontline Education · 3 weeks ago

AI Operations Engineer

Frontline Education is reimagining what’s possible by becoming an AI-first organization, transforming how they serve educators. The AI Operations Engineer supports the automation and modernization of IT operations using AI and machine learning, focusing on developing AI systems and tools for enterprise IT teams.

EmploymentHuman ResourcesRecruitingSoftware
check
Work & Life Balance
check
H1B Sponsor Likelynote

Responsibilities

Develop AI driven monitoring and analytics: Build, train and maintain machine learning models that analyze operational data (logs, metrics, events and traces) to detect anomalies, pinpoint root causes and predict incidents Configure AI agents that detect problems, understand context and execute remediation steps autonomously
Integrate AIOps tools with cloud and enterprise systems: Work with DevOps and SRE teams to integrate AIOps platforms into continuous integration/continuous delivery (CI/CD) pipelines and cloud infrastructure Leverage APIs to connect AI monitoring services to enterprise systems such as IT service management platforms and configuration management databases
Evaluate and deploy AIOps platforms: Assist in evaluating, implementing and tuning commercial AIOps tools (e.g., Moogsoft, Dynatrace, Splunk, BigPanda, Datadog) Configure agentic features in these platforms to enable automatic root cause analysis and remediation
Collaborate with cross functional teams: Partner with developers, operations staff and cybersecurity teams to ensure AI Ops integrations align with business goals Explain AIdriven insights and recommend improvements to stakeholders
Stay current on emerging technologies: Follow developments in agentic AI frameworks, generative AI, machine learning operations and observability. Continuously learn and experiment with new tools to improve system efficiency and reliability

Qualification

Machine LearningAIOps ToolsPythonCloud PlatformsObservability StacksAPIsAnalytical ThinkingProblem-SolvingCommunicationCollaboration

Required

Some college coursework in computer science, data science, information systems or a related field. Candidates without a degree should have an equivalent combination of training and experience
1–2 years of hands-on experience in DevOps, site reliability or IT operations roles that used agentic AI or AIOps tools. Experience may include internships or co-ops
Proficiency in Python or another scripting language for automating tasks and interacting with APIs
Understanding of machine learning concepts and frameworks (PyTorch)
Familiarity with observability stacks such as Prometheus, Grafana and ELK/OpenTelemetry, and basic knowledge of data visualization
Exposure to public cloud platforms (AWS) and infrastructure as code tools (Terraform, Ansible)
Ability to interpret telemetry data and spot trends; comfortable working with charts, logs and metrics
Strong problem-solving and analytical thinking, with the ability to troubleshoot issues and implement solutions
Excellent communication and collaboration skills to work with cross functional teams

Preferred

Hands-on experience with commercial AIOps platforms or building machine learning based incident response systems
Knowledge of Kubernetes, container orchestration and service mesh architecture
Background in log analysis, timeseries forecasting or unsupervised anomaly detection
Familiarity with IT service management (ITSM) processes and how they integrate with AI-driven operations
Exposure to agentic AI frameworks (e.g., LangChain/LangGraph) or generative AI pipelines, particularly for RAG or LLM‑based chatops

Benefits

Personalized Time Off: Take time when it’s needed most — whether that’s a family vacation, a reset day, or simply time to rest and refocus.
Paid Sick Time: Separate, dedicated sick leave to care for yourself or loved ones.
Volunteer Time Off: Paid time to give back and support causes that matter to you.
Ten Paid Holidays: Enjoy meaningful moments and traditions throughout the year.
World-Class Learning Access: Explore thousands of on-demand courses through platforms like LinkedIn Learning.
Leadership & Technical Skill Building: Develop new capabilities and chart your own professional path.
AI Empowerment: Use OpenAI tools to build fluency with emerging technology and harness AI as a creative partner for innovation and problem-solving.
Tuition Reimbursement: Invest in formal education to advance your skills and career.
Ongoing Learning Culture: Participate in company-led webinars on AI, inclusion, and industry trends—designed to inspire curiosity and continuous improvement.
Wellness Initiatives: Company-sponsored programs that support physical, mental, and emotional well-being.
Employee Assistance Program (EAP): Confidential support for you and your family’s needs.
Comprehensive Benefits: Health and financial benefits that support your happiness and future.
A Culture That Cares: At Frontline Education, we want every team member to learn, grow, and thrive—personally, professionally, and purposefully.

Company

Frontline Education

company-logo
Frontline Education is an integrated insights software primarily focusing on human capital management.

H1B Sponsorship

Frontline Education has a track record of offering H1B sponsorships. Please note that this does not guarantee sponsorship for this specific role. Below presents additional info for your reference. (Data Powered by US Department of Labor)
Distribution of Different Job Fields Receiving Sponsorship
Represents job field similar to this job
Trends of Total Sponsorships
2023 (3)
2021 (1)
2020 (1)

Funding

Current Stage
Late Stage
Total Funding
unknown
2022-08-30Acquired

Leadership Team

leader-logo
Matt Strazza
President & CEO
linkedin
leader-logo
Chris Tonas
CTO
linkedin
Company data provided by crunchbase