Northeastern University ยท 5 days ago
AI Operations Specialist
Northeastern University is seeking an AI Operations Specialist to manage and support the university's AI systems and data pipelines. This role involves ensuring the reliability and performance of AI solutions while implementing operational improvements and automation practices.
Education
Responsibilities
Monitor AI system and data pipeline health, performance, and availability using established monitoring tools and dashboards. Detect, triage, and resolve incidents affecting AI systems and their data infrastructure, coordinating with technical teams as needed. Implement proactive measures to prevent recurring issues and minimize service disruptions
Perform routine operational tasks to maintain AI systems and data pipelines, including model updates, data refreshes, pipeline maintenance, and system patches. Implement scheduled maintenance activities with minimal service disruption. Manage user access and permissions for AI platforms according to security policies
Analyze AI system and data pipeline performance metrics, identify bottlenecks and inefficiencies, and implement optimizations to improve response times, data flow, accuracy, and resource utilization. Monitor for model drift and data quality issues, coordinating retraining or pipeline adjustments when necessary
Create and maintain comprehensive operational documentation, including runbooks, standard operating procedures, and knowledge base articles. Document system configurations, data pipeline dependencies, and recovery procedures to ensure operational continuity
Identify opportunities for process improvement and automation in AI operations. Develop and implement scripts and workflows to automate routine tasks, reducing manual effort and minimizing human error. Contribute to the evolution of MLOps practices based on operational experience and emerging best practices
Qualification
Required
Bachelor's degree in Computer Science, Information Technology, or related field
Technical certifications in relevant areas (e.g., cloud platforms, MLOps, data engineering) preferred
Minimum of 3 years of experience in IT operations
At least 1 year focused on AI/ML systems and data pipeline support
Experience with cloud platforms (AWS, Azure, or GCP) and their AI/ML and data engineering service offerings
Demonstrated experience in operationalizing and maintaining machine learning models in production environments, including deployment, monitoring, and lifecycle management
Extensive experience maintaining and troubleshooting data pipelines built with tools like Apache Airflow, Prefect, cloud data services (AWS, Azure, GCP), and data processing frameworks (Spark, Kafka)
Proficiency in monitoring AI system and data pipeline performance, detecting anomalies, and implementing proactive measures to ensure system reliability and availability
Experience in troubleshooting, diagnosing, and resolving AI system and data infrastructure issues, with the ability to prioritize incidents based on business impact
Knowledge of techniques to optimize AI system and data pipeline performance, including resource allocation, scaling strategies, and performance tuning
Experience implementing changes to production AI systems and data pipelines with minimal disruption, including testing, validation, and rollback procedures
Understanding of data quality principles and their impact on AI system performance, with the ability to identify and address data-related issues in processing pipelines
Excellence in creating and maintaining operational documentation, runbooks, and knowledge articles for AI systems and data pipelines
Ability to create and implement automation scripts and workflows to streamline routine operational tasks for both AI systems and data flows
Familiarity with DevOps and CI/CD principles as applied to AI systems and data pipelines, including containerization, orchestration, and infrastructure as code
Understanding of security best practices for AI operations and data handling, including access control, data protection, and vulnerability management
Company
Northeastern University
Founded in 1898, Northeastern is a global research university with a distinctive, experience-driven approach to education and discovery.
Funding
Current Stage
Late StageRecent News
2024-01-06
Alfred P. Sloan Foundation
2023-10-30
Company data provided by crunchbase