SIGN IN
AIOps Technical Associate jobs in United States
cer-icon
Apply on Employer Site
company-logo

Milestone Technologies, Inc. · 4 hours ago

AIOps Technical Associate

Milestone Technologies, Inc. is seeking an AIOps Technical Associate to manage operational workflows for AI/ML model deployments and monitor their performance across various cloud platforms. The role involves cost management, dashboarding, and cross-functional coordination to ensure efficient AI operations and governance.
Application Performance ManagementConsumer ElectronicsInformation Technology
badNo H1Bnote
Hiring Manager
Liz Walker
linkedin

Responsibilities

Manage operational workflows for model deployments, updates, and versioning across GCP, Azure, and AWS
Monitor model performance metrics: latency, throughput, error rates, token usage, and inference quality
Track model drift, accuracy degradation, and performance anomalies - escalating to engineering as needed
Support knowledge base operations including vector embedding pipeline health, chunk quality, and refresh cycles in Vertex AI
Maintain model inventory and documentation across multi-cloud environments
Coordinate model evaluation cycles with Responsible AI and Core Engineering teams
Monitor AI agent health, performance, and reliability (AutoGen-based agents, MCP servers)
Track agent execution metrics: task completion rates, tool call success/failure, latency, and error patterns
Support agent deployment and configuration management workflows
Document agent behaviors, known issues, and operational runbooks
Coordinate with Core Engineering on agent updates, testing, and rollouts
Monitor MCP server availability, connection health, and integration status
Track and analyze AI/ML cloud spend across GCP (Vertex AI), Azure (OpenAI), and AWS (Bedrock)
Build cost dashboards with breakdowns by model, application team, use case, and environment
Monitor token consumption, inference costs, and embedding/storage costs
Identify cost optimization opportunities - model selection, caching, batching, rightsizing
Provide cost allocation reporting for chargeback/showback to consuming application teams
Forecast spend trends and flag budget anomalies
Partner with Infrastructure and Finance teams on AI cost governance
Build and maintain dashboards for platform performance, model health, agent metrics, and operational KPIs
Create executive and stakeholder reports on platform adoption, usage trends, and cost allocation
Develop Responsible AI dashboards tracking hallucination rates, accuracy metrics, guardrail triggers, and safety incidents
Monitor APIGEE gateway traffic patterns and API consumption trends
Provide regular reporting to product management on use case performance
Support release management processes with pre/post-deployment validation checks
Track release health metrics for models, agents, and platform components
Maintain release documentation, runbooks, and operational playbooks
Coordinate with QA, Performance Engineering, and Infrastructure teams during releases
Monitor guardrail effectiveness and flag anomalies to the Responsible AI team
Track and report on hallucination detection, content safety triggers, and accuracy trends
Support LLM Red Teaming efforts by collecting and organizing evaluation data
Maintain audit logs and compliance documentation for AI governance
Serve as operational point of contact for application teams consuming DxT AI APIs
Coordinate with Corporate Security on audit requests and compliance reporting
Partner with Infrastructure team on capacity tracking and resource utilization
Support Performance Engineering with load test analysis and results documentation

Qualification

GCPAI/ML ConceptsCloud Cost ManagementSQLPythonMonitoring ToolsDashboarding ToolsAnalytical SkillsCommunication SkillsProblem-Solving Skills

Required

2-4 years in Ops/Technical Operations (MLOps, AIOps, DataOps, Platform Ops, or similar)
Must have experience with AI/ML Concepts
Must have experience with Cloud Cost Management and FinOps
REQUIRED: GCP, SQL, reporting tools and basic Python as well as monitoring/observability tools
2-4 years in an Ops, Analytics, or Technical Operations role (MLOps, AIOps, DataOps, Platform Ops, or similar)
Understanding of AI/ML concepts: models, inference, embeddings, vector databases, LLMs, tokens, prompts
Experience with cloud cost management and FinOps - tracking, analyzing, and optimizing cloud spend
Strong proficiency with dashboarding and visualization tools (Looker, Tableau, Grafana, or similar)
Working knowledge of GCP (required); familiarity with Azure and AWS a plus
Comfortable with SQL and basic Python for data analysis and scripting
Experience with monitoring and observability platforms (Datadog, Prometheus/Grafana, Cloud Monitoring, or similar)
Understanding of APIs and API gateways - ability to read logs, trace requests, analyze traffic
Strong analytical and problem-solving skills with attention to detail
Excellent communication skills - able to translate technical metrics into stakeholder insights
College degree in Computer Science, BIS, MIS, EE, ME or similar is required

Preferred

Hands-on experience with LLM platforms: Vertex AI, Azure OpenAI, AWS Bedrock
Familiarity with AI agents and agentic architectures (AutoGen, LangChain, or similar)
Exposure to MCP (Model Context Protocol) or agent-tool integration patterns
Experience with vector databases and RAG (Retrieval-Augmented Generation) operations
Understanding of MLOps lifecycle: model registry, versioning, deployment patterns, A/B testing
Experience with APIGEE or similar API management platforms
Familiarity with Responsible AI metrics - hallucination, bias, content safety, guardrails
FinOps certification or formal cloud cost management experience
Experience supporting enterprise platform teams with multiple consuming applications
Familiarity with ML pipeline tools (Kubeflow, MLflow, Vertex AI Pipelines)
Exposure to prompt management and evaluation frameworks
ITIL or operational process framework experience
Experience creating runbooks and operational documentation

Benefits

Comprehensive benefits

Company

Milestone Technologies, Inc.

company-logo
Milestone Technologies is a global IT Services and Digital Solutions company based in Silicon Valley that helps hundreds of leading corporations deliver technology around the globe.

Funding

Current Stage
Late Stage
Total Funding
$42.5M
Key Investors
H.I.G. Capital
2022-12-13Acquired
2015-08-11Private Equity· $42.5M

Leadership Team

leader-logo
Sameer Kishore
President and Chief Executive Officer
linkedin
leader-logo
Mayank K Agrawal
Chief Financial Officer
linkedin
Company data provided by crunchbase