Apply on Employer Site

Prime Solutions Group, Inc. · 1 day ago

Lead DevOps Engineer (AI/ML Ops)

Goodyear, AZ

Full-time

Hybrid

Senior Level, Lead/Staff

5+ years exp

Prime Solutions Group (PSG), Inc. is seeking a Lead DevOps Engineer (AI/ML Ops) to serve as a hybrid senior technical contributor and team leader. This role involves designing, implementing, and operating secure, automated machine learning and data pipelines across cloud and on-premise environments while ensuring compliance and performance of ML systems.

Information ServicesInformation TechnologySecuritySoftware

No H1B

Security Clearance Required

U.S. Citizen Only

Responsibilities

Lead the design, implementation, and operation of ML-focused CI/CD pipelines supporting data ingestion, feature engineering, model training, evaluation, and deployment across dev, test, staging, and production environments

Apply and adapt MLOps best practices within existing DevSecOps workflows, including: Data quality checks and schema validation, Model validation and promotion gates, Model performance and drift monitoring

Architect and oversee training and inference platforms, including experiment tracking, model registries, and automated retraining pipelines

Oversee secure integration of Infrastructure-as-Code, containerization, and orchestration (Docker, Kubernetes) for ML and data workloads, including GPU and high-performance compute resources

Mentor and guide engineers in MLOps and DevSecOps practices, promoting automation, observability, and security-first design

Collaborate with cross-functional teams (data science, software engineering, research, IT, cybersecurity, systems engineering) to ensure ML system reliability, performance, and compliance

Lead technical risk assessments and contribute to incident response for ML and data systems (e.g., model degradation, data quality issues, pipeline failures)

Serve in a hybrid role as both: A senior hands-on engineer contributing to pipelines, infrastructure, and monitoring, A technical leader guiding small to mid-sized MLOps initiatives

Make informed technical decisions across ML, data, security, and operations domains, resolving complex multi-disciplinary challenges

Evaluate ethical and operational considerations in AI/ML deployment (e.g., bias, data constraints, mission risk) and recommend appropriate mitigations

Stay current on emerging MLOps, AI platform, and data engineering technologies, recommending adoption where beneficial

Qualification

MLOpsDevSecOpsCI/CD toolsInfrastructure-as-CodePythonKubernetesData engineeringMachine learningLeadershipCommunicationMentoringCollaboration

Required

U.S. Citizenship

Active Top Secret clearance or higher

Bachelor's degree in Computer Science, Engineering, Data Science, Applied Mathematics, or related field

5–9+ years of experience in one or more of the following: MLOps or ML platform engineering, DevOps / DevSecOps / SRE supporting data or ML workloads, Data engineering with production ML integration, Applied machine learning in production environments

Strong experience with CI/CD tools (Jenkins, GitLab CI, GitHub Actions, CircleCI) and modern Git workflows

Hands-on experience with Infrastructure-as-Code (Terraform, Ansible, CloudFormation) and Kubernetes

Proficiency with ML and data technologies, including: Python and ML/data libraries (NumPy, pandas, scikit-learn, PyTorch, TensorFlow), Workflow/orchestration tools (Airflow, Kubeflow, Prefect, Dagster), Experiment tracking and model registries (MLflow, Weights & Biases, SageMaker)

Experience integrating security and governance into ML environments (image/dependency scanning, SBOMs, secrets management, IAM)

Familiarity with NIST, FedRAMP, and DoD RMF compliance frameworks as applied to ML and data systems

Strong scripting or programming skills (Python, Bash, Go, or similar)

Demonstrated experience leading technical efforts and mentoring engineers

Ability to communicate clearly with both technical and non-technical stakeholders

Preferred

Security, cloud, or ML certifications (e.g., CISSP, AWS Security Specialty, AWS ML Specialty, CKS, GIAC)

Experience implementing Zero Trust architectures

Experience with observability and monitoring tools (Prometheus, Grafana, ELK/EFK, OpenTelemetry) for ML services

Hands-on experience with: Feature stores and data validation frameworks (e.g., Great Expectations), Data governance and lineage tooling, Policy-as-code for ML environments (OPA, Kyverno, admission controllers)

Prior experience supporting defense, aerospace, or government-secured AI/ML programs

Experience operating enterprise-scale or mission-critical ML systems, including high-availability inference and rigorous performance monitoring

Benefits

Competitive compensation and benefits

Professional development and tuition assistance

A collaborative, mission-driven culture

Direct impact on national security through secure AI/ML solutions

Company

Prime Solutions Group, Inc.

Prime Solutions Group, Inc (PSG) provides engineering services and software data processing products for remote sensing systems.

Founded in 2007

Goodyear, Arizona, USA

51-200 employees

https://psg-inc.net

Funding

Current Stage

Growth Stage

Company data provided by crunchbase