Prime Solutions Group, Inc. · 1 month ago
Senior DevOps Engineer (AI/ML Ops)
Prime Solutions Group (PSG), Inc. is seeking a Senior DevOps Engineer (AI/ML Ops) to lead the development of secure, scalable, and automated ML platforms powering mission-critical AI/ML programs. In this high-impact role, you will architect and operate end-to-end ML pipelines across classified and unclassified environments, enabling next-generation AI capabilities for defense and advanced sensing systems.
Information ServicesInformation TechnologySecuritySoftware
Responsibilities
Design, build, and maintain ML-focused CI/CD pipelines with automated testing, security checks, and model validation gates
Architect and implement data ingestion, ETL/ELT, and feature engineering pipelines using modern data engineering frameworks
Lead development of training, evaluation, and retraining workflows with experiment tracking and model registry integration
Containerize and deploy ML models (REST/gRPC microservices, batch jobs, and streaming inference) using Docker and Kubernetes across cloud and on-prem environments
Implement Infrastructure-as-Code (IaC) using Terraform, Ansible, or similar tools for provisioning compute, storage, networking, and GPU resources
Integrate data quality checks, drift detection, and model performance monitoring into production ML systems
Ensure ML workloads comply with NIST, RMF, FedRAMP, and PSG security baselines (image scanning, SBOMs, secrets management, hardening)
Partner with data scientists and software engineers to move models from experimentation to production, including packaging, dependency management, and optimization
Monitor ML infrastructure using Prometheus/Grafana, ELK/EFK, or similar observability stacks; lead incident root-cause analysis
Independently lead projects, influence architecture decisions, and navigate tool selection for enterprise ML platforms
Integrate ML-specific security and quality testing into workflows (SAST/DAST, container security scanning, policy-as-code)
Develop technical documentation, runbooks, diagrams, and risk assessments for ML platforms
Mentor junior staff and provide guidance on architecture, pipelines, code quality, and operational best practices
Participate in architecture reviews, compliance assessments, and configuration management processes
Qualification
Required
U.S. Citizenship (required)
Active Top-Secret Clearance (or higher)
Bachelor's degree in Computer Science, Engineering, Data Science, Mathematics, or related field
4–6+ years of experience in at least one of the following: MLOps / ML platform engineering, DevOps / DevSecOps / SRE for ML workloads, Data engineering with production ML workflows, Applied ML in production environments
Strong experience with secure CI/CD pipelines and IaC (GitLab CI, Jenkins, GitHub Actions, Terraform, Ansible)
Hands-on expertise with Docker, Kubernetes, and at least one major cloud provider (AWS/Azure/GCP), including GPU/HPC support
Strong understanding of the full ML lifecycle (data ? features ? training ? validation ? deployment ? monitoring ? retraining)
Proficiency with Python and standard ML/data libraries (NumPy, pandas, scikit-learn, PyTorch, TensorFlow)
Strong scripting skills (Python, Bash, PowerShell) for automation
Familiarity with RMF, STIGs, DISA, and secure ML deployment practices
Ability to lead projects, make architecture decisions, and mentor technical staff
Excellent communication and documentation skills
Preferred
Master's degree in a related field
Active Security Clearance above minimum requirements (SCI, CI Poly)
Industry certifications: AWS ML Specialty, AWS DevOps, CKA/CKS, etc
Experience with: MLflow, Weights & Biases, SageMaker, or similar registries/experiment tracking
Orchestration frameworks (Airflow, Kubeflow, Prefect, Dagster)
Feature stores and data validation tools (Great Expectations, Feast)
Experience with Zero Trust, SBOMs, and secure software supply chain principles
Familiarity with NIST 800-53, FedRAMP, and ISO 27001 as they relate to ML/AI systems
Kubernetes security expertise (RBAC, network policies, hardened images)
Background supporting defense, intelligence, or other high-assurance environments
Benefits
Competitive compensation & benefits
Professional development & tuition assistance
Collaborative, mission-driven culture
A small-company environment where innovation moves fast
Direct impact on high-visibility government programs leveraging advanced AI/ML
Company
Prime Solutions Group, Inc.
Prime Solutions Group, Inc (PSG) provides engineering services and software data processing products for remote sensing systems.
Funding
Current Stage
Growth StageCompany data provided by crunchbase