Enterprise Solutions Architect (Cloud) jobs in United States
cer-icon
Apply on Employer Site
company-logo

ChatGPT Jobs · 2 days ago

Enterprise Solutions Architect (Cloud)

Arch Systems is seeking a hands-on Enterprise Solutions Architect (Cloud) to design, build, and deliver secure, production-grade AI systems supporting federal civilian missions. This role requires deep technical execution combined with executive-level solutioning and stakeholder engagement, leading end-to-end AI solution delivery while mentoring teams and presenting solutions to federal stakeholders.

Computer Software
badNo H1BnoteU.S. Citizen Onlynote

Responsibilities

Write production-grade Python and build FastAPI-based microservices for AI/ML and GenAI workloads
Design and implement GenAI/RAG pipelines, including embeddings, vector databases, prompt orchestration, and evaluation frameworks
Containerize workloads using Docker and deploy to Kubernetes (EKS and/or AKS) using CI/CD pipelines
Spend ≥50% of active build phases coding, pairing with engineers, and conducting design and code reviews
Design and operate AI systems on AWS and/or Azure, including GovCloud or Azure Government environments when required
Implement Infrastructure as Code using Terraform and/or CloudFormation/Bicep
Architect secure cloud networking (VPC/VNet, private endpoints, VPN/ExpressRoute/Direct Connect)
Integrate cloud AI services such as SageMaker, Bedrock, OpenSearch, EKS and/or Azure ML, Azure OpenAI, AKS, Cognitive Search
Define and implement HA/DR strategies, autoscaling, and reliability patterns across regions and availability zones
Stand up and operate MLOps platforms using MLflow, Databricks, SageMaker, or Azure ML
Manage model lifecycle: experimentation, registry, gated promotions, canary releases, rollback
Implement automated testing, monitoring, and alerting for model drift, bias, robustness, and performance
Engineer prompt flows, grounding strategies, guardrails, and policy enforcement
Define offline and online evaluation using golden datasets and human-in-the-loop workflows
Monitor and optimize factuality, relevance, toxicity, latency, and token usage
Embed NIST RMF (800-37), 800-53, 800-171, FISMA, and FedRAMP controls into system design
Implement IAM, encryption at rest and in transit, secrets management, logging, and auditing
Contribute to SSPs, POA&Ms, continuous monitoring, and coordinate with ISSOs and 3PAOs
Instrument systems using OpenTelemetry, logs, metrics, and traces
Define SLIs/SLOs, dashboards, alerts, and conduct game days and post-mortems
Model total cost of ownership (TCO) and manage cloud spend using FinOps practices
Optimize performance and cost through right-sizing, autoscaling, caching, batching, quantization, and distillation
Build executive-ready decks and demos for federal stakeholders
Clearly communicate mission value using KPIs (cycle-time reduction, precision/recall, latency, cost per query, compliance posture)
Support RFIs, RFPs, technical volumes, and orals; lead technical Q&A with mixed audiences
Lead cross-functional agile teams and mentor engineers through pairing and reviews
Define “definition of done” including tests, documentation, security scans, and performance baselines
Publish reusable accelerators: reference architectures, IaC modules, pipeline templates, and security baselines
Maintain ADRs, runbooks, data contracts, user guides, and ensure Section 508 compliance

Qualification

Cloud-native AI/ML systemsAWSAzureGenAI/RAG architecturesProduction-grade PythonFastAPIKubernetes (EKS/AKS)MLOps platformsInfrastructure as CodeFederal compliance experienceAgile team leadershipMentoring engineersModel lifecycle managementAutomated testingCost optimizationCommunication skills

Required

10+ years of software and/or ML engineering experience; 5+ years cloud experience
Hands-on, recent experience building production AI systems using Python, PyTorch or TensorFlow, and FastAPI
Proven delivery of GenAI/RAG solutions, including embeddings, vector databases (FAISS, Milvus, pgvector), and evaluation frameworks
Strong cloud experience on AWS and/or Azure, including security, networking, monitoring, and operations
Production experience running AI workloads on Kubernetes (EKS and/or AKS)
Implemented cloud-native MLOps using SageMaker, Databricks, or Azure ML with CI/CD and model registries
Federal delivery experience with FISMA/FedRAMP/RMF and ATO processes
Excellent communication skills with the ability to translate technical solutions into mission impact and ROI

Preferred

Experience supporting HHS, DHS, USDA, NOAA, or IRS
Active or eligible Public Trust clearance
Experience in AWS GovCloud and/or Azure Government environments
Certifications: AWS Solutions Architect (Associate or Professional), Azure Solutions Architect Expert, DP-100, DP-203
Experience with model risk management, adversarial testing/red teaming, and Section 508 evaluations

Benefits

Medical

Company

ChatGPT Jobs

twitter
company-logo
We find the best job offers for experts in ChatGPT and related technologies.

Funding

Current Stage
Early Stage
Company data provided by crunchbase