This job has closed.

RIT Solutions, Inc. · 2 months ago

AI operations platform consultant

Summit, NJ

Full-time

Hybrid

Mid Level

RIT Solutions, Inc. is seeking an AI Operations Platform Consultant to manage and optimize AI inference services. The role involves deploying and troubleshooting containerized services on Kubernetes and managing MLOps/LLMOps pipelines for production environments.

Staffing & Recruiting

H1B Sponsor Likely

Responsibilities

Experience deploying, managing, operating, and troubleshooting containerized services at scale on Kubernetes for mission-critical applications (OpenShift)

Experience with deploying, configuring, and tuning LLMs using TensorRT-LLM and Triton Inference server

Managing MLOps/LLMOps pipelines, using TensorRT-LLM and Triton Inference server to deploy inference services in production

Setup and operation of AI inference service monitoring for performance and availability

Experience deploying and troubleshooting LLM models on a containerized platform, monitoring, load balancing, etc

Operation and support of MLOps/LLMOps pipelines, using TensorRT-LLM and Triton Inference server to deploy inference services in production

Experience deploying and troubleshooting LLM models on a containerized platform, monitoring, load balancing, etc

Experience with standard processes for operation of a mission critical system – incident management, change management, event management, etc

Managing scalable infrastructure for deploying and managing LLMs

Deploying models in production environments, including containerization, microservices, and API design

Triton Inference Server, including its architecture, configuration, and deployment

Model Optimization techniques using Triton with TRTLLM

Model optimization techniques, including pruning, quantization, and knowledge distillation

Qualification

LLMKubernetesTriton Inference ServerTensorRT-LLMMLOps/LLMOpsContainerizationMicroservicesAPI designModel Optimization

Required

LLM and Kubernetes

Experience deploying, managing, operating, and troubleshooting containerized services at scale on Kubernetes for mission-critical applications (OpenShift)

Experience with deploying, configuring, and tuning LLMs using TensorRT-LLM and Triton Inference server

Managing MLOps/LLMOps pipelines, using TensorRT-LLM and Triton Inference server to deploy inference services in production

Setup and operation of AI inference service monitoring for performance and availability

Experience deploying and troubleshooting LLM models on a containerized platform, monitoring, load balancing, etc

Operation and support of MLOps/LLMOps pipelines, using TensorRT-LLM and Triton Inference server to deploy inference services in production

Experience deploying and troubleshooting LLM models on a containerized platform, monitoring, load balancing, etc

Experience with standard processes for operation of a mission critical system – incident management, change management, event management, etc

Managing scalable infrastructure for deploying and managing LLMs

Deploying models in production environments, including containerization, microservices, and API design

Triton Inference Server, including its architecture, configuration, and deployment

Model Optimization techniques using Triton with TRTLLM

Model optimization techniques, including pruning, quantization, and knowledge distillation

Company

RIT Solutions, Inc.

Jobdiva Job Portal: https://www1.jobdiva.com/candidates/myjobs/searchjobsdone.jsp?a=xbjdnwgjodtga1y1im2g881fkkeiwd0775lbvq8yqgps8vb2q36w2vj1ga6xxork&compid=-1 Recruitment (contingency search and campus selection).

Arlington, Virginia, US

201-500 employees

http://www.ritsolinc.com

H1B Sponsorship

RIT Solutions, Inc. has a track record of offering H1B sponsorships. Please note that this does not guarantee sponsorship for this specific role. Below presents additional info for your reference. (Data Powered by US Department of Labor)

Distribution of Different Job Fields Receiving Sponsorship

Represents job field similar to this job

Trends of Total Sponsorships

2025 (1)

2023 (2)

Funding

Current Stage

Growth Stage

Company data provided by crunchbase