SIGN IN
Software Developer/Engineer (AI/ML Engineer or LLM Engineer) jobs in United States
info-icon
This job has closed.
company-logo

Diversity Nexus · 10 hours ago

Software Developer/Engineer (AI/ML Engineer or LLM Engineer)

Diversity Nexus is seeking a Software Developer/Engineer with expertise in AI/ML or LLM engineering. The role involves deploying open-source LLMs, implementing vector databases, and ensuring security and governance in on-prem environments.
Staffing & Recruiting
badNo H1BnoteU.S. Citizen Onlynote

Responsibilities

Hands-on experience deploying open-source LLMs such as Meta Llama 3 and Mistral / Mixtral in on-prem or private environments
Strong proficiency in Python for LLM inference, prompt engineering, and integration
Experience with CPU-based inference, model quantization, and performance tuning
Practical experience with open-source vector databases such as Qdrant, Chroma, Milvus, or pgvector
Proven implementation of Retrieval-Augmented Generation (RAG) pipelines
Experience generating and managing embeddings and metadata filtering
Understanding of data privacy, air-gapped deployments, and enterprise security requirements
Experience implementing access controls and audit logging
Deliverables include reference architecture and deployment guidance, working prototype (LLM + vector DB + RAG), and documentation and knowledge transfer to internal teams

Qualification

Open-source LLM deploymentPython proficiencyVector databases experienceRetrieval-Augmented GenerationData privacy understandingAccess controls implementationModel quantizationPerformance tuningDocker familiarityKubernetes familiarityInference frameworks knowledgeRust exposureGo exposureC++ exposure

Required

Hands-on experience deploying open-source LLMs such as Meta Llama 3 and Mistral / Mixtral in on-prem or private environments
Strong proficiency in Python for LLM inference, prompt engineering, and integration
Experience with CPU-based inference, model quantization, and performance tuning
Practical experience with open-source vector databases such as Qdrant, Chroma, Milvus, or pgvector
Proven implementation of Retrieval-Augmented Generation (RAG) pipelines
Experience generating and managing embeddings and metadata filtering
Understanding of data privacy, air-gapped deployments, and enterprise security requirements
Experience implementing access controls and audit logging

Preferred

Experience with LangChain or LlamaIndex
Exposure to Rust, Go, or C++ for high-performance services
Familiarity with Docker and Kubernetes for on-prem deployments
Knowledge of inference frameworks (e.g., vLLM, llama.cpp, Hugging Face Transformers)
Prior work in regulated or enterprise environments

Company

Diversity Nexus

twitter
company-logo

Funding

Current Stage
Growth Stage
Company data provided by crunchbase