1.68 Agentic AI/ML Engineer - Multimodal jobs in United States
cer-icon
Apply on Employer Site
company-logo

FieldAI · 1 hour ago

1.68 Agentic AI/ML Engineer - Multimodal

FieldAI is a company focused on transforming how robots interact with the real world through advanced AI systems. The AI/ML Engineer will drive research and model development for multimodal data, focusing on computer vision and agentic AI, while contributing to broader perception and insight-generation initiatives.

Enterprise SoftwareRobotic Process Automation (RPA)Robotics
check
H1B Sponsor Likelynote

Responsibilities

Train and fine-tune million- to billion-parameter multimodal models, with a focus on computer vision, video understanding, and vision-language integration
Track state-of-the-art research, adapt novel algorithms, and integrate them into FiFM
Curate datasets and develop tools to improve model interpretability
Build scalable evaluation pipelines for vision and multimodal models
Contribute to model observability, drift detection, and error classification
Fine-tune and optimize open-source VLMs and multimodal embedding models for efficiency and robustness
Build and optimize Multi-VectorRAG pipelines with vector DBs and knowledge graphs
Create embedding-based memory and retrieval chains with token-efficient chunking strategies

Qualification

Computer VisionMultimodal ModelsPythonPyTorchMLOps Best PracticesAWSTemporal ModelingModel Fine-tuningProblem SolvingCreativity

Required

Master's/Ph.D. in Computer Science, AI/ML, Robotics, or equivalent industry experience
2+ years of industry experience or relevant publications in CV/ML/AI
Strong expertise in computer vision, video understanding, temporal modeling, and VLMs
Proficiency in Python and PyTorch with production-level coding skills
Experience building pipelines for large-scale video/image datasets
Familiarity with AWS or other cloud platforms for ML training and deployment
Understanding of MLOps best practices (CI/CD, experiment tracking)
Hands-on experience fine-tuning open-source multimodal models using HuggingFace, DeepSpeed, vLLM, FSDP, LoRA/QLoRA
Knowledge of precision tradeoffs (FP16, bfloat16, quantization) and multi-GPU optimization
Ability to design scalable evaluation pipelines for vision/VLMs and agent performance

Preferred

Experience with Agentic/RAG pipelines and knowledge graphs (LangChain, LangGraph, LlamaIndex, OpenSearch, FAISS, Pinecone)
Familiarity with agent operations logging and evaluation frameworks
Background in optimization: token cost reduction, chunking strategies, reranking, and retrieval latency tuning
Experience deploying models under quantized (int4/int8) and distributed multi-GPU inference
Exposure to open-vocabulary detection, zero/few-shot learning, multimodal RAG
Knowledge of temporal-spatial modeling (event/scene graphs)
Experience deploying AI in edge or resource-constrained environments

Company

FieldAI

twittertwitter
company-logo
FieldAI is pioneering the development of a field-proven, hardware agnostic brain technology that enables many different types of robots to operate autonomously in hazardous, offroad, and potentially harsh industrial settings – all without GPS, maps, or any pre-programmed routes.

H1B Sponsorship

FieldAI has a track record of offering H1B sponsorships. Please note that this does not guarantee sponsorship for this specific role. Below presents additional info for your reference. (Data Powered by US Department of Labor)
Distribution of Different Job Fields Receiving Sponsorship
Represents job field similar to this job
Trends of Total Sponsorships
2025 (9)

Funding

Current Stage
Early Stage
Total Funding
$405M
2025-08-20Series Unknown· $91M
2025-08-20Series A· $314M

Leadership Team

leader-logo
Ali Agha
Founder and CEO
linkedin
Company data provided by crunchbase