SIGN IN
Agentic AI/ML Engineer - Multimodal jobs in United States
info-icon
This job has closed.
company-logo

FieldAI · 6 months ago

Agentic AI/ML Engineer - Multimodal

Field AI is transforming how robots interact with the real world, focusing on building reliable AI systems for complex robotics challenges. As an AI/ML Engineer on the Field-insight Foundation Model team, you will drive research and model development to transform multimodal data from autonomous robots into actionable insights, emphasizing creativity and rigorous problem-solving.
Enterprise SoftwareRoboticsRobotic Process Automation (RPA)
check
H1B Sponsor Likelynote

Responsibilities

Train and fine-tune million- to billion-parameter multimodal models, with a focus on computer vision, video understanding, and vision-language integration
Track state-of-the-art research, adapt novel algorithms, and integrate them into FiFM
Curate datasets and develop tools to improve model interpretability
Build scalable evaluation pipelines for vision and multimodal models
Contribute to model observability, drift detection, and error classification
Fine-tune and optimize open-source VLMs and multimodal embedding models for efficiency and robustness
Build and optimize Multi-VectorRAG pipelines with vector DBs and knowledge graphs
Create embedding-based memory and retrieval chains with token-efficient chunking strategies

Qualification

Multimodal modelsComputer visionPythonPyTorchMLOps best practicesVideo understandingTemporal modelingAWSModel interpretabilityAgentic AISoft skills

Required

Master's/Ph.D. in Computer Science, AI/ML, Robotics, or equivalent industry experience
2+ years of industry experience or relevant publications in CV/ML/AI
Strong expertise in computer vision, video understanding, temporal modeling, and VLMs
Proficiency in Python and PyTorch with production-level coding skills
Experience building pipelines for large-scale video/image datasets
Familiarity with AWS or other cloud platforms for ML training and deployment
Understanding of MLOps best practices (CI/CD, experiment tracking)
Hands-on experience fine-tuning open-source multimodal models using HuggingFace, DeepSpeed, vLLM, FSDP, LoRA/QLoRA
Knowledge of precision tradeoffs (FP16, bfloat16, quantization) and multi-GPU optimization
Ability to design scalable evaluation pipelines for vision/VLMs and agent performance

Preferred

Experience with Agentic/RAG pipelines and knowledge graphs (LangChain, LangGraph, LlamaIndex, OpenSearch, FAISS, Pinecone)
Familiarity with agent operations logging and evaluation frameworks
Background in optimization: token cost reduction, chunking strategies, reranking, and retrieval latency tuning
Experience deploying models under quantized (int4/int8) and distributed multi-GPU inference
Exposure to open-vocabulary detection, zero/few-shot learning, multimodal RAG
Knowledge of temporal-spatial modeling (event/scene graphs)
Experience deploying AI in edge or resource-constrained environments

Company

FieldAI

twittertwitter
company-logo
FieldAI is the general-purpose brain making robots autonomous in complex, risky, real-world environments.

H1B Sponsorship

FieldAI has a track record of offering H1B sponsorships. Please note that this does not guarantee sponsorship for this specific role. Below presents additional info for your reference. (Data Powered by US Department of Labor)
Distribution of Different Job Fields Receiving Sponsorship
Represents job field similar to this job
Trends of Total Sponsorships
2026 (6)
2025 (9)

Funding

Current Stage
Growth Stage
Total Funding
$405M
Key Investors
Hyundai Motor GroupBezos Expeditions,Prysm Capital,Temasek Holdings
2026-02-22Corporate Round
2025-08-20Series Unknown· $91M
2025-08-20Series A· $314M

Leadership Team

leader-logo
Ali Agha
Founder and CEO
linkedin
Company data provided by crunchbase