This job has closed.

Ogha Technologies · 9 hours ago

Senior AI/ML Engineer – LLMs & Fine‑Tuning

San Jose, CA

Full-time

Onsite

Mid, Senior Level

3+ years exp

Ogha Technologies is seeking a senior AI/ML engineer with deep experience in large language models (LLMs). The role involves owning the end-to-end development of LLM-powered features, including data pipelines, training, evaluation, and integration with products.

AnalyticsArtificial Intelligence (AI)Big DataConsultingData IntegrationInformation TechnologyNatural Language Processing

Responsibilities

Design, train and fine‑tune LLMs (e.g., GPT‑class, Llama, Mistral, Qwen) using techniques such as LoRA/QLoRA, instruction tuning, and domain adaptation on proprietary datasets

Build data pipelines for collecting, cleaning, labeling, and augmenting text datasets for supervised fine‑tuning, preference modeling, and evaluation

Develop and maintain scalable training and inference pipelines using frameworks such as PyTorch, TensorFlow/JAX, Hugging Face Transformers, and associated tooling

Implement and optimize RAG (Retrieval‑Augmented Generation) systems leveraging vector databases (e.g., FAISS, Pinecone, Weaviate, Qdrant) and document stores

Design prompt strategies, tools, and agents to improve reliability, controllability, and latency of LLM‑based applications

Define and implement evaluation frameworks (automatic metrics, human evals, red‑team tests) for quality, safety, and robustness of model outputs

Optimize models for inference via quantization, distillation, pruning, and efficient serving (GPU/CPU, batch inference, caching)

Collaborate with product, engineering, and domain experts to translate business problems into LLM‑based solutions and iterate quickly on prototypes

Deploy, monitor, and maintain LLM services in production using MLOps best practices (CI/CD, experiment tracking, model/version management, A/B testing)

Proactively track state‑of‑the‑art research in LLMs, multimodal models, and alignment, and bring relevant advances into our stack

Qualification

Large Language ModelsPython ProgrammingDeep Learning ArchitecturesPyTorchMLOps Best PracticesNLP ExperienceData Pipeline DevelopmentModel EvaluationSoft Skills

Required

Bachelor's or Master's degree in Computer Science, Machine Learning, Mathematics, or related technical field (or equivalent practical experience)

Strong Python programming skills and software engineering fundamentals (testing, code review, modular design, performance profiling)

3+ years of hands‑on experience building ML systems end‑to‑end, with at least 1–2 years focused specifically on NLP/LLMs

Proven experience fine‑tuning LLMs (parameter‑efficient methods and/or full‑parameter) on real‑world datasets and shipping them into production

Deep understanding of modern deep learning and transformer architectures: attention mechanisms, positional encodings, tokenization, optimization (AdamW, schedulers), and regularization

Experience with PyTorch (preferred) or TensorFlow/JAX, plus Hugging Face ecosystem (Transformers, Datasets, Accelerate, PEFT, TRL, etc.)

Experience building and operating training and inference workloads on cloud platforms (AWS, GCP, Azure) and GPUs (CUDA, distributed training with DDP/DeepSpeed/Lightning, etc.)

Strong skills in designing and querying data stores for RAG (SQL/NoSQL, vector databases) and integrating them with LLMs

Familiarity with deploying models as APIs/microservices using frameworks such as FastAPI, Flask, or similar

Solid understanding of model evaluation, observability, and monitoring (quality, drift, bias, safety)

Preferred

Experience implementing RAG systems at scale (indexing, retrieval optimization, hybrid search, metadata‑aware ranking)

Experience with RLHF, DPO, or other alignment techniques and preference‑based training

Experience with multi‑modal models (e.g., vision‑language, speech‑language) or tool‑using/agentic architectures

Exposure to security, compliance, and privacy aspects of training and serving LLMs on sensitive data

Contributions to open‑source ML/LLM projects, publications, or notable public demos

Experience integrating with major LLM APIs and managed services (OpenAI, Anthropic, Google, Azure, etc.) as well as self‑hosted models

Ability to work closely with non‑ML engineers, PMs, and stakeholders, explaining complex concepts clearly and pragmatically

Product mindset with focus on impact, reliability, and user experience rather than just model metrics

Self‑driven, comfortable with ambiguity, and able to own projects from idea to production

Company

Ogha Technologies

Ogha Technologies is an IT company that provides big data, data integration, advanced analytics, and IT strategy consulting services.

San Jose, California, USA

11-50 employees

http://oghatech.com

Funding

Current Stage

Early Stage

Leadership Team

Bhanu Murthy Nallagonda

Co-Founder and Director

Company data provided by crunchbase