Trusted by AI-first companies worldwide →

Building production AI systems at the intersection of language models, retrieval, and backend infrastructure is what drives me. I enjoy bringing complex architectures to life and immersing myself in the problems that fascinate me. My curiosity has driven me to work across LLM fine-tuning, semantic search, autonomous agents, and scalable backends from ideation to production. Solving hard problems and pushing the boundaries of what AI systems can reliably do at scale brings me immense satisfaction.

LLM Systems & Fine-Tuning

I build end-to-end LLM pipelines — from fine-tuning foundation models with QLoRA and distillation, to structured extraction, prompt engineering systems, and evaluation frameworks that ensure production reliability.

Search & Retrieval Infrastructure

I specialize in building hybrid search systems that combine BM25 lexical retrieval with dense embeddings, custom ML ranking signals, query expansion, and reranking pipelines that deliver precise results at scale.

Agent Systems & Workflows

I design and build autonomous agent systems using LangGraph state machines, multi-step tool-calling, dynamic task routing, and human-in-the-loop oversight for reliable real-world execution.

AI Infrastructure & MLOps

I architect production inference pipelines with vLLM serving, INT8/INT4 quantization, FastAPI microservices, and Celery + Redis task orchestration — optimized for throughput, latency, and cost.

Strategy & Consulting

I leverage deep expertise to analyze your AI needs, evaluate emerging architectures, and chart a roadmap for success — from assessing existing ML infrastructure to recommending LLM strategies tailored to your organization.

On-Demand AI Engineering

Get a dedicated channel with me and your team where you can ask everything — from specific model issues, inference optimization, pipeline architecture, to production debugging and everything in between.

Experience
ApolisSenior AI / Machine Learning Engineer
Oct 2025 – PresentRemote
  • Architected an end-to-end ML retrieval and ranking platform in Python across 2.7M+ candidate resumes, using Elasticsearch BM25, dense embeddings, and distributed scoring pipelines.
  • Built containerized embedding-driven semantic search APIs using vLLM inference and FastAPI, deployed on Docker with semantic scoring, synonym expansion, and guided JSON generation.
  • Designed MLOps pipelines for resume and job description extraction at scale, including evaluation frameworks benchmarking field accuracy, reasoning consistency, and hallucination detection across model versions.
  • Designed a two-stage retrieval and ranking system combining high-recall search with ML scoring across skill overlap, role similarity, and experience alignment, significantly improving relevance.
  • Fine-tuned Qwen2.5 Instruct models using QLoRA (4-bit NF4 quantization) and implemented teacher–student distillation pipelines to improve extraction accuracy while reducing inference cost and latency.
ValorySenior AI Developer
Sep 2023 – Oct 2025Remote
  • Built distributed autonomous agent systems in Python for portfolio management, scaling to 5M+ requests and $10M+ managed assets with <200ms latency.
  • Engineered LangGraph-based ML pipelines with state-machine orchestration and Apache Airflow for production scheduling, integrating GPT-4, Claude, and fine-tuned Llama models.
  • Operated production AI infrastructure with CI/CD pipelines, containerized deployment, and observability across multi-protocol blockchain integrations.
Pibit.aiFounding Team / Machine Learning
Jun 2020 – Aug 2023Gurugram, India
  • Built Python-based ETL pipelines for large-scale document processing on AWS, improving throughput by 30% while reducing processing errors by 25%.
  • Developed document intelligence pipelines using NLP and computer vision, improving structured data extraction accuracy by 35%, deployed via Docker, AWS Lambda, and SageMaker.
  • Designed and deployed fraud detection systems combining supervised anomaly detection, unsupervised clustering, and rule-based pattern analysis, with containerized APIs on AWS.
#ClientYearTagsDomain
01Hybrid Resume Search PlatformApolis2026#llm#search#fine-tuning#elasticsearchEnterprise
02Long Document LLM ExtractionApolis2025#fine-tuning#nlp#llm#qloraEnterprise
03Autonomous DeFi Agent SystemValory2024-2025#agents#langgraph#on-chain#defiAutonomous
04RAG Intelligence PlatformSmarter2022-2024#rag#llm#qdrant#rerankingEnterprise
05Invoice AI AgentsApolis2025#agents#llm#validation#auditingHealthcare
06Fashion Sales ForecastingSmarter2023#clip#gpt-3#embeddings#forecastingRetail
07Document AI ExtractionPibit.ai2020-2022#nlp#cv#aws#sagemakerFintech
08Tax Fraud DetectionPibit.ai2021#anomaly-detection#clustering#mlFintech

Gaurav is the rare engineer who combines deep AI research knowledge with the pragmatism to ship production systems on tight timelines. His work on NexGig — a talent intelligence platform processing 2.7 million resumes with fine-tuned LLMs — was technically ambitious and delivered measurable business impact from day one.

I'd work with him again without hesitation.

Gaurav brought a rare combination of LLM depth and systems-level thinking to Valory. He architected our autonomous agent infrastructure from the ground up — workflows that scaled to millions of on-chain operations managing real assets.

His work directly shaped how we build autonomous services at the protocol level.