Initialising sequence...

Arjun Gupta

AI & ML Engineer_

Building production-grade AI systems at scale.
Specializing in Conversational AI, LLM Fine-tuning, and AI Infrastructure.

Penguin Doodle
// hold_for_action

experience_log

[Current]

AI Engineer Intern

Omli Kids
  • Optimized production conversational AI systems, reducing user-reported errors by ~20%.
  • Developed and fine-tuned language models on 50K+ curated samples, improving task-specific accuracy by 15–25%.
  • Lowered inference latency by ~25% through evaluation and deployment optimizations.
[Past]

MLE, Applied Science Intern

Stimuler
  • Built and optimized core LLM response pipeline for a voice-first AI tutoring platform (1M+ monthly users).
  • Led migration from commercial LLM APIs to self-hosted fine-tuned models using Ray Serve.
  • Reduced inference cost by 45% while improving infrastructure control and deployment flexibility.
[Past]

NLP Intern

Neural Nurture (ntwo.ai)
  • Built LLM safety and privacy auditing pipelines for 7B–70B models.
  • Optimized inference infrastructure using FastAPI, vLLM, and TensorRT-LLM.
  • Improved evaluation runtime by 30% and detection reliability by 22% over baseline.
[Past]

Research Intern

CyPSi Lab, IIC, University of Delhi
  • Developed uncertainty-based active learning pipelines in PyTorch.
  • Achieved 25% improvement in label efficiency over random sampling.
  • Benchmarked BADGE and AnchorAL on Indian language datasets.
[Past]

AI Research Intern

IIT Jammu
  • Built transformer-based PII leak detection (96.6% accuracy).
  • Reduced model size by 75% via quantization and pruning.
  • Deployed on Android using TFLite/ONNX with <100ms latency.

skills

ML_Libraries

PyTorch TensorFlow Keras Scikit-learn Transformers HuggingFace spaCy NLTK Pandas NumPy FAISS

Inference_&_Ops

FastAPI vLLM TensorRT Ray Serve ONNX TFLite Docker Kubernetes AWS Locust Git CI/CD

Observability

OpenTelemetry Jaeger

Databases_&_Tools

PostgreSQL ChromaDB Gradio Linux

Languages

Python C++ C Java MATLAB R Bash SQL

publications

Design of an Optimal Planning Framework for Cryosurgical Treatment of Brain Tumor Using CNN Segmentation of MRI Images

Published in Cryobiology, Volume 123, 105619 (2026)

DOI: 10.1016/j.cryobiol.2026.105619

executed_projects

Talk2Doc

src_link_active

Autonomous Documentation Assistant with end-to-end observability.

  • Intelligent documentation search with FastAPI & ChromaDB.
  • Modular LLM routing (Local + Cloud).
  • Full-stack telemetry with OpenTelemetry + Jaeger.

Montreal Forced Aligner (MFA) Pipeline

src_link_active

Complete automated pipeline for speech-to-text alignment and analysis.

  • Automated TextGrid generation & phonetic alignment.
  • End-to-end analysis scripts & visualization.
  • Full Docker environment setup.

Multilingual Semantic Search

live_demo

Bilingual search system for NIC Codes (Top 5 Nationally).

  • FAISS + Sentence Transformers (English + Hindi).
  • 40ms inference latency with caching.
  • Voice query support & offline-first capability.

Active Learning Systems

local_only

AL pipelines for vision and NLP datasets.

  • Benchmarked BADGE, AnchorAL methods.
  • Improved sample efficiency on low-resource data.

PII Detection

local_only

Transformer-based PII detection in network traffic.

  • Optimized compact models (TFLite/ONNX).
  • Neural embeddings + classifier architecture.

Kinematic Control of Robot Manipulator

local_only

Control systems for Open Manipulator-X arm.

  • Built forward & inverse kinematics algorithms.
  • Designed GUI for 3D trajectory visualization.
  • Enabled physical arm to write text in 3D space.