Initialising sequence...
Arjun Gupta
AI & ML Engineer_
Building production-grade AI systems at scale.
Specializing in Conversational AI, LLM Fine-tuning, and AI
Infrastructure.
// hold_for_action
experience_log
[Current]
AI Engineer Intern
Omli Kids- Optimized production conversational AI systems, reducing user-reported errors by ~20%.
- Developed and fine-tuned language models on 50K+ curated samples, improving task-specific accuracy by 15–25%.
- Lowered inference latency by ~25% through evaluation and deployment optimizations.
[Past]
MLE, Applied Science Intern
Stimuler- Built and optimized core LLM response pipeline for a voice-first AI tutoring platform (1M+ monthly users).
- Led migration from commercial LLM APIs to self-hosted fine-tuned models using Ray Serve.
- Reduced inference cost by 45% while improving infrastructure control and deployment flexibility.
[Past]
NLP Intern
Neural Nurture (ntwo.ai)- Built LLM safety and privacy auditing pipelines for 7B–70B models.
- Optimized inference infrastructure using FastAPI, vLLM, and TensorRT-LLM.
- Improved evaluation runtime by 30% and detection reliability by 22% over baseline.
[Past]
Research Intern
CyPSi Lab, IIC, University of Delhi- Developed uncertainty-based active learning pipelines in PyTorch.
- Achieved 25% improvement in label efficiency over random sampling.
- Benchmarked BADGE and AnchorAL on Indian language datasets.
[Past]
AI Research Intern
IIT Jammu- Built transformer-based PII leak detection (96.6% accuracy).
- Reduced model size by 75% via quantization and pruning.
- Deployed on Android using TFLite/ONNX with <100ms latency.
skills
ML_Libraries
Inference_&_Ops
Observability
Databases_&_Tools
Languages
publications
Design of an Optimal Planning Framework for Cryosurgical Treatment of Brain Tumor Using CNN Segmentation of MRI Images
Published in Cryobiology, Volume 123, 105619 (2026)
DOI: 10.1016/j.cryobiol.2026.105619executed_projects
Talk2Doc
src_link_activeAutonomous Documentation Assistant with end-to-end observability.
- Intelligent documentation search with FastAPI & ChromaDB.
- Modular LLM routing (Local + Cloud).
- Full-stack telemetry with OpenTelemetry + Jaeger.
Montreal Forced Aligner (MFA) Pipeline
src_link_activeComplete automated pipeline for speech-to-text alignment and analysis.
- Automated TextGrid generation & phonetic alignment.
- End-to-end analysis scripts & visualization.
- Full Docker environment setup.
Multilingual Semantic Search
live_demoBilingual search system for NIC Codes (Top 5 Nationally).
- FAISS + Sentence Transformers (English + Hindi).
- 40ms inference latency with caching.
- Voice query support & offline-first capability.
Active Learning Systems
local_onlyAL pipelines for vision and NLP datasets.
- Benchmarked BADGE, AnchorAL methods.
- Improved sample efficiency on low-resource data.
PII Detection
local_onlyTransformer-based PII detection in network traffic.
- Optimized compact models (TFLite/ONNX).
- Neural embeddings + classifier architecture.
Kinematic Control of Robot Manipulator
local_onlyControl systems for Open Manipulator-X arm.
- Built forward & inverse kinematics algorithms.
- Designed GUI for 3D trajectory visualization.
- Enabled physical arm to write text in 3D space.