Initialising sequence...

Arjun Gupta

ML & NLP Engineer_

Building reliable, real-time AI systems.
Specializing in LLM Safety, RAG Systems, and Inference Pipelines.

Penguin Doodle
// hold_for_action

experience_log

[Current]

NLP Intern

Neural Nurture (ntwo.ai)
  • Built LLM safety and privacy auditing pipelines for 7B–70B models.
  • Optimized inference infrastructure using FastAPI, vLLM, and TensorRT-LLM.
  • Improved evaluation runtime by 30% and detection reliability by 22% over baseline.
[Past]

Research Intern

CyPSi Lab, IIC, University of Delhi
  • Developed uncertainty-based active learning pipelines in PyTorch.
  • Achieved 25% improvement in label efficiency over random sampling.
  • Benchmarked BADGE and AnchorAL on Indian language datasets.
[Past]

AI Research Intern

IIT Jammu
  • Built transformer-based PII leak detection (99.6% accuracy).
  • Reduced model size by 75% via quantization and pruning.
  • Deployed on Android using TFLite/ONNX with <100ms latency.

Skills

Turning research into production-ready AI systems.

ML_Libraries

PyTorch TensorFlow Keras Scikit-learn Transformers HuggingFace spaCy NLTK Pandas NumPy FAISS

Inference_&_Ops

FastAPI vLLM TensorRT ONNX TFLite Docker Kubernetes AWS Locust Git CI/CD

Observability

OpenTelemetry Jaeger

Databases_&_Tools

PostgreSQL ChromaDB Gradio Linux

Languages

Python C++ C Java MATLAB R Bash SQL

publications

Design of an Optimal Planning Framework for Cryosurgical Treatment

Submitted to Cryobiology

Proposed a pipeline using CNN segmentation for cryosurgery planning of brain tumors.

executed_projects

Talk2Doc

src_link_active

Autonomous Documentation Assistant with end-to-end observability.

  • Intelligent documentation search with FastAPI & ChromaDB.
  • Modular LLM routing (Local + Cloud).
  • Full-stack telemetry with OpenTelemetry + Jaeger.

Montreal Forced Aligner (MFA) Pipeline

src_link_active

Complete automated pipeline for speech-to-text alignment and analysis.

  • Automated TextGrid generation & phonetic alignment.
  • End-to-end analysis scripts & visualization.
  • Full Docker environment setup.

Multilingual Semantic Search

live_demo

Bilingual search system for NIC Codes (Top 5 Nationally).

  • FAISS + Sentence Transformers (English + Hindi).
  • 40ms inference latency with caching.
  • Voice query support & offline-first capability.

Active Learning Systems

local_only

AL pipelines for vision and NLP datasets.

  • Benchmarked BADGE, AnchorAL methods.
  • Improved sample efficiency on low-resource data.

PII Detection

local_only

Transformer-based PII detection in network traffic.

  • Optimized compact models (TFLite/ONNX).
  • Neural embeddings + classifier architecture.

Kinematic Control of Robot Manipulator

local_only

Control systems for Open Manipulator-X arm.

  • Built forward & inverse kinematics algorithms.
  • Designed GUI for 3D trajectory visualization.
  • Enabled physical arm to write text in 3D space.