Nikita Verma

Experienced Data Scientist specializing in Automatic Speech Recognition (ASR), Natural Language Processing (NLP), LLMs, and GEN-AI. Proficient in Python, SQL, and AWS services, with expertise in data modeling, ETL processes, and analytics.

Role
Forward deploy AI Engineer
Years of Experience
4.1 years

Skillsets

Django
OpenAI
pandas
PyTorch
Rasa
Redshift
S3
Scikit learn
T5
TensorFlow
AWS
Azure stt
Connect
CUDA
Data Modeling
NumPy
Google stt
gRPC
NLP
Nvidia parakeet
Polly
Rest APIs
riva
Shell
speech AI
SQL
Uvicorn
WebSocket
whisper
FastAPI
Big Data
Python - 4.0 Years
Flask
Apache Kafka
Kibana
Python
Power BI
Grafana
Amazon RDS
AWS Lambda
Bedrock
Data Mining
ETL
Scala
MongoDB
MySQL
PySpark
Sagemaker
Apache Spark
Athena
BERT
Docker
EC2
Github
Glue
Lex
Machine Learning

Professional Summary

4.1Years

Oct, 2025 - Present 8 months
AI Engineer
EXL
Mar, 2025 - Aug, 2025 5 months
Senior ML Engineer
Earnin
Nov, 2021 - Dec, 20243 yr 1 month
Senior Data Scientist
SakhaTech Information Systems

Applications & Tools Known

Python
Scala
Shell Scripting
ETL
Big Data
Data Mining
MySQL
MongoDB
Amazon RDS
AWS Lambda
S3
Glue
EC2
Amazon Redshift
REST APIs
Flask
FastAPI
Power BI
Kibana
Grafana
Docker
GitHub
Apache Kafka
Apache Spark

Work History

4.1Years

AI Engineer

EXL

Oct, 2025 - Present 8 months

Built production-grade multilingual (English + Spanish) real-time ASR/STT systems using NVIDIA Parakeet-TDT-0.6B-v3, Parakeet-CTC-0.6B-ES, Parakeet-RNNT-1.1B, Nemotron Speech Streaming, Whisper Large-v3 Turbo, Google STT v2, and Azure Speech Services supporting live microphone and audio file streaming with partial and final transcript generation. Architected WebSocket ASR servers with FastAPI + Uvicorn featuring chunk-wise PCM streaming, silence-based finalization, manual flush triggers, and JSON partial/final transcript events; tuned VAD systems (WebRTC, Adaptive Energy, Silero) with silence thresholds (200ms-900ms) and chunk durations (10/20/30ms) for near real-time delivery without premature cutoffs. Conducted structured ASR model benchmarking across Parakeet-TDT/CTC-ES/RNNT, Riva, Whisper, Google STT, and Azure STT evaluating TTFT, TTFB, WER, first/average latency, context-switching score, and final transcript quality; generated detailed Excel-based evaluation reports with per-file model analysis. Containerized GPU-accelerated ASR inference services using Docker, CUDA 12.x (nvidia/cuda runtime), NIM, Whisper, Google STT, NIM deployments, and optimized Dockerfiles; resolved critical issues including NeMo import crashes, torch/CUDA mismatches, segfaults, PyAudio/ffmpeg build failures, and pydantic conflicts for high-availability production deployment. Deployed and operated real-time ASR services on GCP (Compute Engine, Artifact Registry, Cloud Run) and AWS EC2 managing VM provisioning, Docker registry pushes, port/firewall rules, SSH tunneling, and WebSocket/gRPC remote endpoint validation and health monitoring. Evaluated and compared NVIDIA Triton Inference Server, NIM, Riva, and vLLM for speech AI model serving assessing GPU utilization, streaming endpoint behavior, supported model discovery, and deployment strategies to determine optimal production scalability and inference performance.

Senior ML Engineer

Earnin

Mar, 2025 - Aug, 2025 5 months

Designed, built, and maintained ML/statistical models to optimize marketing performance across audience targeting, channels, campaigns, and customer journeys. Specialized in Marketing Mix Modeling (MMM), LTV prediction, lift estimation, and causal inference to inform strategic decisions. Designed and deployed a financial IVA chatbot in Python using Amazon Lex, Lambda, and Bedrock to support real-time user queries on Cash Out, Early Pay with ongoing integration of Amazon Polly and Connect for voice support. Performed large-scale data processing and exploratory analysis using Python, SQL, and Spark to uncover actionable insights and evaluate marketing performance metrics.

Senior Data Scientist

SakhaTech Information Systems

Nov, 2021 - Dec, 20243 yr 1 month

Orchestrated AI/ML/LLM models, improving automation by 30% using NLP and SciPy, and leveraging image recognition for enhanced insights. Developed and deployed ASR and TTS machine learning models using PyTorch, TensorFlow, and NumPy for real-time translation, integrated Face and OCR models for automate identification and data extraction, and enhanced customer satisfaction by 20% through AI solution integration. Built a traditional RASA-based conversational chatbot and later integrated LLaMA and RAG to transform it into an intelligent virtual assistant(IVA) for intelligent, real-time interactions. Deployed, customized and fine-tuned GPT-3 and GPT-4 on proprietary datasets, which elevated low performing model response accuracy and efficiency by over 40%. Specialized in supervised learning and classification algorithms enhancing model accuracy to 90% with dynamic retraining, data augmentation, and hyperparameter tuning. Managed MySQL database systems and designed real-time data pipelines using Apache Kafka and Pyspark. Skilled in coding with Python, Scala, and Shell Scripting, using and proficient in project management using Git/GitLab.

Achievements

Improved process automation and insight generation by 30% using NLP and image recognition.
Enhanced customer satisfaction by 20% through AI solution integration.
Elevated response accuracy and efficiency of proprietary datasets using GPT-3 and GPT-4 by over 40%.
Enhanced model accuracy to 90% with dynamic retraining, data augmentation, and hyperparameter tuning.
Cut data processing time by 30% with AWS Lambda and Amazon Redshift.
Organiser of ProSang 2020, Technical Summit of MNNIT Allahabad

Major Projects

2Projects

Custom Health Bot

Developed a custom health bot using Rasa and implemented user authentication for a product website using Django, MySQL, and SMTP.

Real-time Data Streaming Solutions

Designed real-time line charts for monitoring car RPM with nanosecond precision using Java and Apache Spark.

Education

Bachelor of Technology
Motilal Nehru National Institute of Technology (2020)
Intermediate
New Standard Public School