profile-pic

Raghav Bansal

I’m a Machine Learning Engineer with almost 4 years at Samsung, building production-grade Machine Learning systems across on-device, cloud, and generative AI pipelines. I’ve worked on multilingual Natural Language Processing (NLP) models, Deep Learning, Retrieval-Augmented Generation (RAG) systems, and deployed Large Language Models (LLMs) at scale using Triton Inference Server and AWS.

My strengths include on-device AI (TensorFlow Lite, model quantization, unsupervised clustering), contextual search (Word2Vec, BERT), and building robust ML APIs integrated with Android and web services.

I’m currently open to opportunities focused on LLMs, on-device intelligence, Deep Learning, or scalable Machine Learning infrastructure.

  • Role

    Software, FastAPI, Machine Learning Engineer

  • Years of Experience

    9.58 years

Skillsets

  • fine-tuning
  • MCP
  • model quantization
  • NLP
  • Ollama
  • Room
  • S3
  • SQS
  • Triton Inference Server
  • Bash
  • LLMs
  • Hugging Face
  • Jenkins
  • Linux
  • NumPy
  • pandas
  • PostgreSQL
  • PyTorch
  • rag
  • Scikit-learn
  • Auto Scaling
  • Python
  • Java
  • SQL
  • AWS
  • Docker
  • Flask
  • Kotlin
  • MLOps
  • TensorFlow
  • Python
  • ChromaDB
  • EC2
  • FAISS
  • FastAPI
  • Gerrit
  • Git
  • Jira
  • LangChain

Professional Summary

9.58Years
  • Feb, 2026 - Present 3 months

    Lead Engineer

    Samsung R&D Institute India - Bangalore
  • Jul, 2022 - Feb, 20263 yr 7 months

    Software Engineer

    Samsung R&D Institute India - Bangalore
  • Feb, 2022 - Jul, 2022 5 months

    R&D Intern

    Samsung R&D Institute India - Bangalore
  • Jul, 2021 - Aug, 2021 1 month

    Data Science Intern

    Sabudh Foundation
  • Jul, 2021 - Oct, 2021 3 months

    Research Intern

    Bharti Institute of Public Policy, ISB
  • Oct, 2021 - Jan, 2022 3 months

    Data Analyst

    The Ballot House UK (India)
  • Feb, 2021 - Oct, 2021 8 months

    Data Science Trainee (COE)

    Sabudh Foundation
  • Jul, 2020 - Aug, 20211 yr 1 month

    Mentor

    Mentors Without Borders
  • Feb, 2020 - Jul, 20222 yr 5 months

    Student Convener

    Computer Society of India, Student Chapter-GNDEC, Ludhiana
  • May, 2019 - Jan, 2020 8 months

    Committee Member

    Computer Society of India, Student Chapter-GNDEC, Ludhiana

Work History

9.58Years

Lead Engineer

Samsung R&D Institute India - Bangalore
Feb, 2026 - Present 3 months

Software Engineer

Samsung R&D Institute India - Bangalore
Jul, 2022 - Feb, 20263 yr 7 months
    Health Orchestrator | FastAPI, LangGraph, MCP, PostgreSQL, WebSockets, Android, LLMs: Architected a multi-agent orchestration layer handling requests via WebSockets, reducing response latency by 30% while eliminating agent collisions through deterministic sequencing. Engineered PostgreSQL-backed session memory to persist context across 20+ turn interactions, improving clinical instruction recall by 40% and ensuring rigid agent boundaries. Deployed MCP-governed guardrails with custom hallucination checks, blocking 100% of non-compliant tool calls and securing tool-chain execution for safety-critical queries. Samsung Personal Health Records (PHR) | Triton, EC2, SQS, S3, Auto Scaling, Docker, LLMs: Orchestrated a scalable LLM inference service on AWS EC2 via Triton Inference Server, utilizing custom AMIs and Auto Scaling groups to optimize compute costs by 20%. Engineered an asynchronous processing pipeline using SQS and S3, decoupling data ingestion to reduce end-to-end latency by 70% and increasing throughput by 3x. Samsung Internet Browser Semantic Search | Python, ML, RAG, NLP, TensorFlow, FastAPI, LLMs: Engineered a hybrid semantic-search and autocorrect system by ensembling BERT contextual embeddings with Word2Vec and Doc2Vec, achieving 97% Top-k accuracy on complex queries. Productionized a 94-language polyglot NLU model served via FastAPI and integrated with Android (Retrofit), optimizing inference for real-time multilingual support. Architected a RAG pipeline using dense vector embeddings and GPT for contextual query generation, improving intent resolution rates by 30% via semantic grounding. Smart Vibration and Ringtone Adjustment System | TensorFlow Lite, CNN, Android On-device ML: Engineered an ultra-compact 719 KB on-device CNN for surface classification (98.8% accuracy), optimizing inference to run within 610 KB RAM & 10 MB ROM to drive adaptive haptic feedback.

R&D Intern

Samsung R&D Institute India - Bangalore
Feb, 2022 - Jul, 2022 5 months
    Text Summarization Model Development | EncoderDecoder NLP, Python: Developed a text summarizer using EncoderDecoder NLP architecture with attention mechanism, achieving 97% accuracy.

Data Analyst

The Ballot House UK (India)
Oct, 2021 - Jan, 2022 3 months

Research Intern

Bharti Institute of Public Policy, ISB
Jul, 2021 - Oct, 2021 3 months

Data Science Intern

Sabudh Foundation
Jul, 2021 - Aug, 2021 1 month

Data Science Trainee (COE)

Sabudh Foundation
Feb, 2021 - Oct, 2021 8 months

Mentor

Mentors Without Borders
Jul, 2020 - Aug, 20211 yr 1 month
    Helping underprivileged young people become passionate professionals

Student Convener

Computer Society of India, Student Chapter-GNDEC, Ludhiana
Feb, 2020 - Jul, 20222 yr 5 months

Committee Member

Computer Society of India, Student Chapter-GNDEC, Ludhiana
May, 2019 - Jan, 2020 8 months

Major Projects

2Projects

Smart Vibration and Ringtone Adjustment System

    Trained a CNN model on accelerometer data for real-time surface classification, achieving 98.8% accuracy and enabling automatic vibration/ringtone adjustment. Collected and processed sensor data, and optimized a TensorFlow model for on-device inference under 10MB ROM and 610KB RAM.

Samsung Cloud Emergency Backup

    Developed an on-device ML based emergency backup with TensorFlow Lite and unsupervised clustering, reducing backup latency by 50%.

Education

  • Bachelor of Technology in Computer Science and Engineering

    Guru Nanak Dev Engineering College (2022)