profile-pic

Akash Singh

Innovative AI leader with 6+ years of experience driving technological advancements and scaling solutions in a rapidly growing startup environment.
  • Role

    Technical Lead AI

  • Years of Experience

    7.3 years

  • Professional Portfolio

    View here

Skillsets

  • Snac
  • Docker
  • Elmo
  • FastAPI
  • FastSpeech
  • Hifi-gan
  • Hugging Face Transformers
  • LangGraph
  • LoRA
  • Mamba
  • Model distillation
  • Quantization
  • Azure
  • speaker recognition
  • SSM
  • Tacotron
  • TTS
  • Ulmfit
  • WebRTC
  • whisper
  • Parler-tts
  • Rvq
  • X-vector
  • D-vector
  • TensorFlow
  • C
  • Candle
  • Kaldi
  • Mlx
  • Python
  • PyTorch
  • Rust
  • TensorFlow
  • Ggml
  • Python
  • PyTorch
  • C
  • Rust
  • Python
  • PyTorch
  • TensorFlow
  • Rust
  • C++
  • Android
  • Apple silicon
  • ASR
  • AWS

Professional Summary

7.3Years
  • Jan, 2025 - Present1 yr 5 months

    Technical Lead AI

    Paytm
  • Aug, 2021 - Jan, 20253 yr 5 months

    Research Scientist

    Saarthi.AI
  • Dec, 2018 - Jul, 20212 yr 7 months

    Deep Learning Engineer

    Saarthi.AI

Applications & Tools Known

  • icon-tool

    Azure

  • icon-tool

    AWS

  • icon-tool

    Android

Work History

7.3Years

Technical Lead AI

Paytm
Jan, 2025 - Present1 yr 5 months
    Designed and shipped a real-time voice-to-voice agent over WebRTC, integrating streaming ASR LLM reasoning TTS into a full-duplex production pipeline for travel booking. Architected a multi-agent conversational AI system (LangGraph StateGraph, GPT-4) coordinating 10+ agents for search, filtering, and booking; achieved 95% intent recognition accuracy. Integrated live travel APIs via async clients with circuit breaker patterns; maintained 99.5% system uptime in production. Led hallucination mitigation through prompt engineering, session feedback, and synthetic evaluation.

Research Scientist

Saarthi.AI
Aug, 2021 - Jan, 20253 yr 5 months
    Led end-to-end TTS research across Tacotron, FastSpeech, and HiFi-GAN, covered single-speaker, multi-speaker, and multilingual settings across 11 Indian languages at 5M calls/day. Built and deployed streaming ASR systems (DeepSpeech, Whisper, Kaldi), developed full NLU pipeline from data creation to cloud deployment on Azure and AWS. Distilled a large recommendation model into a compact on-device model deployed in production on Android inside a keyboard product for real-time content recommendation. Led a cross-functional team of engineers, linguists, and CUX designers across the full research-to-deployment lifecycle.

Deep Learning Engineer

Saarthi.AI
Dec, 2018 - Jul, 20212 yr 7 months
    Trained ELMo and ULMFiT language models from scratch in 9 Indian languages; applied to entity tagging, text classification, semantic role labelling, and POS tagging. Built speaker recognition pipelines (X-vector, D-vector) achieving 95%+ accuracy for phrase and non-phrase tasks; developed keyword spotting with TensorFlow.js for browser deployment. Built dialog policy via deep RL; developed keyword spotting with TensorFlow.js for browser deployment.

Major Projects

3Projects

mlx-audio-train

    LoRA fine-tuning pipeline for TTS models on Apple Silicon. Shipped Hindi language adapter and speaker adapter for Qwen3-TTS, added Parler-TTS support for description-guided, zero-shot multilingual synthesis.

Hedgehog-Mamba

    MLX implementation of the Hedgehog linear attention kernel combined with Mamba selective state-space layers, targeting efficient long-context inference on Apple Silicon without attention's quadratic memory cost.

Technical Blog

    Deep-dives on neural audio codecs, on-device TTS, SSM/Mamba architectures, and Apple Silicon inference benchmarking.

Education

  • B.Tech, Computer Science & Engineering

    IET Lucknow (2018)