profile-pic

Vinayak Mararthe

Experienced Data Scientist & AI Developer with over 8 years of industry experience, including 3+ years in data science and AI-driven solution development.
  • Role

    Data Scientist & AI Developer

  • Years of Experience

    8 years

Skillsets

  • OpenCV
  • Lambda
  • LangChain
  • Lex
  • LlamaIndex
  • LLM
  • MLFlow
  • MLOps
  • NLP
  • OCR
  • Kendra
  • Pydantic
  • PyTorch
  • rag
  • RESTAPI
  • Sagemaker
  • Streamlit
  • TensorFlow
  • Tesseract
  • Azure
  • Python
  • R
  • SQL
  • Power BI
  • react
  • Tableau
  • Agentic AI
  • AutoGen
  • AWS
  • Django
  • Azure form recognizer
  • Computer Vision
  • DevOps
  • Docker
  • DVC
  • EC2
  • FastAPI
  • GenAI

Professional Summary

8Years
  • Jul, 2024 - Present1 yr 8 months

    Data Scientist & AI Developer

    Sanctuari Platforms
  • Sep, 2022 - Jun, 20241 yr 9 months

    Data Scientist

    Almabetter
  • Jun, 2020 - Aug, 20222 yr 2 months

    Data Scientist

    Shubham Vision
  • Nov, 2017 - Jun, 20202 yr 7 months

    Data Analyst

    Shubham Vision

Applications & Tools Known

  • icon-tool

    SQL

  • icon-tool

    Voice Ai

  • icon-tool

    Vite

  • icon-tool

    Tableau CRM

  • icon-tool

    React

  • icon-tool

    Microsoft Power BI

  • icon-tool

    Deep Learning

  • icon-tool

    Automation Anywhere

  • icon-tool

    Python

  • icon-tool

    Azure Machine Learning Studio

  • icon-tool

    Django

Work History

8Years

Data Scientist & AI Developer

Sanctuari Platforms
Jul, 2024 - Present1 yr 8 months
    Built scalable AI solutions for insurance analytics using Python, Pytorch, FastAPI, SQL, and REST APIs; deployed ML/NLP pipelines and integrated RAG systems for contextual retrieval. Developed LLM-based interfaces with React, Vite, Agentic AI, Gemini API, and GenAI; powered document/image chat and form automation using GPT/BERT via OpenAI and Hugging Face. Containerized services with Docker, AWS sagemaker and implemented CI/CD with DevOps teams for seamless deployment and scalable model APIs. Leveraged Computer Vision frameworks (e.g., OpenCV) & OCR tools Azure Form Recognizer to automate insurance document processing, enhancing accuracy & Efficiency.

Data Scientist

Almabetter
Sep, 2022 - Jun, 20241 yr 9 months
    Developed and deployed ML/DL models for classification, forecasting, and recommendation use cases, integrating SQL, Agentic Ai, FastApi, RAG systems for context-aware predictions. Built interactive dashboards in Power BI and Tableau to visualize KPIs, model outputs, and operational insights across domains. Utilized Computer Vision frameworks (e.g., OpenCV) and OCR tools (e.g., Tesseract, Azure Form Recognizer) to develop and optimize image and document processing solutions.

Data Scientist

Shubham Vision
Jun, 2020 - Aug, 20222 yr 2 months
    Designed and deployed a Recommendation Model to personalize entertainment service plans based on customer preferences, geographic location, and affordability. Achieved a 900% increase in customer base by leveraging data-driven outreach and recommendation strategies.

Data Analyst

Shubham Vision
Nov, 2017 - Jun, 20202 yr 7 months
    Developed dynamic dashboards and reports using Power BI, enabling data-driven decisions by visualizing KPIs, trends, and business metrics across departments. Utilized Advanced Excel functions (VLOOKUP, Pivot Tables, Macros, Data Validation) to automate reports, perform complex data analysis, and improve operational efficiency.

Major Projects

4Projects

Business Risk Intelligence Report Generator

Apr, 2024 - Nov, 20251 yr 7 months
    Designed and developed a GenAI + RAG powered platform that automatically generates comprehensive business risk reports from company websites, annual reports, stock market data, and regulatory sources. Integrated multi-source data pipelines to provide evidence-backed risk assessments and built an ML-based insurance recommender system. Implemented LLM-driven summarization and reporting workflows for structured, executive-ready risk reports.

Legal Judgment Analyzer

Mar, 2025 - Jun, 2025 3 months
    Built a multi-agent RAG-based legal analysis system using AWS Bedrock (Anthropic Claude), Sagemaker, and Pinecone to automate legal judgment processing. Parsed legal PDFs to extract facts, arguments, sections, and parties; performed recursive chunking and semantic embedding for long-document comprehension. Used Chain-of-Thought + Memory-based Reasoning Agents to answer legal queries with context fidelity and integrated hallucination validation using BLEU/ROUGE + Detoxify pipeline.

Document Insight Engine Summarize & Chat

Dec, 2024 - Feb, 2025 2 months
    Developed an AI-powered tool to upload PDFs or images, extract text using Amazon Textract/OCR, and summarize insights instantly. Used OpenAI/Gemini for document summarization, table reconstruction, and Q&A, backed by Pinecone and RAG architecture for accurate semantic retrieval. Enabled chat with documents via vectorized sentence chunking and real-time query-to-chunk matching using cosine similarity.

Zomato Restaurant Clustering & Review Sentiment Analyzer

Apr, 2023 - May, 2023 1 month
    Engineered a data pipeline to cluster over 10,000 restaurants based on cuisine type, pricing, ratings, and location using unsupervised learning. Performed aspect-based sentiment analysis on customer reviews using LLMs combined with traditional NLP. Leveraged Pinecone + RAG-style retrieval to semantically align customer reviews with restaurant clusters, enhancing personalized recommendation logic.

Education

  • Bachelor of Engineering (EXTC)

    University of Mumbai