profile-pic
Vetted Talent

Prasun Sinha

Vetted Talent
Seeking a challenging role as a Technical Lead in AI & ML, to leverage extensive experience and expertise of 7+ years to drive innovative projects, lead cross-functional teams, and contribute to the development of cutting-edge solutions in artificial intelligence and machine learning. The goal is to lead impactful initiatives, foster collaboration, and deliver high-quality AI and ML solutions that drive business growth and technological advancement.
  • Role

    Data Scientist

  • Years of Experience

    7 years

Skillsets

  • Model deployment & servings
  • Version control systems
  • Time Series Analysis
  • Research and innovation
  • Reinforcement Learning
  • Recommendation Systems
  • Python
  • Project Management
  • Predictive Modelling
  • Performance Optimization
  • Natural Language Processing (NLP)
  • Model evaluation and validation
  • anomaly detection
  • ML algorithms
  • hyperparameter tuning
  • Gpt 3.5 fine tuning
  • Generative Adversarial Networks (GANs)
  • Data Visualization
  • Data Manipulation
  • Cross-functional Leadership
  • Convolutional neural networks (cnn)
  • Cloud Computing
  • Ci/Cd Pipelines
  • Automated machine learning (automl)

Vetted For

12Skills
  • Roles & Skills
  • Results
  • Details
  • icon-skill_image
    Data Scientist (Remote)AI Screening
  • 66%
    icon-arrow-down
  • Skills assessed :Communication Skills, Jira, Retrieval-Augmented Generation, Computer Vision, Deep Learning, PyTorch, TensorFlow, GitLab, machine_learning, NLP, NO SQL, Python
  • Score: 59/90

Professional Summary

7Years
  • Jul, 2024 - Present1 yr 11 months

    Assoc. Tech Specialist

    Harbinger Group
  • Apr, 2021 - Jun, 20243 yr 2 months

    Senior Software Engineer

    Harbinger Group
  • Nov, 2019 - Apr, 20211 yr 5 months

    Software Developer

    Extentia Information Technology
  • Jan, 2017 - Oct, 20192 yr 9 months

    Associate IT Applications Specialist

    Symantec Software India Pvt Ltd

Applications & Tools Known

  • icon-tool

    MLFlow

  • icon-tool

    Docker

  • icon-tool

    ServiceNow

  • icon-tool

    MATLAB

  • icon-tool

    PyTorch

  • icon-tool

    Scikit-learn

  • icon-tool

    Keras

Work History

7Years

Assoc. Tech Specialist

Harbinger Group
Jul, 2024 - Present1 yr 11 months
    Develop and maintain high-quality software solutions in accordance with project specifications, while adhering to coding standards and best practices. Collaborate with cross-functional teams including product management, design, and quality assurance to define requirements, plan features, and ensure successful product delivery. Participate in design and architectural discussions, providing technical expertise and guidance to drive effective decision-making. Write clean, efficient, and maintainable code across multiple programming languages and platforms, ensuring scalability, performance, and security. Conduct code reviews, provide constructive feedback, and mentor junior engineers to foster a culture of learning and continuous improvement. Investigate and troubleshoot complex technical issues, implementing effective solutions and optimizations to meet project deadlines and objectives. Actively participate in Agile development processes, including sprint planning, daily stand-ups, and retrospectives, to drive project success and team collaboration.

Senior Software Engineer

Harbinger Group
Apr, 2021 - Jun, 20243 yr 2 months
    Develop and maintain high-quality software solutions in accordance with project specifications, while adhering to coding standards and best practices. Collaborate with cross-functional teams including product management, design, and quality assurance to define requirements, plan features, and ensure successful product delivery. Participate in design and architectural discussions, providing technical expertise and guidance to drive effective decision-making. Write clean, efficient, and maintainable code across multiple programming languages and platforms, ensuring scalability, performance, and security. Conduct code reviews, provide constructive feedback, and mentor junior engineers to foster a culture of learning and continuous improvement. Investigate and troubleshoot complex technical issues, implementing effective solutions and optimizations to meet project deadlines and objectives. Actively participate in Agile development processes, including sprint planning, daily stand-ups, and retrospectives, to drive project success and team collaboration.

Software Developer

Extentia Information Technology
Nov, 2019 - Apr, 20211 yr 5 months
    Developed applications using Pandas for ETL processes. Created APIs utilizing pre-trained models and optimized their performance using multithreading techniques. Maintained and designed applications to meet industry standards, ensuring both existing and new systems were robust and efficient. Implemented a chatbot using NLP techniques such as intent recognition, named entity recognition, and dialogue management. Engineered an innovative attendance tracking system leveraging computer vision and gesture recognition, enabling contactless attendance marking by waving a hand in front of a webcam. Enhanced codebases by revising, modularizing, and updating old code to adhere to modern development standards, resulting in reduced operating costs and improved functionality.

Associate IT Applications Specialist

Symantec Software India Pvt Ltd
Jan, 2017 - Oct, 20192 yr 9 months
    Executed Python automation and optimization initiatives, including streamlining processes such as database patching and refreshing using shell scripts and Python. Collaborated with the team to ensure smooth database system operations and optimize performance, resulting in a productivity increase of over 70%. Successfully conducted a Proof of Concept for Conference Room Monitoring Using IoT technology. Spearheaded the development of a Dashboard to support day-to-day on-call activities. Automated the SOX Audit Report with Python, significantly reducing manual efforts and improving efficiency during quarterly publishing. Implemented Python automation for MYSQL backups, enhancing data reliability and minimizing the risk of data loss. Collaborated with Release Engineering team members to design new application systems aligned with client requirements for system applications.

Achievements

  • Technical Star Award | Harbinger Group | May & Jul 2023 June 2024
  • Superstar Award | Harbinger Group | Jun 2023
  • Team player - Quarterly Award | Harbinger Group| Apr-May 2023
  • Wow Award | Symantec | Jun 2017

Major Projects

11Projects

Medical Insurance Document Processing Automation

Apr, 2021 - Jun, 20243 yr 2 months
    Implemented an end-to-end automation solution for processing medical insurance documents using a combination of UiPath, Python, and advanced natural language processing (NLP) techniques. The project aimed to enhance efficiency and accuracy in handling medical insurance documents while ensuring compliance with privacy regulations.

Predictive Healthcare Model for Mortality Risk Assessment

Apr, 2021 - Jun, 20243 yr 2 months
    Built a predictive model to estimate mortality risk using medical data such as blood pressure, blood sugar levels, cardiac history, age, gender, lifestyle factors, and daily activity. Employed machine learning algorithms to analyse and predict potential life-threatening events.

Attendance System with Hand Gesture Recognition

Apr, 2021 - Jun, 20243 yr 2 months
    Developed an innovative attendance system leveraging computer vision and gesture recognition technology. Implemented a solution where employees can mark their attendance by waving their hands in front of a webcam.

Employee Emotion Detection and Notification System

Apr, 2021 - Jun, 20243 yr 2 months
    Developed an emotion detection system utilizing facial recognition technology from webcam images. Implemented automated analysis to detect signs of sadness or stress in employees. Integrated with email notification to HR and managers for timely intervention and support.

Object Detection using YOLO

Apr, 2021 - Jun, 20243 yr 2 months
    Implemented object detection using the state-of-the-art YOLO (You Only Look Once) model. Developed a robust system capable of real-time detection and recognition of objects in images or videos. Applied in various domains such as surveillance, autonomous vehicles, and augmented reality, demonstrating versatility and practical application of computer vision technology.

Resume Parsing & ATS Matching Project

Nov, 2019 - Apr, 20211 yr 5 months
    Developed a resume parsing and Applicant Tracking System (ATS) matching solution aimed at efficiently analyzing resumes, extracting relevant skills, and matching candidates to suitable job opportunities within the company.

Restaurant Chatbot

Nov, 2019 - Apr, 20211 yr 5 months
    Created a chatbot using AllenNLP, and Elasticsearch, and integrated cosine similarity with BERT embedding, resulting in a remarkable 70% accuracy improvement.

Translation API using Whisper

Nov, 2019 - Apr, 20211 yr 5 months
    Created a translation API leveraging Whisper technology for secure and private communication. Developed an efficient and accurate translation service to bridge language barriers. Contributed to global communication and accessibility by enabling seamless translation in various contexts.

Service Now Monitoring Dashboard

Jan, 2017 - Oct, 20192 yr 9 months
    Created Python Flask and MySQL web dashboards to oversee ServiceNow tickets and server performance, enhancing team efficiency and enabling proactive interventions. Achieved over 30% productivity increase by optimizing SLA management.

SOX Report Automation

Jan, 2017 - Oct, 20192 yr 9 months
    Automated the SOX Audit Report with Python, significantly reducing manual efforts during quarterly publishing.

LMS Course Template Conversion Automation

Jan, 2017 - Oct, 20192 yr 9 months
    Developed an automated solution in Python to convert legacy JSON templates to a new format for Learning Management System (LMS) courses. This project aimed to enhance content management and ensure template consistency across the LMS platform.

Education

  • MCA Artificial Intelligence

    Pune University, India (2017)
  • BCA

    Sikkim Manipal University, India (2014)

Certifications

  • Executive post graduate ml & ai, iiitb, 2024

  • Certified information security & ethical hacking v8, mar 2018

  • Agile project management

AI-interview Questions & Answers

Could you help me understand about your background and give you a brief introduction of yourself? Okay. So my name is Prashant Kumar. I am like, I have a new experience of around 7 years, and, I majorly work on technologies like NLP, CNN, Python. So, like, apart from this, I have graduated my, like, post graduation from triple ITB, And, I'm also certified in information security and ethical, again, apart from this.

Using a vector database in Python can significantly enhance the effectiveness of AI models. A vector database is a special type of database designed to store and query vectors, which are numerical representations of data points. Suppose you have a large contextual dataset and you want to store the numerical representation, you use a vector database. These vectors are often high-dimensional and represent the features of the data points, such as text, images, or other complex data types. Vector databases excel at performing similar searches. When you want to find items with a cosine similarity, you can use a vector database. We can use vector databases for similarity search, recommendation systems, animal detection, clustering, and classification. To implement a vector database in Python, we need to choose a vector database. There are many available, including Facebook's FAISS, Anani, and Pinecone. We can install the necessary packages and use any one of them. That's the basic rule of using a vector database.

When faced with high-dimensional data, how would you use TensorFlow to perform dimension reduction before applying a machine learning algorithm? Okay, so first of all, what is TensorFlow? So TensorFlow is an architecture. Okay? And, that provides a way to look at the data aspect, understand the structure, and get the context of the data. So it provides several tools for dimension reductions, including autoencoders. We have we have principal component analysis, and we can use autoencoders for dimension reductions. So autoencoders, what are autoencoders? They are the neural networks. Sorry for word pronunciations. Autoencoders are the neural networks designed to learn an efficient representation of the data, or a different representation of data, typically, for the purpose of dimension reductions. They consist of two main parts: the encoder and the decoder. So the encoder compresses the data and the decoder then decompresses it.

What is your approach to training deep learning models on imbalanced datasets, and how would you ensure the model's performance remains robust using PyTorch or TensorFlow? Okay. So, first of all, to train deep models on imbalanced datasets can be challenging because models tend to perform best towards the majority classes. So, how can we deal with this situation? Understanding the problem and the data explorations. So, first of all, look at the dataset. We will go through this, and we will see the class distributions. And we will analyze how class imbalance might affect our predictions. And then we can apply several techniques like resampling techniques. So, what is a resampling technique? Increase the number of samples in majority classes by duplicating existing or sampling generating new ones. We will use a technique like SMOTE, which is a synthetic minority oversampling technique. And, we can also do class weighting. We provide a weight to the classes that are more important to us. And then we can use a proper model architecture, of course, like, to avoid overfitting. We ensure that the model's complexity is appropriate for the size of the minority class. And, like, then, there are training strategies we can use. We can use balanced batch generators. So, create a batch with an equal number of samples for each class, ensuring the model balances the data during the training step.

Can you provide a strategy for converting a machine learning model in Python into a production-ready system using a NoSQL database for data storage? Okay. So to answer this question, I'll break it down. Yeah. So converting a machine learning model developed in Python into a production-ready system that utilizes a NoSQL database involves multiple steps. First of all, that will be model optimization, in which we will do model serialization. We will convert our trained machine learning model into a serialized format that can be easily loaded into the production system. So we can use pickle, or we can use TensorFlow's SavedModel API. Then we will do model versioning to keep track of our trained model. Next, we can choose any NoSQL database. We have MongoDB, Cassandra, Redis, and Elasticsearch. And then we design our schema. We set up the APIs. And then we handle real-time data ingestion. We can use Kafka for this purpose. And, like, some other things we can consider.

Can you describe a situation where you use a computer vision with PyTorch to solve a real-world problem? Yes. So, in my recent project, we used object detection techniques, which we can showcase online. So, I'll explain how I used PyTorch step by step. For a manufacturing plant, the quality of products must be checked. We need to detect scratches, dents, or incorrect assembly. To address this situation, we developed a computer vision system using PyTorch to automate defect detection. The system used a deep learning model to inspect real-time identities of defected items. We started with client data, then performed data preprocessing, including image annotation. After that, we applied data augmentation and did transfer learning of the model. We used a pre-trained model of ResNet 18 and fine-tuned it. Finally, we validated and tested the model.

A section of code is used for preprocessing data enable pipeline. Please explain the error in this code snippet, which is used to supposed to termulate the data. So the error is residing in the formula itself. There is a logical error in the

Even though the following button code snippet, what the issue with the code was for correctly creating a machine learning model pipeline? So in the pipeline, the instruction is that there is a

What method do you use to interface computer vision models in TensorFlow with Python-based NLP models, ensuring cross-compatibility and efficiency and data handling. Okay. So to answer this question, to interface computer vision models in terms of flow with Python-based NLP models, we can use the following approaches, following a proper semantic. Like, we will use a unified data pipeline. So, we will do the data preprocessing. We will do feature extractions, then we will do the model interfacing, like shared immediate representation, and we can use custom layers or models. Then, we can do the cross-model communications, and we can apply techniques like batch processing and batch processing. And we can use TensorFlow Extended to build an ML pipeline that can integrate both computer vision and NLP models.

Handling missing or corrupted data in a large dataset is crucial for building robust machine learning models. In Python, we have several techniques to address this type of issue. We can make use of libraries like pandas, scikit-learn, and numpy. Okay, so first, we can do is use pandas profiling. In that, we use inbuilt functions like isnull or info to get the dataset information, and we understand the dataset. Then, we remove the missing data. We implement mean, median, and mode techniques to find the average and identify the missing data. Then, we do database validation, and we do outlier removal. Next, we use techniques like data augmentation.

Can version control with GitLab aid collaboration for a remote data engineer deploying a TensorFlow model? So, first of all, GitLab is like a library to store our code. Okay? It allows us to do database versioning and collaborative development using GitLab. The deployment of a TensorFlow model with a remote dataset can be streamlined into a workflow, which can enhance collaborations. We can use techniques like batch strategy, or we can do CICD model training and deployment. Additionally, we can use collaborative notebooks and remote dataset scaling. And we can do monitoring and feedback.