Vetted Talent

Prasun Sinha

Vetted Talent

Seeking a challenging role as a Technical Lead in AI & ML, to leverage extensive experience and expertise of 7+ years to drive innovative projects, lead cross-functional teams, and contribute to the development of cutting-edge solutions in artificial intelligence and machine learning. The goal is to lead impactful initiatives, foster collaboration, and deliver high-quality AI and ML solutions that drive business growth and technological advancement.

Role
Data Scientist
Years of Experience
7 years

Skillsets

Model deployment & servings
Version control systems
Time Series Analysis
Research and innovation
Reinforcement Learning
Recommendation Systems
Python
Project Management
Predictive Modelling
Performance Optimization
Natural Language Processing (NLP)
Model evaluation and validation
anomaly detection
ML algorithms
hyperparameter tuning
Gpt 3.5 fine tuning
Generative Adversarial Networks (GANs)
Data Visualization
Data Manipulation
Cross-functional Leadership
Convolutional neural networks (cnn)
Cloud Computing
Ci/Cd Pipelines
Automated machine learning (automl)

Vetted For

12Skills

Roles & Skills
Results
Details

Data Scientist (Remote)AI Screening
66%

Skills assessed :Communication Skills, Jira, Retrieval-Augmented Generation, Computer Vision, Deep Learning, PyTorch, TensorFlow, GitLab, machine_learning, NLP, NO SQL, Python
Score: 59/90

Professional Summary

7Years

Jul, 2024 - Present1 yr 3 months
Assoc. Tech Specialist
Harbinger Group
Apr, 2021 - Jun, 20243 yr 2 months
Senior Software Engineer
Harbinger Group
Nov, 2019 - Apr, 20211 yr 5 months
Software Developer
Extentia Information Technology
Jan, 2017 - Oct, 20192 yr 9 months
Associate IT Applications Specialist
Symantec Software India Pvt Ltd

Applications & Tools Known

MLFlow
Docker
ServiceNow
MATLAB
PyTorch
Scikit-learn
Keras

Work History

7Years

Assoc. Tech Specialist

Harbinger Group

Jul, 2024 - Present1 yr 3 months

Develop and maintain high-quality software solutions in accordance with project specifications, while adhering to coding standards and best practices. Collaborate with cross-functional teams including product management, design, and quality assurance to define requirements, plan features, and ensure successful product delivery. Participate in design and architectural discussions, providing technical expertise and guidance to drive effective decision-making. Write clean, efficient, and maintainable code across multiple programming languages and platforms, ensuring scalability, performance, and security. Conduct code reviews, provide constructive feedback, and mentor junior engineers to foster a culture of learning and continuous improvement. Investigate and troubleshoot complex technical issues, implementing effective solutions and optimizations to meet project deadlines and objectives. Actively participate in Agile development processes, including sprint planning, daily stand-ups, and retrospectives, to drive project success and team collaboration.

Senior Software Engineer

Harbinger Group

Apr, 2021 - Jun, 20243 yr 2 months

Software Developer

Extentia Information Technology

Nov, 2019 - Apr, 20211 yr 5 months

Developed applications using Pandas for ETL processes. Created APIs utilizing pre-trained models and optimized their performance using multithreading techniques. Maintained and designed applications to meet industry standards, ensuring both existing and new systems were robust and efficient. Implemented a chatbot using NLP techniques such as intent recognition, named entity recognition, and dialogue management. Engineered an innovative attendance tracking system leveraging computer vision and gesture recognition, enabling contactless attendance marking by waving a hand in front of a webcam. Enhanced codebases by revising, modularizing, and updating old code to adhere to modern development standards, resulting in reduced operating costs and improved functionality.

Associate IT Applications Specialist

Symantec Software India Pvt Ltd

Jan, 2017 - Oct, 20192 yr 9 months

Executed Python automation and optimization initiatives, including streamlining processes such as database patching and refreshing using shell scripts and Python. Collaborated with the team to ensure smooth database system operations and optimize performance, resulting in a productivity increase of over 70%. Successfully conducted a Proof of Concept for Conference Room Monitoring Using IoT technology. Spearheaded the development of a Dashboard to support day-to-day on-call activities. Automated the SOX Audit Report with Python, significantly reducing manual efforts and improving efficiency during quarterly publishing. Implemented Python automation for MYSQL backups, enhancing data reliability and minimizing the risk of data loss. Collaborated with Release Engineering team members to design new application systems aligned with client requirements for system applications.

Achievements

Technical Star Award | Harbinger Group | May & Jul 2023 June 2024
Superstar Award | Harbinger Group | Jun 2023
Team player - Quarterly Award | Harbinger Group| Apr-May 2023
Wow Award | Symantec | Jun 2017

Major Projects

11Projects

Medical Insurance Document Processing Automation

Apr, 2021 - Jun, 20243 yr 2 months

Implemented an end-to-end automation solution for processing medical insurance documents using a combination of UiPath, Python, and advanced natural language processing (NLP) techniques. The project aimed to enhance efficiency and accuracy in handling medical insurance documents while ensuring compliance with privacy regulations.

Predictive Healthcare Model for Mortality Risk Assessment

Apr, 2021 - Jun, 20243 yr 2 months

Built a predictive model to estimate mortality risk using medical data such as blood pressure, blood sugar levels, cardiac history, age, gender, lifestyle factors, and daily activity. Employed machine learning algorithms to analyse and predict potential life-threatening events.

Attendance System with Hand Gesture Recognition

Apr, 2021 - Jun, 20243 yr 2 months

Developed an innovative attendance system leveraging computer vision and gesture recognition technology. Implemented a solution where employees can mark their attendance by waving their hands in front of a webcam.

Employee Emotion Detection and Notification System

Apr, 2021 - Jun, 20243 yr 2 months

Developed an emotion detection system utilizing facial recognition technology from webcam images. Implemented automated analysis to detect signs of sadness or stress in employees. Integrated with email notification to HR and managers for timely intervention and support.

Object Detection using YOLO

Apr, 2021 - Jun, 20243 yr 2 months

Implemented object detection using the state-of-the-art YOLO (You Only Look Once) model. Developed a robust system capable of real-time detection and recognition of objects in images or videos. Applied in various domains such as surveillance, autonomous vehicles, and augmented reality, demonstrating versatility and practical application of computer vision technology.

Resume Parsing & ATS Matching Project

Nov, 2019 - Apr, 20211 yr 5 months

Developed a resume parsing and Applicant Tracking System (ATS) matching solution aimed at efficiently analyzing resumes, extracting relevant skills, and matching candidates to suitable job opportunities within the company.

Restaurant Chatbot

Nov, 2019 - Apr, 20211 yr 5 months

Created a chatbot using AllenNLP, and Elasticsearch, and integrated cosine similarity with BERT embedding, resulting in a remarkable 70% accuracy improvement.

Translation API using Whisper

Nov, 2019 - Apr, 20211 yr 5 months

Created a translation API leveraging Whisper technology for secure and private communication. Developed an efficient and accurate translation service to bridge language barriers. Contributed to global communication and accessibility by enabling seamless translation in various contexts.

Service Now Monitoring Dashboard

Jan, 2017 - Oct, 20192 yr 9 months

Created Python Flask and MySQL web dashboards to oversee ServiceNow tickets and server performance, enhancing team efficiency and enabling proactive interventions. Achieved over 30% productivity increase by optimizing SLA management.

SOX Report Automation

Jan, 2017 - Oct, 20192 yr 9 months

Automated the SOX Audit Report with Python, significantly reducing manual efforts during quarterly publishing.

LMS Course Template Conversion Automation

Jan, 2017 - Oct, 20192 yr 9 months

Developed an automated solution in Python to convert legacy JSON templates to a new format for Learning Management System (LMS) courses. This project aimed to enhance content management and ensure template consistency across the LMS platform.

Education

MCA Artificial Intelligence
Pune University, India (2017)
BCA
Sikkim Manipal University, India (2014)

Certifications

Executive post graduate ml & ai, iiitb, 2024
Certified information security & ethical hacking v8, mar 2018
Agile project management

AI-interview Questions & Answers

Could you help me understand about your background and give you a brief introduction of yourself? Okay. So my name is Prashant Kumar. Uh, I am like, uh, I have a new experience of around 7 years, and, uh, I majorly work on technologies like NLP, CNN, Python. So, uh, like, apart from this, I have graduated my, uh, like, post graduation from triple ITB, And, uh, I'm also certified in information security and ethical, again, apart from this.

Okay. How would you leverage a vector database in Python to enhance the efficiency of AI models that, uh, require similar item retrieval? Okay. So, uh, I would like to frame, uh, my answer as, like, uh, so, uh, like, uh, first of all, I will go with the introduction of vector databases. So using a vector database in Python can significantly enhance the effectiveness of AI models. Like, uh, so, basically, a vector database, it is spliced, uh, a special type of database, we can say, that is, uh, designed to store and, uh, like, to store and query vectors, which are numerical representation of the data points. So suppose you have a large contextual data and you want to, like, uh, store the numerical representation, we use, uh, vector databases. These vectors are often high dimensional and represent the feature of the data points such as text, images, or other complex data types. Uh, vector database excel at performing similar searches. So when you were to find, like, you want to do a cosine similarity kind of things, you can use that. Okay? So we can use it for, like, similarity search, recommendation system, uh, 3 is animal detections and clustering and, uh, classifications. So implementing a database in Python, like, uh, we have to choose a vector datasets. So there are many, uh, vector databases available, like Facebook's ones. That is FAISS, then Anani, then Pinecone. So we can use any one of them. Uh, we will install the necessary packages, and then we can use that database. Okay. So that's, uh, that's the rule of, uh, vector database.

When faced a high dimensional data, how would you use TensorFlow to perform dimension reduction before applying a machine machine learning algorithm. Okay. So first of all, what is TensorFlow? So TensorFlow is, uh, architecture. Okay? And, uh, that provides, uh, basically, uh, a way to, like, uh, look the data aspect, understand the sentences, and get the context of the data. So it provide, uh, several tool for, uh, demonstrate reductions, including auto inquires. We have we have principal component analysis, and, uh, we can use, uh, auto encoder for diversity reductions. So auto auto encoder, uh, what are auto, uh, autoencoders? That are the neural networks. Sorry for word pronunciations. Auto indicators are the neural network designed to learn if we send representation of the data of, uh, what I can say, uh, of different representation of data, typically, for the purpose of density, uh, reductions. They consist of 2 main part, uh, encoder and decoder. So encoder that, uh, compress the data and decoder then to the vice versa.

K. What is your approach of training deep learning model on imbalanced dataset, and how would you ensure the model performance, uh, remains robust using PyTorch or TensorFlow? Okay. So, uh, first of all, to, uh, train deep models, uh, on, uh, imbalance dataset can be challenging because of model tend to be best toward the majority classes. So how can we deal with this situation? Understanding the problem and the data explorations. So beef, uh, first of all, look the dataset. Uh, we will go through this, and we will see the class distributions. And we will analyze how class imbalance might affect our predictions. And then we can apply several, uh, techniques like resampling technique. So what is a resampling technique? Uh, increase the number of sample in majority classes by duplicating existing or sample generating new ones. Uh, we will use a technique like a SMOTE that is synthetic minority oversampling tech, uh, technique. And, uh, we can do a class wedding. So we provide, uh, witness to the class that are, like, more important to me. And then we can use, uh, select the proper model architecture, of course, like, uh, to avoid overfitting. Uh, we ensure that the model complex complexity is appropriate for the size of minority class. And, uh, like, then, uh, there are training strategy we can use. We can use balanced batch generators. So create a batch a batch with equal number of samples for each class, ensuring the model balance data during the training step.

Can you provide a strategy converting a machine learning model in Python in in a product centerding system using NoSQL database for data storage? Okay. So to answer this question, uh, let me think of this question. Yeah. So converting a machine learning model, uh, developed in Python into a proper production ready system, uh, that is a NoSQL database, uh, that will, uh, like, uh, involve multiple steps. First of all, that will be like model optimization in which we will do the model serialization, convert our trained machine learning model into a serialized format that can be easily loaded into the production system. So we can use pickle. We can use. You can, uh, we can use, uh, TensorFlow same models. Then we, uh, what we will do, we will do the model versioning. So to keep the track of our trained model, we do the model versioning, and then we can choose any, uh, NoSQL database. Like, we have MongoDB. We have Cassandra. We have Redis. We have Elasticsearch. And, uh, then we design our schema. And, uh, then, uh, we do set up the APIs. Okay. And, uh, like, then we do the, uh, real time data in this instance. Like, uh, we can use Kafka. Okay. And, uh, like, kind of things we can.

Can you describe situation where you use a computer vision with PyTorch to solve a real world problem? Uh, yes. So in my recent project only, we have used, uh, like, uh, object detections, uh, technique, uh, using the web, what we can show. So how I used PyTorch? Let me go step by step. So, like, uh, for a manufacturing plant, the quality of, uh, like, uh, we have to check the quality of product. Like, we is there is there having scratches, dent, or incorrect assembly? So to address this situation, uh, we have developed a computer vision, uh, system using a PyTorch to automate the defect detections. So the system, uh, was using deep learning model. So in to inspect the real time identity of the defected items. So what we do what we, uh, did, basically, we use the first, we first of all, the the there was a client data, and then we did the data preprocessing, like the image, uh, annotation. Okay? And, uh, after in a a, um, like, applying that data augmentation and all, we did the transfer learning of the model. So, uh, basically, we used a pretend model of ResNet 18, and we, like, uh, did the, um, fine tuning of that model, basically. And then we did the validation and the testing of that model.

A section of code is used for preprocessing data enable pipeline. Please explain the error in this code snippet, which is used to supposed to termulate the data. So the error is residing in the formula itself. There is a logical error in the

Okay. Even the following button code snippet, uh, what did the issue with the code for correctly creating a machine learning model pipeline? So in the pipeline, the instruction, there is a

What method do you use? Uh, what Okay. What method would you use to interface computer version models in TensorFlow with Python based NLP models, ensuring cross compatibility and efficiency and data handling. Okay. So to answer this question, uh, to interface computer vision model in terms of flow with Python based NLP models, uh, we can use, like, uh, uh, following, uh, like, uh, approaches, like, uh, following a proper semantic. Uh, like, we will use a unified data pipeline. So we will do the data preprocessing. We will do the future extractions, then we will do the model interfacing, like shared immediate representation, and we can use custom layers or models. Then we can do the cross model communications, uh, uh, and, uh, we can apply technique like battle processing and batch processing. And we can use a make use of TensorFlow extended. So consider using, uh, TensorFlow extended to build, uh, into int ML pipeline that can integrate both computer vision and NLP model.

Discuss that technique in Python to automatically handle missing corrupted data in large dataset that might affect machine learning model performance. Okay. So, like, handling missing or corrupted data in a large data set is a crucial for building robust machine learning models. So in Python, we have several technique to address, uh, this type of issues. Like, uh, we can make use of libraries like pandas, scikit learn, numpy. Okay? So first, we can do is, like, uh, we can do the pandas profiling. In in that, what we do, we do use make, uh, of the inbuilt function like is null or info to get the dataset informations, and we understand the dataset. Then we remove the missing data. Then we implement mean, median mode techniques to, basically, to find the average, and we identify the missing data. Then, uh, we do the database, uh, validation, then we do, uh, outlier removal. Okay. Then we use techniques like doc data augmentations and all. And, uh, like, uh, that's all.

Can you illustrate how, uh, version control with the GitLab would aid a collaboration for a remote data deploying a TensorFlow models. Oh, k. So first of all, GitLab GitLab is like a library to store our code. Okay? So we can, uh, do the, like, database versioning and, uh, collaborative development using, uh, GitLab. So deployment of a TensorFlow model with a remote dataset can be streamlined workflow, and that can enhance the collaborations. So we can use technique like, uh, batch strategy, or, uh, we can do CICD model training and deployment. And, uh, we can use collaborative notebook and, uh, remote data set scaling. And we can do monitoring and feedback kind of things.

Prasun Sinha

Data Scientist

7 years

Skillsets

Vetted For

Professional Summary

Applications & Tools Known

Work History

Assoc. Tech Specialist

Senior Software Engineer

Software Developer

Associate IT Applications Specialist

Achievements

Major Projects

Medical Insurance Document Processing Automation

Predictive Healthcare Model for Mortality Risk Assessment

Attendance System with Hand Gesture Recognition

Employee Emotion Detection and Notification System

Object Detection using YOLO

Resume Parsing & ATS Matching Project

Restaurant Chatbot

Translation API using Whisper

Service Now Monitoring Dashboard

SOX Report Automation

LMS Course Template Conversion Automation

Education

MCA Artificial Intelligence

BCA

Certifications

Executive post graduate ml & ai, iiitb, 2024

Certified information security & ethical hacking v8, mar 2018

Agile project management

AI-interview Questions & Answers