profile-pic
Vetted Talent

Yogesh Joshi

Vetted Talent
I possess the role of developer proficient in Python and AI, having hands-on expertise in crafting, programming, and validating intricate systems and models. My skill set encompasses deep understanding of Descriptive and Predictive Modelling. I showcase robust analytical, logical, and statistical capabilities, driven by an innate passion for continuous learning. My proficiency extends to Python, R, C#, C++, and the realm of Data Visualization. AREAS OF EXPERTISE Research and Development of AI Software Web, Software Development Machine Learning Techniques Deep Learning Methodology Prompt Engineering with open api, Tensorflow, Keras, Pytorch, Sparks etc. Web Frameworks: Flask, Django etc. Agile Methodology Database Modelling UI/UX Design and Development Micro-Computing and IOT Cloud Integration and Cloud Deployment Utilization of Hugging face, Wandb, OpenAI.
  • Role

    AI Engineer

  • Years of Experience

    12.1 years

  • Professional Portfolio

    View here

Skillsets

  • cloud deployment
  • Deep learning methodology
  • Prompt Engineering
  • OpenAI
  • Machine learning techniques
  • Hugging Face
  • Docker containers
  • Database modeling
  • Data Visualization
  • PyTorch
  • Ai software development
  • Agile Methodology
  • UI/UX Design
  • Python
  • Keras
  • Django
  • Flask
  • TensorFlow

Vetted For

18Skills
  • Roles & Skills
  • Results
  • Details
  • icon-skill_image
    Senior Generative AI EngineerAI Screening
  • 55%
    icon-arrow-down
  • Skills assessed :BERT, Collaboration, Data Engineering, Excellent Communication, GNN, GPT-2, graphs, Large Language Models, Natural Language Processing, Sagemaker, Deep Learning, neural network architectures, PyTorch, TensorFlow, machine_learning, Problem Solving Attitude, Python, Vertex AI
  • Score: 55/100

Professional Summary

12.1Years
  • Mar, 2024 - Present2 yr 3 months

    Consultant

    Deloitte
  • Nov, 2022 - Mar, 20241 yr 4 months

    Associate Data Scientist

    Reflexion AI
  • Associate Data Scientist

    Reflexion.ai
  • Jun, 2019 - Aug, 20201 yr 2 months

    Python Developer

    Biosense Technologies
  • Aug, 2020 - Sep, 20211 yr 1 month

    AI Developer

    Dataviv Technologies
  • Sep, 2021 - Oct, 20221 yr 1 month

    Associate Data Scientist

    Sciffer Analytics Pte Ltd
  • Feb, 2019 - May, 2019 3 months

    Machine Learning Intern

    Biosense Technologies
  • Jun, 2016 - Feb, 2017 8 months

    Cyber Ambassador

Applications & Tools Known

  • icon-tool

    Python

  • icon-tool

    Docker

  • icon-tool

    AWS (Amazon Web Services)

  • icon-tool

    Google Cloud Platform

  • icon-tool

    Django

  • icon-tool

    FastAPI

  • icon-tool

    Hugging Face

  • icon-tool

    TensorFlow Hub

  • icon-tool

    PyTorch

  • icon-tool

    Tensorflow

  • icon-tool

    OpenCV

Work History

12.1Years

Consultant

Deloitte
Mar, 2024 - Present2 yr 3 months

Associate Data Scientist

Reflexion AI
Nov, 2022 - Mar, 20241 yr 4 months

Associate Data Scientist

Reflexion.ai
    Working on existing AI/Deep learning-based video meta-data generation product. Improving and adding newer modules to video meta-data generation product using various deep learning architectures.

Associate Data Scientist

Sciffer Analytics Pte Ltd
Sep, 2021 - Oct, 20221 yr 1 month

AI Developer

Dataviv Technologies
Aug, 2020 - Sep, 20211 yr 1 month
    Handle and manage development lifecycle for entire project. Develop and design efficient algorithms for ML and DL for client requirements. Leading team for meeting close deadlines and finding innovative solutions.

Python Developer

Biosense Technologies
Jun, 2019 - Aug, 20201 yr 2 months

Machine Learning Intern

Biosense Technologies
Feb, 2019 - May, 2019 3 months
    Started as ML Intern for Research and Development Team. Grabbed permanent position as Python Developer in June. Worked with team to build AI incorporated software for healthcare diagnostics, and made ML/AI projects on Raspberry Pi.

Cyber Ambassador

Jun, 2016 - Feb, 2017 8 months
    Strived to educate both children and adults about the realm of cyber-crimes and cyber frauds, empowering them with the insights needed to combat and exercise vigilance against such malicious endeavors.

Achievements

  • Awarded Mr.Reliable from Reflexion.ai for steady development and timely completions
  • Placed Best Host of the Year from Late G N Sapkal College for Communication Skills and Improvisation
  • Placed Avid Readers award from High School for communication and story telling skills

Major Projects

4Projects

Blood Counter (AutoMicroscope)

    Part of a team working on Deep learning-based solution for counting microscopic blood cell from images. Worked as Python ML developer.

Retail Analytics based on CCTV feed

    Carrying out analysis based on retail consumer information gathered with CCTV cameras. Modules include Walk-in Counter, Facial Attendance, Social Distancing Monitor, Mask Detection, Threat Detection.

Traffic and Quality of Traffic Prediction

    Predicting traffic and quality of traffic at any location using Census data and application ping data by implementing multi-output regression.

Dental Treatment and Diagnosis

    Support AI Developer to build and deploy diagnosis solutions for dental issues using Mask CNN. X-ray images for teeth were used to predict the treatment plan.

Education

  • Bachelor in Computer Engineering

    KCT Late G N Sapkal College, SPPU (2017)
  • HSC in Science Stream with Computer Science

    G D Sawant College, Nashik (2013)
  • SSC

    St. Francis High School, Nashik (2011)

Certifications

  • Data Science and R Programming

    ETLHive (Pune) (Oct, 2017)
  • TensorFlow Developer Certificate

    Udemy (Aug, 2023)
    Credential ID : UC-4dd17338-7d01-4e48-9d61-d5b5ceacaefa
    Credential URL : Click here to view
  • AWS Certified Machine Learning Specialty

    Udemy (Aug, 2023)
    Credential ID : UC-96ab07af-8255-46b6-874a-a1645a680f47
    Credential URL : Click here to view
  • TRAINING Deep Reinforcement Learning - Hugging Face [2023]

  • Deep Reinforcement Learning

    Hugging Face (Aug, 2023)
    Credential ID : yogjoshi14
    Credential URL : Click here to view

AI-interview Questions & Answers

Yuki Tushy. I have approximately 5 years of experience in data science, AI, ML, as well as Python development. I have a versatile portfolio where I worked on many different types of projects, including machine learning, mostly deep learning and computer vision. My expertise lies within deep learning and computer vision, but as the trend has more diverted towards LLM and large language models. I have also contributed to getting familiarized with Tuuk's technologies. About a year ago, I started working on NLMs. One of the projects I am working on is based on visual transformers as well as the latest video released with the code. So I have expertise in the Python language as well. Apart from that, I have a good understanding of core principles such as database management, handling pipelines of data, and handling deployment. I'm also eager to learn new technologies. I have recently gained exposure to MLflow and other deployable technologies. I also have expertise in the latest version of PostgreSQL, this database.

The techniques depend on the and the size of data which we need to train. Most preferably, I'll use continuous development and continuous integration techniques where I would have a base model prepared on a large dataset. And depending on the base model's accuracies and the base model's performance benchmark on different sets of datasets. I'll build up on that by hyperparameter tuning. I can do manual hyperparameter tuning depending on the experimentation as well as opt for an option, which is an automated hyperparameter tuning tool. There are also some different libraries such as Weights and Biases and TensorBoard which can help in logging and monitoring the different types of parameters used for experimentation.

There are different types of understanding of loss functions. Primarily, the problem that we are tackling is using algorithms that have worked the best in research papers, depending on the loss function. That is one of the mechanisms I'll use while approaching deep learning and choosing the loss function. In case there are other resources or materials suggesting better loss functions, I can opt for those as well. However, primarily, published research papers and conferences suggest the loss function that has performed better in this case, and my approach would be to use that particular loss function. If I have to switch loss functions, my approach would be to have an experimental approach where I would consider opting for multiple similar types of loss functions. For example, if I'm choosing the cross-entropy loss function, it might be also beneficial to ensure that it works well on classification-type data.

Assigning tasks is what is primarily important when working on cross-functional tasks, depending on the manner they might be. And as the latest trends and technologies have been evolving, there have been incorporations to implement multiple sources or multiple types of technologies to get a better output. One of those examples might be a Postgres ML, where now you can have model inferencing through a basic SQL script. So I think that is something I take as a lesson, and I would prioritize individual tasks over team tasks because once individual tasks have been fulfilled and met, then only the dependencies towards other team members' tasks can be functionally fulfilled. So while implementing AI features in a cross-functional development, I would prioritize individual tasks over team tasks because in a cross-functional platform or cross-functional environment, individual ideologies or individual work are more important than teamwork because the individual's work is dependent on the cross-functional team.

There are multiple approaches to solve the problem of implementing a continuously training NLP model with new incoming data. One of the most efficient approaches is to have a storage mechanism where incoming traffic is stored, then attached to a data pipeline. The data pipeline's job is to clean the data and properly establish it for the model. Then, there will be a training pipeline that takes the cleaned data and trains the model to perform the NLP task. Once the training task is done, there will be two types of pipelines. One can be a deployment pipeline and another is an inference pipeline. Within the deployment pipeline, we need to add a scheduler or scheduling mechanism where it would periodically retrain itself depending on the solution required, but also doesn't have a catastrophic forgetting mechanism where it forgets previous rates and outputs. So, that is something we need to monitor. To tackle those problems, I think newer trends and newer technologies such as vector databases can be implemented to have a storage mechanism.

The mitigation strategies are basically multiple strategies to avoid biases in the data. One of which is to incorporate representative data that represents all types of data, regardless of the problem we need to solve. Another way is to demographically ensure the data shows all regions and countries. That is also one of the approaches we can use to mitigate skewness and bias in the data. There are also other imputation and augmentation techniques that can be used to normalize the data. That's it. I think these are the strategies we can use. Apart from that, there are also other strategies that can be researched and implemented depending on the data and the scenario we are working on.

Self-attention mechanism wouldn't take three inputs of x, it would be a single input. The basic transformer block has the attention layer with a feedforward neural network, so I think that's something which is missing. And also before the attention layer, there should be some additional layers such as an embedding layer. In case we are incorporating it within the transformer block, we can add the positional embedding vector layer also within the transformer block. I think that's something. The input shape for the self-attention mechanism, I think that's something which is

Absolute mean mechanism is somewhat often nonstandard for loss mechanism in LLM. The most frequently used or most promiscuously used loss mechanism for any kind of generative model is mostly the root mean squared error. So that the steps the model takes towards the gradient are higher, and it reaches its global minima faster. So I think that is something we can implement here.

There are multiple readily available architectures which we can use for text generation. One of which is using the BERT mechanism, which is a bidirectional encoding and coding representation of transformers, where basically it has only the decoder layer. Stacking the decoder layers together can be helpful for index summarization problems, whereas the output layer can be a softmax layer.

For ingestion pipeline, there are multiple cleaning and preprocessing steps that can be done beforehand to identify anomalies. One of the steps, while the condition of data is met, is to clean the data and have preprocessing done on the data. This particular step can be useful in avoiding anomalies. I think that is a particular step that can be done to avoid anomalies in the data. Apart from that, we can also have a CloudWatch mechanism where it checks the data. It has crawlers. We can use different types of crawlers even before having the data ingested into the pipeline to basically sort and eliminate any kind of anomaly.

There are different types of benchmarks to validate output based on the LLM. So, the sentiment analysis part, I think, there are different benchmarks readily available. I don't remember, on the top of my head, the benchmarks to validate the particular process. But, apart from the readily available benchmark, we can also have a human-in-the-loop mechanism where the human itself can go through the result output and validate it if the output is somewhat acceptable.

Depending on the use case, we can choose either TensorFlow or PyTorch. In both cases, the programmatically and logical consensus between TensorFlow and PyTorch are somewhat similar. So either choosing TensorFlow or choosing PyTorch won't hamper the overall development, but for the simplicity's sake, I would prefer PyTorch as it is open source. Although TensorFlow is also open source, it has its own limitations and works well with TPU format architecture, whereas PyTorch can handle almost all types of architecture. The conversion of TensorFlow and PyTorch to optimized version, to a quantized or optimized situation are somewhat similar. Both can be condensed into smaller forms. Both have batch fetching mechanisms which can be useful. Both have their own limitations and enhancements, but I think either is fine. While choosing development, transformer-based text mostly most examples of transformers have been made available by PyTorch. So I think PyTorch is more preferable than TensorFlow as the backend for the Hugging Face CP, which uses PyTorch. But again, TensorFlow, if we are choosing to develop Google-based architecture like Bard, then I think TensorFlow is much better. So depending on which you choose for the alternative model, it can be beneficial to have a