profile-pic
Vetted Talent

Madhumitha Kolkar

Vetted Talent

Seasoned Machine Learning Engineer with 3.3 years of professional experience working with a specialization in Natural Language Processing, Computer Vision, Deep Learning and Generate AI.

  • Role

    Senior Machine Learning Engineer - Research Specialist

  • Years of Experience

    6.1 years

  • Professional Portfolio

    View here

Skillsets

  • rag
  • Mediapipe
  • MySQL
  • NumPy
  • OpenCV
  • pandas
  • Pg_trgm
  • Pinecone
  • PostgreSQL
  • Prompt Engineering
  • MCP
  • ResNet
  • semantic search
  • Stable Diffusion
  • Streamlit
  • wav2vec
  • Word2Vec
  • XgBoost
  • Yolo
  • FAISS
  • PyTorch - 4 Years
  • TensorFlow - 4 Years
  • Transformers - 4 Years
  • Scikit-learn - 4 Years
  • BERT
  • CNNs
  • Deepface
  • Diffusion models
  • Python - 5 Years
  • FastAPI
  • Flask
  • Git
  • glove
  • Keras
  • Kotlin
  • LSTMs
  • Matplotlib

Vetted For

10Skills
  • Roles & Skills
  • Results
  • Details
  • icon-skill_image
    Machine Learning Scientist II (Places) - RemoteAI Screening
  • 82%
    icon-arrow-down
  • Skills assessed :Large POI Database, Text Embeddings Generation, ETL pipeline, LLM, Machine Learning Model, NLP, Problem Solving Attitude, Python, R, SQL
  • Score: 74/90

Professional Summary

6.1Years
  • Jul, 2025 - Present 11 months

    Senior Machine Learning Research Specialist

    Nokia
  • Senior Machine Learning Engineer - Research Specialist

    Nokia
  • Aug, 2024 - Jul, 2025 11 months

    Machine Learning Engineer

    Stealth AI Startup
  • Dec, 2020 - Apr, 2021 4 months

    Data Scientist

    Deloitte
  • Apr, 2021 - Feb, 20242 yr 10 months

    Machine Learning Engineer

    Mercedes-Benz Research and Development India

Applications & Tools Known

  • icon-tool

    OpenCV

  • icon-tool

    NumPy

  • icon-tool

    Dialogflow

  • icon-tool

    Mediapipe

  • icon-tool

    Streamlit

  • icon-tool

    Pandas

  • icon-tool

    PyTorch

  • icon-tool

    scikit-learn

  • icon-tool

    Android SDK

  • icon-tool

    MySQL

  • icon-tool

    MongoDB

  • icon-tool

    Git

  • icon-tool

    Flask

  • icon-tool

    FastAPI

  • icon-tool

    PostgreSQL

  • icon-tool

    FAISS

  • icon-tool

    Pinecone

  • icon-tool

    AWS Lambda

  • icon-tool

    AWS S3

  • icon-tool

    AWS EC2

  • icon-tool

    Docker

  • icon-tool

    Hugging Face

  • icon-tool

    Airflow

  • icon-tool

    AWS

Work History

6.1Years

Senior Machine Learning Research Specialist

Nokia
Jul, 2025 - Present 11 months

Senior Machine Learning Engineer - Research Specialist

Nokia
    Designed a multi-agent AI framework and LLM-powered Simulation Assistant using RAG, enabling natural-language-driven simulations and documentation access for researchers. Implemented context-aware conversational recommendations with a user feedback loop to improve relevance and system performance. Architected a PostgreSQL-backed RAG pipeline for a documentation agent, integrating short-term session memory and long-term user memory for contextual and personalized interactions. Contributed to the Nokia AI Literacy program by creating ML/AI training content and delivering biweekly live knowledge-transfer sessions, supporting organization-wide AI adoption and upskilling.

Machine Learning Engineer

Stealth AI Startup
Aug, 2024 - Jul, 2025 11 months

Machine Learning Engineer

Mercedes-Benz Research and Development India
Apr, 2021 - Feb, 20242 yr 10 months

Data Scientist

Deloitte
Dec, 2020 - Apr, 2021 4 months

Achievements

  • • Exemplary Performance: Recognized as a "Star Performer" for consistently exceeding established company benchmarks by 40%. • Mentorship and Talent Acquisition: Successfully trained and mentored over 15 individuals, and Actively participated in hiring for senior positions (T7/T8/T9s) and fresher, contributing to attracting top talent for company growth. • Open-Source Advocate: Made 4 notable contributions to popular Machine Learning libraries like Keras, TensorFlow, and OpenAI Whisper, actively promoting collaborative development within the Machine Learning community. - Speaker for Google , Conscious Algorithms : A Talk on AI Safety.
  • Star Performer award
  • Google Dev Conference speaker
  • Mentorship of 15+ individuals
  • Presented expertise at Google Dev Conference
  • Star Performer
  • Mentorship
  • AI Safety Presentation

Major Projects

3Projects

Moodmap

    Engineered a novel Multimodal AI system (speech-to-text, DeepFace, Conv1D-BiLSTM-GRU) for real-time emotion analysis, achieving an F1 score of 0.86 for speech emotion classification.

Say What Now

    Developed an LSTM-based next-word prediction model trained on OpenWebText for text generation.

PaperScribe

    Architected and implemented a Retrieval-Augmented Generation (RAG) system powered by GPT-3 for question answering over research papers.

Education

  • Bachelor Of Engineering - Computer Science

    SDMCET (2020)

Certifications

  • Machine Learning

    DeepLearning.AI- Stanford (Apr, 2024)

Interests

  • Filmmaking
  • Travel
  • Art
  • Photography
  • AI-interview Questions & Answers

    So, hey. My name is Madhavita Pulkar, and I am a machine learning engineer with 3.3 years of experience working on natural language processing, computer vision, and generative AI and speech recognition. So I have worked in 2 companies before, and my previous organization was Mercedes Benz, where I worked as a machine learning engineer on projects mainly related to natural language processing and computer vision and generative AI. And before that, I was working at Deloitte where I worked on speech recognition. So that was speech to text for a legal firm where we were trying to create a product that would help them convert recordings of legal hearings into speech because they had issues with human errors. And when it comes to the work at Mercedes, there was one project which was related to creating a customer service related chatbot for booking online appointments. And this was actually useful because it used to cut down on overhead charges that used to happen when people would miss out on appointments, and we would still have to pay technical staff and a lot of use cases. But mainly for this, what we did was we built an in-house LSTM, which was an encoder-decoder based model. So it was capable of doing mainly three different kinds of things: it could intent to classify the team which the issue was supposed to be routed to, it could predict whether the issue was self-diagnosable or if it was something that required a service center appointment. And if it was self-diagnosable, then it used to learn from a subset of data that we had given, which was related to diagnosis or hot fixes for certain issues, quick fixes, and it used to be able to do that. So after doing this, we were actually able to get the funds for scaling this up. And once we scaled it up, we took it to Google's Vertex AI, and we used BERT along with it. So we created a use case for transfer learning and sample-based learning for it. So, I have experience with large language models, productionizing it, and deploying it. And even when it comes to computer vision. We used Yolo for object detection. The basic thing behind this was we were trying to automate the UI process of testing, which engineers actually had to do manually because there's a lot of security in the car, and we can't directly write test scripts like scripts to automate the whole thing. So we had to automate the UI. So for that, we used Yolo, and then for more fine-grained classification of icons and stuff, we used ResNet 50. So, yeah, it was CNN-based image classification, and then we used Python scripting for automating the whole thing once image classification was done properly. So, yeah, this is mainly that. And then, yeah, generative AI research, and we were building applications to integrate it into our systems. And so, yeah, that was there were a lot of learning aspects to that too. Again, I've worked on retrieval-augmented generation, basically, to come up with a nice application use case for this. That is, my experience.

    So pipeline is basically like an extract, transform, load pipeline. Right? So if we wanna achieve consistency, then we could follow a lot of methods for this. So, basically, to automate the whole thing, I think we would start off with something like profiling the data. You wanna understand the structure, the content, the quality of what kind of data you actually have. And for this, you could use tools such as Informatica Data Quality or Griffin, and implement some kind of data profiling tools such that we can gather these statistics and identify patterns within the data. And then I would go for something like validating the data so that we can define some fixed rules to make sure that our data adheres to these transformation sort of rules. And I think for that, I would just go with using SQL scripts. And so I would initially start by creating some sort of rules, putting some kind of constraints, and then I would try to maintain some integrity and mandatory fields. Like, this has to comply to this. And, yeah, I would apply these at all stages, like, before ETL, during ETL, and even after ETL so that I can keep verifying the data. And then I would start automating these testing frameworks, you know, to implement something that can run automatically and people don't have to use it. So, I think we could use something like Amazon Web Services for this. You know, there is DeepQ. So we could use DeepQ. And, basically, we wanna set expectations of what we want, and then we can integrate these tests and make sure that our model is fulfilling all of this in the pipeline. And it can run specifically between certain intervals on its own. And after that, monitoring, you know, you have to keep monitoring to make sure that the pipeline is working properly or it alerts you when there are certain inconsistencies. So I think for this, we can use tools like Grafana. I think AWS Cloud has something called CloudWatch, which can also be used. So we basically set up some kind of tool for checking the ETL process status, the data quality, based upon certain metrics that we could have defined. And upon that, we can decide if we need to make changes or something is not working properly. Apart from that, I think we would have some kind of auditing sort of thing. Like, you can use something like Atlas or Informatica again. So you would basically create something that would keep track of the transformations for the data and maintain logs for it. And on top of that, I think we basically use version control again, like, git or Jenkins. So by using version control, we can just make sure that we're always up to date, and we can always roll back if we want to. So we can just set up some CICD pipeline to test again, deploy the ETL.

    How can an LLM be utilized to enhance an existing NLP based system? Okay, so I could relate this to my personal experience. So, like I mentioned, initially, we were using our own in-house LSTM, which was an encoder-decoder based, and then we trained it upon a lot of data that we got. And later, as I mentioned, once we scaled it up, we started using Word. So, like, in that case, large language models have a lot of benefits and implications because they can significantly enhance the amount of capability that a model can make because the amount of data that they are trained on is obviously humongous. And it's learned so much that it's hard for us to come up with a model, a local NLP model that has that amount of data. So, what could be certain benefits is, you can do text generation and summarization because it improves the quality and enhances the text generation. Because if it's understanding, like the embeddings are much more sophisticated, and there is a variety of things that it knows. So it enhances that. So if I use something like GPT 3.5 or 4 for generating high-quality, coherent summaries for something, that would significantly prove very beneficial compared to something that's local. Right? Like, these systems are much more sophisticated. So the main thing would be, like, using something like transfer learning would be good because it would reduce how much work we have to do and train it again because we already have things that are learned. It would be good for automating certain things like content creation for blogs, sort of a way, you know, because you have so much data that you can work on. LLMs also have this really good understanding of language. Right? So if I'm using it for something like sentiment analysis, that would also prove a higher level of accuracy, like, if I leverage LLMs. Because sometimes, you know, sentiments have this kind of nuance. You don't really realize what the user is trying to say. And for me to train something from scratch for that is a bit hard. But if I use an existing LLM and its pre-learned knowledge, it would be very useful. It can also be used for certain things like feedback, if I want to analyze some customer's feedback or, even in cases when there's named entity recognition. It would help in enhancing that. And other use cases would be, like, you could have machine translation, you know, and making it more fluent. You're making your model more fluent. And so, basically, integrating this LLM into your NLP model, it would be good. Then we have, what, conversational AI and, like, our chatbots, basically. So if we have LLMs integrated, then we can actually increase its natural level of understanding and its context-aware responses, which is good. Then you could use it for, like, classification or modeling, information retrieval in the case of, like, retrieval of minute generation for, like, Q&A. So, yeah, basically, there are a lot of things, and mainly the fact that it's trained on so much amount of data that it gives you that upper edge upon your base model. So yeah.

    Device's strategy for implementing an SQL-based solution for real-time POI metadata enrichment. Okay. So, for a point of interest, metadata enrichment. Let's see. Well, this would be a multistep approach for this. I think we would start with normal flow, data acquisition, real-time processing, enriching the data, which is sort of augmenting it, and then maintaining a database. Okay. So the first step I would do is go with data acquisition. Collecting data is the first thing that you're supposed to do. So I would collect this data for my point of interest and collect data from internal and external sources just for enrichment to make sure that I have a good set of data. I would collect primary data. That would include basic details, names, addresses, coordinates, or stuff like that. And then I would go for external data, where I could identify and integrate this external data source just to enrich my data collection. I could go on to something like demographic information, weather information, or social media check-ins. Then, so I have the data now, and I would probably want to integrate it. Right? So, I could design an architecture to handle real-time data processing. Basically, I would use ETL tools, which could be used like message queues, SQL databases to manage the flow of the whole data. I would probably use Kafka or MySQL databases and real-time streaming. Then the next step would be to go for real-time data ingestion. So if I want to ingest this point of interest data and these external sources, the real-time data that I got, then I would have to use streaming process, maybe go for Kafka to get all this data from external sources as well. I would try to enrich the metadata that I have. I would use SQL and stored procedures to enrich the data while it's ingested, and we could probably employ batch ingestion to do this. And after that, one of the major steps would be the database schema. So you want to create tables for the POIs, external data, and enriched data. You want to index it, and then you want to optimize your query performance. And then you want to go for real-time processing of the logic. You could write stored procedures to handle the enrichment logic or create triggers for it for your insertions and updates. And basically, at the end of all this, you want to do your monitoring and alerts. So you could probably use tools like Grafana for this or Prometheus. So you just want to monitor the data pipeline, and you can set alerts when something is not performing well or you just have some bottlenecks for this. So, yeah, that's the flow that I would do it.

    Propose a technique for incorporating vector database technology in a POI matching algorithm. Technique for incorporating vector database technology is to utilize vector databases to categorize and find similar and relevant points of interest (POIs) based on their features, such as descriptions, geography, or certain attributes. The first step is data preparation, where we collect and extract relevant features from POI data to create embeddings. This involves collecting data including names, addresses, coordinates, and other relevant attributes. Next, we perform feature extraction to extract meaningful features from the data, which could be in text format. These features could include descriptions, categories, demographic data, feedback ratings, and other attributes. We then create embeddings by converting these features into vector representations. We can use pre-trained models such as BERT, GPT, or custom-trained models to convert textual data into vectors, or use fast text for text-based data. For geographical data, we can convert latitudes and longitudes into vectors. We combine both text and spatial data into a single vectorized format and create a vector database to store and manage these embeddings. We can use libraries such as Faiss, which is an open-source library developed by Facebook for efficient similarity search and clustering of dense vectors. We design a schema to store our data along with its metadata and perform data ingestion, either in batch or real-time. We then apply a similarity search algorithm to find the proximity between POIs, using metrics such as Euclidean distance, cosine similarity, or k-nearest neighbors. Finally, we write logic for our application to translate user queries into vector search operations and handle requests properly, displaying relevant results.

    Describe a system designed to automate the recognition and flagging of outdated PUI listings. So this would involve setting up a system that would be continuously monitoring. And we needed to evaluate and update the status of the point of interest. This would actually depend upon a variety of data, including interactions, detailed descriptions, and whatever information we can get related to it. So that depends initially on the data source. I need to collect it to assess all that information for the status of the POI. So that would be like my primary data source. That would be original listings from my database, and I could also go for external sources where I could get social media information, check-ins, websites, government records. I could also go for geographic data. And what I could do after that is come up with a system architecture. So, I need a data ingestion layer that's going to collect data from these various sources, and then I need to process it. So I'd have to have a processing layer, then I need to store it. So after you store it, you'd need something like an action layer. Like, I can notify and flag outdated listings. Or I can send a trigger when there's something that's not right. So when it comes to data ingestion, like I need to continuously keep collecting and then updating data from all of these various sources that I have. I think for that, the best thing would be an ETL pipeline to use, like Airflow or some Apache tool for it. And then I could do real-time data streaming using Kafka or ingestion APIs or social media platforms for that. And then when I do have the data, I need to do my feature extraction, basically, my data engineering and processing. So I'd have to be extracting relevant features from the ingested data that I already have. I think for this, I could go with natural language processing. I could analyze the text, review it, and then mention things that I have detected. I could also do image processing if I have image data so I could analyze the images from social media stuff, and then I can detect some visual cues that I'm seeing in it. Or, I could do geospatial analysis if I have information related to the location and stuff or activity monitoring, maybe if I had a business or something. And then when it comes to outdated listings, I would have to propose for what is actually considered to be outdated. So for that, I need to set certain rules. And for my rule-based detection, I could do stuff like when the timestamp was last updated. You could check that. Or you could use machine learning models, like classification models, and maybe to classify what is actually outdated so that it could learn from that or anomaly detection to see these outliers. And then you would start flagging and notifying. So you'd have database flags and a notification system to tell them. You could set up automated triggers for that. And basically, the whole thing, you have to do a monitoring for this. You'd have some performance metrics, maybe track the metrics, have some feedback mechanism or something. So you just continuously keep monitoring, getting the data, and just keep updating it.

    Select name, location, category from POIS where category in ('hotel', 'restaurant') and location is not null. Order by name desc limit 10. The SQL query is written for selecting POIs with certain attributes. It is trying to retrieve a list of POIs that belong to a category of 'hotel' or 'restaurant' and that doesn't have a null location. The results are sorted by the name of the POI in descending order and limited to the top 10 entries. The query seems to be fine with no issues mentioned. However, I would like to point out a few things: - The query assumes that the actual column names are 'name', 'location', and 'category'. If these are not the actual column names, it could result in an error. - The order by clause is sorting by 'name' in descending order. Depending on the use case, this might not be the most useful way to represent the POIs. - The query might impact performance if the POIs table is large. Ensuring that there are appropriate indexes on the 'category', 'location', or 'name' columns could help improve query performance.

    match POIs: f p y a, p y b. if p y a lower dot strip is equal to p y b lower dot strip and length of p y a is greater than fire Length of PYB is greater than fire Return true Return false What are we trying to do here? Given this Python function meant for matching POI names, can you spot any logical errors that might cause incorrect matching? Okay. So we're defining a function to match the POIs. We're given 2 functions, p o I a and p o I b. If a dot lore dot strip. So we're removing empty spaces. We're combining everything together is equal to b dot strip. And length of POI is greater than phi or length of POI b is greater than phi. Return true. Else return phi Okay. You spot any logical errors that might cause that might cause incorrect matching? What strip is equal to pure bead of the more dot strip. Can you spot any logical error that might cause incorrect matching? Okay. We have 2 here. F a dot lower dot strip is equal to b dot lower dot strip. So I think when we're we're doing this, we're doing this check over here to check the equality, I don't think checking the length of both these names individually again is necessary. It just seems a little redundant because if we're gonna go to that case, I mean, they're already equal names. Equal names would already be equal in length. Right? So this could just be simplified, and we don't really need to check that. Okay. Strip and lower are arranged. Okay. They're done at the same time. So there is really no error handling mechanism here. Like, we're not really catching any errors. If there are any anything for that, we should have, like, a try catch. Yeah. But I think that would be one of the major issues, yeah, for in the logic. We really don't need to check the length of a and b because if we're checking an and condition later, then that means they are both equal for it to have gone to that next condition. And if they're equal, then, obviously, their length is equal, so we don't need to check the length of both of them separately. match POIs: def match_POIs(poi_a, poi_b): if poi_a.strip().lower() == poi_b.strip().lower(): return True return False

    Okay. Well, I'm not specifically very great at R. I'm more of a Python person, but, yeah, I can try attempting this question. So I think this would involve something like model selection, training again, and evaluation. So I think for, like, I would start off with data collection because that's always what we do first. So, initially, I would gather my data from various sources, you know, social media check-ins, geography, everything that I've mentioned before. I would do the data integration, you know, combine these datasets into some cohesive format, which would be suitable for my analysis purpose. I would preprocess the data, you know, handle missing values, do feature engineering, you know, try to get something that's more valuable and less redundant, maybe the total number of reviews, average ratings, certain things like that. Then all of these features that I get, I would normalize them, into a 0 to 1 scale maybe so that this would later help when we're trying to find similarity or our minimum. Then I would try to explore the dataset which I have now, like, EDA. I do exploratory data analysis for it. This could be through visualizations or using plots or graphs to see it so I can see the relationship between the features and the target variable. So I would do this correlation analysis to see how these features are related to each other. And after doing that, I'd have to go for good model selection. And that's basically choosing the algorithm, then I would have to consider most of them. Like, you can go for linear regression, decision trees, random forest, gradient boost, SVMs, neural networks. There are so many things. So I think we could use grid search for that. I would have a cross-validation set so that I could check it out later and see the performance between these models. I wanna do fine-tuning so that there is no overfitting. So once I select the model, then I need to train it. So I would split the data, train it, do my hyperparameter tuning once I start realizing which model is performing well. And, yeah, I do the whole training, fine-tuning, all of that, then check it on my cross-validation data for the fine-tuning, then try my test data. And when it goes to evaluation, I'd have to use the right evaluation metrics. So this could be mean absolute error depending upon what my data is in the nature of what I'm trying to do, or I could go for MSE, mean squared error, or RMSE, and then I would validate it to see what is giving me the minimum loss. Then I'd go ahead and deploy this model. I'd use this trained model to predict the POIs' popularity for new data that's coming in. And, yeah, I'd monitor this and continuously monitor the model's performance and retrain it if it's necessary periodically, like, with new data to maintain this accuracy.

    Optimize SQL queries used in the ETL process for greater efficiency without sacrificing data integrity. Okay, let's see. Optimizing SQL queries means there are already queries in the queue without sacrificing integrity. So, first of all, I would analyze and understand what the current queries are. I would profile it, maybe use tools like explain or analyze because I want to understand the execution of the current queries actually. And then I would identify bottlenecks, like queries running slowly or frequent scans that are redundant and lead to high input operations or consume too much energy. I would check for that, analyze it first, and then I would optimize indexing. I would use some kind of appropriate indexing, like ensuring that where, join, and order by clauses are properly indexed. I would also regularly maintain index integrity. Then I could optimize the query itself, maybe use subqueries or common table expressions. I would replace subqueries with CTEs if they provide better readability. I would reduce the impact or usage of select, avoid it when not necessary, and only use it for necessary columns to reduce the amount of data processed and transferred. I would make sure my join types, inner join, outer join, left join, right join, are optimized and appropriately used. I would minimize any redundant calculations and maybe partition the data. If I'm using large datasets, I would probably partition it to get more data efficiency from smaller blocks. I would then try to optimize storage by normalizing tables to reduce redundant data. I would archive historical data to separate storage to improve query performance and active data. Then I could do batch processing, parallel processing using GPUs, and hardware configuration. Finally, I would monitor the whole thing, regular monitoring, automated alerts, keep triggers, and performance tuning.

    What approach would you take to integrate LLM in an existing Python based ETL pipeline for enhanced NLP processing. So, first, I would check what exactly is the use case. I'd set some kind of objective, like the specific tasks or cases where LLMs could actually enhance the natural language understanding, such as text extraction or sentiment analysis or named entity recognition. Later, I would choose an LLM that's specifically good for my purpose. I have a wide variety where I can choose it from, such as gpt-3 or gpt-2, and I'd decide it based upon my requirements. Once I've selected that and I know what my use case is, then I would go for the integration steps of this. That could be setting up my environment if I need to install dependencies, like my OpenAI related API keys and other libraries that I may need for gpt-3. Then I would verify the API key for the tokens related to using these LLM services. After that, I would preprocess my data so I can prepare it well and clean it and code it to be used for the LLM. Then I need to integrate my API with the ETL pipeline. This could involve sending requests to the API and then handling the responses. Then I need to have something for error handling so I could implement robust error handling mechanisms to make sure that the API does not fail. We could consider scaling, ensuring the pipeline can scale to handle larger volumes of text data efficiently, considering API rate limits or performance requirements, and some kind of metrics. Then I would go to after this validation, testing it, validating it. I would write unit test cases, integrated testing, like performance testing, anything. So I wanna validate the end-to-end functionality of the pipeline. After I've done all of this, I've deployed it, and I wanna test it, then I would go for my final stage where I need to monitor and maintain the whole thing. So I would do this by keeping some kind of logging mechanism, regular maintenance, you know? Like, you keep checking for updates or changes maybe to the APIs of the LLM, and if there are any dependencies or updates that are blocking you, or you could keep some feedback loop, so you can maybe ask your users to give you certain feedbacks or maybe stakeholders if you have any. And this could lead to continuous improvement on the pipeline and integration.