Highly skilled AI Engineer with a proven track record of developing and deploying over 50 Conversational AI applications across diverse sectors. Recognized for driving revenue growth, optimizing operations, and leading high-performing teams to deliver innovative AI solutions. Proficient in Python, JavaScript, GPT-3, ChatGPT, and various frameworks. Adept at integrating AI solutions with business goals and mentoring junior members for enhanced performance.
AI Engineer (Conversational AI)
Red GlobalAI Consultant
UpworkGCP CCAI Enginneer
MiratechSoftware Engineer
TCSLead AI Engineer
LandsloAI Engineer
Vocalime
Python

Javascript

ChatGPT

HTML, CSS and JavaScript

PyTorch

Dialogflow
.png)
Docker

MongoDB

GitHub
AWS (Amazon Web Services)
pytest

neural network architectures
So I've been working in this field for almost 5 and a half years. So, initially, I started working as a Python developer, and I guided my team in writing proofs of, like, helping PhD candidates complete their projects, doing Python-based problems and automation, all that stuff. And I got introduced to machine learning way back in 2018. From that point onwards, I had been working on cutting-edge AI and ML techniques. Initially, I worked as a Python developer, and later I got into PCS, where I got an opportunity to work on natural language processing problems, like chatbot billing. I took that opportunity and built the chatbot for a US-based banking client within 3 months, and it was a great success. I got promoted to a leading engineer, where I got introduced to more challenges in a client-facing role. Not only did I help them build processes, but I also converted their client requirements to technical requirements and converted those technical requirements into tasks that I would hand over to my teammates and have them complete, reviewing their work and taking that. After PCS, I was the 1st employee of a research-based company, Landstar. It's a research-based company, not Canon-based phone. So, my role was to lead a team of over 5 people where we worked on natural language processing, chatbot development, Charge GPT-generated via applications. Not only that, but we worked on scalability, DevOps, code reviews, all these things. I have actually worked on that. Also, we were the first ones in the market to introduce generative AI into the risk space. I worked there for 1 and a half years. From the last 1 and a half years, I've been working on large language models, like BERT, Chargept, GPT 3, all these things. My primary skill set was to deploy, scale these large language models, and also make sure they're actually accessible to the general public at a reasonable amount of cost. We're actually building something very similar to ChargeGPT by using an open-source model with 75 billion parameters for the real estate space. Recently, we also got a chance to work with Vertex AI from Google, where we actually worked on building auto utilizing our ML and the model garden. We were actually able to build some good modern endpoints, making them available for users through API, basically. So yeah.
So the loss function, basically, is a very important parameter. When choosing a loss function, you need to make sure that the type of data you have and how it is actually working, how it is actually performing over different epochs, basically. Not only that, the criterion for the loss function depends upon the type of problem you are selecting. For example, if you are selecting regression, you want to select another problem. If it's a classification, it's a different problem. But also, at the same time, if you're actually dealing with general text or, for example, text-based, transformer-based applications, you need to have different loss functions where the importance of human feedback, the importance of things, how we went wrong, was actually really important in these loss functions. So while developing a model, it's really important to make sure to check the loss scopes. And my search was to try to see which loss function is actually working for the native set of epochs, try to run with different sets of loss functions and see how the loss scores are actually whether they're converging or they're diverging, with the test and prime datasets and also make sure that the curves, whether they're fitting or under-fitting. And this loss function needs to provide valuable feedback back to the model so that it actually makes the right decisions. So the loss cost can actually tell you whether it is actually giving the right amount of feedback back to you or not. Whether the loss function was giving very high feedback or very slow feedback, it actually depends upon the loss function, the type of the problem that we are solving. So it actually very depends upon the dataset that you have, whether there is any dataset drift or not. And also, whether there is any data imbalance or not, whether it's a classification problem, a regression problem, or it's like a generated AI. Like, for example, you know, text generation or a major generation thing. It actually depends on how well the images were there, whether they were high quality. The pixels were really good. The data is actually good or not. So all these things, parameters, like, I'll choose all these things while actually solving or designing a deep learning model. The loss function is really important here. I'll consider all these aspects.
So large-scale dataset is actually when the data is large. So, definitely, you need to make sure that you don't have any written data. So my first goal would be to convert all the data. You should first analyze the data. You should first analyze and then transform the data. So, in the analysis part, you plot some graphs on top of it, observe the skewing of the data, perform some techniques, whether there is any redundant data. You apply techniques like data retention techniques. And, so, you transform the data. Like, a play log on a certain column, see if it's making any impact in the data, whether the data is actually whether we need all the data or not, is to maintain a perfect split between the training and test split basically and how well you're choosing that split. And, also, with the large-scale dataset, you need to make sure that you're actually giving the right amount of feedback back to the users. So, whether you want to be using deep learning techniques, you wanted to use a conversion neural network or a normal regular neural network or it's a probably a transformer-based approach or fine-tuning a model, all these things. You know? So, definitely, a large-scale dataset, when the data is there or when the sufficient amount of data is there, it is definitely possible to build a good machine learning model. So, when the data is less, it's a challenge. Similarly, if the dataset is larger, so large-scale data is also there, when scaling, you need to perform a scaling operation because there's a very high chance that the outliers are in the dataset. You need to bring them back to the normal scale and make sure that your data falls under the bulk curve. And, if it falls under the bulk curve and it's under the normal scale range, it works really well for convolutional machine learning models. But, of course, you know, with the advancements in technology, we can address all those things. But, at the same time, we also need to make sure that we are not overfitting the data. Okay? So, you apply, first, transform the data, analyze first, plot and graphs, transform the data. And then, based on the transformation and based on the data, I mean, like, its type of application and probably based upon the characteristics of the data, you observe. You see whether it's a perfect, based on the target variables, you see whether it can be linearly separable or not. So, if it's linearly separable, you know, you have a conventional method, like, for example, methods. And boosting and debusting algorithms, like, all those things, like, well, they'll work really well for support data, like, an operation-based and all those things. And, also, we have transformer-based deep learning methods, like, correct neural networks, deep learning, all these things. So, you can use that and to, well, large-scale datasets, you know, to actually try in the data model. So, yeah.
So it's really important that when you define a neural network architecture, you need to make sure that you at least follow four parameters. Okay? You need to see whether the nodes, and then the edges and then the curves, all these things. For example, if you observe a normal problem, like an image classification problem, you need to check that first. So how do you see the image classification problem? First, you identify whether the pixel is there or not in the first network in the first layer. And in the second layer, those dots actually form lines, whether they're inclined or straight lines and all those things. And in the next layer, the combination of all these layers adds up, and these dots form two lines, and these combinations of lines form two shapes, like a square, a rhombus, or a circle, all those things. Also, these two layers will add up all these combinations to form the formations of objects, like a head, whether there's a head or not, or let's say, if the circuit is there or not, all those things. So my approach would be to define at least four or more layers in a model, with each layer maintaining to cover at least all these shapes. The input layer will have to convert that to these things. And also, make sure not to go to the dense layer, but definitely want some convolution because of the convolution. You know? The number of operations and the calculation will be really easy for you. Also, I kind of based on how well my number of layers and dimensionality is increasing. Probably, I'll include some dropout mechanism to make sure that my model learns not to learn on any data. Sometimes I want to miss the data intentionally, so I improve also. So convolution, dropout, all these things with the perfect loss function, we are able to come to a balanced complexity and performance, basically. And the cool thing here is that you can actually sense in your mind how well your model is performing. And not only that, but you also need to see when the number of layers is increasing in the model, the complexity is increasing, it's really hard to give up. So you need to make sure that you give a little bit of edge depending upon the problem. And also, make sure you keep an eye on the increasing complexity. So, you need to kind of drop fit and also give more importance to the internal stages and also initial stages. Because if you are not capturing the immediate information, you will not be able to have the end result. So important data is important, and you need to make sure you choose the right one.
So finding the loss function. System loss. And what kind of loss function should be. So for generating text, you are actually trying to use a loss function that was defined here, but it might not be a good one because it's actually good for regression problems, basically. So here, what it is actually essentially saying is that if the target variable was a continuous variable, like the price of a house or age, it will be a really good loss function because you have a target and you model something. And this loss function is able to give you how far it went. Okay? So that will be really good estimate for regression problems, but not for generative applications. Because for generative applications, it's really different. So it actually works based on human feedback, basically, like you want the next proper word, okay, you want? And there may be a number of proper words because we have a lot of synonyms, and we have a lot of promo words and all these things. So there might be a lot of possibilities of the next text. Right? So, generally speaking, the loss functions for generative text need to be really depending upon the basis of human feedback, you know, while doing this kind of technical. That's where fine-tuning and things like RHLF and PFT, like prime reduction, the time-binding methods come in. So all these things come like we have different methodologies, like supervised, semi-supervised, and unsupervised, different looking methods. So, definitely, I consider that for generative text, we should not use conventional loss functions and rather choose what is actually applicable for them, specifically for such type of applications, is my opinion. And not only that, not only for TensorFlow, but also PyTorch and other frameworks as well. It's respectful to the framework that you are using, or whether you are actually opting for a cloud-based, like Vertex AI or Gemini or any other SageMaker or any other tool you are using? You need to make sure that you do the data analysis part, data evaluation part, and the timing part. Correct. And for that, choosing the loss function is also really important, and it should be really wise as well.
So, recently, we are building a generative AI model. We're not only implementing a generative AI model, but I'll tell you. We are actually working on an open-source language model, it's called Mistral. We're using a substantial parameter from Mistral, which was actually open. So, we are trying to build something for the real estate space. We wanted to create a Charge GPT interface kind of a tool for leadership people, which can actually help them in cold emailing, SEO, and publishing their leadership material and all that stuff. And, the problem that we faced was, the availability of the data and the deployment method and strategies, for open-source metadata, open-source LLMs were very less. So, we tried a lot of options. We tried going with VLLM, one of the methodologies to deploy large language models, and also we can deploy this in-house solution, and also we can actually go with their cloud offering to access that API, but these are all costly. So, what we did was, we implemented our own in-house deployment methodology, wherein we actually downloaded the entire 7,000,000,000 model and made that inference possible by combining that with FastAPI, which is a Python API-based framework, which can actually do the inference in real-time, and we tied that to our SaaS-based platform, wherein we are making calls directly from our interface, basically. The challenge was mostly around deploying and scaling that, and making sure that it's actually available, and it's not hallucinating, and all those things. And, we need to do the fine-tuning part as well for that model. We fine-tuned on a custom-generated dataset, like, for the real estate applications. We fine-tuned it. And, in fine-tuning, we decided which methodology we need to go through. And, we tried different approaches, like, p, f, your, and human loop and the feedback and all those things. And, we ended up having our model running, and it's actually working really well. And, we're really happy with the progress of the model. So, it's not only about the problem or the challenge. It's also about the implementation and how we are dealing with this situation and, how we are dividing that entire task into simple pieces. Okay? You have this big problem, and you need to first understand, how we're solving each part of it to solve the problem entirely. That's my strategy, and that's how we did that.
Yep. So, basically, in the given code, it's saying that the transfer model object has no attribute from pre-print. So, the main reason for this was the package, the transformer mode, and whatever was there, has no method call from pre-train. Okay? So, like, it's actually that you need to import the transformers module, and we need to make sure that it is actually available in the Hugging Face library. Hugging Face, you know, like, the transformer space. So you need to get the model name and user tokenizer and all those things. But to make sure, you use the transformers model library. Your transformer model is there. You imported that, but doc from Deepgram is no attribute. That means we're actually not using the correct package or correct import here. So you need to use the correct import and make sure that the model is actually available in the transformers space. So that might be the reason why it's not actually giving you the error. So that might be one of the reasons. And also, you need to make sure that you have, maybe it might work for, like, you know, the previous versions, like version 1, version 2. There might be different versions. Right? So also need to make sure which version you have installed. And for that version, whether the code has been any changes or you need whether any duplications or removals, the particular code might work for some other version might not work for the latest version. So you also need to make sure that you check the version correctly. And also, you need to make sure that you have, like, all the dependencies installed, for example, the top of the transformers library, whether you have called the dependencies for install correctly or not in your environment or in virtual environment whether it's like. If you still regard there, like, you go to Stack Overflow for check with the community, I think this community whether anyone are actually facing the issue and see how we're going to Google that. So and that's that's how I try to resolve the problem. And if nothing is actually working well, probably I'll see what is actually causing the issue by going through the source code of like the transformers library. And you know, see, like, probably I'll submit a issue in their GitHub issues in their GitHub page. And probably, if I'll be able to solve, probably I'll explain the problem. I'll see how do we clear this problem and submit a possible API with that. So that's my approach to solve this problem, basically. So yeah.
Python is also really good framework to build the machine learning models. So, with the transformer block, you have layered emissions that are actually going well here. The transformer block has it, and all the definitions are actually going well. The forward pass implementation x was giving that. But, your problem is that the potential is that you're not passing the data directly, like three times. The attention layer, like, you need to pass it through phase. Like, first, you need to pass to the previous layer, and then you pass to the next layer. And, it has to go step by step. Right? The attention parameter, you also need to see whether you wanted to do anything with the data, like preprocessing with the data, all those things. But you're directly going to the self recognition and passing the parameters, like these. It's not really a good implementation, I would say, for this kind of problems. And, especially, Python has really good documentation on the Internet, so definitely make sure you check the right best practices, the best practices for developing such kind of transformer blocks and defining that. And, yeah, I hope the issue can be solved easily, if you can pay attention in the forward layer. And, also, while initiating the layers as well. The transform block, we need to follow the approaches for the neural network in the general approaches, like, what to say, like, the deep learning and the conversion in your network and all those things, you need to try, and then go to the transformers. Right? So, the importance of initial transformers is really important. So, you should give importance to the tension layer of it. So that you'll be able to succeed in all these kind of solutions, and you'll be able to define your network architecture really well with PyTorch. And then, it's really easier. It's like just a cup of coffee for everything, how we are doing and all these things. The standard structure is that clearly has to be good in the forward method and also for the initiation layer as well. So, yeah, I hope that answers.
Yeah. So state-of-the-art generated models, like, for example, PaLM or, you know, Charge, all these things, we can use these were, like, you know, pretty much the latest and, you know, all those things are in they don't have really SDKs in, like, each and every language. So, most probably, they have SDKs for, like, Python, I would say, like, are probably, like, all these things. Right? So, in such cases, what when you wanted to degrade that with legacy systems where your applications were, like, building, like, 10, 15 years back, like SAP or Java or like PHP applications, to interface or to integrate those, it's really a challenge for you to, you know, delete all these things. So one of those things you can do or what I can do, what I'll do, actually get this in, you know, degrading this, with the ServiceNow, basically. ServiceNow response platform, wherein it actually handles the user request and, you know, does the stuff. So we sit with the client and the requirement. We design the solution and all those things. So the product that we did was, we kind of, you know, took part of this generative AI aspect entirely outside of legacy systems, like building an API, deploying that generative model, and, you know, accessing that through an API, basically. Whatever you wanted to do it outside the system. Okay? And interface that system with an API call, basically. Pretty much any legacy system, depending upon the time of the legacy system, there might be a possibility that you interface that through an API or probably an extension or probably, you know, like a custom script. So, depending upon that, you know, whatever the out system, in system. Okay? I'm talking about the system, saving system, and out system. Outstream is, like, you know, the general generated way model, which was actually fine-tuned for your use case, deployed, ready to go, ready to take the calls and all. The out system used to be the interface within systems and legacy systems. So, if it's like a job based on, like, something like a web-based application, your middleware is like an API. If it's like an annotation or like that, it's like an action, things like that. So it's really important to divide the systems. The legacy system has to be different than the out system, which was, like, the machine learning or the generative AI system. And the middleware is, like, the connecting between these two. So then, you know, you'll be able to, you know, not only it actually helps you to, you know, disturb the systems, not only helps you to, you know, not issue the system, but also in the long run, it is easy to manage and efficiently, you know, find any problems, or debugging is really easy when you do run the systems and, you know, integrate the existing system, which was really important in my opinion. So yeah.
So, there might be chances that when we're actually using generative AI models, the data might be skewed or there might be a drift in that. So, the impact on the generated data is still important that, in my opinion, when we're actually giving input to generate via a model. Okay? Then there are different layers. Right? So we have an input. We have when we ask the question to generate, we are. First of all, it needs to go through a sequence of steps. It needs to go through a sequence of steps before it teaches to generate AI. And, also, when the output is actually coming back to the user or back into the system or back to us. It needs to follow or go through a system of layers, basically. So, when we give the input, it needs to go through vector stores to see whether there was any context. Right? You pass the context and the input, right, you pass the context and find any similar queries that were actually answered previously or not. All those things. Right? Vector stores, fine-tune, all these things. You will be able to build a vector store and do all this process of similar content and adding that to the context of your input message, all these things. Right? So, similarly, when you're actually collecting the output at the last layer or of the thing. Basically, you need to see you need to perform a similar kind of approach. That way only, you'll be able to see whether your data is good or not, and whether the model is actually performing well or not performing well, basically, in my opinion. Okay? And see how you're able to do it, basically. So that's really important, okay, for you. And if it's actually doing that, the good solution would be to fine-tune that. You fine-tune the large language model, like with open-source models. All these are open-source models. We have a lot of open-source models, and we have a lot of good people on the Internet doing some crazy stuff. Right? So, you take the model and try to see how we can actually fine-tune the model to mitigate the generated content and how you're able to resolve this issue, basically. So, you know, probably I'll pick a machine learning model and see how it goes with the already known answers and see if the model form is not good, I fine-tune it. If the parameters were too much increasing, I'll know how to reduce the number of parameters and try to reduce my cost, and at the same time, make sure that my generated content was not skewed, and it's giving right answers and correct answers, and stuff, you know, not hallucinating, like repeating a lot of stuff and things like that. So that's my approach to solve this kind of generative AI score models. But, yeah, at the end of the day, you need to monitor the performance of these models regularly, and then you opt for MLOps, like deployment scaling, CICD. Everything, you need to set up a pipeline wherein these MLOps things work really well if you do that, and you'll be able to identify the issue in the system and be early in the system. You are in the system.
For a pre-trained model for a chatbot project, Okay. So, probably, I'll choose Mistral. It's one of my favorite models, basically, for chatbot projects because it works really well, and it's a small model. Fine-tuning is easy. And, also, for the given tasks, it actually works really well, because why? Because it's since it's a large model with 7,000,000,000 parameters, I can easily fine-tune it. And, also, the nature is that it's an open-source model. Since it's an open-source model, many people have done a lot of work on that, and that's why. If the number of parameters weren't important to me, I'd choose LaMDA. LaMDA, basically, or also if multimodal prompting is kind of important for me. But it would be only for text-based chatbots. I'll go with the one that, you know, depending on the architecture, whether the architecture really tells you whether it can be suitable for our chatbot projects or not. So, yeah, all these things actually matter while selecting a pre-defined model, basically. And, again, in the chatbot project space, many things have happened. So I probably tied that with some open-source language chatbot frameworks, like Rasa. Asahi is one of the leading chatbot frameworks. So, probably, I'll use Rasa. I'll use Rasa's chatbot framework capabilities, and I can easily integrate this with any chatbot framework. So that's the essence of it. You convert that into an API. You run it on inference, or you want to build an Android app. Everything is possible with this. So, yeah, that's how I choose. I'll choose an open-source framework. At the same time, I'll also make sure whether I can deploy it easily, and whether, if there's an issue, I can do the backup clearly or not. How well the data is actually with the parameters, whether it's challenging or not, all those things. So yeah.
So, basically, for fine-tuning, a GPT-2 model, you need to make sure domain-specific language is something you need to focus on. It's kind of really different from what it's actually trying to do. So, I rather wanted to use an approach called RHN. There might be, like, different fine-tuning methods. Okay? Tuning, semi-supervised fine-tuning is there, supervisor fine-tuning is there, or, you know, our total unsupervised fine-tuning is there. And depending on the type of application, whether you are actually doing AI model fine-tuning or non-generative way of fine-tuning a model, it's different, basically. Okay? And, they were, like, obviously, so is there and, you know, not supervised fine-tuning, if I would say. But, you know, I would say RHL is, like, one of the better approaches because it's a domain-specific language. And, the human feedback really helps you make sure the model, the GPT-2 model, understands the data pretrained transformer model, understands the domain correctly. And, the human and the feedback also helps them correct if it's going in a wrong direction. Right? So it's how good is that? So, the thing is that you need to show that, you know, the number of parameters you are doing will increase, obviously, when you fine-tune. Right? So you need to opt for strategies like parameter pruning, fine-tuning, and how to reduce the number of parameters and also at the same time, if your model is being too complex and it's picking up enormous numbers, but it's becoming difficult to maintain that. Right? Obviously, cost will increase and, you know, deployment cost will rise up and all those things. And, doing all that, if it works, it's good. Otherwise, it's like a waste of time. So my approach will be, like, human feedback because it's too specific to language. And, yeah, I'd rather choose that.