ML Engineer
Brenin IncAI Engineer
Independent FreelancerPython
Scikit-Learn
PyTorch
Keras
Pandas
NumPy
NLTK
MySQL
Matplotlib
Seaborn
Pandas
NLTK
Visual Studio Code
Git
Hugging Face
React
Vector DB
I'm. I have done my MCA from Technical University, and my major is in data science and machine learning. I also have done a certificate program from, uh, Intel, part, Udemy, and many other online, uh, courses. And I have done, uh, many projects on machine learning and deep learning such as, uh, text classification, image classification, and many other.
To make our machine learning model more accurate, we can, uh, lower down the layers or can use a grid search, random search to find the best parameters or can use optimizing technique which bets, uh, best suit to our problem.
Tune hyperparameters for a deep learning, uh, model. We can use grid search or random search to find the best parameter that will work our with our machine learning model to predict the, uh, best transfers so to overall achieve higher accuracy. You can use libraries such as, uh, uh, TensorFlow, scikit learn.
K. When we have imbalanced data imbalanced data, then we can to balance it, we can use, uh, imputation or such as, uh, mean, median, mode, or KNN algorithm that will automatically correct our dataset. If we have categorical value, then we must first have to convert it in numerical value. Similarly, it goes with the text and images. We can, uh, detect the skewness of the data by visualization, uh, through metprodliborseaborn. We can also detect outliers if there are any in dataset using box plot, which will highly affect our machine learning model prediction, or we can use normalization and standardization techniques on our datasets. So to make our all dataset of the same characteristics, such as 0 mean and standard deviation of 1, And we can use, uh, no standard scaler from s k learn to standardize our data and normalize it.
Well, we work with the RNN to process sequential data. We have problem, uh, uh, such as, uh, RNN model. Simple RNN can't able to process or remember, uh, long range dependencies in a sequential data. RNN model is is specifically designed for sequential data, um, and we can also do classification as well as regression with RNN. But based on transformer models such as t 5, uh, CHEDGBT, and some other sort of, uh, models based on transformer, uh, can have, uh, ability to remember long range dependencies and provides coherent and logical answers because of the hidden cell state that it maintains over every time steps of input. Uh, this is the drawback which is covered by transformer.
To accurately select, uh, future that you want to take in, uh, to train our model, we can use, uh, various metrics such as correlation metrics, covariance metrics to check the dependence of features on one another, how they are influencing, uh, on one another, and how it will be influenced our model. So this is a part of future engineering. And, similarly, if we are working with time series, then we have to look for date range, uh, where from, uh, at from where point this, uh, data is starting and ending. So to make a time frame time frame and, uh, perform operations such as, uh, uh, rolling mean or rolling statistics to define the window frame and to predict the upcoming answer. Feature engineering is, uh, important to, uh, because, uh, our model depends only on features to provide the best transfer. We can discard outliers because those are not needed in our, uh, by our machine learning models. Features. Features. And, also, we can look for those columns which will not be apart for our machine learning model to predict output, um, such as, uh, uh, dates column or index columns or name of years and all their countings and so on.
Medical embedding. I, Encoding. Encoding. Get. Each category equals to index. Apply. Data frame column. Approaching code from the apply lambda function encoding. So this is the code. Uh, this code I see here will convert category variable into numerical embeddings. Uh, it suggests it in line data frame. Bracket column is equals to data frame column dot apply where here it, uh, use lambda function to encode features. Uh, the second parameter that here we will give in this encode categorical definition function We'll convert this column parameter when we apply this lambda function into into numerical. 1st, we are giving here in first parameter, data frame, and second, the future or column that we want to convert into, uh, from categorical to numerical.
Prediction service. What we did? Is it class prediction service? Decorator static method. Load model. Model path, prediction service dot model, wish service dot model, from path. Another function. Previous permission. Load model as permission. Model path path of the model. Static method. Predict input feature. This mode which we are loading a model here and then another definition we are predicting features. At last, within it, model is described. Features of prediction service dot model is none. Then we predict we can predict predictions. Prediction service predict predict the load model. To load the model prediction service That model, we are getting model from the prediction surface, and then we are loading it from another path, another path, load model path, If prediction it's telling if prediction service. Model is none. Model is not loaded. Yes. Right. Model dot predict input features. Prediction service dot model into futures. This point is not a problem here. I'm not sure if you address it. I think it will take time for me to take this code and run-in my notebook and then find what are the potential causes, uh, for any error. Just by looking, I think, uh, now for I would not be able to find what is the problem here.
First, when I have to work with TensorFlow, specifically for image classification, first, I will need my data. If I have data, then I can start working on it. And if I do not have data, then first, I have to collect data. This can be done through YouTube app scraping tool. Or and if I have data available, then I will use that data first, uh, uh, to check whether it is in right format or all the data types are in right structure. Also, I will I can perform operations such as whether there are null values or not. And then after checking, uh, checking the format of images, whether the all the images that I want to train in our in my model is right or not. Because it sometimes happen that some format is not, uh, we we can't use some formats of images with our model. So that leads to an error while we are loading our images with our, uh, with our model. And it comes to aware with us after the later images after a later time when we have finished our finished with our, uh, model architecture. And to work with images, uh, we generally use CNNs because it works best. In CNNs, we has, uh, convolution neural network. It has parameters such as filters, uh, which captures information in the images. And we can define the filter size by the help of kernel size parameter. Numbers of filter, we can define. Activation function, we we have to define in the model architecture and input dimension, the dimension of the image that our model will train on, which will be defined in the first layer. A model architecture in CNN, uh, will start from input layer, then, uh, in between, hidden layers will come. And then in in, uh, in final, uh, layer, our output layer will come. We can also use dropout layer to drop, uh, some neurons, uh, randomly Awesome filters here when we are talking about CNN. And then we can divide our dataset, uh, in train test split. After dividing, we can train our model by the model dot, uh, fit fit, uh, parameter fit method. Sorry. And then we can pass our dataset. We can define epochs best size. And then after when our model will successfully train on the dataset, we can predict and test on our test dataset using model dot predict met method. And in late and in and lastly, we can check the accuracy of the model on the training dataset and testing dataset.
For transfer learning, uh, we can use models, uh, where when we do not want to create a model for our own because sometimes, uh, and it often happens when we are working with large dataset. Models consumes too much space and times, and it is not, uh, for everyone to train the complex models on their personal laptops because it requires complex hardware architectures such as GPU and TPUs. Uh, so when we're working with large datasets and and and the dataset are in high dimension, so it is better to use pre trained model because it does not take time to train our model. We can customize the, uh, output layers, uh, just to set as trainable is equals to true, and then, uh, we can define our input, uh, output layers, uh, in which we want our output such as we when we're working with 10 classes, then we then we can define 10 classes and output in a layer. We can use Hugging Face library for leveraging uh, transfer learning models, and we can customize according to our needs. And it will work similarly as I've said before. 1st, we have to, uh, gather the, uh, gather the gather the old data and, uh, divide the data and then, uh, train our model on our data. And after successful train, we will, uh, predict our data and check its accuracy. Yes. We can use in transfer learning GPT, BERT, and many other models such as, um, ImageNet, uh, VGG. It, uh, based on the type of problem we are working, whether we are working with the text data or, uh, when we are working with images, videos.
Performance. What strategies could you apply to monitor performance of production and systems? I think here we can use version controls to manage our code and to provide continuous integration or continuous deployment. Uh, we can use GitLab, AWS to track our code and update update our code, fix bugs. Performance of production. Yes. That was all. Thank you.