
Machine Learning, Big Data, Cloud, LLM Consultant
Remote ConsultantManager Data Science
AbzoobaData Scientist / Sr. Consultant - Analytics & Big Data
TCG DigitalSenior Technical Analyst - Machine Learning
PubmaticLead Architect - Data Mining
WedoriaTechnical Architect Security Tool Group
MphasisSenior Software Engineer
CA TechnologyTechnical Manager
FairFest MediaSenior Consultant
PayPalSoftware Engineer
Total Computer SystemTechnical Consultant
Credit-Suisse
MySQL

Javascript

Python
C++

PHP

Java

Play

Neo4j

AWS

node js

Selenium

R

MongoDB

Hadoop

Spark

Pentaho

Kibana

Tableau

NLTK
.png)
Docker

Kubernate

Nagios
Azure

Socket

IPC

HTTP

REST

SOAP

Spring
.png)
Flask

Falcon

Django

GDB

Makefile

CUDA

visual studio

pycharm

MATLAB

R

Oracle

Sybase

SQLite

Elastic Search

Hive

Kibana

Tableau

Keras

NLTK

pandas
Snowflake

Airflow
.png)
dbt

Linux

Windows

Router

Switch

NMAP
.png)
Kali Linux

Suricata

AWS

GCP

terraform

Spring

CUDA

R

SQLite

Hive

Kibana

Tableau

NLTK

pandas

Linux

NMAP

GCP

terraform

Cryptography

GDB

Analytics

SQLite

Kibana

Tableau

NLTK

pandas

bert

Shell

SQL

Scala
.jpg)
Elixir

Go

Rust

Unity

UML

WCF

Linux

Suricata

AWS

GCP

terraform

SOAP

SQLite

Tableau

pandas

Linux

Unix

CUDA

Git

R

Sybase

SQLite

Kibana

Tableau

NLTK

Pandas

BERT

Linux

Kubernetes

Cacti

Suricata

AWS

GCP

Terraform
https://www.linkedin.com/in/sayan-mukhopadhyay-61634511/
Okay. My my name is, and I did my b in electronics and instrumentation engineering from Jadavpur University and MTech research in computational and data science from ISB Bengaluru. I as a, uh, I work, uh, I got chance to work as a full time employee in Credit Suisse, PayPal, CA Technology, Emphasis, TC Digital, Abdul. And after 2016 November, I started, uh, freelancing career. I start work for the startup like, Sunvo Future Today. And then I worked for the mid sized company like Pubmedix, SymfonyAI. I worked for the, like, big company like, crossover. And technology wise, my main skill is machine learning, data analytics. I was part of the Credit Suisse risk analytics team in public. I was a senior technical analyst in machine learning and Adua and, um, and, uh, TCV Digital, what was it? Data scientist and manager data science. Uh, and after, uh, another field is in Infrastructure field. I worked in the Data Center team in Credit Suisse and later promoted to the team. And I work in the c e technology in a product prospectum, which is basically a network monitoring tool. And then I was a as a I was a technical architect in the security tool groups of emphasis. So I have all as you were, experience in all aspects of data. I can claim myself as a full stack data analyst professional. I can do the front end. I can do the back end, and I can do the everything in between. That's all.
Okay. So we are building a price prediction system, uh, for a startup or and, uh, what happens if your prediction is low, then, uh, then, uh, in the video online bidding, If you if you ask low price, it will automatically sell. And the in the next iteration, when the when you'll train your model with the data, your data will be more lower And so the your next iteration prediction will be more lower. In this way, the price of that gradually going down and, at the society. Review of the site also going down. So there is When we're learning this solution, we propose for it. So what we see, we see the PRC. If the revenue is going up, we do do nothing. We do not fix until it is broken. But if you will if this revenue is going down, we look at the. If is going up, That means we are selling more but selling in low price. They will make our prediction little bit high. And if the prediction is, uh, if you are we are spending is feeling that it's going down, That's when we are asking high price that's why you not we are cannot sell enough enough ad. So we make our prediction a little bit low. And this Predicate price of the advertisement from the sell side platform is known as flow in ad industry, and we give this algorithm name is dancing flow. It is implemented in, uh, Google Cloud Platform, and the prediction model was the case of flow work based model. And in ad industry, there is a practice like this. Keep the model some type, and sometimes they run without any model. But here, you can long term time you can run without, uh, stopping, uh, without to have without doing the stopping the model to predict that prediction done. So it is
So signal is a time series data. So I will see the auto correlation function and the auto correlation function at this part, I will take it is a, uh, periodicity. And, uh, then also, I will see the average of the moving average. And from the moving area, you can see the trade. And I will remove the so there are if these 2 things is there, uh, if the if this is these 2 things is not there, then this, uh, Adam, animal model, then, uh, Adma model will be implemented in statistics in time since model. But if the strain is there, then Adima model means integrated audio auto editing integrated movie, alright, model, and that is with the trend. And if the periodicity also there, there's model, seasonality, and model. So that is model is there. And if you want to look at the deep learning problem, uh, model, then we can use, uh, use, uh, the the the current neural network for for this kind of the, uh, for for this kind of data signal, time series signal. And deep learning, uh, and other signal, uh, frequency domain, you can do you can do the Fourier transform of the signal and see the frequency and, uh, you can do the analysis in the people frequency domain. And this is long time times you you have to apply the inputs from moment you have to learn from the error, and, actually, you will make a prediction model for the error. And with the prediction of the value, we'll predict the error and current the value and give the answer. It will increase your accuracy. That's
Yeah. I work in a project, uh, said the other 24,000,000 which they should allow us to target for their act. Okay. So what we see, the people who buy their subparties, we make a column 1, otherwise 0. And with this column, I see the correlation of the features which are correlated. And will all these feature will taken as a clustering, uh, Euclidean distance function. And each cluster will then declassify high by the each product. Like, what is that, uh, that is if you buy that product or not. And calculate the probability of the product of the of the buy the product. And after some threshold of the probability, we choose those customers. And in those customers who are already buying that much, we do we and conventional, uh, this, uh, what is it? Volovatic filtering, uh, algorithm is correlation video, but collision is not a distance. So we just take push sign distance. We just we'll have and this is implemented in the Spark. And the classification, we use the name based, random polished, and
So there is a data validation, uh, framework it is framework that you can check. So you probably follow me with the right sample size start date and end date. And if you want to highlight that data that it's just, uh, event date is in middle of the start date and end date or not. Uh, and you can do a byte test for for each row of the column or above this c column, you will get a record that your console is 2 or false. So this is one thing and we can do. And, uh, train, uh, for trained, actually, play, uh, play visualization of the data analysis group technique. If you look at the train, all these things, and, um, average, last 10 points average, uh Seasonality is if there is not that, that you can find with the correlation function. Correlation will give you if if auto correlation is high, That means at that point, there is a seasonality or periodicity. And free at this point also you need so you can find seasonality that Uh, full frequency forward, you have the, uh, highest, uh, highest value highest, uh, free attachment value. Those are the, Uh, those are the time period. Those those inbox will be the time period. So the that is, uh, in in k f, uh, in the in the in the full year, uh, frequency domain and, Uh, significant change. That's only these 2 things should be checked to check. And accuracy of model is have didn't change gamma tag editor and also you'll find that data is also
So I prefer a free software, and I use JavaScript. And I believe I build moments like this Google visualization API, but some client has problem with to sending data to Google. So that's why you, uh, you can use d c or c c. So divide this library with JavaScript. And back end we I write h t p I using Python Flux library mainly and sometimes I use a flat million If high performance is required then, uh, I will use for Telkom. And if it is Google Cloud Platform, there is Google Data Studio like You can make the Google data seed on these things. You can make a chart like Excel like Excel, whatever you can do, that you can do in Google data seed also. So the Google Cloud Platform, Amazon, if you have ready, this ship d b. Latest d b, it has a a lot of, uh, data analytics and visualization thing in AWS. And, uh, conventional API tool at Tableau and Power BI I am somehow, familiar with. And, uh, apart from that, I know the front end logic like tables. All these things I can build by myself using Angular JS or to get. Actually, I do JavaScript, uh, average level. So I'm I'm not a very good front end developer, but I am I can build front end. Uh, and back end, I am expert. I work for very big company back in development. And data related, I work with company, so I can handle a data very well.
Yeah. Yeah. It's I use so, basically, uh, I work in the Panda, and panda to c s CSV. I dump the data in from Python to Excel. And then the Excel, I use formula, uh, like, count. I did coordination things. I create bits about different kind of chart. Uh, and, uh, I'm also pivoting the table, which, uh, converting the column as a row as a column. So that is that that I can do in Excel. Uh, I I have I'm at Excel. I'm not very good, but I I can handle the thing in Excel. And and I know a little bit p b scripting also. So this is scripting. So, uh, PBA automation. Now I I do not know. I do not work, but I heard that Python sixty is available in Excel. If that is available, I am an expert in
Yeah. That is a good question. So if it is business, people don't go to the technology, but see the impact in the they will how much lifting in the revenue or how much lifting in the pew, or the audience uh so that that you should be considerate concentrate and that should be emphasized and if it is technical people, then go to the architecture diagram, sequence diagram, class diagram, these kind of things are there. An algorithm should be in lucid manner explained to the business people because they want to they are interested in how it is done. And apart from that, uh, uh, apart from that, uh, if it is if it is ply, I'll make it short. So make it short and for, uh, formal. Uh, so don't don't make anything. I don't I don't want it. Fee and it is very good, look at the audience, uh, LinkedIn profile and these things and see their capability, their expertise, fee and keep your presentation in those domain. So that would be a good idea. Do some homework.
So large volume of data. So I worked in Matthews in 2013 and 13, 14, and they handle 100,000,000 record every day. It is for formatting I did. And then I what is part and 5, I worked for a company called Future Today. And in PySpark, I developed a app price prediction system for them. And then I worked for a meeting again, and that that time I worked in the Scalars part also. And I I future today also, I worked on Scalars part. And uh So, that is spice bug. And another way is another spark that is in machine learning. There is another thing called transfer learning. Like, you separated data in the small small small small chart and each small chart update your model. So your old training will be not be neglected. So all training will be considered, and it will be updated your model. That is done through the transfer learning, Deep learning, it is possible. And Bayesian, uh, model like always model is possible. So that that can be used.
So Google data, I will fetch that data if it is big query or database. Whatever it is, I will fetch it in the Google Sheet. This Google Sheet, whatever you can do in Excel, you can do in the Google Sheet. And uh Google data studio you can build beautiful plot also for that and you can change the parameter and your plot will be damaged you can change so that can be done and uh there's a Google data give you a Google dataset. I work, and I work with Google BigQuery database. And other database, like, Google Cloud, uh, store I work with Bigtable I work. And and and Google has like Google double click. I work for a small time period of time, but I work on the analytics the same way. I'd like to die in the reverse side. That is the AdRoll is the partner. From there, we'll change the all inform impression data for for their whole day, do, uh, in batch processing mode every day or release jb in AWS cluster. That is I done the for the reversal and do analytics from the data build a recommendation system which ad should be, uh, should we would be should we recommend it for which people so that we