
Architect - AI/ML
ConfidentialMachine Learning, Big Data, Cloud, LLM
Remote ConsultantManager Data Science
AbzoobaLead Architect - Data Mining
WedoriaTechnical Architect Security Tool Group
MphasisData Scientist / Sr. Consultant Analytics & Big Data
TCG DigitalSenior Technical Analyst Machine Learning
PubmaticSenior Software Engineer
CA TechnologyTechnical Manager
FairFest MediaSoftware Engineer
Total Computer SystemTechnical Consultant (Risk & Algo Trading)
Credit-SuisseSenior Consultant
PayPal
MySQL

Javascript

Python
C++

PHP

Java

Play

Neo4j

AWS

node js

Selenium

R

MongoDB

Hadoop

Spark

Pentaho

Kibana

Tableau

NLTK
.png)
Docker

Kubernate

Nagios
Azure

Socket

IPC

HTTP

REST

SOAP

Spring
.png)
Flask

Falcon

Django

GDB

Makefile

CUDA

visual studio

pycharm

MATLAB

R

Oracle

Sybase

SQLite

Elastic Search

Hive

Kibana

Tableau

Keras

NLTK

pandas
Snowflake

Airflow
.png)
dbt

Linux

Windows

Router

Switch

NMAP
.png)
Kali Linux

Suricata

AWS

GCP

terraform

Spring

CUDA

R

SQLite

Hive

Kibana

Tableau

NLTK

pandas

Linux

NMAP

GCP

terraform

Cryptography

GDB

Analytics

SQLite

Kibana

Tableau

NLTK

pandas

bert

Shell

SQL

Scala
.jpg)
Elixir

Go

Rust

Unity

UML

WCF

Linux

Suricata

AWS

GCP

terraform

SOAP

SQLite

Tableau

pandas

Linux

Unix

CUDA

Git

R

Sybase

SQLite

Kibana

Tableau

NLTK

Pandas

BERT

Linux

Kubernetes

Cacti

Suricata

AWS

GCP

Terraform
https://www.linkedin.com/in/sayan-mukhopadhyay-61634511/
My name is, and I did my B.Tech in electronics and instrumentation engineering from Jadavpur University and M.Tech in research in computational and data science from ISB Bengaluru. I work as a full-time employee in Credit Suisse, PayPal, CA Technology, Emphasis, and TC Digital. And after November 2016, I started my freelancing career. I worked for the startup like Sunvo and Future Today. And then I worked for the mid-sized company like Pubmedix and SymfonyAI. I worked for the big company like Crossover. Technology-wise, my main skills are machine learning and data analytics. I was part of the Credit Suisse risk analytics team in public. I was a senior technical analyst in machine learning and data science at TCV Digital. Data scientist and manager of data science. And after another field is in the infrastructure field. I worked in the Data Center team in Credit Suisse and later was promoted to the team. I worked in C e technology in a product prospectus, which is basically a network monitoring tool. And then I was a technical architect in the security tool group of Emphasis. So I have all the experience in all aspects of data. I can claim myself as a full-stack data analyst professional. I can do the front-end. I can do the back-end, and I can do everything in between.
So we are building a price prediction system for a startup. What happens if your prediction is low is that in the video online bidding, if you ask for a low price, it will automatically sell. And in the next iteration, when you train your model with the data, your data will be lower. So your next iteration's prediction will be lower. In this way, the price is gradually going down, and at the same time, the site's review is going down. So, when we're learning this solution, we propose it. What we see is the PRC. If the revenue is going up, we do nothing. We don't fix it until it's broken. But if the revenue is going down, we look at the situation. If the revenue is going up, that means we're selling more but selling at a low price. This will make our prediction a little bit higher. And if the prediction is that the revenue is going down, that's when we're asking a high price, which is why we can't sell enough ads. So we make our prediction a little bit lower. In the ad industry, this is known as the flow of ad prices, and we give this algorithm the name "Dancing Flow." It's implemented on the Google Cloud Platform, and the prediction model is based on the flow work model. In the ad industry, there's a practice of keeping the model the same and sometimes running without a model. However, here, you can run it for a long time without stopping the model to predict prices. So it is.
So signal is a time series data. So I will see the autocorrelation function and the autocorrelation function at this part, I will take it as a periodicity. And then also, I will see the average of the moving average. And from the moving average, you can see the trend. And I will remove the noise if these two things are there, if the if this is the case where these two things are not there, then this Adam, animal model, then the Adma model will be implemented in statistics in time series model. But if the trend is there, then the Adima model means integrated auto-regressive moving average model, and that is with the trend. And if the periodicity is also there, then the model is the seasonality model. And that is the model is there. And if you want to look at the deep learning problem, then we can use the current neural network for this kind of data signal, time series signal. And deep learning, and other signal processing, you can do the Fourier transform of the signal and see the frequency and, you can do the analysis in the frequency domain. And this is a long-term problem where you have to apply the inputs from the moment you have to learn from the errors, and actually, you will make a prediction model for the errors. And with the prediction of the value, we will predict the error and correct the value and give the answer. It will increase your accuracy.
Yeah, I work in a project where the other 24 million people are saying they should allow us to target their actions. Okay. So what we see is that people who buy their subproducts, we make a column of 1, otherwise 0. And with this column, I see the correlation of the features that are correlated. And all these features will be taken as a clustering using the Euclidean distance function. And each cluster will then be declassified by the product. Like, what that means is if you buy that product or not. And we calculate the probability of buying the product. And after some threshold of the probability, we choose those customers. And in those customers who are already buying a lot, we do conventional Volatility filtering, also known as correlation filtering, but collision is not a distance. So we just take the p-value distance. We just use this in Spark. And the classification, we use named-based, random forest.
So there is a data validation framework that you can check. So you probably follow with the right sample size, start date, and end date. And if you want to highlight that data, it's just whether the event date is in the middle of the start date and end date or not. And you can do a byte test for each row of the column, or above this column, you will get a record that is True or False. So this is one thing we can do. And, trained actually, you can play a visualization of the data analysis technique. If you look at the train, all these things, and the average of the last 10 points, average seasonality is if there isn't that, you can find it with the correlation function. Correlation will give you if the auto-correlation is high, that means there is a seasonality or periodicity. And free at this point also, you need to find the seasonality that has the full frequency forward, you have the highest value, free attachment value. Those are the time periods. Those inboxes will be the time periods. So that is in the k f, in the full year, frequency domain and significant change. That's only these two things that should be checked to check the accuracy of the model, and also you'll find that the data is also accurate.
So I prefer a free software, and I use JavaScript. And I believe I build moments like this Google visualization API, but some clients have problems with sending data to Google. So that's why you can use D3 or C3. So divide this library with JavaScript. And on the backend, I write HTTP using the Python Flask library mainly, and sometimes I use a Flask API. If high performance is required, I will use Firebase. And if it is Google Cloud Platform, there is Google Data Studio. You can make a Google data sheet on these things. You can make a chart like Excel. Whatever you can do, you can do in Google Data Studio also. On the Google Cloud Platform, Amazon, if you have a ready database, the latest database has a lot of data analytics and visualization things in AWS. And conventional API tools like Tableau and Power BI, I am somewhat familiar with. And apart from that, I know front-end logic like tables. All these things I can build by myself using AngularJS or React. Actually, I do JavaScript at an average level. So I'm not a very good front-end developer, but I can build front-end. And back-end, I am an expert. I worked for a very big company in development. And data-related, I worked with companies, so I can handle data very well.
I use so, basically, I work in the Panda and process CSV. I dump the data in from Python to Excel. And then the Excel, I use formulas like count. I did coordination things. I created bits about different kinds of charts. And, I'm also pivoting the table, which converts the column as a row to a column. So, that's what I can do in Excel. I have experience with Excel. I'm not very good, but I can handle things in Excel. And, I know a little bit of Power BI scripting also. So, this is Power BI automation. Now, I do not know. I do not work, but I heard that Python sixty is available in Excel. If that is available, I am an expert in it.
That's a good question. So if it's business, people don't go to the technology, but see the impact on how much lifting in the revenue or how much lifting in the pew, or the audience, so you should be considerate and concentrate on that and that should be emphasized. And if it's technical people, then go to the architecture diagram, sequence diagram, class diagram, these kinds of things are there. An algorithm should be explained in a lucid manner to the business people because they're interested in how it's done. Apart from that, if it's technical, I'll make it short. So make it short and formal. Don't make anything that I don't want. Focus on the fee and it's very good, look at the audience's LinkedIn profile and these things and see their capability, their expertise, and keep your presentation within those domains. That would be a good idea. Do some homework.
So large volume of data. So I worked in Matthews in 2013 and 2014, and they handled 100,000,000 records every day. It was for formatting I did. And then I worked on part five, I worked for a company called Future Today. And in PySpark, I developed an app price prediction system for them. And then I worked for another meeting, and that time I worked in the Scaler's part also. And I also worked at Future Today, where I worked on the Scaler's part. So, that is a type of bug. And another way is another Spark feature that is in machine learning. There is another thing called transfer learning. Like, you separate data into small charts and each small chart updates your model. So your old training is not neglected. Instead, all training is considered, and it updates your model. That is done through transfer learning, Deep learning – it is possible. And Bayesian models, like all models, are possible. So that can be used.
So Google data, I will fetch that data if it's Big Query or database. Whatever it is, I will fetch it in the Google Sheet. This Google Sheet, whatever you can do in Excel, you can do in Google Sheets. And Google Data Studio you can build beautiful plots also for that and you can change the parameter and the plot will be damaged. You can change so that can be done and Google Data Studio gives you a Google dataset. I work with Google BigQuery database. And other databases, like Google Cloud Storage, I work with Bigtable. And Google has Google DoubleClick, I worked on it for a small time period, but I worked on the analytics the same way. From there, we'll change all impression data for their whole day, do it in batch processing mode every day or release it in an AWS cluster. That is, I did it for the reversal and did analytics from the data, built a recommendation system which ad should we recommend it for which people so that we