Vetted Talent

Sayan Mukhopadhyay

Vetted Talent

Capitalising the vast domain knowledge in Data Science & Machine Learning through leadership to steer companies & clients in breaking new business avenues and reaching new horizons.; targeting for Sr. level assignments in Machine Learning/Solution Architecture/FinTech with an organization of high repute

Role
ML, DNS, Big Data, Cloud, LLM Engineer
Years of Experience
16 years
Professional Portfolio
View here

Skillsets

Snowflake
WCF
TextBlob
Terraform
SyBase
SVN
Storm
Spring
Spark
Socket.IO
SOAP
XgBoost
Shell
Rust
REST
Redis
react
Play
Oracle
OpenMP
OpenCV
NLTK
D3
Dialo gpt
Unity
three.js
Scikit-learn
POSIX
OpenGL
Node
Node
Google APIs
Dash
MySQL
C3
Asterisk
Kafka
Vivado
Suricata
Nagios
Kali Linux
Cacti
ZeroMQ
C++
Airflow
ActiveMQ
PyTorch
pandas
Neo4j
MongoDB
Hive
Hadoop
Go
Elixir
Angular
TensorFlow
SQL
Scala
Python
Kubernetes
Java
GCP
Elasticsearch
Docker
Azure
Flask
MPI
Milvus
MapReduce
LightFM
LangGraph
LangChain
Keras
Jenkins
Git
FPGA
AWS
FastAPI
Falcon
Django
dbt
CUDA
ChromaDB
C#
BERT
Ansible

Vetted For

0Skills

Roles & Skills
Results
Details

Digital Data ScientistAI Screening
60%

Score: 60/100

Professional Summary

16Years

Jun, 2015 - Nov, 20161 yr 5 months
Manager Data Science
Abzooba
Jul, 2014 - Jun, 2015 11 months
Data Scientist / Sr. Consultant - Analytics & Big Data
TCG Digital
Oct, 2013 - Jul, 2014 9 months
Technical Architect Security Tool Group
Mphasis
Sep, 2011 - Apr, 2012 7 months
Senior Software Engineer
CA Technology
May, 2012 - Apr, 2013 11 months
Senior Technical Analyst - Machine Learning
Pubmatic
May, 2013 - Aug, 2013 3 months
Lead Architect - Data Mining
Wedoria
Nov, 2010 - Sep, 2011 10 months
Technical Manager
FairFest Media
Mar, 2010 - Nov, 2010 8 months
Senior Consultant
PayPal
Jun, 2008 - Feb, 20101 yr 8 months
Technical Consultant
Credit-Suisse
Jan, 2000 - Dec, 20033 yr 11 months
Software Engineer
Total Computer System

Applications & Tools Known

MySQL
Javascript
Python
C++
PHP
Java
Play
Neo4j
AWS
node js
Selenium
R
MongoDB
Hadoop
Spark
Pentaho
Kibana
Tableau
NLTK
Docker
Kubernate
Nagios
Azure
Socket
IPC
HTTP
REST
SOAP
Spring
Flask
Falcon
Django
GDB
Makefile
CUDA
visual studio
pycharm
MATLAB
R
Oracle
Sybase
SQLite
Elastic Search
Hive
Kibana
Tableau
Keras
NLTK
pandas
Snowflake
Airflow
dbt
Linux
Windows
Router
Switch
NMAP
Kali Linux
Suricata
AWS
GCP
terraform
Spring
CUDA
R
SQLite
Hive
Kibana
Tableau
NLTK
pandas
Linux
NMAP
GCP
terraform
Cryptography
GDB
Analytics
SQLite
Kibana
Tableau
NLTK
pandas
bert
Shell
SQL
Scala
Elixir
Go
Rust
Unity
UML
WCF
Linux
Suricata
AWS
GCP
terraform
SOAP
SQLite
Tableau
pandas
Linux
Unix
CUDA
Git
R
Sybase
SQLite
Kibana
Tableau
NLTK
Pandas
BERT
Linux
Kubernetes
Cacti
Suricata
AWS
GCP
Terraform

Work History

16Years

Manager Data Science

Abzooba

Jun, 2015 - Nov, 20161 yr 5 months

Directed a team of 7 data scientists to deliver complex programs on time and within budget, finishing initiatives 4 weeks early. Owned a full lifecycle from design and planning to risk management and implementation, improving stakeholder satisfaction by 100%. Built risk models including credit scoring using social data and health claim status predictors, boosting model accuracy by 72%. Fostered collaboration across product, engineering, and QA to streamline handoffs and accelerate adoption. Architected simple backend sharding and caching to support peak loads of 3x prior, achieving 99.95% availability with automated failover mechanisms. Introduced CI checks and unit tests increased coverage from 55% to 82%, reducing production incidents by 40% within three months.

Data Scientist / Sr. Consultant - Analytics & Big Data

TCG Digital

Jul, 2014 - Jun, 2015 11 months

Led sentiment analysis and predictive modeling initiatives for aviation and manufacturing, improving insights and decision-making by 50%. Developed neural networks to forecast passenger load and optimized processing pipelines, reducing computation time by 20%. Implemented K-Means clustering to resolve data anomalies, cutting error rates by 80%. Partnered with stakeholders to translate requirements into deliverables and analytics roadmaps. Improved pipeline reliability and uptime to 99% through monitoring and retry logic.

Technical Architect Security Tool Group

Mphasis

Oct, 2013 - Jul, 2014 9 months

Managed and coached a team of 12 engineers to build a transaction monitoring system on an open-source stack, serving 9 enterprise clients. Led requirements discovery, feasibility analysis, and delivery roadmaps, increasing project throughput by 30%. Implemented security controls, monitoring, and alerting, reducing incident rates by 90%. Streamlined deployments and environment automation, improving reliability and shortening release timelines by 20%.

Lead Architect - Data Mining

Wedoria

May, 2013 - Aug, 2013 3 months

Served as Lead Architect for data mining initiatives at ABP Group.

Senior Technical Analyst - Machine Learning

Pubmatic

May, 2012 - Apr, 2013 11 months

Enhanced the Hadoop platform using predictive approximation for big data queries, improving query performance by 100%. Instituted validation and quality checks, decreasing production issues by 90%.

Senior Software Engineer

CA Technology

Sep, 2011 - Apr, 2012 7 months

Worked as Senior Software Engineer on Network Management System projects.

Technical Manager

FairFest Media

Nov, 2010 - Sep, 2011 10 months

Held the role of Technical Manager, overseeing development and systems.

Senior Consultant

PayPal

Mar, 2010 - Nov, 2010 8 months

Served as Senior Consultant via CSC supporting payments systems.

Technical Consultant

Credit-Suisse

Jun, 2008 - Feb, 20101 yr 8 months

Worked as Technical Consultant via FCS contributing to investment banking solutions.

Software Engineer

Total Computer System

Jan, 2000 - Dec, 20033 yr 11 months

Served as Software Engineer building applications for different educational clients.

Achievements

Golden Award Silver 2020 Codility
Selected for National Math Olympiad India
Ranked 352 in West Bengal Engineering Entrance Examination and 50 in Graduate Aptitude Test in Engineering (Instrumentation)
B certificate holder by NCC (Army)
Managed and developed highly effective analytical solutions for a system receiving 100 million new records on a daily basis on behalf of a leading online advertising company.
Designed and developed a parser for FIX format file in C++ that improved efficiency ten-fold in comparison with Unix grep command; deployed to support high frequency trading servers of a major investment bank.
Led the development of an enterprise network management system, involving complex bug fixing and development of new features in one core of heterogeneous, distributed codes.
Selected for National Math Olympiad
Ranked 352 in West Bengal Joint Entrance Examination (Engineering)
All India ranks 50 in Graduate Aptitude Test in Instrumentation Engineering
B certificate holder by National Cadet Corps (Army)
A parallel algorithm for molecular dynamics simulation
Variance of difference as a distance like measure in synchronous time series microarray data clustering
Advance Data Analytics using Python, Apress, Sayan Mukhopadhyay (Book) 1st Ed 2nd Ed

Testimonial

Abzooba

Pubmatic

https://www.linkedin.com/in/sayan-mukhopadhyay-61634511/

Major Projects

4Projects

Ad Price Predictor System

Developed from data collection to ML model for Sulvo Ad Price Prediction.

Nostradamus Approximation Framework

Inventory estimation for Pubmatic.

METAL - Trading Time Risk Analysis Tool

Developed for Credit-Suisse, focused on Risk Analysis.

Real Time Latency Monitoring

High-frequency trading latency monitoring for Credit-Suisse.

Education

M.Tech (Research) in Computational & Data Science
Indian Institute of Science (2014)
B.Eng. in Instrumentation & Electronics
Jadavpur University (2004)

Certifications

AWS
Udemy (Jan, 2023)
Angular
Code Academy (Dec, 2015)
Sas certified base programmer
Ncc b certificate
Cryptography from coventry university

Interests

Acting

Exercise

Writing

Watching Movies

AI-interview Questions & Answers

Okay. My my name is, and I did my b in electronics and instrumentation engineering from Jadavpur University and MTech research in computational and data science from ISB Bengaluru. I as a, uh, I work, uh, I got chance to work as a full time employee in Credit Suisse, PayPal, CA Technology, Emphasis, TC Digital, Abdul. And after 2016 November, I started, uh, freelancing career. I start work for the startup like, Sunvo Future Today. And then I worked for the mid sized company like Pubmedix, SymfonyAI. I worked for the, like, big company like, crossover. And technology wise, my main skill is machine learning, data analytics. I was part of the Credit Suisse risk analytics team in public. I was a senior technical analyst in machine learning and Adua and, um, and, uh, TCV Digital, what was it? Data scientist and manager data science. Uh, and after, uh, another field is in Infrastructure field. I worked in the Data Center team in Credit Suisse and later promoted to the team. And I work in the c e technology in a product prospectum, which is basically a network monitoring tool. And then I was a as a I was a technical architect in the security tool groups of emphasis. So I have all as you were, experience in all aspects of data. I can claim myself as a full stack data analyst professional. I can do the front end. I can do the back end, and I can do the everything in between. That's all.

Okay. So we are building a price prediction system, uh, for a startup or and, uh, what happens if your prediction is low, then, uh, then, uh, in the video online bidding, If you if you ask low price, it will automatically sell. And the in the next iteration, when the when you'll train your model with the data, your data will be more lower And so the your next iteration prediction will be more lower. In this way, the price of that gradually going down and, at the society. Review of the site also going down. So there is When we're learning this solution, we propose for it. So what we see, we see the PRC. If the revenue is going up, we do do nothing. We do not fix until it is broken. But if you will if this revenue is going down, we look at the. If is going up, That means we are selling more but selling in low price. They will make our prediction little bit high. And if the prediction is, uh, if you are we are spending is feeling that it's going down, That's when we are asking high price that's why you not we are cannot sell enough enough ad. So we make our prediction a little bit low. And this Predicate price of the advertisement from the sell side platform is known as flow in ad industry, and we give this algorithm name is dancing flow. It is implemented in, uh, Google Cloud Platform, and the prediction model was the case of flow work based model. And in ad industry, there is a practice like this. Keep the model some type, and sometimes they run without any model. But here, you can long term time you can run without, uh, stopping, uh, without to have without doing the stopping the model to predict that prediction done. So it is

So signal is a time series data. So I will see the auto correlation function and the auto correlation function at this part, I will take it is a, uh, periodicity. And, uh, then also, I will see the average of the moving average. And from the moving area, you can see the trade. And I will remove the so there are if these 2 things is there, uh, if the if this is these 2 things is not there, then this, uh, Adam, animal model, then, uh, Adma model will be implemented in statistics in time since model. But if the strain is there, then Adima model means integrated audio auto editing integrated movie, alright, model, and that is with the trend. And if the periodicity also there, there's model, seasonality, and model. So that is model is there. And if you want to look at the deep learning problem, uh, model, then we can use, uh, use, uh, the the the current neural network for for this kind of the, uh, for for this kind of data signal, time series signal. And deep learning, uh, and other signal, uh, frequency domain, you can do you can do the Fourier transform of the signal and see the frequency and, uh, you can do the analysis in the people frequency domain. And this is long time times you you have to apply the inputs from moment you have to learn from the error, and, actually, you will make a prediction model for the error. And with the prediction of the value, we'll predict the error and current the value and give the answer. It will increase your accuracy. That's

Yeah. I work in a project, uh, said the other 24,000,000 which they should allow us to target for their act. Okay. So what we see, the people who buy their subparties, we make a column 1, otherwise 0. And with this column, I see the correlation of the features which are correlated. And will all these feature will taken as a clustering, uh, Euclidean distance function. And each cluster will then declassify high by the each product. Like, what is that, uh, that is if you buy that product or not. And calculate the probability of the product of the of the buy the product. And after some threshold of the probability, we choose those customers. And in those customers who are already buying that much, we do we and conventional, uh, this, uh, what is it? Volovatic filtering, uh, algorithm is correlation video, but collision is not a distance. So we just take push sign distance. We just we'll have and this is implemented in the Spark. And the classification, we use the name based, random polished, and

So there is a data validation, uh, framework it is framework that you can check. So you probably follow me with the right sample size start date and end date. And if you want to highlight that data that it's just, uh, event date is in middle of the start date and end date or not. Uh, and you can do a byte test for for each row of the column or above this c column, you will get a record that your console is 2 or false. So this is one thing and we can do. And, uh, train, uh, for trained, actually, play, uh, play visualization of the data analysis group technique. If you look at the train, all these things, and, um, average, last 10 points average, uh Seasonality is if there is not that, that you can find with the correlation function. Correlation will give you if if auto correlation is high, That means at that point, there is a seasonality or periodicity. And free at this point also you need so you can find seasonality that Uh, full frequency forward, you have the, uh, highest, uh, highest value highest, uh, free attachment value. Those are the, Uh, those are the time period. Those those inbox will be the time period. So the that is, uh, in in k f, uh, in the in the in the full year, uh, frequency domain and, Uh, significant change. That's only these 2 things should be checked to check. And accuracy of model is have didn't change gamma tag editor and also you'll find that data is also

So I prefer a free software, and I use JavaScript. And I believe I build moments like this Google visualization API, but some client has problem with to sending data to Google. So that's why you, uh, you can use d c or c c. So divide this library with JavaScript. And back end we I write h t p I using Python Flux library mainly and sometimes I use a flat million If high performance is required then, uh, I will use for Telkom. And if it is Google Cloud Platform, there is Google Data Studio like You can make the Google data seed on these things. You can make a chart like Excel like Excel, whatever you can do, that you can do in Google data seed also. So the Google Cloud Platform, Amazon, if you have ready, this ship d b. Latest d b, it has a a lot of, uh, data analytics and visualization thing in AWS. And, uh, conventional API tool at Tableau and Power BI I am somehow, familiar with. And, uh, apart from that, I know the front end logic like tables. All these things I can build by myself using Angular JS or to get. Actually, I do JavaScript, uh, average level. So I'm I'm not a very good front end developer, but I am I can build front end. Uh, and back end, I am expert. I work for very big company back in development. And data related, I work with company, so I can handle a data very well.

Yeah. Yeah. It's I use so, basically, uh, I work in the Panda, and panda to c s CSV. I dump the data in from Python to Excel. And then the Excel, I use formula, uh, like, count. I did coordination things. I create bits about different kind of chart. Uh, and, uh, I'm also pivoting the table, which, uh, converting the column as a row as a column. So that is that that I can do in Excel. Uh, I I have I'm at Excel. I'm not very good, but I I can handle the thing in Excel. And and I know a little bit p b scripting also. So this is scripting. So, uh, PBA automation. Now I I do not know. I do not work, but I heard that Python sixty is available in Excel. If that is available, I am an expert in

Yeah. That is a good question. So if it is business, people don't go to the technology, but see the impact in the they will how much lifting in the revenue or how much lifting in the pew, or the audience uh so that that you should be considerate concentrate and that should be emphasized and if it is technical people, then go to the architecture diagram, sequence diagram, class diagram, these kind of things are there. An algorithm should be in lucid manner explained to the business people because they want to they are interested in how it is done. And apart from that, uh, uh, apart from that, uh, if it is if it is ply, I'll make it short. So make it short and for, uh, formal. Uh, so don't don't make anything. I don't I don't want it. Fee and it is very good, look at the audience, uh, LinkedIn profile and these things and see their capability, their expertise, fee and keep your presentation in those domain. So that would be a good idea. Do some homework.

So large volume of data. So I worked in Matthews in 2013 and 13, 14, and they handle 100,000,000 record every day. It is for formatting I did. And then I what is part and 5, I worked for a company called Future Today. And in PySpark, I developed a app price prediction system for them. And then I worked for a meeting again, and that that time I worked in the Scalars part also. And I I future today also, I worked on Scalars part. And uh So, that is spice bug. And another way is another spark that is in machine learning. There is another thing called transfer learning. Like, you separated data in the small small small small chart and each small chart update your model. So your old training will be not be neglected. So all training will be considered, and it will be updated your model. That is done through the transfer learning, Deep learning, it is possible. And Bayesian, uh, model like always model is possible. So that that can be used.

So Google data, I will fetch that data if it is big query or database. Whatever it is, I will fetch it in the Google Sheet. This Google Sheet, whatever you can do in Excel, you can do in the Google Sheet. And uh Google data studio you can build beautiful plot also for that and you can change the parameter and your plot will be damaged you can change so that can be done and uh there's a Google data give you a Google dataset. I work, and I work with Google BigQuery database. And other database, like, Google Cloud, uh, store I work with Bigtable I work. And and and Google has like Google double click. I work for a small time period of time, but I work on the analytics the same way. I'd like to die in the reverse side. That is the AdRoll is the partner. From there, we'll change the all inform impression data for for their whole day, do, uh, in batch processing mode every day or release jb in AWS cluster. That is I done the for the reversal and do analytics from the data build a recommendation system which ad should be, uh, should we would be should we recommend it for which people so that we

Sayan Mukhopadhyay

ML, DNS, Big Data, Cloud, LLM Engineer

16 years

View here

Skillsets

Vetted For

Professional Summary

Applications & Tools Known

Work History

Manager Data Science

Data Scientist / Sr. Consultant - Analytics & Big Data

Technical Architect Security Tool Group

Lead Architect - Data Mining

Senior Technical Analyst - Machine Learning

Senior Software Engineer

Technical Manager

Senior Consultant

Technical Consultant

Software Engineer

Achievements

Testimonial

Abzooba

Major Projects

Ad Price Predictor System

Nostradamus Approximation Framework

METAL - Trading Time Risk Analysis Tool

Real Time Latency Monitoring

Education

M.Tech (Research) in Computational & Data Science

B.Eng. in Instrumentation & Electronics

Certifications

AWS

Angular

Sas certified base programmer

Ncc b certificate

Cryptography from coventry university

Interests

AI-interview Questions & Answers