profile-pic
Vetted Talent

Sayan Mukhopadhyay

Vetted Talent
Capitalising the vast domain knowledge in Data Science & Machine Learning through leadership to steer companies & clients in breaking new business avenues and reaching new horizons.; targeting for Sr. level assignments in Machine Learning/Solution Architecture/FinTech with an organization of high repute
  • Role

    Machine Learning, Big Data, Cloud, LLM Consultant

  • Years of Experience

    16 years

  • Professional Portfolio

    View here

Skillsets

  • OpenMP
  • Snowflake
  • Shell
  • Rust
  • REST
  • Redis
  • react
  • Possix
  • Play
  • Oracle
  • SOAP
  • OpenCV
  • NLTK
  • MySQL
  • MPI
  • Milvus
  • MapReduce
  • LightFM
  • LangGraph
  • LangChain
  • WCF
  • Vivado
  • Suricata
  • Scikit-learn
  • Nagios
  • Kali Linux
  • Cacti
  • .NET
  • ZeroMQ
  • XgBoost
  • Keras
  • TextBlob
  • Terraform
  • SyBase
  • SVN
  • Storm
  • Spring
  • Spark
  • Socket.IO
  • SQL
  • Node.js
  • Neo4j
  • MongoDB
  • Hive
  • Hadoop
  • Go
  • Elixir
  • C++
  • TensorFlow
  • pandas
  • Scala
  • Python
  • Kubernetes
  • Java
  • GCP
  • Elasticsearch
  • Docker
  • Azure
  • ChromaDB
  • Jenkins
  • Git
  • FPGA
  • Flask
  • FastAPI
  • Falcon
  • Django
  • dbt
  • CUDA
  • AWS
  • C#
  • Big Query
  • BERT
  • Ansible
  • Angular
  • Airflow
  • ActiveMQ
  • PyTorch

Vetted For

0Skills
  • Roles & Skills
  • Results
  • Details
  • icon-skill_image
    Digital Data ScientistAI Screening
  • 60%
    icon-arrow-down
  • Score: 60/100

Professional Summary

16Years
  • Nov, 2016 - Present9 yr 1 month

    Machine Learning, Big Data, Cloud, LLM Consultant

    Remote Consultant
  • Jun, 2015 - Nov, 20161 yr 5 months

    Manager Data Science

    Abzooba
  • Jul, 2014 - Jun, 2015 11 months

    Data Scientist / Sr. Consultant - Analytics & Big Data

    TCG Digital
  • May, 2012 - Apr, 2013 11 months

    Senior Technical Analyst - Machine Learning

    Pubmatic
  • May, 2013 - Aug, 2013 3 months

    Lead Architect - Data Mining

    Wedoria
  • Oct, 2013 - Jul, 2014 9 months

    Technical Architect Security Tool Group

    Mphasis
  • Sep, 2011 - Apr, 2012 7 months

    Senior Software Engineer

    CA Technology
  • Nov, 2010 - Sep, 2011 10 months

    Technical Manager

    FairFest Media
  • Mar, 2010 - Nov, 2010 8 months

    Senior Consultant

    PayPal
  • Jan, 2000 - Dec, 20033 yr 11 months

    Software Engineer

    Total Computer System
  • Jun, 2008 - Feb, 20101 yr 8 months

    Technical Consultant

    Credit-Suisse

Applications & Tools Known

  • icon-tool

    MySQL

  • icon-tool

    Javascript

  • icon-tool

    Python

  • icon-tool

    C++

  • icon-tool

    PHP

  • icon-tool

    Java

  • icon-tool

    Play

  • icon-tool

    Neo4j

  • icon-tool

    AWS

  • icon-tool

    node js

  • icon-tool

    Selenium

  • icon-tool

    R

  • icon-tool

    MongoDB

  • icon-tool

    Hadoop

  • icon-tool

    Spark

  • icon-tool

    Pentaho

  • icon-tool

    Kibana

  • icon-tool

    Tableau

  • icon-tool

    NLTK

  • icon-tool

    Docker

  • icon-tool

    Kubernate

  • icon-tool

    Nagios

  • icon-tool

    Azure

  • icon-tool

    Socket

  • icon-tool

    IPC

  • icon-tool

    HTTP

  • icon-tool

    REST

  • icon-tool

    SOAP

  • icon-tool

    Spring

  • icon-tool

    Flask

  • icon-tool

    Falcon

  • icon-tool

    Django

  • icon-tool

    GDB

  • icon-tool

    Makefile

  • icon-tool

    CUDA

  • icon-tool

    visual studio

  • icon-tool

    pycharm

  • icon-tool

    MATLAB

  • icon-tool

    R

  • icon-tool

    Oracle

  • icon-tool

    Sybase

  • icon-tool

    SQLite

  • icon-tool

    Elastic Search

  • icon-tool

    Hive

  • icon-tool

    Kibana

  • icon-tool

    Tableau

  • icon-tool

    Keras

  • icon-tool

    NLTK

  • icon-tool

    pandas

  • icon-tool

    Snowflake

  • icon-tool

    Airflow

  • icon-tool

    dbt

  • icon-tool

    Linux

  • icon-tool

    Windows

  • icon-tool

    Router

  • icon-tool

    Switch

  • icon-tool

    NMAP

  • icon-tool

    Kali Linux

  • icon-tool

    Suricata

  • icon-tool

    AWS

  • icon-tool

    GCP

  • icon-tool

    terraform

  • icon-tool

    Spring

  • icon-tool

    CUDA

  • icon-tool

    R

  • icon-tool

    SQLite

  • icon-tool

    Hive

  • icon-tool

    Kibana

  • icon-tool

    Tableau

  • icon-tool

    NLTK

  • icon-tool

    pandas

  • icon-tool

    Linux

  • icon-tool

    NMAP

  • icon-tool

    GCP

  • icon-tool

    terraform

  • icon-tool

    Cryptography

  • icon-tool

    GDB

  • icon-tool

    Analytics

  • icon-tool

    SQLite

  • icon-tool

    Kibana

  • icon-tool

    Tableau

  • icon-tool

    NLTK

  • icon-tool

    pandas

  • icon-tool

    bert

  • icon-tool

    Shell

  • icon-tool

    SQL

  • icon-tool

    Scala

  • icon-tool

    Elixir

  • icon-tool

    Go

  • icon-tool

    Rust

  • icon-tool

    Unity

  • icon-tool

    UML

  • icon-tool

    WCF

  • icon-tool

    Linux

  • icon-tool

    Suricata

  • icon-tool

    AWS

  • icon-tool

    GCP

  • icon-tool

    terraform

  • icon-tool

    SOAP

  • icon-tool

    SQLite

  • icon-tool

    Tableau

  • icon-tool

    pandas

  • icon-tool

    Linux

  • icon-tool

    Unix

  • icon-tool

    CUDA

  • icon-tool

    Git

  • icon-tool

    R

  • icon-tool

    Sybase

  • icon-tool

    SQLite

  • icon-tool

    Kibana

  • icon-tool

    Tableau

  • icon-tool

    NLTK

  • icon-tool

    Pandas

  • icon-tool

    BERT

  • icon-tool

    Linux

  • icon-tool

    Kubernetes

  • icon-tool

    Cacti

  • icon-tool

    Suricata

  • icon-tool

    AWS

  • icon-tool

    GCP

  • icon-tool

    Terraform

Work History

16Years

Machine Learning, Big Data, Cloud, LLM Consultant

Remote Consultant
Nov, 2016 - Present9 yr 1 month
    Spearheading the creation of new data science capabilities for diverse clients. Envisioning and executing strategies to drive business performance through data-driven decision-making. Responsible for the end-to-end lifecycle of large-scale data analyses and model development, validation, and deployment.

Manager Data Science

Abzooba
Jun, 2015 - Nov, 20161 yr 5 months
    Promoted to lead a team of seven data scientists, delivering complex client projects ahead of schedule and within budget. Directed the entire project lifecycle, from design and planning to risk monitoring and implementation. Successfully architected a credit scoring model using social media data and a Health Insurance Claim Status Prediction System using HIPAA data.

Data Scientist / Sr. Consultant - Analytics & Big Data

TCG Digital
Jul, 2014 - Jun, 2015 11 months
    Led a sentiment analysis project for brand development and developed a Neural Network-based passenger load predictor for two major aviation clients. Designed and implemented a KMeans clustering solution to resolve data errors for a leading electronics manufacturer.

Technical Architect Security Tool Group

Mphasis
Oct, 2013 - Jul, 2014 9 months
    Managed and coached a team of 12 engineers, taking full accountability for building and deploying a complex transaction monitoring system using an open-source stack. Served nine key clients, including CocaCola and Verizon, by gathering requirements, performing feasibility analysis, and developing project roadmaps.

Lead Architect - Data Mining

Wedoria
May, 2013 - Aug, 2013 3 months

Senior Technical Analyst - Machine Learning

Pubmatic
May, 2012 - Apr, 2013 11 months
    Enhanced the Hadoop platform by applying predictive approximation techniques for big data queries and deployed a new framework for query estimation and prediction.

Senior Software Engineer

CA Technology
Sep, 2011 - Apr, 2012 7 months
    Worked on software engineering projects.

Technical Manager

FairFest Media
Nov, 2010 - Sep, 2011 10 months

Senior Consultant

PayPal
Mar, 2010 - Nov, 2010 8 months

Technical Consultant

Credit-Suisse
Jun, 2008 - Feb, 20101 yr 8 months

Software Engineer

Total Computer System
Jan, 2000 - Dec, 20033 yr 11 months
    Worked as a software engineer.

Achievements

  • Golden Award Silver 2020 Codility
  • Selected for National Math Olympiad India
  • Ranked 352 in West Bengal Engineering Entrance Examination and 50 in Graduate Aptitude Test in Engineering (Instrumentation)
  • B certificate holder by NCC (Army)
  • Managed and developed highly effective analytical solutions for a system receiving 100 million new records on a daily basis on behalf of a leading online advertising company.
  • Designed and developed a parser for FIX format file in C++ that improved efficiency ten-fold in comparison with Unix grep command; deployed to support high frequency trading servers of a major investment bank.
  • Led the development of an enterprise network management system, involving complex bug fixing and development of new features in one core of heterogeneous, distributed codes.
  • Selected for National Math Olympiad
  • Ranked 352 in West Bengal Joint Entrance Examination (Engineering)
  • All India ranks 50 in Graduate Aptitude Test in Instrumentation Engineering
  • B certificate holder by National Cadet Corps (Army)
  • A parallel algorithm for molecular dynamics simulation
  • Variance of difference as a distance like measure in synchronous time series microarray data clustering
  • Advance Data Analytics using Python, Apress, Sayan Mukhopadhyay (Book) 1st Ed 2nd Ed

Testimonial

Abzooba

Pubmatic

https://www.linkedin.com/in/sayan-mukhopadhyay-61634511/

Major Projects

4Projects

Ad Price Predictor System

    Developed from data collection to ML model for Sulvo Ad Price Prediction.

Nostradamus Approximation Framework

    Inventory estimation for Pubmatic.

METAL - Trading Time Risk Analysis Tool

    Developed for Credit-Suisse, focused on Risk Analysis.

Real Time Latency Monitoring

    High-frequency trading latency monitoring for Credit-Suisse.

Education

  • M.Tech (Research) in Computational & Data Science

    Indian Institute of Science (2014)
  • B.Eng. in Instrumentation & Electronics

    Jadavpur University (2004)

Certifications

  • AWS

    Udemy (Jan, 2023)
  • Angular

    Code Academy (Dec, 2015)
  • Sas certified base programmer

  • Ncc b certificate

  • Cryptography from coventry university

Interests

  • Acting
  • Exercise
  • Writing
  • Watching Movies
  • AI-interview Questions & Answers

    Okay. My my name is, and I did my b in electronics and instrumentation engineering from Jadavpur University and MTech research in computational and data science from ISB Bengaluru. I as a, uh, I work, uh, I got chance to work as a full time employee in Credit Suisse, PayPal, CA Technology, Emphasis, TC Digital, Abdul. And after 2016 November, I started, uh, freelancing career. I start work for the startup like, Sunvo Future Today. And then I worked for the mid sized company like Pubmedix, SymfonyAI. I worked for the, like, big company like, crossover. And technology wise, my main skill is machine learning, data analytics. I was part of the Credit Suisse risk analytics team in public. I was a senior technical analyst in machine learning and Adua and, um, and, uh, TCV Digital, what was it? Data scientist and manager data science. Uh, and after, uh, another field is in Infrastructure field. I worked in the Data Center team in Credit Suisse and later promoted to the team. And I work in the c e technology in a product prospectum, which is basically a network monitoring tool. And then I was a as a I was a technical architect in the security tool groups of emphasis. So I have all as you were, experience in all aspects of data. I can claim myself as a full stack data analyst professional. I can do the front end. I can do the back end, and I can do the everything in between. That's all.

    Okay. So we are building a price prediction system, uh, for a startup or and, uh, what happens if your prediction is low, then, uh, then, uh, in the video online bidding, If you if you ask low price, it will automatically sell. And the in the next iteration, when the when you'll train your model with the data, your data will be more lower And so the your next iteration prediction will be more lower. In this way, the price of that gradually going down and, at the society. Review of the site also going down. So there is When we're learning this solution, we propose for it. So what we see, we see the PRC. If the revenue is going up, we do do nothing. We do not fix until it is broken. But if you will if this revenue is going down, we look at the. If is going up, That means we are selling more but selling in low price. They will make our prediction little bit high. And if the prediction is, uh, if you are we are spending is feeling that it's going down, That's when we are asking high price that's why you not we are cannot sell enough enough ad. So we make our prediction a little bit low. And this Predicate price of the advertisement from the sell side platform is known as flow in ad industry, and we give this algorithm name is dancing flow. It is implemented in, uh, Google Cloud Platform, and the prediction model was the case of flow work based model. And in ad industry, there is a practice like this. Keep the model some type, and sometimes they run without any model. But here, you can long term time you can run without, uh, stopping, uh, without to have without doing the stopping the model to predict that prediction done. So it is

    So signal is a time series data. So I will see the auto correlation function and the auto correlation function at this part, I will take it is a, uh, periodicity. And, uh, then also, I will see the average of the moving average. And from the moving area, you can see the trade. And I will remove the so there are if these 2 things is there, uh, if the if this is these 2 things is not there, then this, uh, Adam, animal model, then, uh, Adma model will be implemented in statistics in time since model. But if the strain is there, then Adima model means integrated audio auto editing integrated movie, alright, model, and that is with the trend. And if the periodicity also there, there's model, seasonality, and model. So that is model is there. And if you want to look at the deep learning problem, uh, model, then we can use, uh, use, uh, the the the current neural network for for this kind of the, uh, for for this kind of data signal, time series signal. And deep learning, uh, and other signal, uh, frequency domain, you can do you can do the Fourier transform of the signal and see the frequency and, uh, you can do the analysis in the people frequency domain. And this is long time times you you have to apply the inputs from moment you have to learn from the error, and, actually, you will make a prediction model for the error. And with the prediction of the value, we'll predict the error and current the value and give the answer. It will increase your accuracy. That's

    Yeah. I work in a project, uh, said the other 24,000,000 which they should allow us to target for their act. Okay. So what we see, the people who buy their subparties, we make a column 1, otherwise 0. And with this column, I see the correlation of the features which are correlated. And will all these feature will taken as a clustering, uh, Euclidean distance function. And each cluster will then declassify high by the each product. Like, what is that, uh, that is if you buy that product or not. And calculate the probability of the product of the of the buy the product. And after some threshold of the probability, we choose those customers. And in those customers who are already buying that much, we do we and conventional, uh, this, uh, what is it? Volovatic filtering, uh, algorithm is correlation video, but collision is not a distance. So we just take push sign distance. We just we'll have and this is implemented in the Spark. And the classification, we use the name based, random polished, and

    So there is a data validation, uh, framework it is framework that you can check. So you probably follow me with the right sample size start date and end date. And if you want to highlight that data that it's just, uh, event date is in middle of the start date and end date or not. Uh, and you can do a byte test for for each row of the column or above this c column, you will get a record that your console is 2 or false. So this is one thing and we can do. And, uh, train, uh, for trained, actually, play, uh, play visualization of the data analysis group technique. If you look at the train, all these things, and, um, average, last 10 points average, uh Seasonality is if there is not that, that you can find with the correlation function. Correlation will give you if if auto correlation is high, That means at that point, there is a seasonality or periodicity. And free at this point also you need so you can find seasonality that Uh, full frequency forward, you have the, uh, highest, uh, highest value highest, uh, free attachment value. Those are the, Uh, those are the time period. Those those inbox will be the time period. So the that is, uh, in in k f, uh, in the in the in the full year, uh, frequency domain and, Uh, significant change. That's only these 2 things should be checked to check. And accuracy of model is have didn't change gamma tag editor and also you'll find that data is also

    So I prefer a free software, and I use JavaScript. And I believe I build moments like this Google visualization API, but some client has problem with to sending data to Google. So that's why you, uh, you can use d c or c c. So divide this library with JavaScript. And back end we I write h t p I using Python Flux library mainly and sometimes I use a flat million If high performance is required then, uh, I will use for Telkom. And if it is Google Cloud Platform, there is Google Data Studio like You can make the Google data seed on these things. You can make a chart like Excel like Excel, whatever you can do, that you can do in Google data seed also. So the Google Cloud Platform, Amazon, if you have ready, this ship d b. Latest d b, it has a a lot of, uh, data analytics and visualization thing in AWS. And, uh, conventional API tool at Tableau and Power BI I am somehow, familiar with. And, uh, apart from that, I know the front end logic like tables. All these things I can build by myself using Angular JS or to get. Actually, I do JavaScript, uh, average level. So I'm I'm not a very good front end developer, but I am I can build front end. Uh, and back end, I am expert. I work for very big company back in development. And data related, I work with company, so I can handle a data very well.

    Yeah. Yeah. It's I use so, basically, uh, I work in the Panda, and panda to c s CSV. I dump the data in from Python to Excel. And then the Excel, I use formula, uh, like, count. I did coordination things. I create bits about different kind of chart. Uh, and, uh, I'm also pivoting the table, which, uh, converting the column as a row as a column. So that is that that I can do in Excel. Uh, I I have I'm at Excel. I'm not very good, but I I can handle the thing in Excel. And and I know a little bit p b scripting also. So this is scripting. So, uh, PBA automation. Now I I do not know. I do not work, but I heard that Python sixty is available in Excel. If that is available, I am an expert in

    Yeah. That is a good question. So if it is business, people don't go to the technology, but see the impact in the they will how much lifting in the revenue or how much lifting in the pew, or the audience uh so that that you should be considerate concentrate and that should be emphasized and if it is technical people, then go to the architecture diagram, sequence diagram, class diagram, these kind of things are there. An algorithm should be in lucid manner explained to the business people because they want to they are interested in how it is done. And apart from that, uh, uh, apart from that, uh, if it is if it is ply, I'll make it short. So make it short and for, uh, formal. Uh, so don't don't make anything. I don't I don't want it. Fee and it is very good, look at the audience, uh, LinkedIn profile and these things and see their capability, their expertise, fee and keep your presentation in those domain. So that would be a good idea. Do some homework.

    So large volume of data. So I worked in Matthews in 2013 and 13, 14, and they handle 100,000,000 record every day. It is for formatting I did. And then I what is part and 5, I worked for a company called Future Today. And in PySpark, I developed a app price prediction system for them. And then I worked for a meeting again, and that that time I worked in the Scalars part also. And I I future today also, I worked on Scalars part. And uh So, that is spice bug. And another way is another spark that is in machine learning. There is another thing called transfer learning. Like, you separated data in the small small small small chart and each small chart update your model. So your old training will be not be neglected. So all training will be considered, and it will be updated your model. That is done through the transfer learning, Deep learning, it is possible. And Bayesian, uh, model like always model is possible. So that that can be used.

    So Google data, I will fetch that data if it is big query or database. Whatever it is, I will fetch it in the Google Sheet. This Google Sheet, whatever you can do in Excel, you can do in the Google Sheet. And uh Google data studio you can build beautiful plot also for that and you can change the parameter and your plot will be damaged you can change so that can be done and uh there's a Google data give you a Google dataset. I work, and I work with Google BigQuery database. And other database, like, Google Cloud, uh, store I work with Bigtable I work. And and and Google has like Google double click. I work for a small time period of time, but I work on the analytics the same way. I'd like to die in the reverse side. That is the AdRoll is the partner. From there, we'll change the all inform impression data for for their whole day, do, uh, in batch processing mode every day or release jb in AWS cluster. That is I done the for the reversal and do analytics from the data build a recommendation system which ad should be, uh, should we would be should we recommend it for which people so that we