profile-pic

JITENDRA SUTAR

Experienced Senior Data Engineer with over years of experience in Cloud and BigData Platforms with excellent reputation for resolving problems and improving customer satisfaction.
  • Role

    Search Engineer

  • Years of Experience

    12.42 years

Skillsets

  • Cassandra
  • Bigdata tech
  • MISC
  • languages
  • ETL - 10 Years
  • DB
  • Unix
  • TeamCity
  • Jira
  • Jenkins
  • Hive
  • HBase
  • Hadoop
  • Framework
  • DevOps
  • CI/CD
  • Python - 9 Years
  • NO SQL - 8 Years
  • AWS - 4 Years
  • CloudSQL
  • Dataproc
  • Informatica
  • Analytics
  • automation
  • Dataflow
  • Stackdriver
  • Cloud Services
  • BigQuery
  • PySpark
  • Data Mining
  • SQL

Professional Summary

12.42Years
  • Mar, 2025 - Present1 yr 2 months

    Manager -Projects

    Cognizant
  • Senior Data Engineer

    Publicis Sapient
  • Sep, 2021 - Mar, 20253 yr 6 months

    Senior big data engineer

    Publicis Sapient
  • Apr, 2018 - Jun, 20191 yr 2 months

    Data Engineer

    Comcast
  • Jun, 2019 - May, 20211 yr 11 months

    Senior Data Engineer

    Equinix
  • May, 2021 - Sep, 2021 4 months

    Senior BigData Engineer

    Flipkart

Applications & Tools Known

  • icon-tool

    Google Cloud Platform

  • icon-tool

    Airflow

  • icon-tool

    HIVE

  • icon-tool

    Pentaho

  • icon-tool

    Informatica

  • icon-tool

    Cloud Function

  • icon-tool

    Informatica Power Center

  • icon-tool

    AWS

  • icon-tool

    GCP

  • icon-tool

    EMR

  • icon-tool

    S3

  • icon-tool

    ATHENA

  • icon-tool

    GCP

  • icon-tool

    Pyspark

  • icon-tool

    Python

  • icon-tool

    Unix

  • icon-tool

    SQL

  • icon-tool

    CICD

  • icon-tool

    DEVOPS

  • icon-tool

    AUTOMATION

Work History

12.42Years

Manager -Projects

Cognizant
Mar, 2025 - Present1 yr 2 months

Senior Data Engineer

Publicis Sapient
    Development of pipelines in UK_SME department to generate historical and incremental data from different sources of bank which would be used by modelling team to calculate PD,LDG,EAD and RWA for credit risk measurement. Interacted daily with the Business Analysts to gather functional requirements and understand the business logics to convert those into SQL Queries those could be implemented in Pyspark transformation in AWS infrastucure. Involved in migration of the pipelines from shared EMR cluster to Container approach and able to reduce the run time by 60% for 18 yr of historic data generation. Worked on data sourcing to AWS S3 from SAS and on-prem cluster so that workflows can be run on top of those tables. Used Airflow to build ETL solution and to manage the complex workflows. Automated the deployment from bitbucket/gitlab to AWS Emr using TeamCity CI/CD pipeline. Developed a DQ check pyspark utility check which runs in EMR and help to check the data quality of Datamart tables with hundred/thousand of columns. Tech stack used was PySpark,Python,AWS,Airflow,TeamCity etc.

Senior big data engineer

Publicis Sapient
Sep, 2021 - Mar, 20253 yr 6 months

Senior BigData Engineer

Flipkart
May, 2021 - Sep, 2021 4 months
    As Search Engineer , responsibility is to build robust framework for efficient search engine. The search engine will ingest data from various sources and as per framework it will build the ranking and sorting of data in an efficient manner for the end user so that the relevant products will be displayed .

Senior Data Engineer

Equinix
Jun, 2019 - May, 20211 yr 11 months
    PLP (Power-Load-Projection): As part of data analytics team, created robust framework which will ingest data through data pipelines using (DATAFLOW AND PYSPARK) from various sources and populate the data to Google Cloud Storage. Created serverless function using CLOUD FUNCTION to ingest business file for various regions across the GLOBE. Created framework for collecting power and sensor data from HIVE and load to GCS storage which will run as a pipeline based on different batches. PWA(Predictive-Workforce-Analytics) :To provide the reliable data based on the different region, provided solutions to create or stop VM as per timezones on which DataScience team can run their models for the prediction of upcoming customer, workforce based on the matrices provided by different sources.

Data Engineer

Comcast
Apr, 2018 - Jun, 20191 yr 2 months

Achievements

  • Successfully created pipeline in pyspark orchestrated in Airflow for data flow
  • Created serverless pipelines using serverless architecture in cloud function
  • Involved in migration of platform as per business needs
  • Created serverless function using CLOUD FUNCTION
  • Implemented framework to ingest network traffic data daily
  • Developed compliance frameworks for multi-site data warehousing efforts
  • Designed process control framework through python

Major Projects

3Projects

PLP (Power-Load-Projection)

    As part of data analytics team, created robust framework which will ingest data through data pipelines using (DATAFLOW AND PYSPARK) from various sources and populate the data to Google Cloud Storage. Created serverless function using CLOUD FUNCTION to ingest business file for various regions across the GLOBE. Created framework for collecting power and sensor data from HIVE and load to GCS storage which will run as a pipeline based on different batches.

PWA (Predictive-Workforce-Analytics)

    To provide the reliable data based on the different region, provided solutions to create or stop VM as per timezones on which DataScience team can run their models for the prediction of upcoming customer, workforce based on the matrices provided by different sources.

ODS (Operational Data Store)

    The information factory is an existing Cigna asset used to integrate business data from different sources of truth and consistent information from a common platform. Information factory converts data from BDRS and other sources into consumable information. CIF enables consistent and stable accesses to data across all channels both internal & external. Created high level design for HEALTHCARE ID CARD GENERATION DATAFLOW to create the mappings through Informatica Power Center tools - Source Analyzer, Target designer, Mapping & Mapplet Designer and Transformation Designer. Perform thorough end to end unit testing of the functionality and solved the defect during system. The asset used to integrate 9 yrs of business data (through INCREMENTAL) from different ORACLE sources to HADOOP system. Correlate the business to technical aspect and come up with high level design to match with HADOOP environment. Parsing data through high-level design framework extract data from existing ORACLE ecosystem to HADOOP using frameworks such as HIVE, HDFS, SQOOP. Creation of HIVE and HBASE tables to extract and load data into ORACLE for easy extraction on user end. Extraction of data through different PIG queries and automation of jobs done by OOZIE.

Education

  • B.Tech

    Biju Pattnaik University Of Technology
  • Bachelor of Technology

    Biju Pattnaik University Of Technology (2013)