Pawan Kumar

With 7 years of experience with SQL, Python, PySpark, AWS, ETL processes, BI

tools and data visualization, I have delivered impactful solutions, including

driving $3.7B+ in GMS value and reducing workloads by 6.5-man hours through automation.

My expertise extends to people and program management, enabling efficient collaboration across teams. I am eager to bring this blend of technical and leadership skills to optimize your data infrastructure and drive measurable outcomes.

Role
Subject Matter Expert (Analytics, Intelligence and Engineering)
Years of Experience
7.11 years

Skillsets

People management
Glue
IAM
Innovative
Kinesis
Lambda
Leadership
logical skills
NoSQL
Github
PostgreSQL
Power BI
PySpark
Quicksight
Spark
SQL Server
Stakeholder Management
VPC
ETL/ELT
Redshift - 4 Years
S3 - 6 Years
Agile
Analytical Skills
Athena
AWS CloudFormation
Business Intelligence
Code pipeline
Databricks
DynamoDB
EC2
EMR
Python - 5 Years
AWS - 7 Years
Data Engineering - 4 Years
Data Analytics - 5 Years

Professional Summary

7.11Years

Oct, 2022 - Present3 yr 2 months
Subject Matter Expert (Analytics, Intelligence and Engineering)
AMAZON
Jan, 2022 - Sep, 2022 8 months
Lead Tech Operations Associate (Analytics, Intelligence and Engineering)
AMAZON
Jul, 2017 - Dec, 20214 yr 5 months
Tech Operations Associate 1 (Analytics, Intelligence and Engineering)
AMAZON

Applications & Tools Known

Python
Pyspark
AWS (Amazon Web Services)
Spark SQL
Amazon Redshift
Apache Spark
Microsoft Power BI
Amazon QuickSight
GitHub
MySQL Workbench
Microsoft Excel
Docker
Jenkins

Work History

7.11Years

Subject Matter Expert (Analytics, Intelligence and Engineering)

AMAZON

Oct, 2022 - Present3 yr 2 months

Built scalable ETL pipelines with PySpark, Glue, CI/CD, Airflow, and Databricks. Managed AWS S3 for data processing. Improved AWS Redshift query performance by 25% through tuning and schema optimization. Developed scalable data models and BI solutions on Quicksight, Power BI, ensuring data integrity for analytical applications. Streamed and transformed data using Kinesis and Lambda, enabling business insights through Athena and QuickSight. Led a team of 5 to deliver 104 scraping tasks, 25+ deep dives and 14+ ad hoc requests, generating $1.2B in GMS in CY 23.

Lead Tech Operations Associate (Analytics, Intelligence and Engineering)

AMAZON

Jan, 2022 - Sep, 2022 8 months

Partnered with BAs to develop BI solutions on Quicksight, Power BI, including BR documentation, testing, and deployment. Supported BI and Data Engineers with troubleshooting and issue resolution w.r.t Athena, Glue.

Tech Operations Associate 1 (Analytics, Intelligence and Engineering)

AMAZON

Jul, 2017 - Dec, 20214 yr 5 months

Developed 160+ SQL queries using SQL Workbench and Cradle, meeting complex business requirements. Created 50+ visualizations on Quicksight and Power BI, delivering insights into key metrics. Streamlined processes with Python scripts, reducing manual effort by 75% and enhancing data flow efficiency. Delivered crawled data for 450+ competitors, contributing to 250M+ selections and $2.5B in GMS value.

Achievements

Pure Performer Award in 2020
Pure Performer Award in 2022
Extra Miler Award in 2023

Major Projects

6Projects

Data Engineering

Successfully designed and executed a scalable ETL (Extract, Transform, Load) pipeline using PySpark to populate a data warehouse in AWS Redshift, leveraging data modeling techniques.

Automation

Led development of a python-based GUI solution using web2py, reducing manual work by 75%.

Data Analysis and Visualization

Optimized database queries, scheduled monthly ETLs & published data to Quicksight for insights, assisting stakeholders identify & remove 40% competitors from scope having no impact reducing cost & effort.

Data Analysis

Partnered with Climate Fledge Friendly team to discover most valuable set of keywords assisting in identifying 72/230 brands (150K sustainable selections) from Spain, Italy & UK helped achieving 5.2B$/12B$ goal.

Design an automated, scalable, ETL pipeline to move raw data from various present sources, process it, and load it into AWS Redshift for analytics.

Feb, 2023 - Apr, 2023 2 months

Responsibilities:

1. Identified the various input sources and

centralizing it into an Amazon S3 data lake.

2. Transformed raw data using PySpark on AWS Glue to meet the

requirements of downstream analytics systems.

3. Loaded processed data into Amazon Redshift

for querying and analytics for fast querying and analytics.

4. Automated the Pipeline Using AWS Step Functions since we had to

maintain a scalable and reliable ETL pipeline.

5. Implemented data quality checks using AWS Glue Data Brew and

monitored the pipeline with Amazon Cloud Watch to collect

logs and metrics. Defined alarms to detect and respond to errors.

Design and populate a data mart shared by business in AWS to support business analytics

Aug, 2022 - Oct, 2022 2 months

1. Created a star schema consisting of fact table and

multiple dimension tables.

2. Optimized redshift for performance by tuning and schema

optimization.

3. Utilized AWS Glue ETL jobs with PySpark to clean, transform, and load Data into Redshift.

4. Automated the ETL Pipeline Using Amazon MWAA (Managed Apache

Airflow).

5. Validated Data Consistency between source and destination tables.

6. Build Dashboards in Amazon Quicksight to showcase Insights from the Data Mart.

Education

Bachelors in Technology Computer Science
GEC, Bhubaneswar

Interests

Cricket

Bike Rides

Youtube Learning

Travelling

Pawan Kumar

Subject Matter Expert (Analytics, Intelligence and Engineering)

7.11 years

Skillsets

Professional Summary

Applications & Tools Known

Work History

Subject Matter Expert (Analytics, Intelligence and Engineering)

Lead Tech Operations Associate (Analytics, Intelligence and Engineering)

Tech Operations Associate 1 (Analytics, Intelligence and Engineering)

Achievements

Major Projects

Data Engineering

Automation

Data Analysis and Visualization

Data Analysis

Design an automated, scalable, ETL pipeline to move raw data from various present sources, process it, and load it into AWS Redshift for analytics.

Design and populate a data mart shared by business in AWS to support business analytics

Education

Bachelors in Technology Computer Science

Interests