Supreeth Gowda

Build and maintain data solutions for enterprise-grade applications using various AWS services and data engineering tools.

Building and maintaining configuration driven Data ingestion framework and Data Orchestration framework using AWS Services such as Lambda, DynamoDB, MWAA and Python

Building and Maintaining the Modern Data Stack built using Fivetran, Airflow, Snowflake (Enterprise Data Platform), Data Build Tool (DBT) and PowerBi

Hackathons involving Amazon Bedrock/Chat-GPT for image recognition and web scrapping data analysis.

Role
Lead Data Engineer
Years of Experience
12 years

Skillsets

Terraform - 5 Years
Behavior detection
Snowflake Administration
pipeline management
performance metrics
Feature Engineering
Etl management
Data handling
Data Extraction
Big Data
AWS Glue
Automation Testing
Data Analytics
Ansible - 2 Years
Github - 8 Years
Python - 8 Years
Java
ETL jobs
Data warehouse
data transformation
data orchestration
Data lake
Data Ingestion

Professional Summary

12Years

Jan, 2023 - Present3 yr 2 months
Lead Data Engineer
Mindera
Dec, 2021 - Present4 yr 3 months
Data Engineering consultant
Encore Data Intelligence Service
Jun, 2021 - Nov, 2021 5 months
Data Engineer II
Amazon Web Services (AWS)
Aug, 2012 - May, 20141 yr 9 months
Research Assistant (Data Mining and Software Development)
Worcester Polytechnic Institute (WPI)
Jun, 2014 - Jul, 20184 yr 1 month
Software Test Engineer, Data Services
Sonos Inc
Jul, 2018 - Jun, 20212 yr 11 months
Data Engineer
Meredith Corporation
Mar, 2010 - Jan, 20121 yr 10 months
Performance Testing Engineer
CGI Group Inc.

Applications & Tools Known

AWS Lambda
DynamoDB
Fivetran
Snowflake
PowerBi
Python
Apache Airflow
Amazon Athena
Delta Lake
Redshift
Spark
Glue
Jupyter Notebook
Postman
MySQL
Ansible
Docker
GitHub
Pycharm
Jenkins
Perforce
Eclipse
AWS
Lambda
Airflow
Amazon Bedrock
Amazon EMR
AWS Glue
Jupyter Notebook
GitHub
PowerBI
AWS services
Terraform
PagerDuty
S3
Redshift
Apache Spark
Zeppelin
Splunk
AWS CLI
RapidMiner
Weka

Work History

12Years

Lead Data Engineer

Mindera

Jan, 2023 - Present3 yr 2 months

Building and maintaining a modern data stack built using Fivetran, Airflow (MWAA), Snowflake, DBT, and PowerBI. Implemented configuration-driven data ingestion, data quality, error logging, and data orchestration frameworks leveraging AWS services. Designing, developing, deploying, and maintaining data ingestion frameworks for CDC, streaming and batch data using Fivetran, Amazon Data Firehose, AWS Lambda, DynamoDB, S3, and AWS Glue. Orchestrating workflows with Airflow, deploying via Terraform, and integrating PagerDuty for real-time failure alerts. Managing Snowflake with RBAC, optimizing cost/performance via dashboards for query monitoring, and enforcing guardrails on query execution. Participated in hackathons involving Amazon Bedrock/Chat-GPT for image recognition and web scraping data analysis.

Data Engineering consultant

Encore Data Intelligence Service

Dec, 2021 - Present4 yr 3 months

Interacted with various startup teams to understand their challenges and provided consultation on building data pipelines. Brainstormed and evaluated potential product ideas, identifying opportunities to develop them into viable product companies. Collaborated with a hospital management team to address their data challenges by developing solutions using Python.

Data Engineer II

Amazon Web Services (AWS)

Jun, 2021 - Nov, 2021 5 months

Handled, maintained, and supported ETL data processes for Redshift, transitioning to a Data Lake architecture and managing thousands of ETL jobs, ensuring data integrity and reliability. Offloaded and transformed data from Redshift to Parquet format using Apache Spark on EMR, loading it into Redshift Spectrum to implement a Data Lakehouse solution. Worked on an in-house built AWS Glue catalog sharing infrastructure to minimize data duplication and enhance data consistency, leveraging AWS Glue, Lake Formation, Resource Access Manager (RAM), and Redshift Spectrum. Managed and supported a centralized data warehouse handling 45-50 petabytes of data.

Data Engineer

Meredith Corporation

Jul, 2018 - Jun, 20212 yr 11 months

Designed and developed a comprehensive Data Lake on AWS using AWS Glue Jobs, Glue Crawler, Data Catalog, Athena, and EMR. Created PySpark code with RDDs and UDFs for complex data transformations. Implemented and managed computational workflows and data processing pipelines with Apache Airflow. Built an in-house data deletion framework for CCPA/GDPR compliance within the Data Lake environment using Athena's CTAS feature after doing a POC on Apache Hudi, Delta Lake and Snowflake.

Software Test Engineer, Data Services

Sonos Inc

Jun, 2014 - Jul, 20184 yr 1 month

Designed and deployed data pipelines using Python, PySpark, MySQL, AWS platforms, Ganglia, Ansible, NiFi, Docker, and GitHub. Conducted comprehensive testing of Spark processing using PySpark, Zeppelin, PyCharm, Jenkins, Perforce, and AWS platform. Utilized Python in Jupyter Notebook, Splunk, and AWS CLI for end-to-end validation of big data pipelines. Pitched, implemented, and presented five new ideas involving testing tools, Android integration, voice integration using Android and Python, and PySpark at internal hackathons.

Research Assistant (Data Mining and Software Development)

Worcester Polytechnic Institute (WPI)

Aug, 2012 - May, 20141 yr 9 months

Developed structured Java code using RapidMiner and Weka APIs to automate machine learning detector building and engineered complex features from foundational data sets. Created Java code to extract features from log files and relational databases for intelligent tutoring systems (e.g., ASSISTments ITS) and configured automated detectors to identify student behaviors such as boredom, frustration, confusion, and concentration.

Performance Testing Engineer

CGI Group Inc.

Mar, 2010 - Jan, 20121 yr 10 months

Involved in requirements gathering and test plan preparation. Developed scripts using Web (HTTP/HTML), Web (Click and Script), and Ajax (Click and Script) protocols in HP LoadRunner, embedding C and Java code for error handling. Set up monitors using Sitescope to collect performance metrics on application and web servers. Conducted various performance tests, analyzed metrics and test results, and prepared reports. Performed tuning activities to identify and eliminate bottlenecks, enhancing overall performance.

Achievements

Pat on the back - recognizing and appreciating the commitment at work at CGI Group Inc.
Corona - recognizing the 'most successful team' for Quarter - FY 11 at CGI Group Inc.
Pat on the back
Most successful team
Corona

Education

Master of Science in Learning Sciences and Technologies
Worcester Polytechnic Institute (WPI) (2014)
Bachelor of Engineering in Computer Science
Visvesvaraya Technological University (VTU) (2009)

Certifications

Certified programmer for the java platform (scjp) 5.0
Certified programmer for the java 2 platform (scjp) 5.0

Supreeth Gowda

Lead Data Engineer

12 years

Skillsets

Professional Summary

Applications & Tools Known

Work History

Lead Data Engineer

Data Engineering consultant

Data Engineer II

Data Engineer

Software Test Engineer, Data Services

Research Assistant (Data Mining and Software Development)

Performance Testing Engineer

Achievements

Education

Master of Science in Learning Sciences and Technologies

Bachelor of Engineering in Computer Science

Certifications

Certified programmer for the java platform (scjp) 5.0

Certified programmer for the java 2 platform (scjp) 5.0