profile-pic

Supreeth Gowda

Build and maintain data solutions for enterprise-grade applications using various AWS services and data engineering tools.

Building and maintaining configuration driven Data ingestion framework and Data Orchestration framework using AWS Services such as Lambda, DynamoDB, MWAA and Python

Building and Maintaining the Modern Data Stack built using Fivetran, Airflow, Snowflake (Enterprise Data Platform), Data Build Tool (DBT) and PowerBi

Hackathons involving Amazon Bedrock/Chat-GPT for image recognition and web scrapping data analysis.


  • Role

    Lead Data Engineer

  • Years of Experience

    12 years

Skillsets

  • Terraform - 5 Years
  • Behavior detection
  • Snowflake Administration
  • pipeline management
  • performance metrics
  • Feature Engineering
  • Etl management
  • Data handling
  • Data Extraction
  • Big Data
  • AWS Glue
  • Automation Testing
  • Data Analytics
  • Ansible - 2 Years
  • Github - 8 Years
  • Python - 8 Years
  • Java
  • ETL jobs
  • Data warehouse
  • data transformation
  • data orchestration
  • Data lake
  • Data Ingestion

Professional Summary

12Years
  • Jan, 2023 - Present2 yr 11 months

    Lead Data Engineer

    Mindera
  • Dec, 2021 - Present4 yr

    Data Engineering consultant

    Encore Data Intelligence Service
  • Jun, 2021 - Nov, 2021 5 months

    Data Engineer II

    Amazon Web Services (AWS)
  • Aug, 2012 - May, 20141 yr 9 months

    Research Assistant (Data Mining and Software Development)

    Worcester Polytechnic Institute (WPI)
  • Jun, 2014 - Jul, 20184 yr 1 month

    Software Test Engineer, Data Services

    Sonos Inc
  • Jul, 2018 - Jun, 20212 yr 11 months

    Data Engineer

    Meredith Corporation
  • Mar, 2010 - Jan, 20121 yr 10 months

    Performance Testing Engineer

    CGI Group Inc.

Applications & Tools Known

  • icon-tool

    AWS Lambda

  • icon-tool

    DynamoDB

  • icon-tool

    Fivetran

  • icon-tool

    Snowflake

  • icon-tool

    PowerBi

  • icon-tool

    Python

  • icon-tool

    Apache Airflow

  • icon-tool

    Amazon Athena

  • icon-tool

    Delta Lake

  • icon-tool

    Redshift

  • icon-tool

    Spark

  • icon-tool

    Glue

  • icon-tool

    Jupyter Notebook

  • icon-tool

    Postman

  • icon-tool

    MySQL

  • icon-tool

    Ansible

  • icon-tool

    Docker

  • icon-tool

    GitHub

  • icon-tool

    Pycharm

  • icon-tool

    Jenkins

  • icon-tool

    Perforce

  • icon-tool

    Eclipse

  • icon-tool

    AWS

  • icon-tool

    Lambda

  • icon-tool

    Airflow

  • icon-tool

    Amazon Bedrock

  • icon-tool

    Amazon EMR

  • icon-tool

    AWS Glue

  • icon-tool

    Jupyter Notebook

  • icon-tool

    GitHub

  • icon-tool

    PowerBI

  • icon-tool

    AWS services

  • icon-tool

    Terraform

  • icon-tool

    PagerDuty

  • icon-tool

    S3

  • icon-tool

    Redshift

  • icon-tool

    Apache Spark

  • icon-tool

    Zeppelin

  • icon-tool

    Splunk

  • icon-tool

    AWS CLI

  • icon-tool

    RapidMiner

  • icon-tool

    Weka

Work History

12Years

Lead Data Engineer

Mindera
Jan, 2023 - Present2 yr 11 months
    Building and maintaining a modern data stack built using Fivetran, Airflow (MWAA), Snowflake, DBT, and PowerBI. Implemented configuration-driven data ingestion, data quality, error logging, and data orchestration frameworks leveraging AWS services. Designing, developing, deploying, and maintaining data ingestion frameworks for CDC, streaming and batch data using Fivetran, Amazon Data Firehose, AWS Lambda, DynamoDB, S3, and AWS Glue. Orchestrating workflows with Airflow, deploying via Terraform, and integrating PagerDuty for real-time failure alerts. Managing Snowflake with RBAC, optimizing cost/performance via dashboards for query monitoring, and enforcing guardrails on query execution. Participated in hackathons involving Amazon Bedrock/Chat-GPT for image recognition and web scraping data analysis.

Data Engineering consultant

Encore Data Intelligence Service
Dec, 2021 - Present4 yr
    Interacted with various startup teams to understand their challenges and provided consultation on building data pipelines. Brainstormed and evaluated potential product ideas, identifying opportunities to develop them into viable product companies. Collaborated with a hospital management team to address their data challenges by developing solutions using Python.

Data Engineer II

Amazon Web Services (AWS)
Jun, 2021 - Nov, 2021 5 months
    Handled, maintained, and supported ETL data processes for Redshift, transitioning to a Data Lake architecture and managing thousands of ETL jobs, ensuring data integrity and reliability. Offloaded and transformed data from Redshift to Parquet format using Apache Spark on EMR, loading it into Redshift Spectrum to implement a Data Lakehouse solution. Worked on an in-house built AWS Glue catalog sharing infrastructure to minimize data duplication and enhance data consistency, leveraging AWS Glue, Lake Formation, Resource Access Manager (RAM), and Redshift Spectrum. Managed and supported a centralized data warehouse handling 45-50 petabytes of data.

Data Engineer

Meredith Corporation
Jul, 2018 - Jun, 20212 yr 11 months
    Designed and developed a comprehensive Data Lake on AWS using AWS Glue Jobs, Glue Crawler, Data Catalog, Athena, and EMR. Created PySpark code with RDDs and UDFs for complex data transformations. Implemented and managed computational workflows and data processing pipelines with Apache Airflow. Built an in-house data deletion framework for CCPA/GDPR compliance within the Data Lake environment using Athena's CTAS feature after doing a POC on Apache Hudi, Delta Lake and Snowflake.

Software Test Engineer, Data Services

Sonos Inc
Jun, 2014 - Jul, 20184 yr 1 month
    Designed and deployed data pipelines using Python, PySpark, MySQL, AWS platforms, Ganglia, Ansible, NiFi, Docker, and GitHub. Conducted comprehensive testing of Spark processing using PySpark, Zeppelin, PyCharm, Jenkins, Perforce, and AWS platform. Utilized Python in Jupyter Notebook, Splunk, and AWS CLI for end-to-end validation of big data pipelines. Pitched, implemented, and presented five new ideas involving testing tools, Android integration, voice integration using Android and Python, and PySpark at internal hackathons.

Research Assistant (Data Mining and Software Development)

Worcester Polytechnic Institute (WPI)
Aug, 2012 - May, 20141 yr 9 months
    Developed structured Java code using RapidMiner and Weka APIs to automate machine learning detector building and engineered complex features from foundational data sets. Created Java code to extract features from log files and relational databases for intelligent tutoring systems (e.g., ASSISTments ITS) and configured automated detectors to identify student behaviors such as boredom, frustration, confusion, and concentration.

Performance Testing Engineer

CGI Group Inc.
Mar, 2010 - Jan, 20121 yr 10 months
    Involved in requirements gathering and test plan preparation. Developed scripts using Web (HTTP/HTML), Web (Click and Script), and Ajax (Click and Script) protocols in HP LoadRunner, embedding C and Java code for error handling. Set up monitors using Sitescope to collect performance metrics on application and web servers. Conducted various performance tests, analyzed metrics and test results, and prepared reports. Performed tuning activities to identify and eliminate bottlenecks, enhancing overall performance.

Achievements

  • Pat on the back - recognizing and appreciating the commitment at work at CGI Group Inc.
  • Corona - recognizing the 'most successful team' for Quarter - FY 11 at CGI Group Inc.
  • Pat on the back
  • Most successful team
  • Corona

Education

  • Master of Science in Learning Sciences and Technologies

    Worcester Polytechnic Institute (WPI) (2014)
  • Bachelor of Engineering in Computer Science

    Visvesvaraya Technological University (VTU) (2009)

Certifications

  • Certified programmer for the java platform (scjp) 5.0

  • Certified programmer for the java 2 platform (scjp) 5.0