profile-pic

Akhil Prasad

Skilled in practices around Data Warehousing, Data Marts, Data Models, ETL & ELT for large-scale data management.


Built batch & streaming pipelines using Spark & Hadoop ecosystem capable of handling terabytes of data on day-to-day basis.


Worked on Cloud migration of thousands of pipelines (On-Primm to GCP).


Part of the Data Management team, involved in Automation & Support, Platform Engineering, and Data Governance activities.


Build & managed Data Pipeline frameworks (using Python, Scala) used for data ingestion activities across teams/markets.


Expert in writing Optimized SQL queries for OLTP & OLAP workloads using Indexes, Join, Aggregate, Window functions, CTEs, and more.


Worked in Retail, Banking, Financial Services & Insurance domain.


Collaborated on quarterly roadmaps, optimizing resource allocation and ensuring clear stakeholder communication.


Led sprint planning, prioritizing backlog, managing team capacity, and providing technical guidance for timely & high-quality delivery.

  • Role

    Senior Data Engineer

  • Years of Experience

    7 years

  • Professional Portfolio

    View here

Skillsets

  • Azure
  • Spark Streaming
  • Prometheus
  • Oracle
  • Mssql
  • Looker
  • Kubernetes
  • Kafka connect
  • Grafana
  • GCP
  • dbt
  • Databricks
  • Bash
  • PySpark - 6 Years
  • AWS
  • Spark SQL
  • Scala
  • Kafka
  • Jenkins
  • Java
  • Hive
  • Hadoop
  • Docker
  • Airflow
  • Python - 6 Years
  • Snowflake - 1 Years

Professional Summary

7Years
  • Oct, 2023 - Present1 yr 11 months

    Senior Data Engineer

    Walmart Global Tech India
  • Feb, 2022 - Oct, 20231 yr 8 months

    Data Engineer III

    Walmart Global Tech India
  • Sep, 2018 - Jan, 20223 yr 4 months

    Data Engineer

    Infosys Limited

Applications & Tools Known

  • icon-tool

    Spark

  • icon-tool

    Hadoop

  • icon-tool

    GCP

  • icon-tool

    SQL

  • icon-tool

    Python

  • icon-tool

    Scala

  • icon-tool

    Java

  • icon-tool

    C

  • icon-tool

    C++

  • icon-tool

    Bash scripting

  • icon-tool

    Hive

  • icon-tool

    HDFS

  • icon-tool

    Sqoop

  • icon-tool

    Kafka

  • icon-tool

    Kafka Connect

  • icon-tool

    Oracle

  • icon-tool

    MSSQL

  • icon-tool

    MySQL

  • icon-tool

    Dremio

  • icon-tool

    PostgreSQL

  • icon-tool

    Snowflake

  • icon-tool

    Dataproc

  • icon-tool

    Batch

  • icon-tool

    IAM

  • icon-tool

    BigQuery

  • icon-tool

    Airflow

  • icon-tool

    Looker

  • icon-tool

    Power BI

  • icon-tool

    Tableau

  • icon-tool

    Apache Superset

  • icon-tool

    Docker

  • icon-tool

    Jenkins

  • icon-tool

    Git

  • icon-tool

    Maven

  • icon-tool

    Excel

Work History

7Years

Senior Data Engineer

Walmart Global Tech India
Oct, 2023 - Present1 yr 11 months
    Reduced data freshness for inbound shipment tracking from hourly refresh to under 5 minutes (near-realtime). Architected a stateful Spark Streaming application that maintained the real-time status of thousands of active shipments by joining multiple Kafka event streams and upserting enriched results into an Apache Hudi lakehouse. Cut annual cloud spend by over $50K through workload profiling and optimization of Spark and BigQuery jobs. Developed a cloud cost dashboard that achieved 100% accurate cost attribution by engineering a Python scraper to pull ownership metadata from internal Git repositories. Led migration of 4K legacy Hive/Spark 2 pipelines to Spark 3 with zero-downtime rollout and data integrity validation. Developed NRT CDC solution using Kafka Connect to stream data from operational MSSQL DBs to a lakehouse, including a Spark-based reconciliation process. Established real-time observability for all streaming data pipelines using Prometheus and Grafana.

Data Engineer III

Walmart Global Tech India
Feb, 2022 - Oct, 20231 yr 8 months
    Collaborated on development of e-commerce fraud detection platform adhering to Medallion Architecture to process terabytes of multi-source batch data in Spark, providing cleansed, feature-rich datasets for ML model training and analyst dashboards. All workflows were orchestrated using Apache Airflow. Designed and implemented a data quality framework with automated checks for record count anomalies and improved join accuracy using fuzzy string matching logic. Built a config-driven ingestion framework in Python using custom Airflow operators and GCP hooks, reducing new pipeline onboarding time by 35%.

Data Engineer

Infosys Limited
Sep, 2018 - Jan, 20223 yr 4 months
    Developed scalable batch pipeline using Spark and Sqoop to ingest and process 2TB of daily HL7 data from a legacy EHR system into a Hive-based data lake, reducing data availability latency by 80%. Developed Spark-based ETL framework to parse, validate, and transform millions of daily X12 EDI 837 claim files, improving data quality by 99.5% and enabling faster downstream financial reconciliation.

Achievements

  • Achieved a 70% reduction in pipeline creation effort.
  • Empowered L1 support personnel for independent initial failure analysis.

Education

  • B.C.A

    GGSIPU (Delhi) (2018)
  • 12th

    KV JNU (Delhi) (2015)
  • 10th

    KV Vasant Kunj (Delhi) (2012)

Certifications

  • Az-900 - microsoft certified azure fundamentals

  • Dp-900 - microsoft certified azure data fundamentals