Senior Data Engineer
WiproJun, 2024 - Present1 yr 10 months
Architected scalable, analytics-ready GCP ETL pipelines using PySpark and Python, transforming multi-terabyte financial datasets into curated BigQuery tables optimized for enterprise dashboards, reporting, and ad-hoc analytics across 10+ business teams. Developed performance-tuned dimensional data models using advanced SQL and PL/SQL-style logic (CTEs, window functions, stored procedures), supporting delta loads and SCD Type-2 historization, resulting in ~30% improvement in data accuracy and reporting consistency. Automated end-to-end GCP data workflows using Apache Airflow (Cloud Composer), integrating GitLab CI/CD to orchestrate BigQuery pipelines, reducing manual operational effort by ~60% and improving reliability of analytics data delivery. Optimized ingestion and transformation logic through parallel processing and multithreading, achieving 35-45% reduction in pipeline runtimes and improving data freshness SLAs for downstream dashboards and reports. Led and coordinated mainframe-to-cloud data migration initiatives on GCP, converting business requirements into technical solutions, reviewing deliverables, validating data quality and schema parity, and ensuring on-time delivery of modernized BigQuery-based reporting platforms.