Experienced Senior Data Engineer with a strong track record of building and optimizing large-scale data pipelines across cloud platforms like GCP and Azure. Proficient in PySpark, Spark, SQL, Hive, Databricks, and Databricks SQL for developing robust ETL/ELT workflows that support marketing attribution, site traffic analysis, and business intelligence reporting.
Hands-on experience with Airflow for orchestration, Power BI and Tableau for visualization support, and PostgreSQL for downstream reporting needs. Skilled in using Hadoop, BigQuery, DataProc, and Scoop for data processing, with expertise in integrating source systems like Traffic360 into unified analytics layers.
Experienced in implementing DevOps practices using GitHub, Looper Pro, and Concord to enable automated CI/CD workflows and seamless deployments to Google Cloud Storage. Familiar with Azure Data Factory, Azure SQL, and Data Lake architecture, with a background in using Medallion architecture and ingestion frameworks for scalable data management.
Strong understanding of data quality, pipeline monitoring, and incident alerting via Slack and email integrations. Certified in Databricks (Associate, Professional, Spark Developer), Microsoft Azure Data Engineer (DP-203), and Snowflake (SnowPro Core), with a focus on reliability, performance tuning, and delivering business-impacting data solutions.
Senior Software Engineer
Tredence Inc.Associate Consultant
Celebal TechnologiesData Engineer
Futurense Technologies
Spark

SQL

Hive
.png)
Databricks

Python

Power Bi

Hadoop
Azure
Utilizing Databricks and Data Factory for ETL operations, handling data from diverse sources including Qlik files, SAP, Bizom, and SQL Servers.
- Implemented Medallion architecture in Databricks, ensuring structured data processing from raw to gold layers, enhancing data reliability and accuracy.
- Configured a monthly refreshed GST report using Power BI, providing stakeholders with insightful analytics.
- Managed JSON data efficiently, employing techniques such as explode related queries to handle nested structures effectively.
- Addressed data skewness using advanced techniques like salting, ensuring balanced data distribution and optimized query performance.
- Employed broadcast joins to optimize performance and improve query execution efficiency.
- Utilized the qualify method for efficient window function subqueries in SQL queries.
- Applied repartitioning and coalescing techniques to optimize memory usage and mitigate out-of-memory issues.
- Implemented Z-ordering and optimize commands to tackle the challenge of small files, optimizing data storage and query execution efficiency.
- Overall, focused on delivering reliable, efficient, and scalable data solutions tailored to meet client needs.utilizing Databricks an
Python (Programming Language)
Microsoft Power BI
DAX
Data Processing
Azure Databricks
Hive
Shell Scripting
DWH
Microsoft SQL Server
Data Warehousing
SQL
Microsoft Azure
Amazon Web Services (AWS)
Query Writing
Microsoft Excel
Apache Spark
GitHub
Data Visualization
Stored Procedures
Distributed Computing
Data Lineage
Data Ingestion
Sqoop
Airflow
Hadoop