
I am an innovative and results-driven Data Engineer with a passion for designing and implementing robust data pipelines. With a strong background in real-time data processing, Kafka, PySpark, and Azure Databricks, I excel at integrating and analyzing massive volumes of semistructured data to derive valuable insights.
Key highlights of my career include:
Designed and implemented a cutting-edge real-time data pipeline that seamlessly integrated over 150 million raw records from more than 30 data sources. By utilizing Kafka and PySpark on Azure Databricks, I ensured efficient and reliable data processing. Leveraged Spark in Python to distribute data processing across large streaming datasets, resulting in a remarkable 67% improvement in ingestion and speed. This optimization enhanced overall system performance and accelerated data-driven decision-making. Created Airflow Dags to automate the triggering of Databricks notebooks based on scheduled intervals. This streamlined workflow automation not only saved time but also improved the efficiency of the data processing tasks.
Senior Software Engineer
Optum Global SolutionsPython Developer
IoTech Designs Pvt. Ltd.
Python

Java

SQL

MySQL
.png)
Docker

Postman

Zookeeper

Git

Azure Databricks
Snowflake

Django