Sr Data Engineer
Wipro Ltd | AONJun, 2023 - Present2 yr 2 months
Written ETL jobs in using spark data pipelines to process data from different sources to transform data to multiple targets.
Designed and implemented data processing pipelines using GCP services such as Cloud Dataflow and Apache Beam to ingest, transform, and analyze large volumes of data.
Developed and optimized ETL processes to extract data from various sources, including databases, APIs, and streaming platforms, and load it into BigQuery for analysis.
Implemented real-time data streaming solutions using GCP Pub/Sub and Dataflow for continuous data ingestion and processing.
Collaborated with data scientists to deploy machine learning models on GCP using TensorFlow and integrated them into data pipelines for predictive analytics.
Designed and maintained data warehouses on GCP, optimizing performance and scalability for analytical queries.
Developed and maintained data pipelines for processing large volumes of data in a cloud environment.
Implemented real-time data ingestion using Pub/Sub and Dataflow, allowing for near-instantaneous data processing.
Designed data warehousing solutions with BigQuery, implementing partitioning and clustering for enhanced query performance.
Creating Test Automation Framework in Python.
Written ETL jobs using spark data pipelines to process data from different sources to transform data to multiple targets.
Hands-on experience as Cloud Data Engineering in Big data Hadoop ecosystems such as HDFS, Hive, Spark, Bigquery, Data Bricks, Kafka, Yarn on AWS cloud services and Cloud rational databases.
Created streams using Spark and processed real-time data into RDDs & data frames, and created analytics using SPARK SQL.
Created ETL Framework using spark on AWS EMR in Scala/Python.
Designed Redshift based data delivery layer for business intelligence tools to operate directly on AWS S3.
Implemented kinesis data streams to read real-time data and loaded into data S3 for downstream processing.
Experienced in writing Spark Applications in Scala and Python.
Developed and optimized ETL processes to load and transform data from various sources into Teradata, ensuring data quality and consistency.
Collaborated with business stakeholders to identify data analytics requirements and deliver actionable insights using Teradata's advanced analytics capabilities.
Created interactive dashboards and reports using Google Data Studio to visualize insights and facilitate data-driven decision-making.
Collaborated with business analysts and stakeholders to gather requirements and translate them into technical specifications.
Created framework for Data Profiling.
Created framework for data encryption.
Designed 'Data Services' to intermediate data exchange between the Data Clearinghouse and the Data Hubs.
Prepared High-level design documentation for approval.
Providing support (24*7), on call.