profile-pic

Danish Mushtaq

Big Data Consultant having 8+ years of experience with strong theoretical skills and a passion for data

platforms,machine learning and deep learning.

Skilled both in data engineering and DevOps, experienced with large

projects and heterogeneous infrastructures.

Customer-oriented and structured method of working, focused on quality

and maintainability. Highly motivated to work in a team, comfortable in big

companies as well as in small teams.

  • Role

    Data Engineer

  • Years of Experience

    11.5 years

Skillsets

  • AWS
  • Azure
  • PySpark
  • Data Engineering
  • open source tools
  • Snowflake
  • Spark
  • Git
  • Terraform
  • AWS CDK
  • SQL
  • AWS Redshift
  • AWS Glue
  • SAS

Professional Summary

11.5Years
  • Jul, 2022 - Present3 yr 10 months

    Founder

    Insights Factory
  • Jan, 2024 - Dec, 2024 11 months

    Lead Data Engineer

    Ras Al Khaimah Economic Zone (RAKEZ)
  • Jan, 2023 - Dec, 2023 11 months

    Senior Data Engineer

    Qualcomm
  • Apr, 2019 - Jan, 2020 9 months

    Lead Engineer

    Social Media
  • Aug, 2021 - Jul, 2022 11 months

    Lead Data Engineer

    GDS Link
  • Jan, 2022 - Dec, 2022 11 months

    Senior Data Engineer

    Mars
  • Dec, 2015 - Apr, 20193 yr 4 months

    Consultant for Fortune 500 Companies

    Pfizer, Gilead Life Sciences

Applications & Tools Known

  • icon-tool

    AWS (Amazon Web Services)

  • icon-tool

    Microsoft Azure

Work History

11.5Years

Founder

Insights Factory
Jul, 2022 - Present3 yr 10 months

Lead Data Engineer

Ras Al Khaimah Economic Zone (RAKEZ)
Jan, 2024 - Dec, 2024 11 months

Senior Data Engineer

Qualcomm
Jan, 2023 - Dec, 2023 11 months

Senior Data Engineer

Mars
Jan, 2022 - Dec, 2022 11 months

Lead Data Engineer

GDS Link
Aug, 2021 - Jul, 2022 11 months

Lead Engineer

Social Media
Apr, 2019 - Jan, 2020 9 months

    Responsible for design and development of data platform for a social media marketing company-

    Data Ingestion from all major social media platforms

    Serverless data platform

    Concurrent data load for 100 clients daily

    Containerized code for attribution of sales to marketing channels

    Technologies include:

    AWS as the cloud platform

    Serverless AWS services ( S3, Lambda, Step Functions, Aurora Serverless

    DB)

    PySpark for data transformation

    AWS Fargate for execution of containerized code hosted in AWS Elastic

    container registry

Consultant for Fortune 500 Companies

Pfizer, Gilead Life Sciences
Dec, 2015 - Apr, 20193 yr 4 months

    Large pharma data management projects with the goal to establish platforms with modern architecture. Main focus was the migration of legacy

    data, by assuring data quality and transformation into various formats-

    Customer consulting with regard to loading / unloading interfaces

    Definition of requirements for transformation of legacy data

    Implementation of algorithms for data transformation

    Tool development for secure data transport

    Tool development for tests of data quality/interface implementation

    Technologies include:

    Standard Linux tools, such as awk, sed, grep, ...

    Python for in-depth data analysis

    AWS Redshift for Data Storage

Achievements

  • 1) Food Industry Achievements include: • End to End data management solution built using Azure Data Factory and Azure Databricks 2)Fin Tech Achievements include: • Developed 10+ Lambda Functions integrated with API Gateway to serve data to Power BI and other apps • Developed 5+ Lambda Functions event driven by S3 and integrated with Amazon Managed Kafka 3)E-commerce Aggregation- Achievements include: • Designed the solution for data ingestion, analytics and warehousing • Implemented Step Functions and Lambda Functions for data ingestion from Netsuite 4) Social Media Achievements include: • Implementing orchestration of the end to end data pipeline using AWS Step Functions • Defined the template for ETL scripts in Glue • Implemented container execution on AWS Fargate 5) Pfizer, Gilead Life Sciences Achievements include: • Documentation of legacy processes written in SQL and SAS • Design of configurable data load framework to handle Adult Ped Split for Pfizer Prevnar 20 Drug

Major Projects

3Projects

Data Lake on Azure

Jun, 2022 - Apr, 2023 10 months

    A large data management project required a multi-company collaboration to enable data transfer/analytics from multiple sources to multiple

    destinations.

    • Ingest data from multiple sources like SQL DB, SQL Data warehouse and SFTP, APIs
    • Implementation of new business rules in Azure Databricks using Python,
    • SQL and Spark ( Python, Scala)
    • Development of Hive assets for use in Dremio

Modern Data Platform for Marketing Attribution

Jun, 2021 - Jun, 20221 yr

    Responsible for design and development of data platform for a social media marketing company

    Data Ingestion from all major social media platforms

    Serverless data platform

    Concurrent data load for 100 clients daily

    Containerized code for attribution of sales to marketing channe

Modern Data Platform for E-commerce

Jan, 2021 - Jun, 2021 5 months

    Development of data platform from scratch.

    • Design and implementation of data platform on Amazon Web Services
    • DataIngestionfromVarietyofdatasourceslikeAmazon,Shopify,Internal
    • Data sources like ERP

Education

  • B. Tech. Computer Science

    National Institute of Technlogy Srinagar (2011)

Certifications

  • AWS Certified - Data Analytics Speciality

  • AWS Certified - Solutions Architect

  • Snowflake SnowPro

  • Microsoft Azure Fundamentals