profile-pic

Samit Ranajn Jena

Experienced Senior Data Engineer with over years in IT, specializing in BI and data Engineering products. Possesses extensive technology skills, expertise in Azure Cloud and Microsoft Business Intelligence (MSBI).
  • Role

    Senior Data Engineer

  • Years of Experience

    11 years

Skillsets

  • MSBI
  • T-SQL
  • Star Schema
  • SSIS
  • SSAS
  • SQL
  • Spark
  • snowflake schema
  • Python
  • PySpark
  • Power BI
  • Data Modeling
  • MDX
  • Logic App
  • DAX
  • Azure SQL DW
  • Azure SQL DB
  • Azure DataBricks
  • Azure Data Lake Store
  • Azure Data Factory
  • Azure Blob Storage
  • Data Engineering

Professional Summary

11Years
  • Feb, 2022 - Present3 yr 11 months

    Senior Data Engineer

    FARFETCH
  • Jun, 2018 - Feb, 20223 yr 8 months

    Senior Data Engineer

    MAERSK
  • Nov, 2016 - May, 20181 yr 6 months

    Senior Software Developer

    AccionLabs
  • Jan, 2012 - Oct, 20164 yr 9 months

    Senior Engineer

    IMS Health

Applications & Tools Known

  • icon-tool

    Azure Databricks

  • icon-tool

    Python

  • icon-tool

    Spark

  • icon-tool

    Azure Blob Storage

  • icon-tool

    T-SQL

  • icon-tool

    MSBI

  • icon-tool

    SSIS

  • icon-tool

    SSAS

  • icon-tool

    Power BI

  • icon-tool

    DAX

  • icon-tool

    PowerApps

Work History

11Years

Senior Data Engineer

FARFETCH
Feb, 2022 - Present3 yr 11 months
    Led end-to-end data integration solutions using Azure Data Factory (ADF) and Azure Databricks to orchestrate data movement and transform from various sources, including Azure Blob Storage, Azure SQL Database, Azure Data Lake Storage, and on-premises systems. Implemented error handling and logging mechanisms within Azure Data Factory, able to reduce data processing errors. Developed Azure Databricks notebooks using PySpark code to efficiently handle large volumes of data and execute complex data transformations, enhancing the overall data processing capabilities. Implemented optimizations in Azure Databricks, including partitioning, broadcast joins, caching strategies, and Spark cluster configuration adjustments, led to a remarkable 30% improvement in Spark job performance. Collaborated with stakeholders and product owners to analyze requirements and design appropriate solutions, following the Agile/Scrum methodology to ensure efficient project delivery. Conducted in-depth analysis of PySpark Directed Acyclic Graph (DAG) execution plans, identified and addressed bottlenecks in Spark SQL queries, significantly improving the efficiency of PySpark job execution by 40% and reducing unnecessary data shuffling. Successfully implemented multiple pipelines to process influencer data from 3rd Party API. Contributed to business insights by providing valuable information on marketing investments and ROI.

Senior Data Engineer

MAERSK
Jun, 2018 - Feb, 20223 yr 8 months
    Designed and implemented efficient Extract, Transform, Load (ETL) pipelines to read and process data using Azure Data Factory (ADF) and Azure Databricks, ensuring timely execution and delivery of critical data. Integrated Azure Databricks with Azure Data Factory, optimizing data processing workflows and improving overall pipeline efficiency, implemented dynamic parameterization in ADF pipelines. Successfully migrated legacy data workflows to Azure Data Factory, converting SQL code to Databricks notebooks. Experience with Azure Key Vaults to mask sensitive data. Expertise in optimizing Spark Jobs by analyzing Directed Acyclic Graph (DAG) execution plans, applied strategic optimizations techniques to the DAG structure, resulting significant enhancement of Spark job execution efficiency. Developed SSAS Tabular Model, wrote DAX queries to implement various KPIs in cube. Automated SSAS Model Role mechanism to add/remove users, reducing administrative workload by 25%, and ensuring timely user access updates. Migrated Excel-based forecasting reports into Power BI and integrated PowerApps to automate forecasting data preparation process and enable the write-back feature for customers, saving expenses by $50K annually. Successfully migrated several on-premise ETL processes to ADF, improving overall data quality and ETL efficiency by 30%.

Senior Software Developer

AccionLabs
Nov, 2016 - May, 20181 yr 6 months
    Created complex SSAS cubes with multiple fact measure groups, and multiple dimension hierarchies and implemented Time Intelligence functions in SSAS cubes. Wrote MDX and DAX queries, implementing dynamic cube role security mechanism and cell level security in cubes using MDX expressions to introduce user restrictions. Performed the ETL from databases and flat files sources by using SSIS Packages and implemented custom logging in SSIS. Contributed to building Data Marts and multi-dimensional models like Star Schema and Snowflake schema. Wrote T-SQL scripts, dynamic SQL, complex stored procedures, functions, and triggers; scheduled and maintained SSIS packages on a daily, weekly and monthly basis using SQL Server Agent in SSMS. Created multiple Power BI reports, developed custom calculations using DAX in Power BI, utilized Power BI query editor and data modeling features. Implemented row-level-security (RLS) in Power BI. Automated the code deployment process in production environment which optimized code deployment process.

Senior Engineer

IMS Health
Jan, 2012 - Oct, 20164 yr 9 months
    Formulated and documented detailed business rules and guidelines. Created and maintained 30+ MDX SSAS cubes, including complex SSAS cubes with multiple fact measure groups and dimension. Worked on complex MDX queries and created SSIS packages for reading and processing daily files. Provided application support to improve quality and troubleshoot business issues in a timely manner. Developed master SSAS cube from which all other similar cubes could be created and deployed, it helped to reduce the maintenance work by 40%. Automated jobs using SSIS and MDX to create multidimensional cube comparison reports between current and previous version of cubes, which optimized cube deployment process in production system.

Achievements

  • Reduced data processing errors by implementing error handling in ADF
  • Improved PySpark job performance by 40%
  • Processed influencer data from 3rd Party API
  • Saved $50K annually by migrating Excel reports to Power BI and integrating PowerApps
  • Optimized code deployment process in production environment

Education

  • Bachelor in Electronics and Telecommunication Engineering

    BIJU PATNAIK UNIVERSITY OF TECHNOLOGY (BPUT) (2010)

Certifications

  • Dp-200: implementing an azure data solution certification

  • 70-778: analyzing and visualizing data with microsoft power bi certification

  • 70-761: querying data with transact-sql certification