profile-pic

Ayush Rastogi

Ayush Rastogi

Currently, I am working with Nykaa as a Senior Data Engineer. I completed the degree of Bachelor of Technology (B.Tech) from The LNM Institute of Information Technology, Jaipur in 2017. I am working on different technology like - Arangodb (Graph database), Elasticsearch (NoSQL), Spark, Hadoop, Java and Scala. I have good knowledge of C++, Java, and nodeJs Programming language and Data Structures and Algorithms.

  • Role

    Data Platform Engineer

  • Years of Experience

    7.8 years

Skillsets

  • Kafka - 7 Years
  • 組込みLinux
  • Git
  • Git
  • C++
  • Data Warehousing - 7 Years
  • Data Visualization - 7 Years
  • Airflow - 4 Years
  • SQL - 7 Years
  • NO SQL - 7 Years
  • Pyspark - 2 Years
  • ETL - 7 Years
  • Big Data Technology - 7 Years
  • ArangoDB - 2 Years
  • Hive - 4 Years
  • MySQL - 7 Years
  • Data Structure - 7 Years
  • Apache Flink - 4 Years
  • Docker - 5 Years
  • Scala - 7 Years
  • Git - 7 Years
  • Elasticsearch - 5 Years
  • Azure - 2 Years
  • 組込みLinux - 7 Years
  • Apache Spark - 7 Years
  • Redshift - 4 Years
  • Python - 2 Years
  • AWS - 5 Years
  • Java - 7 Years

Professional Summary

7.8Years
  • Jan, 2023 - Present2 yr 4 months

    Senior Software Engineer - Data Platform

    Nykaa
  • Jan, 2021 - Dec, 20221 yr 11 months

    Principal Software Engineer - Data Platform

    Wynk (Airtel Digital)
  • Apr, 2019 - Jan, 20211 yr 9 months

    Senior Data Engineer

    PaisaBazaar
  • Jul, 2017 - Jan, 20191 yr 6 months

    Software Engineer (Data &ML Engineer)

    Formcept
  • Jul, 2017 - Jan, 20191 yr 6 months

    Software Engineer (Data & ML Engineer)

    Formcept

Applications & Tools Known

  • icon-tool

    Redash

  • icon-tool

    AWS Lake Formation

  • icon-tool

    Apache Flink

  • icon-tool

    Spark

  • icon-tool

    Redshift

  • icon-tool

    Airflow

  • icon-tool

    Hadoop

  • icon-tool

    ElasticSearch

  • icon-tool

    InfluxDB

  • icon-tool

    Prometheus

  • icon-tool

    Hive

  • icon-tool

    Ambari

  • icon-tool

    OrientDB

  • icon-tool

    Scala

  • icon-tool

    Dialogflow

  • icon-tool

    Git

  • icon-tool

    Docker

  • icon-tool

    Jenkins

  • icon-tool

    AWS

  • icon-tool

    Azure

  • icon-tool

    Hive

  • icon-tool

    Apache Spark

  • icon-tool

    Apache Flink

  • icon-tool

    Kafka

  • icon-tool

    Airflow

Work History

7.8Years

Senior Software Engineer - Data Platform

Nykaa
Jan, 2023 - Present2 yr 4 months
    Developed Ingestion and compute framework for batch data pipeline. Implemented a comprehensive data governance layer for the external delta lake via AWS Lake Formation. Integrated Redash for data visualization and alerting, enhancing data accessibility. Created statefulReal-time data pipelines usingApache Flink. Utilized big data technologies including Spark, Redshift, AWS Lake Formation, and Apache Airflow.

Principal Software Engineer - Data Platform

Wynk (Airtel Digital)
Jan, 2021 - Dec, 20221 yr 11 months
    Developed a batch data pipeline framework and backend microservices, reducing dataset onboarding time from days tominutes. Enabled real-time data ingestion from Kinesis and Kafka, allowing for flexible querying Took ownership to create API-based Partner Payout Product aka Thanos, where we can configure actual and scenario base contracts and execute them in real-time. Worked with big data technologies such as Spark, Hadoop, Redshift, AWS, ElasticSearch, and Azkaban. Worked on time-series databases like InfluxDB, and Prometheus.

Senior Data Engineer

PaisaBazaar
Apr, 2019 - Jan, 20211 yr 9 months
    LeadData Team and build theReal-Time data pipeline from scratch. Worked on Big data technologies like Spark, Hadoop, Hive, Kafka, ElasticSearch, Tez, Ambari Use ElasticSearch to build a Search Engine for the customer360 view. Worked on different searching features like - autocomplete, fuzzy search, geolocation search, prefixmatch search. Use dialogflow to train data and get to know the query's intent. Based on that optimize the result set and help the user to complete the query.

Software Engineer (Data & ML Engineer)

Formcept
Jul, 2017 - Jan, 20191 yr 6 months
    Explored graph databases. Created Knowledge Base using databases and search analytics. Worked with RDDs and data frames in Scala. Handled unstructured data using ML algorithms.

Software Engineer (Data &ML Engineer)

Formcept
Jul, 2017 - Jan, 20191 yr 6 months
    Explored the graph databases like - OrientDB andArangoDB Create a Knowledge Base using Graph database Arangodb, Elasticsearch, and Apache Jena. Workedwith RDDs and data frames in Scala. Handle unstructured data of the client convert it into a different form and store it into Elasticsearch. Analyze the data behavior by performing ML algorithms like - clustering, RFMC, and forecasting.

Achievements

  • Google Kick start 2020 Round G coding rank: 1623 worldwide
  • Google APAC 2017 Rank round B: 1291, round E: 886
  • Participated in ACM-ICPC 2015
  • Shortlisted in the top 500 contestants in SnackDown 2015 organized by CodeChef
  • Won a gold medal for chess during Sports meet15 at the College level
  • Google Kick start 2020 Round G coding rank: 1623 (worldwide)
  • Google APAC 2017 Rank round B: 1291, round E: 886.
  • Participated in ACM-ICPC 2015.
  • Shortlisted in the top 500 contestants in SnackDown 2015 organized by CodeChef worldwide.
  • Won a gold medal for chess during Sports meet15 at the College level.
  • Google Kick start 2020 RoundG coding rank: 1623 (worldwide)
  • Google APAC 2017Rank round B: 1291, round E: 886
  • Participated inACM-ICPC 2015
  • Shortlisted in the top 500 contestants in SnackDown 2015 organized by CodeChef worldwide
  • Won a goldmedal for chess during Sports meet15 at the College level

Major Projects

3Projects

Thanos - The Partner Payout Product

    Came up with an idea to build an API-based Partner Payout Product where we can configure actual and scenario-based contracts and execute them in real time. Besides the actual partner bill report, the user can also perform any payout-related query via scenario-based contracts (with or without a rate card). Users can define their user types (free, paid, free > 100) based on multiple filters, stream count range, etc. And later marge themwith the help of an alias name. Enable multiple rate card calculations based on the slab, user streams, and subscription type. Onboarding a new partner is relatively easier.

Customer360

    OneCustomer View is built for data consolidation frommultiple business units. Synchronizes data from multiple data sources i.e. MySQL, Cassandra, MongoDB, SFTP, Kafka, and third-party APIs. Automate the process for adding new databases or tables into Hive. Generate hourly lag status reports for every database and table (over more than 500 tables). Use ElasticSearch to build a search engine for internal queries. Create ElasticSearch Schema with custom analyzers for different searching features.

Text Enrichment using Knowledge Base

    Parse the RDF input data with Apache Jena. Load the open-source database within a GraphDatabase and query the same. Pre-process the graph and store the result in a defined index schema for spotting entities, categories, and disambiguating them based on semantic links. Extract noun phrases from input data. For each noun phrase, search the n-grams in elasticsearch. Spotting the entities, categories, and disambiguating them.

Education

  • B.Tech

    The LNM Institute of Information Technology, Jaipur (2017)
  • B.Tech

    The LNM Institute of Information Technology (2017)

Interests

  • Chess
  • Travelling