profile-pic
Vetted Talent

Bibek Rauniyar

Vetted Talent

As a Senior DevOps Engineer at Honest, I manage the production Kubernetes cluster that runs over 100 microservices, ensuring optimal performance through scaling and resource allocation. I also successfully optimized costs on Confluent Cloud, achieving a notable 48% reduction in total billing.


With a Bachelor of Technology in Mechanical Engineering from GLA University and multiple certifications in AWS, Python, and Microsoft DevOps, I have 6+ years of experience in pioneering technological advancements. I have proven expertise in cloud computing, CI/CD, monitoring and analytics, data governance, security, and cost optimization. I have designed and implemented HA infrastructure for mobile and web applications, and contributed to the organization's technological excellence through innovative solutions.

  • Role

    Senior DevOps Engineer

  • Years of Experience

    7.6 years

  • Professional Portfolio

    View here

Skillsets

  • Grafana
  • Terraform
  • SOC 2
  • Serverless
  • Python
  • Prometheus
  • PCI-DSS
  • Multi-cloud architecture
  • Microservices Architecture
  • Istio service mesh
  • Kubernetes - 6 Years
  • Go
  • GitHub Actions
  • GCP
  • Bash
  • Azure
  • ArgoCD
  • Ansible
  • AWS - 7 Years

Vetted For

14Skills
  • Roles & Skills
  • Results
  • Details
  • icon-skill_image
    Senior Kubernetes Support Engineer (Remote)AI Screening
  • 61%
    icon-arrow-down
  • Skills assessed :Ci/Cd Pipelines, Excellent problem-solving skills, Kubernetes architecture, Strong communication skills, Ansible, Azure Kubernetes Service, Grafana, Prometheus, Tanzu, Tanzu Kubernetes Grid, Terraform, Azure, Docker, Kubernetes
  • Score: 55/90

Professional Summary

7.6Years
  • Feb, 2023 - Present3 yr 3 months

    Senior DevOps Engineer

    Honest Technologies
  • Jul, 2021 - Feb, 20231 yr 7 months

    Site Reliability Engineer

    Dkatalis Labs
  • Feb, 2021 - Jul, 2021 5 months

    DevSecOps Consultant

    FPL Technologies
  • Dec, 2017 - Feb, 20202 yr 2 months

    Senior System Engineer

    Infosys Ltd
  • Feb, 2020 - Feb, 20211 yr

    DevOps Consultant

    CloudCover Consultancy

Applications & Tools Known

  • icon-tool

    GCP

  • icon-tool

    AWS

  • icon-tool

    Azure

  • icon-tool

    Kubernetes

  • icon-tool

    Terraform

  • icon-tool

    Docker

  • icon-tool

    GitHub Actions

  • icon-tool

    ArgoCD

  • icon-tool

    Jenkins

  • icon-tool

    GitLab

  • icon-tool

    Harness

  • icon-tool

    Spinnaker

  • icon-tool

    Ansible

  • icon-tool

    Terragrunt

  • icon-tool

    Kyverno

  • icon-tool

    Falco

  • icon-tool

    DefectDojo

  • icon-tool

    Istio

  • icon-tool

    IAM

  • icon-tool

    Prometheus

  • icon-tool

    Grafana

  • icon-tool

    Loki

  • icon-tool

    ELK Stack

  • icon-tool

    Datadog

  • icon-tool

    Dynatrace

  • icon-tool

    Bash

  • icon-tool

    Python

  • icon-tool

    GoLang

  • icon-tool

    Cloudflare

  • icon-tool

    Promtail

  • icon-tool

    Fortinet

  • icon-tool

    Wazuh

  • icon-tool

    Superset

  • icon-tool

    Lambda

  • icon-tool

    SIEM

  • icon-tool

    GCP

  • icon-tool

    AWS

  • icon-tool

    GKE

  • icon-tool

    EKS

  • icon-tool

    AKS

  • icon-tool

    Helm

  • icon-tool

    Terraform

  • icon-tool

    GitHub Actions

  • icon-tool

    Prometheus

  • icon-tool

    Loki

  • icon-tool

    Logstash

  • icon-tool

    Elasticsearch

  • icon-tool

    Kibana

  • icon-tool

    Looker Studio

  • icon-tool

    Vault

  • icon-tool

    IAM

  • icon-tool

    Snyk

  • icon-tool

    SonarQube

  • icon-tool

    Trivy

  • icon-tool

    GoLang

  • icon-tool

    TypeScript

  • icon-tool

    Confluent Kafka

  • icon-tool

    RabbitMQ

  • icon-tool

    Git

  • icon-tool

    Bitbucket

  • icon-tool

    PagerDuty

  • icon-tool

    Linux Administration

  • icon-tool

    Networking

  • icon-tool

    SRE

  • icon-tool

    Microservices

  • icon-tool

    API Development

Work History

7.6Years

Senior DevOps Engineer

Honest Technologies
Feb, 2023 - Present3 yr 3 months
    Designed and optimized Kubernetes platform managing 200+ microservices with strategic scaling and resource tuning. Automated CI/CD pipelines increasing deployment frequency. Enhanced security posture achieving PCI-DSS/SOC2 compliance audit pass rate.

Site Reliability Engineer

Dkatalis Labs
Jul, 2021 - Feb, 20231 yr 7 months
    Led cross-functional team to design secure cloud infrastructure supporting 10M+ users. Extended Istio service mesh across the organization enabling Zero Trust Security.

DevSecOps Consultant

FPL Technologies
Feb, 2021 - Jul, 2021 5 months
    Deployed enterprise SIEM using Amazon OpenSearch. Conducted PCI-DSS Level 1 compliance audits and orchestrated application containerization.

DevOps Consultant

CloudCover Consultancy
Feb, 2020 - Feb, 20211 yr
    Executed multi-cloud migration of 22 microservices and developed CI/CD pipeline for payment gateway processing. Established disaster recovery strategies achieving 99.9% data recovery success.

Senior System Engineer

Infosys Ltd
Dec, 2017 - Feb, 20202 yr 2 months
    Contributed to cloud migration of PostgreSQL database infrastructure. Enhanced database performance through optimization and indexing improvements.

Achievements

  • Led successful cloud migrations
  • Achieved a 48% reduction in cloud costs
  • Tenfold improvement in deployment frequency
  • Reduced Prometheus RAM usage by 80%
  • Zero Downtime Migration from Ali cloud to Google Cloud
  • Implemented centralized SIEM for a USA-based enterprise across 80+ AWS accounts
  • Awarded the Rising Star award for exceptional collaboration on projects with Infosys and MBRDI, recognizing outstanding performance and contributions.
  • Demonstrated superior problem-solving skills by effective query tuning and indexing leading to a significant database performance boost

Major Projects

3Projects

Mastering Cloudflare Ruleset Engine with Terraform

    Published article with 2K+ views on Dev.to platform.

AI-powered job matching platform

    Built platform using GenAI, vector search, and LLM technologies.

Reusable Terraform modules

    Created modules enabling rapid deployment of AWS Glue pipelines.

Education

  • B.Tech: Mechanical Engineering

    GLA University, Mathura (2017)

Certifications

  • Aws certified solutions architect associate saa-c02

  • Az-400: designing and implementing microsoft devops solutions

AI-interview Questions & Answers

I am a DevOps and cloud professional, having more than 6.5 years of experience working in the same field. I have worked for MNCs and a lot of start-ups. I help start-ups grow. I have worked on almost all the tools and technologies, including cutting-edge technologies, which are used in the most fields, and cloud technology. So, I have worked with almost all the big providers of cloud, such as AWS, GCP, and Azure. I'm also certified in AWS and Azure. Regarding cutting-edge technologies, I have worked with Terraform as an expert. I have good experience in core.NET and experience with the entire CD/CD pipeline design. Apart from that, I'm also a team player. I can lead a project alone, and I can prove myself as a good team player because, currently, I'm working in a team that is very versatile. I'm also working on the on-call, maintaining the gate repository with more than 450 repositories, doing weekly on-call rotations, maintaining, designing, and supporting the code. So, pretty much, that's all about myself.

To expose a Kubernetes service to the Internet, I go about it by using the ingress or a load balancer provided by any of the cloud providers. I can also host my own self-hosted ingress or use the load balancers provided by the clouds. So, basically, what we need to do is think whether we're going to do domain-based routing or path-based routing. Once we confirm our approach, we need to finalize whether we're going to use a load balancer, expose the load balancer and control it, or use a proxy and then control it. So, basically, we can use the NGINX ingress in between the application and the Internet, and then route the traffic according to the approach. So, in simple words, we can use NGINX ingress or a load balancer to expose the service to the Internet.

What are the main components of Tanju Kubernetes grid, and how did it I'm really sorry. I have never interacted with Tanju, so I have no idea on this. I would like to skip this question.

So coming to the life cycle of PARD can be either running or it can be in a terminated status. It can be in a running status. It can be in a terminated status. It can be in a backlog status. So, running is basically when the party is up and running, everything is good. We are serving a topic. Terminated, it's basically when the parties are totally wiped off, like, deleted. We have also called a crash loop, where the port is not able to get the correct configuration or not able to point to the current configuration, which it is allocated for or a config map. And we can also get a part with an evicted status. So, it means, basically, when the port doesn't have enough resources on the node, then it will be evicted. So, basically, when there is a resource constraint on the node and there is no node available to schedule the port, then it will be evicted every time. So, that is a life cycle of the Kubernetes port, basically.

In Kubernetes is like a container. It is a very deep container that is very fast, and it's a whole architecture where we deploy the life cycle or dependencies of the application. So think of this. If a Kubernetes is a big chunk of room, we divide it into partitions to manage it. So it's a virtual space where we can define the namespace as a virtual clustering of different virtual clusters. So we will create different namespaces where we define the life cycle of a port, how it behaves, and how it works. It's a virtual clustering in GKE. This will allow you to ease the management of applications. It will allow you to control numerous factors. You can control numerous Kubernetes configurations at the namespace level. You can also define how you want to control the entire life cycle of a port and the deployment into that namespace. This is the namespace, and it helps make it easy to manage applications over GKE by virtually clustering them.

While setting up a CICD pipeline for a Kubernetes application, first, we need to figure out how we're going to roll out the deployment. Like, what will be the controller for the deployment, whether we are going to manage it via some third-party applications like Argo CD, where it will do the job. Once that's decided, we need to use a package manager to package all the Kubernetes manifests. Like, to deploy a part, basically, we need all the config maps, secrets, code, database, and other things. Right? And to make it work, we need to package all the service templates, all the Kubernetes manifests into a single package manager. So, we probably have to use Helm to package all the Kubernetes manifests. And then, we need to plan out how we're going to roll it. Like, what will be the controller for the deployment. Once this is done, we can create a CICD pipeline, basically, from scratch. Then, we can kill the CI by creating an email, publishing the image, tagging the image correctly with the stages, and making sure the published image can be pushed to our registry and used in the deployment. So, and then, once the CI is good, we need to make sure we have a CD for the same, where we will deploy the workload on Kubernetes using either Argo CD or any third-party controller or directly using Helm. What can be the key component there? I think about this key component. What's the thing is? What key component is legitimate? What do you mean by this question? Like, am I on the right track? This question makes me seek another kind of answer. I'm not able to fully answer this question because it's a bit vast. So, if you're talking about the manifest, like, what are the manifests that we need? We need config maps, secrets, service, deployment, rules, Prometheus rules, PDB, replica sets, and all this stuff. And if you want to design the CICD pipeline for this, then you can write a GitHub workflow where we pull the images from the CI, configure the credentials for Argo CD or whatever Helm or Kubernetes, and deploy. If you want to use Helm, then just use the Helm command, like up, deploy, Helm upgrade, Helm install, first install it, then upgrade it depending on the revision version. Or if you want to go for Argo CD, then just create a new repository where you can push the state of the file and just point that state to the Argo CD. It depends, like, how you want to design the CI part and the CD part. I hope this answered your questions.

While setting the network policies, first of all, we need to think how a port will communicate with another port. How will we maintain authentication if needed? How will we authorize, like, a particular part? Can one leak access to a particular workload or report? So, you can design like that. You can do network policies at multiple levels. First, we need to think, like, which level we are targeting to set up the network policy: whether it's a global level or a namespace level. And then, whether we are planning to use it for authentication or for authorization. We need to check this factor. Like, what is the purpose of this network policy? And then, we can check what are the card operations or what are the services or endpoints. What are the methods you want to give access to a particular part? These are all the network policies. And coming to a global part, you can consider the factor: what access should I allow in my Kubernetes cluster at the global level. So, if you want to block any traffic from a particular source IP, you can set up the network policies at a global level to exclude traffic from that IP. So, yeah. That's it. Yeah, right now, I can think of whatever I've asked.

What are the benefits of using Helm charts in Kubernetes and how we do manage the dependencies? Correct. So, the benefits of using Helm charts are that it will allow you to package or bundle all the Kubernetes manifest into one single file. Basically, one single template, we can call it as a file or a template, whatever. It will allow you to bundle all the different manifests into a one single bundle. It will allow you to deploy that one single bundle using some Helm command. So, you can install it in the cluster, you can also make it idempotent because of values files. You can give a generic chart and give different values depending on the application requirement, how to manage their dependencies. Basically, what kind of dependencies you want to manage. It depends on that. Like, if you want to deploy your application, it just needs a database or it needs Nginx, then in the chart's values file configuration, you can define the dependencies. Like, this chart is dependent on this, and this chart is dependent on this. So, if you define the chart and the dependencies of that chart, then it will download that chart and spin up the dependencies when you install it or when you upgrade it. So, there are a lot of benefits of using Helm to manage the applications. It is very idempotent. Right? You can have multiple versions of a chart running. It is very easy for the rollback. You can rollback to any of the versions if you need. So, yeah, these are the perks of using Helm.

And then as a container or case stressor system, Kubernetes relies on underlying technologies to isolate and manage containers. So, this is quite good. So this, if you go to the history, then this Kubernetes application first developed by Google. They use a barcode application to manage their containers. If you talk about the architecture, what's inside it, then they follow a master-slave architecture. Kubernetes follows the master-slave architecture, where the master plane is the control center where many components manage the life cycle of the cluster. Inside the Kubernetes cluster, there is a component called the controller manager. There is a node controller. There is an ETCD database. The job of the controller manager is to manage the cluster components in a particular node. The job of the node controller is to provide the node and maintain communication between nodes. The ETCD database stores all cluster information, like the current state of the cluster. We query it with the cube CTL client. There are many flowing components in the g to orchestrate applications. There are also controllers, like the admission controller, replication controller, and admission hook controller. Each single component combines as a unit and provides the functionality to orchestrate the system. On the node, there are a few components that help route the correct traffic, like a proxy. It maintains communication between nodes and ports. So, there are many moving components working as a unit to support this system.

So in Kubernetes, there is a call, there is one Kubernetes resource called secrets. We can manage the secrets using that Kubernetes secrets secret resource around SIP. There is a Kubernetes object called secret. We can manage the secret. It just stores the data, the secret, and only in base 64 format, base 64 encoded. If even unfortunately, if any attackers got access to our Kubernetes or just dug inside or just hacked the Kubernetes cluster, then they will get access to all the secrets. And that is a simple base 64 encoded. They can decode it right away by using the base 64 default command. So I don't think this is a better way to manage the secrets. To better manage the secret, we have to rely on third-party applications. Recently, we are using a VoIP. VoIP is a very good mechanism to handle the secrets. It will handle static and dynamic redincers. So we can use the functionality of VoIP to manage the secrets. VoIP has a lot of good functionalities, just like dynamic secret quotations. They will do the secret quotation. It will provide the functionality to not store the secrets in plain text on the GK level, but it will store them in its own vault in encrypted format wherever you want to store it, whatever the backend is. And then you can inject a sidecar container to get the value from the vault. That's how it works for the dynamic banking vault. We can call it a dynamic vault. So, to manage the secrets efficiently, we have to rely on third-party applications. I am familiar with HashiCorp, but I think there are a lot more services, a lot more managed services provided by big providers, such as Google, sorry, cloud. We can leverage the functionality using AWS Secret Manager also, and we can use GCP services. I might be not remembering the name. But, yeah, definitely, we have to rely on third-party applications to perfectly manage the secrets inside the cluster.

How do you approach a performance testing for deployment in Kubernetes? How did you keep on capacity planning? To approach performance testing for deployment in Kubernetes, we have to consider the open source tools that can generate load and help us test the application level. We use tools like k9 to test the application running inside the Kubernetes cluster, which gives us insight into how we manage resources inside the cluster. It's very important to know about our environment and how the application is behaving. We need to know the accurate amount of memory or CPU we need to allocate for a particular part. If we're overprovisioning, we're losing, and if we're underprovisioning, we're also losing because it impacts the performance of the application. I believe the canine tool is there, which helps to generate load and do load testing on any application. This will predict and help in proper planning of the resources we're going to allocate to the parts or any application running inside the Kubernetes cluster. I think this is a very important part to do performance testing to get to know about applications running in the Kubernetes cluster and to set the appropriate amount of resources on the application. It will save the cost for you, not allow overprovisioning of resources, and give you good debugging knowledge. If there's any memory leak in the application, and if you've fine-tuned the configuration, you can know if it's an issue with the application or a bug in the code that you have to fix. Eventually, proper performance testing will help us understand better about the application and the environment.