profile-pic
Vetted Talent

Kaustubh Kunjibettu

Vetted Talent

A DevOps /Site Reliability Engineer with hands on experience in automating, optimizing, supporting mission critical deployments, leveraging automation, IaC, K8s and implementing effective solutions on Cloud using CI/CD and DevOps processes.

  • Role

    Site Reliability Engineer

  • Years of Experience

    6 years

Skillsets

  • Prometheus
  • PowerShell
  • GKE
  • GCS
  • DevOps
  • Ansible
  • Kubernetes - 5 Years
  • Kustomize
  • Kyverno
  • Terraform - 05 Years
  • ArgoCD
  • Grafana
  • GitLab
  • GCP
  • Docker - 4 Years
  • Bash
  • Azure DevOps
  • Azure - 5 Years
  • AWS - 4 Years

Vetted For

15Skills
  • Roles & Skills
  • Results
  • Details
  • icon-skill_image
    Senior Software Engineer, DevOpsAI Screening
  • 56%
    icon-arrow-down
  • Skills assessed :infrastructure as code, Terraform, AWS, Azure, Docker, Kubernetes, 組込みLinux, Python, AWS (SageMaker), gcp vertex, Google Cloud, Kubeflow, ml architectures and lifecycle, pulumi, seldon
  • Score: 50/90

Professional Summary

6Years
  • Dec, 2022 - Present2 yr 9 months

    Site Reliability Engineer

    Infracloud Technologies
  • Jan, 2022 - Dec, 2022 11 months

    Devops/ Technical Consultant

    OneTrust
  • May, 2018 - Dec, 20224 yr 7 months

    Associate Devops/Cloud Support Engineer

    Accenture

Applications & Tools Known

  • icon-tool

    AWS

  • icon-tool

    Azure

  • icon-tool

    GCP

  • icon-tool

    Azure Devops

  • icon-tool

    ArgoCD

  • icon-tool

    Docker

  • icon-tool

    K8s

  • icon-tool

    Terraform

  • icon-tool

    Prometheus

  • icon-tool

    Grafana

  • icon-tool

    Bash

  • icon-tool

    Kyverno

  • icon-tool

    Kustomize

Work History

6Years

Site Reliability Engineer

Infracloud Technologies
Dec, 2022 - Present2 yr 9 months
    Worked as a Cloud Platform Engineer for a client with data analytics platform as a domain. Developed and maintained golden base images for various Node.js versions, ensuring implementation of security best practices to enhance system integrity and protection. Configured renovate bot in different golden image repositories to automatically detect dependencies and bump the software versions. Developed Terraform modules for provisioning GCP services such as projects, VPC, GCS, GKE etc. Analyzed and reconfigured existing modules to meet specific renovate requirements. Implemented GitLab pipelines to automate the deployment of Terraform modules for resource creation in different clouds, docker build and deploy. Created Kyverno policies and used the Kyverno CLI for managing and enforcing policies in Kubernetes clusters, enhancing security and compliance. Utilized Kustomize to manage and customise Kubernetes configurations and ensuring consistent application of configurations across different environments. Deployed ArgoCD for GitOps, enabling automated and consistent application deployments to Kubernetes.

Devops/ Technical Consultant

OneTrust
Jan, 2022 - Dec, 2022 11 months
    Implemented the OneTrust Privacy Management solution across various cloud-based Kubernetes platforms, including AKS, EKS, and OpenShift. Collaborated with a client to design and deploy a fully private AKS cluster, tailored to their specific customization needs using various Azure services. Developed and automated the provisioning of Development and QA environments for Dev/Test purposes on Azure and AWS using Terraform. Automated the provisioning and de-provisioning of VMs and other cloud services on Azure and AWS within Dev/QA environments, resulting in approximately $2500 in monthly cloud cost savings. Extensive hands-on experience with major public cloud platforms, including AWS and Azure. Trained new joiners and interns on K8s, Cloud, Terraform and other DevOps technologies.

Associate Devops/Cloud Support Engineer

Accenture
May, 2018 - Dec, 20224 yr 7 months
    Automated CI/CD pipelines utilizing Azure DevOps, Git, Ansible, and Terraform. Developed and deployed POC environments on Azure Cloud for on-premises to cloud migration projects using Terraform. Automated repetitive tasks on Azure Cloud through PowerShell and Azure CLI scripting. Collaborated closely with clients and the R&D team to resolve open DevOps tickets and provide effective solutions. Managed and resolved customer issues in alignment with SLA requirements, ensuring timely and effective resolution. Monitored production environments, implementing alerts for high availability and proactive monitoring. Resolved outstanding tickets based on severity, ensuring compliance with SLA standards.

Achievements

  • Developed and maintained golden base images for various Node.js versions, ensuring implementation of security best practices to enhance system integrity and protection
  • Configured renovate bot in different golden image repositories to automatically detect dependencies and bump the software versions
  • Developed Terraform modules for provisioning GCP services such as projects, VPC, GCS, GKE etc.... Analyzed and reconfigured existing modules to meet specific requirements
  • Implemented GitLab pipelines to automate the deployment of Terraform modules for resource creation in different clouds, docker build and deploy
  • Created Kyverno policies and used the Kyverno CLI for managing and enforcing policies in Kubernetes clusters, enhancing security and compliance
  • Utilized Kustomize to manage and customise Kubernetes configurations and ensuring consistent application of configurations across different environments
  • Deployed ArgoCD for GitOps, enabling automated and consistent application deployments to Kubernetes
  • Implemented the OneTrust Privacy Management solution across various cloud-based Kubernetes platforms, including AKS, EKS, and OpenShift
  • Collaborated with a client to design and deploy a fully private AKS cluster, tailored to their specific customization needs using various Azure services. This project garnered high visibility and recognition
  • Developed and automated the provisioning of Development and QA environments for Dev/Test purposes on Azure and AWS using Terraform
  • Automated the provisioning and de-provisioning of VMs and other cloud services on Azure and AWS within Dev/QA environments, resulting in approximately $2500 in monthly cloud cost savings
  • Trained new joiners and interns on K8s, Cloud, Terraform and other DevOps technologies
  • Automated CI/CD pipelines utilizing Azure DevOps, Git, Ansible, and Terraform.
  • Developed and deployed POC environments on Azure Cloud for on-premises to cloud migration projects using Terraform.
  • Automated repetitive tasks on Azure Cloud through PowerShell and Azure CLI scripting.
  • Collaborated closely with clients and the R&D team to resolve open DevOps tickets and provide effective solutions.
  • Managed and resolved customer issues in alignment with SLA requirements, ensuring timely and effective resolution.
  • Monitored production environments, implementing alerts for high availability and proactive monitoring.
  • Resolved outstanding tickets based on severity, ensuring compliance with SLA standards.

Education

  • Bachelor of Engineering, Computer Science Engineering

    NMAMIT, Nitte (2018)

AI-interview Questions & Answers

Yep. Hi, all. Uh, myself, Kostu, and, uh, I'm working as a DevOps and site reliability engineer from the last, uh, almost, uh, 5 to 6 years now. So I have been part of, uh, 3 companies. I've been working on the service side of things as well as, uh, the, uh, product side of things. So my main experience, uh, comes in the field of DevOps, and, uh, I have been dominantly, uh, working on Kubernetes and different uh, cloud native Kubernetes stacks, like AKS, EKS that is from AWS on Azure and, uh, GKE on GCP. Along with that, I've also used various, uh, cloud provider services and integrated them, uh, along with the the Kubernetes, uh, managed services. Uh, also, I have worked, uh, on, uh, infrastructure as code, uh, on different clouds, uh, with Terraform. So I've used Terraform for the last almost 4 to 5 years now. And, um, yeah, I have been working with, uh, shell scripting, uh, more than Python. I have not used Python much. I've not got a chance to use, uh, Python much. Apart from that, I also worked on, um, Kustomize, Helm, uh, and different Kubernetes, uh, kind of, uh, automations and distributions, like Argo CD for GitOps. I've also used CICD, uh, GitLab CICD, Azure DevOps. I've integrated Azure services with Azure DevOps, used Terraform along with that in different stacks. Yeah. And, um, I have worked, uh, on the customer side of things as well when I was working in, uh, an organization. Um, I have worked with different customers from different geographical locations, uh, for helping them set up our product on their infrastructure, uh, which is deployed on, uh, Kubernetes. Right? So this is the, um, this is from the, uh, professionals, uh, site. And, uh, yeah, I am basically from, uh, India, Karnataka. Uh, right now, I'm working, uh, remotely for an organization as a site reliability engineer. Yeah. My hobbies include, uh, you know, playing different musical instruments, working out, uh, going for a jog, going for a bike ride, etcetera. Yep.

Okay. Uh, so, uh, when it comes to AWS, uh, development kit, I have not got a chance to use, uh, AWS CDK, uh, when it comes to infrastructure as code. Uh, so I have used, uh, Terraform and a little bit of CloudFormation as well. So what helps with, uh, okay. Uh, I can just go on and explain about how, uh, infrastructure as code works with, uh, use case like, you know, network provisioning. Right? Yeah. So, basically, you set up your account and, uh, then using, uh, Terraform, you basically do a, um, you you queue what what is the, uh, back end storage where you want to uh, store your state file for the Terraform for creating any, uh, resources like network resources, etcetera. So the first and foremost thing is, uh, you create a VPC, uh, and then you, uh, create different, uh, VNets and subnets, uh, as you require. So with Terraform first, uh, once you write all the Terraform code for VPC subnets, uh, you can place your, uh, ideas in a private subnet and in public subnets or different things with your VMs, which needs to be, uh, externally accessed. And, uh, then, uh, you basically write, uh, modules for VPC, RDS, your subnets, v nets, or you can also use reusable modules if somebody else has written. Right? So, uh, then you pass different commands, Terraform minute, Terraform validate, Terraform plan, um, and then, basically, Terraform apply if everything goes out well. Right? Yeah. Yeah. And, also, you can create different things. Uh, you you using Terraform, you can create, um, like, network provisioning. Right? Yeah. So you can create, uh, different VPCs. You can have your resources in different VPCs and then have a peer to peer connection or, uh, VPC peering that we see. And, also, you can have transit gateways as well. If there are a lot of VPCs involved in your, uh, account, then you can have a transit gateway as well. Yeah. I have not got a chance to use AWS.

Yeah. So you, uh, your if you if you use the Python as your programming language, right, then the first thing to do is, uh, write a Docker file. Uh, and then what what you can basically do is you can you you can have your Dockerfile written at one place and then use CICD, uh, to build their Dockerfile to create an image, uh, and push it into Docker hub or Docker registry. Any any kind of Docker registry. Like, when you are using AWS, you can push it into ECR. Right? Uh, also, you can use another thing called har harbor as well, which we'll use in our project here. So yeah. So you you write your code using Python, and then you write a docker file. You use a a base image any base image of your choice, and then you copy the code, and then you use the CMD. You expose any ports if you want. Uh, you can also write a multistage Dockerfile. And then you write a CICD pipeline using, uh, AWS code pipeline. Right? So you can have different stages where you will build the Docker file. You'll build the image, then you will do a sneak scan, uh, where the it scans your images for vulnerabilities. And then you do a Docker push, which basically pushes the Docker image created into ECR. You can also have ECR, uh, related variables, uh, involved with AWS code pipeline or any other, uh, GitLab pipeline that you use. And, uh, yeah. And you also, uh, separately, you can also create a, uh, EKS cluster where you want to run this as, uh, the your docker, uh, image, uh, at in the Kubernetes as ports. Right? So you create an EKS cluster. Uh, any public facing cluster or for production visits, uh, go to create a private cluster. Right? So once you create the, uh, that cluster, uh, you can basically write customize customized codes for which can be used for different development and production environments. Again, when you're creating different clusters, uh, using Terraform for development and production, you can use different variable files as well, uh, and you can use different variables to, you know, make, uh, make changes just for the names for the different clusters which we are using for development and production. And, uh, in order to deploy to a different to different clusters, you can use Argo CD as well here. You can use customize. You can use customize with Argo CD as well. So, basically, what happens when you are using customize is, there is something called base and overlays which you can create. So base refers to the, uh, manifest files which are going to be common across environments. And then when you can also have overlays which, uh, which will which will, uh, the the code that you put in overlays will, uh, you know, segregate what, uh, what is in a common file and what is, uh, going to go into, uh, all the, uh, the both the development and the production. So yeah. So I think this is how we can explain it.

Uh, this one, I'm not very sure, uh, about automating the security patching. Uh, so, basically, when you are doing any kinds of upgrades to your Kubernetes cluster, uh, the best way to do, uh, it is using, uh, using either a blue green deployment, uh, where, you know, uh, where only a certain part of the node let's say there are 2 different nodes, uh, Linux notes that are running on your Kubernetes cluster. What you can do is basically, first, uh, create another note then, uh, you know, drain all the parts that are running in one note so that when you train all the ports running in this node, uh, it will, you you know, uh, get scheduled in an, uh, in the new node that is there. And then you can, uh, basically do all the security patching for, uh, this the first node where you want to, uh, do it. And then you go to the next second node where you want to do the security patching, and then you can, you know, uh, you you can again, uh, do the same call in node or, you know, not call in, actually, drain. So what happens is it it gets scheduled on the other two nodes that are running. And, uh, so, basically, then after that is done, the second node is done, you go to the 3rd node and then you, uh, you know, train that node as well. So that will come and sit in the the the second node that is, uh, that you're going to schedule. And then you can basically delete that. So this is how we can do it, uh, without, uh, without having any kind of downtime, uh, on your cable and disc cluster. And, also, when you're using a managed, uh, managed cluster, I think security patching can be done be done by the AWS side itself. But, yeah, uh, in order to do any upgrades of security patchings, uh, otherwise, you can use this blue green or canary deployments, uh, or the type of example that I told. Right? Yeah.

What are you including? A variable file. Yeah. Variable file, uh, where you can use the same names across all the different multi cloud infrastructures. Right? Uh, or any local file or any prefix or suffix that you want to add to your file. Basically, any kind of variable file will help you. Other than that Yeah. I think the variable file with the names, uh, with with the names of the cluster that you want to create or, uh, what is the count that you want to have for different, uh, services for the different components that you're running through the, uh, on different clouds? Yeah. I think that should help.

Um, sorry. Not I never heard about this. What is on? Terraform configuration, how does it impact the capability of the sorry. I have not heard about this.

Yeah. So here, uh, you are using the key directly. You're not using any kind of, uh, Terraform resource blocks for creating, uh, creating the keys. So, ideally, SSH keys or anything like that, uh, can be created using Terraform resource blocks, or you can also create it on your command line and then use it here. But here, you are directly hard coding the key key value, uh, that can be used to log in to your AWS instance. So that should that's a security risk. The default equal to default key, this is the value that you are using here, uh, directly hard coded. And it's also stored in your state state file as well, which in case if it is breached, uh, everything can be seen directly. It's and it can be used to breach your, uh, EC two instance. So, yeah, we are we need to use our Terraform resource block, uh, or you need to create SSH keys on your machine, and it should be encrypted. This is this is hardcoded, and it's not the way.

Uh, here's a snippet from Build docker image. Docker image of latest. So it's using latest as the tag. I think you cannot use latest as the tag here, and that's not the

So, yeah, uh, basically, you can have 3 tiers. Uh, 1 is for your database. 1 is for your application tier. 1 is the, uh, front end tier. Front end, back end, and, uh, database tier. So in order to ensure a very high availability, we can, uh, deploy this in multiple availability zones in both AWS and Azure. You can using, uh, traffic manager, and, uh, we can also use, uh, we can we can also use content delivery network to ensure, uh, that the latency is less. And, uh, yeah.

I'm not very sure about this.