profile-pic

Udit agarwal

DevOps Engineer with experience in designing and implementing automated deployment pipelines, disaster recovery plans, and configuration management systems. Skilled in GCP, Terraform, Ansible, Docker, Kubernetes, Gitops, Monitoring and CI/CD pipelines. Proven ability to collaborate effectively in teams, communicate technical concepts clearly, and solve complex problems. Having track record in reducing deployment time by 25%, improving pipeline reliability by 30%, and reducing security incidents by 30%. Committed to staying current with emerging technologies and industry best practices.
  • Role

    Devops Engineer

  • Years of Experience

    3.2 years

Skillsets

  • IAM
  • Kafka
  • ArgoCD
  • AWS Lambda
  • AWS S3
  • EC2
  • ECS
  • ELK
  • Go
  • GitLab
  • n8n
  • Secret manager
  • Trivy
  • VPC
  • Git
  • Incident Response
  • SDLC
  • zero-trust architecture
  • AWS
  • Jenkins - 3 Years
  • Kubernetes - 3 Years
  • Ansible - 3 Years
  • Terraform - 3 Years
  • Docker - 3 Years
  • Grafana - 2 Years
  • Prometheus - 2 Years
  • CI/CD - 3 Years
  • Python - 3 Years
  • Elasticsearch
  • Google Cloud
  • Linux
  • MongoDB
  • MySQL
  • Redis
  • Datadog
  • Github

Professional Summary

3.2Years
  • Jan, 2023 - Present3 yr 2 months

    Devops Engineer

    Purplle.com

Applications & Tools Known

  • icon-tool

    Google Cloud

  • icon-tool

    Grafana

  • icon-tool

    Prometheus

  • icon-tool

    PagerDuty

  • icon-tool

    Jenkins

  • icon-tool

    Python

  • icon-tool

    Docker

  • icon-tool

    GitHub

  • icon-tool

    Terraform

  • icon-tool

    Ansible

  • icon-tool

    Kubernetes

  • icon-tool

    Linux

  • icon-tool

    CI/CD

  • icon-tool

    Git

  • icon-tool

    MongoDB

  • icon-tool

    Redis

  • icon-tool

    Nginx

  • icon-tool

    Bash Scripting

  • icon-tool

    ElasticSearch

  • icon-tool

    AWS

Work History

3.2Years

Devops Engineer

Purplle.com
Jan, 2023 - Present3 yr 2 months
    Automated GCP infrastructure provisioning using Terraform and Ansible, cutting manual setup time from 2 hours to under 10 minutes and ensuring identical environments across GCP and AWS. Engineered agentic MongoDB provisioning and lifecycle management using Ansible, Terraform, Jenkins, and n8n, reducing manual setup time from ~2 hours to under 15 min and standardizing deployments across 10+ environments. Hardened container security by implementing Kubernetes RBAC, Secure Boot, automated IAM role minimization using Python, SAST, DAST to detect and remediate vulnerable code, adhering to zero-trust architecture and DevSecOps principles resulting 70% reduction in Vulnerabilities. Migrated 15+ MySQL 5.7 legacy databases to 8.0 in production with minimal downtime, improving query performance by 20%. Optimized Jenkins CI/CD pipelines for applications serving 5M+ users, cutting production deployment times by 25% and enabling 50+ weekly zero-downtime rollouts. Reduced GCP monthly spend by 30% by automating resource scaling and implementing cost-optimized storage strategies for 100+ microservices. Owned on-call and incident response for production Kubernetes workloads supporting 5M+ users, leading P0/P1 incident triage, RCA, and remediation, and implementing automated recovery and preventive fixes that reduced MTTR by 35%. Engineered a scalable API-based secret management system (Go/Gin) handling 7M+ monthly requests for 250K+ active users, reducing access latency by 40% and eliminating unauthorized breaches. Built real-time monitoring solutions with Grafana, Prometheus, and Datadog, processing 2M+ daily metrics and reducing MTTR by 30% for incidents impacting 5M+ users. Deployed security automation in CI/CD using Trivy and GCP Security Command Center, reducing container and cloud vulnerabilities by 30% across production workloads. Collaborated with 20+ developers to standardize CI/CD pipelines and monitoring practices, reducing build failures by 30% and improving deployment consistency across projects. Implemented Single Sign-On (SSO) authentication & IP Whitelisting for multiple internal URLs, centralising access control and improving security posture.

Achievements

  • Set up monitoring and alerting on multiple channels using Grafana, Prometheus, and Datadog, reducing incident response time by 30%.
  • Collaborated with development teams to implement security measures, resulting in a 30% reduction in security incidents and vulnerabilities.
  • Deployed cost optimization strategies on GCP, achieving a 50% reduction in monthly cloud expenses.
  • Led on-call rotations and incident management, resolving 95% of urgent production incidents within SLAs.
  • Worked with Product Engineering teams on key projects, reducing deployment failures by 30% and improving software performance.
  • Implemented Infrastructure as Code (IAC) with Ansible and Terraform, automating cloud infrastructure provisioning and reducing manual intervention by 40%.
  • Developed and managed CI/CD pipelines with Jenkins, optimizing the software delivery process and reducing deployment times by 25%.
  • Automated build and deployment process with Jenkins and Maven, eliminating 85% of manual work.
  • Partnered with the leadership team to improve team-building skills, resulting in a 25% increase in team productivity and a 15% boost in employee satisfaction.
  • Designed and executed an automated deployment pipeline, reducing deployment time by 25% and increasing deployment frequency by 75%.
  • Developed and maintained monitoring and alerting systems, improving system uptime by 25% and reducing mean time to resolution (MTTR) by 40%.
  • Implemented monitoring and logging solutions with Prometheus and ELK, enhancing incident detection and resolution time by 35%.
  • Conducted security assessments on Docker images, identifying and addressing 200+ vulnerabilities, improving overall security by 40%.
  • Migrated a Python-based monolithic app to Kubernetes, utilizing Helm, CI/CD pipelines, and GitOps, reducing deployment times by 60% and increasing deployment efficiency by 50%.

Major Projects

1Projects

End-to-End DevOps Platform

Jan, 2025 - Mar, 2025 2 months
    Built a self-healing Kubernetes platform with ArgoCD, Prometheus, and Terraform, enabling zero-touch deployments and improving cluster uptime by 45%. Developed Python-based auto-remediation scripts triggered by Alertmanager webhooks, cutting mean time to recovery (MTTR) from 20 mins to under 8 mins during simulated failures. Simulated chaos tests with Chaos Mesh, validating pod recovery and achieving a 98% self-recovery success rate under fault conditions.

Education

  • Bachelor of Technology (B.Tech), Computer Science Engineering

    JK Lakshmipat University

Certifications

  • Winner of the planet wise sos hackthon

  • Architecting with google cloud

  • 45+ skill badge in gcp

  • Introduction to openshift (do101)

  • Redhat ansible (rh294)

  • Cisco: fundamentals of networking

  • Github code innovation series

  • Associate cloud engineer (ace undergoing)