Himanshu Dahiya

Experienced Cloud Technical Lead with proven record in building and optimizing cloud infrastructure for India's Largest EV Charging Network. Skilled in leading teams, optimizing AWS infrastructure, and implementing robust monitoring and security measures. Proficient in developing scalable microservices and architecting efficient inter-service communication. Dedicated to driving success in cloud architecture and technology initiatives.

Role
Site Reliability Engineer | MEAN, MEVN stack developer
Years of Experience
7.3 years

Skillsets

Node.js
HAProxy
Helm
IAM
Java
Kafka
Kustomize
MongoDB
MySQL
New Relic
Go
PostgreSQL
Prometheus
Redis
S3
Sentry
Terraform
VPC
Wireguard
Bitbucket
Kubernetes
Docker
Grafana
TypeScript
Aerospike
Alertmanager
Ansible
ArgoCD
AWS
Python
Cilium
Cloudflare
CloudWatch
Druid
EC2
EKS
Elasticsearch
GitOps

Professional Summary

7.3Years

Jul, 2024 - Feb, 20261 yr 7 months
Site Reliability Engineer
the COOL Co
Jul, 2021 - Jun, 20242 yr 11 months
Technical Lead - Cloud Infrastructure
Bolt.Earth
Jul, 2020 - Jun, 2021 11 months
Senior Software Development Engineer
Bolt.Earth
May, 2018 - Aug, 2018 3 months
Software Developer Intern
Nov, 2018 - May, 2019 6 months
Co-Founder
Jun, 2019 - Jun, 20201 yr
Software Development Engineer
Bolt.Earth
May, 2017 - Jul, 2017 2 months
Web and app developer Intern

Applications & Tools Known

EKS
EC2
Cloud9
VPC
S3
Lambda
Route53
Firebase
Kubernetes
Docker
Helm
Cloudflare
Nginx
Redis
MongoDB
ElasticSearch
Rancher
ArgoCD
Prometheus
Grafana
Sentry
NewRelic

Work History

7.3Years

Site Reliability Engineer

the COOL Co

Jul, 2024 - Feb, 20261 yr 7 months

Optimized Linux kernel and OS-level parameters to improve resource utilization and increase application performance by 25% under production traffic. Implemented a K3s cluster across 50+ dedicated Leaseweb servers; improved compute utilization by 30%. Developed observability stack (Prometheus, Grafana, Alertmanager) integrated with Slack, PagerDuty, and Email; defined SLIs/SLOs for latency (<200ms p95), availability (99.9%), and error budgets. Reduced mean time to detection (MTTD) by 40% and mean time to recovery (MTTR) by 35% by implementing alerting tied to SLIs/SLOs and standardized runbooks. Designed high-availability PostgreSQL cluster with ZFS-backed storage, streaming replication, and pg auto failover; ensured zero data loss and recovery within <60s during failover drills. Optimized Kafka brokers sustaining throughput of 500MB/s; fine-tuned partitioning, replication factor, and JVM GC settings to keep produce and consume latencies under 30ms. Tuned Apache Druid ingestion pipeline to handle 500K events/sec with consistent query latency <300ms. Automated rolling updates for PostgreSQL, HAProxy, and K3s clusters via Ansible, reducing operator toil by 70%.

Technical Lead - Cloud Infrastructure

Bolt.Earth

Jul, 2021 - Jun, 20242 yr 11 months

Senior Software Development Engineer

Bolt.Earth

Jul, 2020 - Jun, 2021 11 months

Software Development Engineer

Bolt.Earth

Jun, 2019 - Jun, 20201 yr

Propelled the development of diverse client Software as a Service (SaaS) solutions specializing in Electric Vehicle (EV) inventory, sales, post-sales management, and EV fleet monitoring, deepening domain expertise. Initiated and led development efforts for company's portfolio of SaaS products from proof of concept to implementation.

Co-Founder

Nov, 2018 - May, 2019 6 months

Software Developer Intern

May, 2018 - Aug, 2018 3 months

Web and app developer Intern

May, 2017 - Jul, 2017 2 months

Achievements

Built the entire cloud infrastructure from scratch for India's Largest EV Charging Network
Developed a microservice capable of scaling to accommodate over a million TCP connections
Led initiatives to optimize AWS infrastructure resulting in a 30% cost reduction
Achieved accelerated deployment timelines with average deployment time per service of minutes and rollback time of less than a minute
Proactively identified and addressed security threats including DDoS attacks, bot-driven network bombardment, and client-side vulnerabilities such as credential leakage

Major Projects

2Projects

Near Real-Time Ingestion Pipelines (Kafka Druid)

Implemented near real-time ingestion pipelines using Kafka and Druid capable of handling 300K events/sec with minimal downtime, establishing clear SLIs and SLOs for optimal performance.

Automated Infrastructure Provisioning

Automated infrastructure provisioning using Terraform and Ansible, achieving a 25% faster deployment time and reducing manual errors, leading to a 15% reduction in rollback rates.

Education

B.Tech Computer Science
Indian Institute of Technology, Ropar (2019)

Himanshu Dahiya

Site Reliability Engineer | MEAN, MEVN stack developer

7.3 years

Skillsets

Professional Summary

Applications & Tools Known

Work History

Site Reliability Engineer

Technical Lead - Cloud Infrastructure

Senior Software Development Engineer

Software Development Engineer

Co-Founder

Software Developer Intern

Web and app developer Intern

Achievements

Major Projects

Near Real-Time Ingestion Pipelines (Kafka Druid)

Automated Infrastructure Provisioning

Education

B.Tech Computer Science