Designed monitoring framework with Prometheus, Grafana, Rancher, and Sentry, reducing customer-facing issues by 30%. Integrated alerting capabilities across Slack, Microsoft Teams, and Email channels, ensuring swift incident response and sustained 99.9% SLA for microservices. Reinforced system security via Cloudflare, mitigating traffic anomalies, rate limiting, and defending against bot & DDoS attacks, thwarted 9 DDoS incidents. Optimized AWS infrastructure, implementing database cold storage, EKS cluster sizing, and auto-scaling, reducing costs by 30%. Led migration from EC2 to EKS cluster, enabling seamless deployment and management of MongoDB, Apache Kafka, Redis, and 30+ microservices resulting in 30% infrastructure improvement. Engineered a microservice solution to manage 2 million TCP connections and IoT connections for the BOLT.EARTH product line, streamlining operational processes and enhancing scalability.