DevOps Engineer
Deutsche Telekom Digital LabsSep, 2024 - Apr, 2025 7 months
Designed multi-cluster and multi-region Kubernetes deployments to enhance high availability and disaster recovery. Utilized Velero for backup and restore strategies, ensuring data resilience and disaster recovery readiness. Integrated Prometheus, Grafana, and CloudWatch for proactive monitoring, alerting, and troubleshooting, reducing incident resolution time by 30%. Strengthened Kubernetes resilience with Istio Service Mesh, implementing retries, circuit breakers, and traffic shifting for high availability. Conducted Chaos Engineering tests using Chaos Monkey to simulate failures, validate system resilience, and enhance Kubernetes fault tolerance. Achieved 99.99% uptime by optimizing scaling, resilience strategies, and automated recovery mechanisms. Developed Python scripts to automate infrastructure provisioning, configuration management, and CI/CD workflows, reducing manual effort and improving efficiency. Automated password rotation, daily backups, and log management using Python and Shell scripts, improving security and data reliability. Implemented ArgoCD for automated Kubernetes deployments using Helm charts, ensuring continuous deployment and maintaining consistency across environments. Created custom Helm charts to standardize Kubernetes resource management and enable scalable, repeatable deployments. Automated infrastructure provisioning, maintenance, and upgrades using Terraform and Ansible, enabling easy migration, seamless scaling, and efficient infrastructure management. Collaborated with cross-functional teams to gather requirements, plan project timelines, and deliver high-quality software solutions within budget constraints.