profile-pic

Anudeep Nunna

Anudeep Nunna

An experienced Site Reliability Engineer (SRE) with expertise in cloud infrastructure, automation, DevOps, and security compliance. With over a decade of experience spanning companies like Sprinto, Khumbu Systems, and Google, he has a strong track record of building scalable, resilient, and highly available cloud systems

  • Role

    Senior Site Reliability Engineering

  • Years of Experience

    10 years

Skillsets

  • EC2
  • Google Cloud Platform
  • AWS Cloud Formation
  • Security
  • CICD
  • On
  • Bash
  • automation
  • testing
  • Troubleshooting
  • PCI
  • MicroServices - 10 Years
  • DynamoDB
  • Gradle
  • CI/CD - 10 Years
  • DevOps - 6 Years
  • MongoDB - 4 Years
  • Infrastructure as Code (IaC) - 6 Years
  • Distributed Systems - 10 Years
  • Mean Stack - 3 Years
  • Infrastructure as Code (IaC) tools - 7 Years
  • Leadership & Team Scaling - 3 Years
  • SaaS/PaaS Platform - 6 Years
  • AU or UK market
  • Node Js
  • AWS - 6 Years
  • Python - 8 Years
  • Python - 8 Years
  • Terraform - 5 Years
  • Terraform - 5 Years
  • Docker - 6 Years
  • C
  • Data Processing
  • ECS
  • Datadog
  • GCP
  • AWS - 6 Years
  • Lambda
  • Site Reliability
  • Java
  • Azure
  • Web Services
  • Cloud - 10 Years
  • BigQuery
  • CloudWatch
  • Jenkins
  • Containers

Professional Summary

10Years
  • Jan, 2022 - Dec, 20242 yr 11 months

    Senior Site Reliability Engineering

    SPRINTO
  • Jan, 2019 - Dec, 20223 yr 11 months

    Manager, Site Reliability Engineering

    KHUMBU SYSTEMS
  • Jan, 2014 - Dec, 20162 yr 11 months

    Developer, Trust and Safety

    GOOGLE
  • Jan, 2012 - Dec, 20142 yr 11 months

    Analyst, Product and Quality operations

    GOOGLE

Applications & Tools Known

  • icon-tool

    Terraform

  • icon-tool

    CloudFormation

  • icon-tool

    EC2

  • icon-tool

    Lambda

  • icon-tool

    S3

  • icon-tool

    ECS

  • icon-tool

    RDS

  • icon-tool

    DynamoDB

  • icon-tool

    Bigquery

  • icon-tool

    CloudSQL

  • icon-tool

    GKE

Work History

10Years

Senior Site Reliability Engineering

SPRINTO
Jan, 2022 - Dec, 20242 yr 11 months
    Automated all infra provisioning using Terraform also while expanding the current setup to multi-region to increase the availability. Conducted regular disaster recovery drills to test system resiliency and data backup processes. Developed automated monitoring and alerting using Datadog and Cloudwatch, ensuring early detection and resolution of system issues. Improved system performance and resource utilisation by 25% through proactive performance tuning and capacity planning on ECS Fargate Implemented production-like ephemeral environments for pull requests for quick testing. Developed and implemented a scalable, maintainable, object-oriented Python monitoring system to track Terraform deployments and send alerts. Enhanced SSL/TLS certificate automation using AWS Certificate Manager (ACM) using custom terraformmodules

Manager, Site Reliability Engineering

KHUMBU SYSTEMS
Jan, 2019 - Dec, 20223 yr 11 months
    Oversee the operation of the production environment including monitoring and troubleshooting, review and scheduling of planned changes, and managing outages. Served as information security & network subject matter expert; provided advisory and consulting services as needed for various projects in the organization. Authored configuration management procedures, playbooks for resolving incidents. CICD: o Eliminated manual release cycles by automating build and deployment process with Azure pipelines and Gradle, thereby reducing the release time by ~40% o Automated entire deployment of infrastructure across several environments by implementing CICD pipelines. o Implemented static code analysis and security scans via Sonarcloud in CI pipelines. Iaac: o Transformed existing manual infrastructure into composable, reusable Terraformmodules along with continuous testing of IaaC code. o , , Serverless, Dynamodb, Route 53 and other AWS services to AWS SAM and AWS Cloudformation. Observability: o Defining and capturing metrics such as latency, traffic, errors via AWS cloud formation logs to be exported to Datadog and AWS ELK. o Implemented chaos engineering to proactively detect potential failure points, identify bottlenecks. Continuous monitoring: o Implemented monitoring for all production mission-critical resources by defining SLI and SLOmetrics. o Modernised on-call system by migrating to Pagerduty based on AWS alarms and defining rules, communication methods and response plans. Compliance: o Owned responsibility for our product-related security compliance initiatives such as SOC 2, ISO 27001, PCI compliance as well as annual assessments with the external audit firms. o Developed and implemented continuous compliance in AWS via a pipeline using CloudFormation guard and Terraform-Compliance frameworks. Identified and reduced AWS billing by 30% by identifying various methods to optimally utilise resources Architected the scaling of requests up to a 15X increase and with 99.99% availability. Modernised disaster recovery by implementing AWS central backup, multi-site active-active models for mission-critical services.

Developer, Trust and Safety

GOOGLE
Jan, 2014 - Dec, 20162 yr 11 months
    Migrated the ownership and maintained a business-critical review tool from the ops team Gradually migrated the legacy ads review tool by revamping to an optimised version which reduced the review time by ~20% Used Dremel to distribute data processing on large streaming datasets to improve ingestion and processing speed of that data by 90%

Analyst, Product and Quality operations

GOOGLE
Jan, 2012 - Dec, 20142 yr 11 months
    Worked on enforcing Adwords policy for ad creatives and landing pages to prevent abuse. Primarily focused on automating operational flows while working with the engineering and operations team Developed several Chrome browser plugins for improving the efficiency of the workflow of manual ad reviews. Worked on different automation scripts using Mapreduce and Flume jobs to streamline insights on the incoming volume of ads for manual review.

Major Projects

1Projects

Analyst, Product and Quality operations,

Jan, 2024 - Jan, 2024

Education

  • Bachelors of Engineering in Computer Science

    CHAITANYA BHARATHI INSTITUTE OF TECHNOLOGY (2024)