profile-pic

Poojasree S J

Experienced Site Reliability Engineer with a strong background in managing AWS and Azure cloud platforms. Skilled in designing and overseeing monitoring and alerting systems for optimal performance and reliability. Highly motivated developer and implementer of automated cloud-based solutions, consistently reducing system outages and enhancing customer satisfaction. Proven expertise in diagnosing and resolving issues with cloud-based applications, as well as investigating and suggesting cost-effective solutions to improve system performance for emerging cloud technologies.
  • Role

    Site Reliability Engineer

  • Years of Experience

    3.7 years

Skillsets

  • load balancers
  • PostgreSQL
  • AKS
  • Alertsite
  • Apache
  • Application gateways
  • Azure Monitor
  • CloudHealth
  • Confluent Cloud
  • Consul
  • HashiCorp Vault
  • Jira
  • Kubernetes
  • Prisma Cloud
  • RedPanda
  • ServiceNow
  • Shell
  • Sophos
  • Tomcat
  • vNet
  • VPC
  • VPN
  • YAML
  • Linux
  • Grafana - 2.0 Years
  • Terraform - 3.0 Years
  • PowerShell
  • Python
  • Ansible
  • ArgoCD
  • Azure DevOps
  • Datadog
  • Docker
  • Jenkins
  • Dynatrace - 2.0 Years
  • minIO
  • PagerDuty
  • Portworx
  • Redis
  • SQL
  • AWS
  • Azure
  • GCP
  • GitHub Actions
  • JFrog Artifactory

Professional Summary

3.7Years
  • Feb, 2022 - Present4 yr 1 month

    Site Reliability Engineer

    Hitachi Vantara

Applications & Tools Known

  • icon-tool

    Terraform

  • icon-tool

    Ansible

  • icon-tool

    Docker

  • icon-tool

    Azure Kubernetes Service

  • icon-tool

    Consul

  • icon-tool

    Vault

  • icon-tool

    Portworx

  • icon-tool

    MinIO

  • icon-tool

    Redis

  • icon-tool

    Azure DevOps

  • icon-tool

    ArgoCD

  • icon-tool

    Jenkins

  • icon-tool

    Datadog

  • icon-tool

    Dynatrace

  • icon-tool

    PagerDuty

  • icon-tool

    Grafana

  • icon-tool

    Jira

  • icon-tool

    ServiceNow

  • icon-tool

    SQL

Work History

3.7Years

Site Reliability Engineer

Hitachi Vantara
Feb, 2022 - Present4 yr 1 month
    Designed, deployed, and managed multi-cloud infrastructure across Microsoft Azure, Amazon Web Services, Google Cloud Platform, and Alibaba Cloud, including virtual machines, networking, storage, security, and Kubernetes clusters. Planned and executed Kubernetes cluster upgrades (AKS, GKE, EKS) manually and via Terraform, along with in-place upgrades of dependent tools and add-ons, ensuring minimal downtime and platform stability. Implemented and maintained Infrastructure as Code (IaC) using Terraform and Ansible to standardize provisioning, upgrades, and decommissioning workflows across environments. Performed SSL/TLS certificate renewals, Service Principal (SPN) / credential rotations, and security hardening activities to ensure compliance and uninterrupted service access. Led cloud resource lifecycle management, including VM, storage, and service decommissioning, cleanup of unused resources, and subscription hygiene to improve cost efficiency and governance. Automated operational tasks such as Jira ticket creation, Kubernetes pod log collection, and health checks using Python, reducing manual effort by ~60%. Managed and optimized CI/CD pipelines using Azure DevOps, Jenkins, and GitHub Actions, supporting application teams with build, release, and deployment troubleshooting. Designed and maintained monitoring and alerting frameworks using Azure Monitor, Datadog, Grafana, Dynatrace, Wormly, and Prisma Cloud, improving incident detection and reducing MTTR. Integrated and managed secrets, storage, and data services including Azure Key Vault, Vault, Portworx, Redis, MinIO, and AWS Secrets Manager for secure and scalable workloads. Supported identity and access management (IAM) across cloud platforms by enforcing RBAC, MFA, and secure access policies, and assisting with cross-cloud authentication integrations. Configured high-availability and scalability components such as load balancers, application gateways, availability sets, auto-scaling groups, and Kubernetes scaling policies. Conducted disaster recovery planning, failover testing, and upgrade validations to ensure business continuity for critical workloads. Participated in 24x7 on-call rotations, handled incidents and service requests via Jira, CMP, and ServiceNow, and led root cause analysis (RCA) while meeting strict SLAs/SLOs. Delivered monthly operational and uptime reviews, documented runbooks and SOPs in Confluence, and ensured smooth handovers and efficient incident triaging across shifts.

Major Projects

1Projects

Johnsons Control India

    Developed and maintained automation scripts for Azure Kubernetes cluster, reducing manual efforts and increasing task completion speed. Designed and implemented cloud infrastructure monitoring and alerting systems.

Education

  • B.Tech/B.E.

    Anna University Regional Campus, Madurai

Certifications

  • Azure fundamentals (az-900)

  • Datadog

  • Python for data science - nptel

  • Beginners guide towards python programming

  • Python data science in python

  • Programming in java

  • Diploma in computer application (dca)