profile-pic
Vetted Talent

Ravali Muttana

Vetted Talent

Senior Cloud DevOps engineer with 6+ years experience in Cloud (Azure, AWS), DevOps, Configuration management, Infrastructure automation, Continuous Integration and Delivery (CI/CD). Can implement effective strategies for application development in both Cloud and On-premises environments. Experience in dealing with Unix/Linux and Windows server administration. Expertise in Architecting and Implementing Azure Service Offering, such as Azure cloud services, Azure storage, IIS, Azure Active Directory (AD), Azure Resource Manager (ARM), Azure Synapse and analytics, Azure Storage, Azure, Blob Storage, Azure VMs, SQL Database, Azure Functions, Azure Service Fabric, Azure Monitor, and Azure Service Bus. Hands on experience on Backup and restore Azure services and in Design and configure Azure Virtual Networks (VNets), subnets, Azure network settings, DHCP address blocks, DNS settings, security policies and routing. Azure cloud services, Blob storage, Active directory, Azure Service Bus, Cosmos DB.

  • Role

    DevOps Engineer

  • Years of Experience

    7.6 years

Skillsets

  • Data Analytics
  • Bash
  • AWS - 2 Years
  • Apache
  • IIS
  • VMware
  • Python - 01 Years
  • Groovy
  • Windows
  • Puppet
  • CI/CD
  • DevOps - 6 Years
  • Nagios
  • Scripting
  • IAC - 4 Years
  • Azure sql - 3 Years
  • Azure Storage - 4 Years
  • Azure DevOps - 5 Years
  • infrastructure as code - 5 Years
  • Containerization - 4 Years
  • Identity and Access Management - 3 Years
  • Security - 2 Years
  • AWS Services - 3 Years
  • Networking - 2 Years
  • EKS
  • Azure - 6 Years
  • Terraform - 3 Years
  • Terraform - 3 Years
  • Kubernetes - 4 Years
  • Kubernetes - 4 Years
  • Shell Scripting
  • Git
  • Jira
  • Azure Synapse
  • Github
  • Docker
  • Azure - 6 Years
  • Office 365
  • Confluence
  • New Relic
  • PowerShell - 2 Years
  • Helm
  • Chef
  • Ansible
  • Maven
  • Jenkins
  • Bitbucket

Vetted For

11Skills
  • Roles & Skills
  • Results
  • Details
  • icon-skill_image
    Senior DevOps Engineer (Remote)AI Screening
  • 82%
    icon-arrow-down
  • Skills assessed :CI/CD, CloudWatch, Datadog, Terraform, AWS, GCP, Jenkins, Kubernetes, 組込みLinux, Problem Solving Attitude, Type Script
  • Score: 74/90

Professional Summary

7.6Years
  • May, 2023 - Present3 yr

    Cloud DevOps Engineer

    Anlage Infotech (India) P Ltd
  • Jan, 2022 - May, 20231 yr 4 months

    Senior Infrastructure Associate

    Publicis Sapient
  • Jun, 2021 - Jan, 2022 7 months

    DevOps Engineer

    XT Global Azure
  • Jun, 2017 - Jun, 20181 yr

    Senior Process Associate

    Cognizant
  • Jul, 2018 - May, 20212 yr 10 months

    Cloud/DevOps Engineer

    AFF Soft IT Solutions Pvt limited

Applications & Tools Known

  • icon-tool

    Azure Cloud Services

  • icon-tool

    Azure Active Directory

  • icon-tool

    Azure Resource Manager

  • icon-tool

    Azure Synapse Analytics

  • icon-tool

    Azure DevOps

  • icon-tool

    Git

  • icon-tool

    Terraform

  • icon-tool

    Kubernetes

  • icon-tool

    OpenShift

  • icon-tool

    Docker

  • icon-tool

    Prometheus

  • icon-tool

    Grafana

  • icon-tool

    Ansible Tower

  • icon-tool

    Azure Data Factory

  • icon-tool

    Azure Databricks

  • icon-tool

    Azure Stream Analytics

  • icon-tool

    Azure Logic Apps

  • icon-tool

    Azure Key Vault

Work History

7.6Years

Cloud DevOps Engineer

Anlage Infotech (India) P Ltd
May, 2023 - Present3 yr
    Implemented Azure CI/CD pipelines using Azure DevOps. Managed application deployments with GitOps principles. Utilized Terraform for IaC and Kubernetes for container orchestration.

Senior Infrastructure Associate

Publicis Sapient
Jan, 2022 - May, 20231 yr 4 months

DevOps Engineer

XT Global Azure
Jun, 2021 - Jan, 2022 7 months

Cloud/DevOps Engineer

AFF Soft IT Solutions Pvt limited
Jul, 2018 - May, 20212 yr 10 months

Senior Process Associate

Cognizant
Jun, 2017 - Jun, 20181 yr

Certifications

  • Az-900

  • Az-104

  • Az-400

  • Aws cloud practitioner

AI-interview Questions & Answers

Hi, team, this is Ravi. I have a total of seven years of experience. During the start of my career, I started as an infrastructure associate. And later on, I moved into Azure DevOps. So, when I was working as an infrastructure associate, my roles and responsibilities included giving access to team members for the Azure resources I created, and also giving access to team members for the Active Directory and other resources. Coming to the Azure resources. After working as an infrastructure associate, I totally moved on to Azure DevOps, where I have six years of experience. Within six years, I gained a good exposure to CICD pipelines, Kubernetes, Terraform, and Ansible. I also had the chance to work on other monitoring tools like New Relic, Prometheus, and Grafana. And I've integrated it with Azure pipelines using service principles. I also have a good exposure to the AWS environment. In the AWS environment, I worked on almost all the services. It's like a multi-cloud platform where I worked with. I have a good experience in AWS, where I've created storage accounts, pipelines, and other services as per my requirements within the pipelines. When talking about AWS, I worked on EC2 services, storage accounts, database creations, and all. I also have a good exposure to Jenkins pipelines. Within the Jenkins pipelines, I've created pipelines with different jobs so that I've deployed those pipelines in other environments as well. I also have a good exposure to pipeline models, where I've created the CI process, the CD process. Within the CICD process, I have a good understanding of building pipelines and deploying into the AKS cluster, as well as deploying apps into app services in AWS, and deploying as a web service in AWS. This is a brief about my expertise in AWS. Coming to my overall exposure, I also have a good exposure to Terraform, where I've created and written Terraform code for both AWS and Azure environments. That's it for me.

So on designer morning. So, basically, to design a monitoring solution for CloudWatch and Datadog that flags a deviation in performance benchmarks for an AWS application. So, the first step is to set up CloudWatch metrics. I will use CloudWatch metrics to collect metrics from AWS services such as EC2 instances, Lambda functions, or RDS databases, etcetera. After that, I'll define custom metrics if needed to monitor specific aspects of the application performance. Later on, I would create cloud virtual items. Like, I will configure cloud virtual items to trigger when performance metrics deviate from defined thresholds. For example, we can set alarms for CPU utilization, memory utilization, and latency. There are any even for error rates also we can. So after that, I will integrate it with Datadog. I'll connect Datadog with my AWS account first to collect additional metrics and logs also, which are beyond what CloudWatch provides. This integration will allow us to have a more comprehensive view of the application performance. After that, I will define custom boards. I will create a board in data logs to visualize metrics from both CloudWatch and Datadog. So by customizing these, after that, I'll customize these dashboards to display relevant performance benchmarks as key indicators of the application and etcetera. Like, after that, I'll set up alerts in Datadog. So I'll configure alerts in Datadog to notify when performance deviates when there is a deviation in performance, it will be detected. So that for that, I will create an alert. Even so, I'll implement anomaly detections. Like, I will utilize Datadog's anomaly detection feature to automatically identify unusual patterns, etcetera. After that, even I'll implement remediation actions. I will define automated remediation actions that can be triggered when performance deviations are detected. For example, like we can automatically scale up resources. We can automatically restart instances, or we can automatically deploy code changes to address performance issues, basically. So even so, by combining CloudWatch and Datadog, we can create a robust monitoring solution that provides real-time visibility into the performance of AWS-hosted applications. And it'll be easy, and it'll help us quickly. That's it. Thank you.

Okay. So basically, when troubleshooting a Linux server with high load, using CloudWatch and Datadog indications, we can follow a few steps. First, we need to check the CloudWatch metrics, starting with servers, CPU utilization, memory usage, disk input and output, and network traffic. We can check for any spikes or high values in these metrics. After that, we need to review the Datadog dashboards to get a more detailed view of the server's performance, including metrics, and even custom metrics for our application or environment. We also look at the correlations between the metrics to identify anomalies, etc. And we should identify resource bottlenecks. We need to analyze the CloudWatch and Datadog metrics to pinpoint the issue. We need to analyze the CPU-bound process, memory leaks, disk saturation, or network congestion. We need to check the system logs for any error messages, warnings, or relevant information. We can also use Datadog tracing to check API calls, database queries, or external service dependencies. We need to check process activity using tools like Top, Htop, or Atop to monitor CPU and memory utilization in real-time. We should also analyze disk usage patterns using tools like DEF, IOstat, to see long queue lengths. We can optimize the configurations by reviewing server configuration settings, such as kernel parameters and network settings. We can implement remediation actions based on our analysis to address the underlying cause, which may involve scaling up resources. After that, we need to monitor the impact of our actions.

Method to enable tracing for con enable tracing for configuration. So, for enabling the tracing of configurations applied, if it is applied through Terraform, we can involve setting up a systematic approach to track these changes. I can see. So first is we can there are some methods to achieve this. 1st is we can use a version control with Git. We can utilize a version control with Git to manage the Terraform configurations. So we can ask each developer to work on their feature branches and submit a pull request to review before merging the changes into the main branch. And we can also not may we need to give a commit message for sure. We need to enforce a descriptive commit message so that it will clearly explain the purpose of each change, which includes details such as the resources being modified, what is the impact of the changes, and what was done. And we need to use a Terraform state management. We need to use a centralized Terraform state management, which we can use for that, we can use Amazon S3, or we can even use HashiCorp's console. So this will ensure consistency and allow for collaboration among multiple developers. Even we need to use a Terraform Cloud or Terraform Enterprise for state management, collaborating features, and all. Also, we need to enable Terraform debugging. We need to set the TF_LOG environment variable to debug and capture the details and logs during Terraform operations. Like, basically, this all helps to diagnose the issue and trace the execution flow of configuration changes and all. So, even we need to implement the infrastructure change management. We need to implement a change management process to review and approve the infrastructure changes before applying them to production. We need to review in the lower-end environment. And after that, we also need to implement monitoring and alerting. We can use a monitoring and alerting tool like CloudWatch or Datadog to track the infrastructure changes and monitor all the impact, basically, the impact on system performance, the impact on availability. And we also need to set up an alert and notify the stakeholders if there is any unexpected behavior. We need to implement automated testing, and we also need to document the collaborations. Like, we need to maintain a comprehensive documentation for Terraform configurations, which includes architecture diagrams and all. So that's it from my end. Thank you.

So, basically, to fine-tune your horizontal pod autoscaler for cost-efficient staging, like, the basic steps in Kubernetes, we can follow some procedures. We need to target resource utilization so we can set up an appropriate target resource utilization for horizontal pod auto-scaling. Based on this, we need to analyze the application workload patterns. We need to maintain the resource utilization levels so that we can balance performance and cost-effectiveness. Also, we need to adjust the target thresholds in this case. So even when we need to scale in and scale out, we need to define conservative scale-in and scale-out policies so that unnecessary scaling actions can be avoided, which will inflate the cost. We should consider factors such as average utilization over a long time rather than short-term spikes. We need to scale in and scale out according to the policies. We need to choose a relevant utilization metric for scaling decisions, which are aligned with cost optimization as well. For example, we need to prioritize scaling based on CPU utilization rather than memory. Like, if CPU-bound workloads are more cost-sensitive, we need to define custom metrics and external factors. We also need to use a vertical pod autoscaler. We need to evaluate the use of vertical pod autoscaler in conjunction with HPAS to dynamically adjust the resources requested and limits for individual pods based on resource utilization patterns. And even, we need to predict the scaling. We need to explore usage capacity, spikes, and all. So, by doing that, we will be able to easily ensure that the scaling horizontal pod scaling will be cost-efficient. That's it. Thank you.

I plan to deploy AKKS using Jenkins and Terraform. So, basically, we need to configure the CICD pipeline to deploy into it. First, we need to process all the steps, which are we need to first install Jenkins. We need to set up a Jenkins server or a cloud instance. We need to install necessary plugins. As we are using Terraform integration, we need to install the Terraform plugin. After that, we need to set up a version control. We need to create a Git repository to store the application code and all the Terraform configuration. Then, we need to ensure that Jenkins has access to the repository. After that, we need to define the Jenkins pipeline. We need to create a Jenkins file in the repository to define the CICD pipeline stages. Then, we need to define stages for building, testing, and deploying the application. We can also use a Jenkins shared library or reusable pipeline code if it is needed. After that, we need to install Terraform on the Jenkins server or use a Docker image in the pipeline to install Terraform, ensuring Terraform is available in the Jenkins environment path. So, after that, we need to configure the backend. We need to define the Terraform backend configuration. We need to set up authentication and access to the backend services. After that, we need to create scripts. We need to create some scripts to define the infrastructure as code for deploying the application. Also, we need to define the CICD process. We need to organize the Terraform code into modules so that it's reusable, easy to maintain. And coming to the pipeline stages, we need to use a checkout stage to pull the latest code from the repository. Then, we need to use a build stage to compile the application code and run any unit test cases. After that comes Terraform initialization and configuration of the backend. Then comes the Terraform plan, where we can generate an execution plan to preview the changes. Then comes Terraform apply, which will create or update the existing infrastructure. So, after that, we need to apply the application. Then comes the integration test. If there is any cleanup process, we need to create that stage. After that comes a pipeline trigger, where we need to configure the triggers in the pipeline for credential management. We can use API tokens and SSH keys. After that, we need to do the testing and validation, and even create a monitoring and logging system for the CICD pipelines. We can implement monitoring and logging for performance metrics. So, that's it from my end.

So in the Python code for AWS Lambda, the mistake which will prevent the function from running is that it is importing boto3 as botters and defining a function with a syntax error. Actually, there is an incorrect import statement and a missing import statement, so we need to correct that. Big picture, we need to correct the import statement. The code which was provided attempts to import a module named Boto's, which is likely a typo. The correct module for interacting with AWS in Python is boto3. The import statement should be import boto3. So the syntax error is in the function definition line. The backslashes are used instead of parentheses to define the function parameters, which results in a syntax error. Like, additionally, the backslashes are unnecessary and can be removed to define the function parameters correctly. So this tool needs to be changed so that the lambda function will correctly work, like so that it'll list the objects in the s3 bucket, like, when this function is triggered.

So in the given Terraform snippet, if I modify the instance type attribute and apply the changes, Terraform will detect the modification and plan to recreate the AWS EC2 instance with a new instance type, which is specified. So the principle reflected here is immutable infrastructure. Immutable infrastructure is an approach where the infrastructure components such as servers or virtual machines are never modified after they get created. Instead, changes are made by replacing entire components with a new instance, which incorporates design changes as well. When we modify the instance type or review in Terraform, it means that we are effectively changing the configuration of the EC2 instance. Here, when we change the instance type, what happens is that Terraform will first detect the change in configuration. After that, Terraform will create an execution plan that includes destroying the existing EC2 infrastructure. It will create a new one with the changed configuration. When we apply the changes, Terraform will initiate the execution plan, which will include the termination of the existing instance. It will also provide a new instance with the new instance type and all. So, basically, when a new instance is successfully provisioned, Terraform updates its state to reflect the changes. By following this immutable infrastructure principle, Terraform will ensure that infrastructure changes are predictable, consistent, and all. That is the main thing that will happen here.

So, basically, to secure the management of secrets and sensitive configuration mainly across the AWS environment, we can follow some steps. Like, we can use AWS secret management or AWS system management parameter store. It will store the sensitive configuration securely, and it will also leverage some features, such as encryption. We can include rotation policies, and we can also change the access controls, and we can integrate that with our secrets. We can even grant some least privileges. We can implement the IAM rules and policies to Jenkins, which are necessary, basically, to retrieve the secret. We should follow the principle of least privilege to restrict access to a specific secret. We need to configure AWS credentials for Jenkins. We need to create IAM roles, IAM users within a project with some appropriate permissions. We need to configure AWS credentials in Jenkins even with IAM roles. We cannot use access keys, AWS security token services, and all. So, even we need to securely store the Jenkins credentials. We need to integrate Jenkins with the AWS services. We need to use plugins or integrations to enable Jenkins to interact with the AWS services and all. We also need to encrypt secrets and transit them, and even store them at rest. We need to ensure that the secrets are encrypted when in transit. We can also use TLS and SSL for secure communications. We need to implement secret rotations as well. We need to implement audit monitoring, and also, we need to work on continuous improvement and compliance as well.

For optimizing log management for microservices applications mainly using Terraform native tools. They can like we can follow some methods. Like we can use a centralized log with Amazon FraudWaste logs, like aggregate logs from all the microservices into it can be navigated to Amazon CloudWatch for centralized storage and analysis. We can also use the CloudWatch logs agent for AWS SDKs mainly to push logs from containers, lambda functions, and all. Even we have log group organization which will allow us to organize logs into some logical log groups like based on the microservice boundary environments. We also need to implement proper naming conventions and also tagging easily so that it'll be easy to identify and filter the log groups. We also need to implement retention and storage management. We need to configure some retention policies within the CloudWatch logs so to retain the logs for any appropriate duration like based on the compliance requirements or also the operational needs. We can use a log group life cycle policies like to automatically archive or delete or add the logs. So we need to have real-time monitoring alerts. We need to set up CloudWatch alarms. I like to monitor long metrics such as error rates, latency, and some expected exceptions in real-time. We need to configure alerts to trigger notifications via it may be via Amazon SMS and all. We can also use log analytics and insights. A like it has an enhanced log visualization. We have we should also implement some dashboards as well. We can even integrate it with AWS X-Ray for distributing tracing. We can also implement additional monitoring solutions as well. Like Datadog can be used. Splunk can be used. The ELK Stack, Elastic, Stack, Search, Logstash, Kibana, all this can be used. So that's it for me.

Leveraging and civil chef for automated configuration development in this case. So we need to develop Ansible playbooks that can define the desired state for the two instance configurations. Playbooks can include tasks for including packages within con for configuring services, managing users. Even we can set up secure settings as well. For dynamic inventory management, we can use Ansible dynamic inventory plugins for scripting to dynamically discover and manage within the AWS instances. Ansible can query AWS APIs to dynamically generate the inventory based on the EC2 instances or attributes such as tags, regions, instance types, and other metadata. And even integration with AWS modules can leverage automation in this way as well. We can create an item put in configuration management so this ensures consistency, predictability in the configuration management as well. Even parameterization and templating will help us to allow for a flexible configuration within a specific environment as well. We can even integrate with cloud so that we can execute the Ansible playbooks and shell scripts. We can automate the orchestration process as well. Even error handling and reporting, we need to implement error handling mechanisms and logging mechanisms within the Ansible Playbooks to capture the errors or failures during the configuration task as well. These are some which we can follow to automate the configuration. So which are the features in the newly launched AWS C2 instances.