
A passion driven and versatile individual, prepared to add value to the organization by aligning to its vision and core values.
Having around 5+ years of experience in IT industries with Professional Development, Automation of Build, Deployment, Release Management and Change Management activities on AWS as a DevOps Engineer.
Cloud Engineer
Blueberry Digital Labs Pvt LtdSenior Cloud Engineer
HCL TechnologiesCloud Engineer
ValueLabs Global Solutions Pvt Ltd
Maven

AWS services

Git
.png)
Jenkins
.png)
Docker

Tomcat

ServiceNow

SonarQube
Jira

BMC Remedy

Nexus
Apache Tomcat

Websphere

Apache

Terraform

Kubernetes

Ansible
.jpg)
Grafana
.png)
ELK Stack

Amazon EC2

Amazon EKS

AWS EC2

VPC

RDS

ELB

S3

SNS

CloudWatch

EBS

IAM

CloudFormation

GitHub

BitBucket

Terraform

Helm

VSTS

Splunk

Kibana
Yeah. Hi, my name is Sai Kumar, I have 5 years of experience in AWS and DevOps and coming to the AWS, I work within a couple of services like compute, storage, database, networking, management and governance, application integration, security identity, and transfer containerization. In DevOps tools, I work within Bitbucket administration, Jenkins, Docker, Kubernetes, and Terraform. In compute session, I work, coming to the AWS compute session, I work within EC2 instances, launch template, AMIs, volumes, snapshots, and target groups, security groups, and load balancers, and autoscaling. And also work within the Lambda services, where we used to create the serverless process by using a portal API3 with the Lambda services, and also work within the Elastic Beanstack also. Whenever our DevOps teams require for any of the development activities, like having a required for any of the server, we used to go for an Elastic Beanstack as a part of POC, like POC is nothing but a proof of concept. And coming to like storage part, I have worked on S3 buckets, objects, and different types of storage classes, where we used to create this, like integrated with EFS and FSX for the data sharing across all the missions by using Jump Host, and also work within Elastic, like, and also work within AWS backup services, where we used to take the frequent backup of the deployment, if you're having any of the deployments. And coming to like, like a database, I've been working with MySQL, which is already available in RDS, and network identity, I have, I have created the like automated network by using Terraforms, Terraforms, where we used to create like VPCs, route tables, and public and private subjects, and internet gateways, and used Cloudflare for the content delivery network. And also having the good knowledge of establishing of a VPN from the on premises to the AWS by using, by using, by using an IPsec, and also troubleshoot NAT gateways and the transit gateways, transit gateways. And also, and also like coming to like cloud monitoring tools, I have worked on the, I have worked on cloud watching. And for auditing purpose, we use the cloud trials for having any of the network collapse from the VPCs. And I haven't, I have worked on the IAMs and the creating of the roles and the roles, users and the policies, and established a multi-factor authentication to all the users across the project. And also work within the secret manager, key manager and the certificate manager for the SSL certifications, where we used to create, where we used to provide like a few details like CSR key files and the CSR key files and the certificates, details in it and by the way, I have worked on the SHA-0 and SHA-256. And also worked on Docker and Docker and the Kubernetes, like ECR, EC, EKS and the EKS and the orchestration level. And also worked on the Terraforms where we used to create like, like the resources of having of an VB, VBnets creation and the, and the subnets creations we used. And also VB, like VB having more like, having different types of storage, like stages, like stages, like pull the source code and the, pull the source code and the stages and the stages and the testing of and the deploying it. And we do use different types of deployments, like SSS deployments and the, and container like in deployments.
And coming to like, you could say, in Kubernetes, where we use like creating of bots and deployment sets and the stateful sets and the remote sets, we'll be creating. And also, like stateful sets, we'll be using the like stateful sets, where we are having of any like banking related things, like the database related to issues, we use to go for the stateful sets. And also, like services, we'll be using like the cluster IP and notepad and the load balancers in the Kubernetes. In that, like, we could say, for the security reasons, we use, for the security reasons, we'll be using like, use the commands like, and also use the like, configuration management tool like Ansible, where we use to write the playbooks and the scripts to patch the Linux and the OS and scheduling the automated jobs to apply the operator regularly. And coming to like, root with the kernel updates we'll be using, and the cluster deployments like we'll be having the daemon sets for the OS level, patching level, beyond like unattended update, update levels, and the node drains, node drains and the uncordant during the patching work, where we use to take one node into the uncordant level, and cluster autoscaling we'll be using and the CSED pipeline integration part where we use like, using the Jenkins and the Git to create the jobs and the triggering of OS patches and other things. And the monitoring and alerts we'll be using like Prometheus and the Grafana and the like, and if you're having any of the patches status, like notifying the security vulnerabilities we'll be using, and the backup like, patching we'll be using, ensuring like regular backups we'll be taking for the vulnerabilities, that's it.
Suggest an automated approach to scale the Kubernetes deployments in response to increase the number of adjustments, such as CPU and memory usages, and configure the HPA to the scale-based and custom metrics like the web traffic we will be using, and the set of resources and requests and limits we will be using, like if we are having in the deployments the manifest ensures that whether the resource request is limited to the CPU and memory issues and the setting on it, and the creating the HPA object and the defining the horizontal pod data scaler based on the CPU utilizations and the custom metrics, like writing often in YAML files like API and the API and the kind metadata and the specs, the specifications details and applying those, applying the HPA, we use like command, those YAML file, we will run the YAML files and the cluster auto scaling also available, where we automatically adjust the sizes of the Kubernetes and cluster based on the resources request by the pods and pod depending upon pending due to the lack of resources we will be using, to step configuration auto scaler, installing the cluster auto scaler, cluster auto scaler in the EKS and the auto scaler might already available, we just need to enable it and configuring the clusters like setting the parameters such as the minimum and the maximum nodes we will be using, and the node auto scaling behavior like when HPA increases the number of pods, it is like if the existing nodes like insufficient clusters may be having like HPA decreases, the number of pods in the cluster automated scale removes unused nodes and use the custom metrics with the HPA also like the web trafficking and scaling up, scaling up the things and the HTTP requests per second, like RPS will be having or the latency and to check this, we use the like Prometheus and Grafana for the graphical view and configuring the Kubernetes metrics adapter we will be using, for setting up of the Prometheus metrics, we will be having like such as a HTTP request total like auto scaling decision we will be taking and the ingress controller, ingress controller where we having the multiple scale up, scale out of pods and monitoring the number of requesters we are using for the ingress like NGINX ingress we will be using for the controller.
Propose a CI-CD pipeline design for an application deployed across AWS and Azure that integrates end-to-end testing. For the CI-CD pipeline designing, application deploying from the various environments for the integration end-to-end testing, we will be using designing of the CI-CD pipeline for the application deployment across like SCM, we will be using continuous integration pipeline like triggering and build points and for the triggering pipeline when code is pushed to the specific branches like main or another like main or master branch then the packages like build the docker images, build the applications and store the docker images in the container like AWS having the Elastic Container Registry and for unit testing we use run the unit testing during the for building we be using like username and pushing off in the code docker image name and versions we will be placing in the file and for the unit testing we use run the unit testing during the CI stages to ensure that the basic application correctness before moving forward and the tools we will be using like JUnit and PyTest we will be using and the static code analysis and security scanning and we store the artifacts test reports and other outputs in the central locations like AWS S3 and for the continuous deployment pipeline we be having a deploying them like stages we having and the containers like Elastic Kubernetes Services or Amazon like Amazon Elastic Containers we be using and for the deploy for the infrastructure code we be using the AWS CloudWatch or the Terraforms from the end-to-end testing after deploying and stages and the run to end-to-end EC2, E2E like testing and validation application behavior the real-time environment, real-time behavior and testing the frameworks and the execution of the E2E testing to integrate after the deployments we be having and we do have an approvals for workflows if they having the test passes they send the notifications for the manual automation we be using.
For restoring of stateful applications in the Kubernetes, we have like components involved in the stateful applications like persistent volume, persistent volume claims and the stateful sets and the Kubernetes configuration and the system like system overview will be like a persistent volume application where to take the data backup and configuring the backup into the YAML or manifest and secrets will be placed and monitoring and automation will be using and restoring the procedures like restoring the data and configuration in the case of the failure happening in the for the while taking off on the backups and the backup for the stateful applications we mostly use like persistent volumes where we used cloud native tools like storaging of the snapshots like snapshots and the EBS volumes creating like EBS snapshots of the VP like persistent volumes using the AWS CLI and for and also we will be having backups like creating the snapshots and the volume ID will be providing in the in the code and the NFS volume snapshots by the Kubernetes like providing the Kubernetes APIs that can be used to try used to take the snapshots of persistent volumes if it is stored provided by the supports it and backup applications will be using the data via the sidecar containers running the backup into the running those backup into the save the data for the external backup storage we use like S3 and these things and for the backupping of the Kubernetes configuration we use the config maps and the secrets for any of the sensitive informations we use the secret like secrets keys like secrets and the keys in the YAML file and automating the configuration backup we having like entire the cluster or the specific resources like having any PVCs but also backup of the entire Kubernetes resources like stateful sets and the secrets will be using like backup of the CACD file configuration if you are having the stateful applications uses for the external tools like we could say like Jenkins or GitLabs using ensure that the pipeline we should ensure like the pipeline and the secrets are all getting backed up for automated and monitoring backups we use the cron jobs and for the automatic backup process and also use the tools for scheduling the backups also may be a particular timestamp may be using and for the monitoring backups we use to take off and the Prometheus and the Grafana for the backup of the cron jobs and the volume snapshots and the restoring the process of like stateful applications from the cloud providers may be using EBS
I've got a number of trips. For deploying, for deploying the Lambda functions update using CACD pipelines in AWS, we use like having the function update efficiency, CACD pipeline in the AWS like having the AWS code pipeline, CACD pipeline like build stages and test stages will be there and the deployment stages and like we could say like SCM will be using and for the build stages, this is used like a package, packaging of the Lambda function and the code build runs this. And for testing, it is like optional way we can say like unit testing and JUnit test will be using. For the deployments, we are using having AWS Lambda after the build is completed and the test passes, deploying the updated Lambda function by updating the functional code by using AWS CLI or AWS like we could say like AWS CloudFormation also and AWS CloudPipeline like setup will be having like source stage and build stage and the deploying stages. For AWS like Lambda, the supports of versioning allow you to rollback the previous versions if they are having issues faced across during years and the canary and the liner deployments will be takes place. For monitoring and alerts, we are having like CloudWatch and CloudMetrics will be having and SSS, SNS notification integrated into it.
three four five six seven eight nine ten eleven twelve thirteen fourteen fifteen sixteen seventeen eighteen nineteen twenty twenty-two eighteen nineteen twenty-three twenty-four twenty-five twenty-six twenty-seven twenty-eight twenty-nineteen twenty-seven twenty-seven twenty-six twenty-five twenty-seven twenty-eight twenty-nine twenty-two twenty-three twenty-four twenty-five twenty-seven twenty-eight twenty-nine fifty-two fifty-three thirty-four fifty-three fifty-four twenty-five twenty-six twenty-seven twenty-seven twenty-eight twenty-ten twenty-nine twenty-eleven twenty-eleven twenty-two twenty-nine twenty-two twenty-ten twenty-five twenty-six twenty-seven twenty-four twenty-five twenty-nine twenty-eight twenty-nine twenty-nine twenty-ten twenty-nine twenty-two twenty-four twenty-three twenty-four twenty-five twenty-four twenty-nine twenty-four twenty-six twenty-nine twenty-five twenty-four twenty-eight twenty-nine twenty-three twenty-two twenty-four twenty-three twenty-four twenty-five twenty-six twenty-seven twenty-eight twenty-nine twenty-nine twenty-nine
As SSH key and the name were hard-coded in it, so it will be like, if you're having lack of encryption method, we'll be using and like, what we could say like avoiding of the security risk will be happening, hard-coding of SSH key names or the paths will be placed in it and we need to provide the required SSH key to be passed and the variable without default values and securely managed by the secrets we'll be using.
How would you design a system to autoscale the concurrent simulation, learning workloads in a hybrid way, set up using the new parameters? How would you design a system to autoscale the concurrent simulation, learning workloads in a hybrid way, set up using the new parameters? How would you design a system to autoscale the concurrent simulation, learning workloads in a hybrid way, set up using the new parameters? How would you design a system to autoscale the concurrent simulation, learning workloads in a hybrid way, set up using the new parameters? How would you design a system to autoscale the concurrent simulation, learning workloads in a hybrid way, set up using the new parameters? How would you design a system to autoscale the concurrent simulation, learning workloads in a hybrid way, set up using the new parameters? How would you design a system to autoscale the concurrent simulation, learning workloads in a hybrid way, set up using the new parameters?
Like the methodologies we will be using like having the SOC like System Organization Control and General Data Protection Regulator. We use the security by design and threat modeling we will be using and compliance first design and data classification identifying the classified data. Example like sensitive data to apply the appropriate security measures will be taking place and encryptioning those to the like base, we should say like ELS and for the data transit protector, sensitive data in the lines, SOC 2 and the GDP requirements and the security development practices we will be using like dynamic application security testing and code reviews will be having and DevSecOps integration like CACD pipelines security gate, security gates adding the security checks in the CACD pipeline and using the tool like Trivy for the container vulnerabilities, open policies agents will be placed for data handling and privacy controls, data retention policies will be using and the data subject access request and audit logging and the continuous monitoring and monitoring and the auditing will be having like security informations and the event management tool will be using like Splunk and the access controller identity will be using the role based access controls and the multi-factor authentications to ensure that enforce that MFA will be using for any admissible levels like users creations and reducing the risk in the unauthorized access will be having and vulnerability management, we are having the regular patching up and vulnerability scanning will be there and penetrating off in the testings and incident responses planning the data features notifications will be placed and the data governance and third party vendor tools, vendor management performs in the S4C and the data processing agreements and the regular auditing of and the compliances reviews will be having, user contest like secure SDLC process will be having and follow the for the deployment of an while deploying of an applications we follow these methods, these standards.
Let's put it on the porch. I'm gonna zoom in closer. I'm gonna zoom this up a little bit. So you can see it better. For approaches like optimizing a Kubernetes cluster deployment like computer vision model like we're having a Kubernetes cluster optimizing node configuration method for ensuring that the Kubernetes nodes and the GPU. Example like NVIDIA and GPU scheduling and the resources allocations like enabling the device plugins example like a NVIDIA device plugins will be a clustering of the points and they use the Kubernetes GPU scheduling point scheduling and containerization of the computer vision model. We optimize the Python docker images and the multistage building will be placed and ensuring that the necessary Python dependencies will be placed or not. And modeling the model packages like open network exchange will be having the cluster level optimization having a horizontal horizontal bar auto scalings for which we having the vertical bar scalings we having like automatically adjusting the CPU and memory requests of the parts based on the actual usages and the optimization level level of resources usage and avoiding the over provisioning and efficient in the deployment. Efficient in the data model and the storages and the persistent storage for the large database sets like Amazon S3 be using like distributed storage storage solutions we using and implementing the catching model catching for mechanisms in the parts and reducing it and optimizing the model servicing like load balancing and interfaces scaling. And latencies like node affinity we be having setting of set of rules on the nodes and the tolerance will be placed and monitoring and logs where we having GPU utilization set up of the monitoring and performing whether it is like centralized logins and implementing the centralized usages of the Elasticsearch. Security compliances well coming to the security compliances network policies applying the Kubernetes and the secret management and data privacy approaches and the bot security policies will be having and the module model versioning of continuous deployments we use the CSCD for the model updates and the canary and blue and green deployment methods and the model versioning and cost optimization coming to like the spot of smart instances for the noncritical tasks. And right sizing the resources and efficiently load scaling.