16+ Years of IT experience in software configuration, design, development and cloud implementation using AWSCloud, DevOps Tools,ETL,Cognos BI 6 years of extensive hands-on AWS Cloud services and DevOps Tools & Technologies like Kubernetes, Openshift, Jenkins, Maven, Terraform, Ansible, Docker, ELK for Application Logging, Prometheus / Grafana for Monitoring, Azure DevOps, Unix, Bash Scripting. SRE activities. Experienced in designing & deploying highly available, cost-effective, automated and fault tolerant environment for Application servers and database servers by using Openshift ,Kubernetes, AWS EKS, ECS, ELB, Autoscaling, RDS, CloudFront and Route53, Secret Manager, DMS, SCT, CloudWatch, SQS, Lambda, SNS Actively involved in OpenShift v3 to V4 migration and responsible for non-prod OpenShift cluster Executed various cost saving techniques measures and modifications to achieve substantial reduction in cloud expenditures and work within the budget. Experienced in Database Migration using AWS DMS & AWS SCT for IBM DB2 and MySQL. Extensively used Docker/Kubernetes (AWS EKS), Helm for Containerization, Deploying & Scaling the application securely across clusters of hosts to fasten Build/Release Engineering by different deployment strategy Involved in DB administration like back and restore,DB cluster setup. Oracle to Postgress Migration Experienced in writing Python scripts for automating back-ups to multi AZs and AppScan utility to check image vulnerability. Actively involved in migration from Jenkins to ADO and various tool integration Implemented a CI/CD pipeline involving ADO, Jenkins, Ansible, and Terraform for complete automation from commit to deployment. Implemented necessary controls to ensure the security of the cloud infrastructure where all the Applications and Client Data are hosted. Used services like WAF, Shield, Guard Duty, Secret Manager Checkov, Appscan. Good experience is handling errors and exceptions in large scale applications. Creating and managing SSL/TLS certificates for securing the OC cluster and application. Responsible for managing non-prod OpenShift cluster and various stakeholders using our cluster. Setup observability tools like ELK, Prometheus, Grafana, Istio, Jaeger and Kiali for AWS services and Kubernetes cluster to ensure integrity, reliability of the system Strong knowledge in BI Tools like Cognos (Report Studio, Framework Manager, Cube designer (Dynamic Cubes), IBM Workspace Advanced, Query Studio, Cognos 10, 10.2.2 and IBM Cognos Analytics), Qlik Sense and ETL tools like Informatica.
Devops Lead
Standard Chartered Global Business ServicesTechnical Lead
IBM India Private LimitedSoftware Engineer
CSC India Private LimitedAssociate trainee
Birlsasoft India Private ltdMySQL
AWS Security Hub
AWS (Amazon Web Services)
Ubuntu
PostgreSQL
Microsoft SQL Server
Amazon Redshift
R
Linux Admin
Apache
Ngnix
Jenkins
GitHub
Azure
Git
Kubernetes
Code Review
Ansible
terraform
Bash
Helm
AWS Lambda
AWS Cloud
Amazon EKS
Amazon EC2
Docker
Azure DevOps Server
Project 1: Risk View
Domain: Banking
Role: DevOps Lead
Tools/Environment: AWS,EKS,Terraform,Openshift,Kubernetes, GIT,Shell Script,Python, Jenkins,Azure Devops,ELK
Project Description:
Riskview is a cloud native application platform under Risk and CFCC function which comprises of multiple workflow management and APIs. It is built based on micro frontend, microservice architecture and deployed on Kubernetes.
Roles and Responsibilities:
Project 1:
Client: GENOMICS ENGLAND LTD ,UK
Domain: Healthcare
Role: AWS DevOps Engineer
Tools/Environment: AWS, GIT, Terraform, Ansible, Python, Jenkins, EKS, AWS Systems Manager
Project Description:
Genomics England is a British company set up and owned by the United Kingdom Department of Health and Social Care to run the 100,000 Genomes Project. GEL has partnered with innovative British deep-tech company Lifebit and is using global cloud provider Amazon Web Services (AWS) to power the platform. By using AWSs scalable and secure cloud computing and storage infrastructure in the UK, and then enabling access, analysis and collaboration through Lifebits unique technology platform
Roles and Responsibilities:
Project 2:
Client: Pitney Bowes AMS ,USA
Domain: E-commerce
Role: AWS DevOps, SRE Engineer
Tools/Environment: AWS ECS,DMS,SCT,AWS Codepipeline,ELK,Prometheus,Grafana,,Docker
Project Description:
The company provides mailing and shipping services, global e-commerce logistics and financial services approximately 750,000 customers globally, Pitney Bowes is a certified "work-share partner" of the United States Postal Service. Pitney Bowes selected AWS based on three critical factors: cost effective infrastructure, reliability, and productivity.
Roles and Responsibilities:
Roles and Responsibilities:
Skills : - Informatica, Cognos, DB2
Roles and Responsibilities:
Skills :- Manual Testing
Yes. I have around 16 years of IT experience and into AWS DevOps, uh, close to 5 years of experience. I have experience with the AWS database migration using DMS and SCT. Post migration, we used, uh, AWS ECs, uh, for the deployment and using AWS code pipeline for, uh, CICD. Post that, we migrated from ECS to ECS, uh, due to scaling issue and, uh, from AWS core pipeline to Jenkins. And as part of, uh, AWS, cloud implementation, have exposure to different AWS services, like setting up the infrastructure using VPC subnet, uh, on the networking concept. Um, along with that, computing back comes the, uh, on e c 2. And, uh, e b s and e f s for the volume part. And, uh, when it comes to security, I have the exposure to using I'm I'm uh, also AWS config, uh, cloud cloud, such, uh, I'm monitoring good. Uh, for security purpose, I have, uh, on the exposures of using AWS WAF codes, uh, also AWS config, uh, AWS card duty, and, uh, I mean, the Amazon inspector. Also, I have the experience of, uh, using route 53 for domain name test stations and, uh, using the SCM, uh, Amazon certificate manager, basically. Also, for patching management, currently, we are using AWS system manager. So that comes this, uh, AWS part. Uh, briefly, I can tell. And, uh, coming to DevOps, I have the exposure to Kubernetes and opens it. Uh, also, for DevOps, uh, CICD pipeline have the exposure along with Jenkins. Um, we are using Azure DevOps also. For the scripting and all, uh, we are mainly using the bash script and, uh, Python script. So that is the, uh, overall, uh, like, diverse activity. Uh, I missed 2 points, uh, like, uh, using the telephone we kept publishing our time infrastructure along with configuration management app using Ansible. So this is the overall journey. And, actually, Uh, that's it. Uh, so far, 6 years of, uh, cloud and dev ops journey. And,
How do you diagnose and fix a bottleneck, uh, in the CICD pipeline that is causing slow deployment times to AWS PCS. Okay. Uh, if we can take this, uh, this is task, basically, uh, and how this service is configured, um, behind the load balancer. And suppose you are using high computing, um, applications, but, uh, you are not using the underlying proper EC 2 instances. So in that case, you maybe you can migrate to, uh, EC, uh, as series of, uh, EC two instance type. Uh, and when comes to volume type, uh, we can use from g p 2, g p 2 to g p 3, basically. That bit, uh, gives better throughput and uh, IOPS. That is one thing. And this uh, network part only comes like how you are managing what, uh, like, load balancing concept using, it should be, like, uh, multi AZ concept should be there. Uh, and in comes to CHD pipeline, yes, we can check how the codes are managed in the GitHub. Maybe unnecessary, uh, like, files, uh, we can just remove from this, uh, AWS, uh, like, your your GitHub. Uh, otherwise, you are using the AWS, uh, repo. Core repo, basically. Under okay. Any bottleneck means suppose, uh, you are having, uh, any vulnerabilities there in the CSV pipeline. Suppose you are, uh, like, committing any, uh, any secrets to your report. Definitely, the pipeline will fail actually. If, uh, you are, uh, that is part of your check. And any, like, scanning, uh, you are not doing the properly or the images and all. So that is one use case. And, uh, there may be another bottleneck. Suppose you are pushing your image to AWS this year, and you don't have, uh, permission, actually. So that will definitely block the security pipeline. So underlying permissions, it should, uh, it should be given prior to the implementation of security pipeline. Uh, check all this security are in place, which should not be bottleneck at later point of time, and it should not come as a, uh, like, uh, products and issues. Uh, also, uh, for building, uh, in lower enrollment, you can do build once. But in higher enrollments, you can do rebuild of, uh, application. No need to do so. Better in that case, you can leverage the user, like, Helm Chat, uh, and managing the context properly for, uh, any applications. Yeah.
The word is solution for implementing multi region database architecture for I'm using. Okay. Uh, for high availability, uh, basically, uh, you can maintain, uh, like, minimum 2 or 3 availability zone for a particular region, basically. So in Terraform, uh, using Terraform, you can mention while creating the VPC, basically. It is ideally recommended to use, like, 3 availability zone because you're, uh, like, uh, suppose you are using AWS case cluster for deployment, um, then then the 3 node, like, um, 3 control node plan is, uh, automatically provision as part of your your like, because class of provisioning through Terraform, actually. And multi region gives the high availability. Uh, also in database, you can have a replica of database. Uh, suppose you are using update database, then you can have the real rep replicas throughout this, uh, different level region. And that is, uh, like, advantage of, uh, like, your application being highly available. And, uh, coming to the cost, actually, uh, suppose what are the resource you are using, uh, for the c two instance? So, uh, suppose your project is for long term project, so you can use up, like, reserve instance. And, uh, you can also change to savings plan because that gives the opportunity to it supports, like, multi region, um, like, you can use this issue to instance across the all the regions. Also, in between, suppose if you want to change this issue instance, uh, then you can, uh, use up, like, savings plan. That becomes, like, cost optimization. And, uh, you're through Terraform, you can, like, create your life cycle tool, basically. When you to minimize the cost, you send your data logs, uh, to, uh, s 3 glacier deep archive. Uh, that will also become chips, uh, cheap. Also, uh, the EBS volume. In GP 3, you can use in case of GP 2, um, which is also cheap, uh, and is the performance is better compared to GP 2. And that is one use case. While creating the Terraform, uh, configuration file, uh, you can give all this, uh, so g p three and what type of EC two instance according to your, uh, like, usage you can use it actually. Also, I mean, to check the agent queue cost in the case cluster, how to, uh, like, minimize the cost actually. And what are the resource you can mention? How much is required for in the poll level. That is also another advantage upon minimizing the cost.
We can use Python tools. Basically, Python script, uh, we can use, uh, from, uh, like, monitoring and alerting, um, like, application, health metrics. Uh, basically, uh, suppose In Kubernetes, we have, like, uh, in the health check, basically to check the Readiness Pro and Liveness Pro. We can, uh, write a script basically using Python mode 433. Like, whenever, uh, there is a threshold of, like, you say CPU, uh, usage of above 80%, Uh, just, uh, like, send a, uh, like, notification. Uh, of like, uh, if you are using Python tools, basically, uh, we can, uh, use it, the script. In place of, uh, metric server, we can write a script so that another portal will be scaled, uh, because of, like, any, uh, horizontal port. Autoscaler is required. Uh, and, uh, for the monitoring purpose, uh, we can use, like, uh, like, Prometheus tool, basically. So using Python tools, basically, we can, uh, like, uh, we can install all the agents in all the c two nodes. And, basically, that is to scrape all these metrics from the EC two instance, uh, and keep it as a time series database in the Prometheus. But that is one use case you can use in using Python tools in using Python. Also, as part of, like, monitoring, you you will do, We can, uh, like, use, like, Python, Panda, and Numpy, uh, library, basically, to take the backup, uh, of of Kubernetes resource, like, uh, by scheduling the job using Python. Uh, like, taking the backup of all the objects, uh, in case of, like, we are not taking the backup up like, uh, it's it's free backup. So that can be also done.
we can use like lambda layers basically that gives all the like modules basically all these which can be used like reuse the code across all different lambda function that can be used to manage the dependency in Python so that can be one use case using lambda layer also we can use like lambda concurrency method like suppose like multiple lambda functions are running and then in that case we can use provision lambda concurrency basically whenever is required or reserve concurrency basically we can use reserve concurrency so that this much allocation of memory and CPU or number of threads running for that particular lambda function and it should not impact other lambda functions running concurrently yes
To go, when given an AWS infrastructure with scaling issues, how do you apply them to the structure? Scalability. Scalability. We can use of, uh, make use of AWS auto scaling using Terraform, basically. Um, there are different, uh, scaling policy, like, schedule scaling steps, scaling and all. So based on that, you can, uh, like, scale the number of instances, basically. A scale up or scale down, basically, uh, according to, uh, the usage, the traffic, uh, coming into auto scaling roof? Start taking place.
So here, go on directory and, uh, uh, run NPM, uh, those lines are encrypted, Fasten should install on all the dependencies that should be on top. And, uh, then for directory, and, uh, copy should be at last, actually. Uh, otherwise, uh, the like, it will have, like, extra layer up, like, building the, uh, Nothing is exposed should be, uh, before the CMD instructions. So that port, basically, during the docker build command that it will be used. The copy command, uh, there are 2 copy commands. Uh, basically, we can make it, uh, like, uh, if required, we can, uh, make use of, like, only 1 line command which is required, basically to reduce the, uh, extra layer in the docker image. Yeah. And that is one use case. Think so. So there are 3 to 4, uh, issues inside this docker file, basically.
Uh, we have used, like, the address portal 3, library, basically. Uh, AWS automation work, or we have to find a number of inactive. I am users, like, those who have not used more than, like, uh, 90 days. And I am like, uh, in photo 3 dot client, then get users is there. Uh, then, uh, botaree.client, then in bracket, uh, give I am then I am dot get users. Uh, dot get client or something. Get users. Uh, so using page editor in 1 page, it will come all these, uh, user user, uh, user I am users, basically. Then we'll calculate this number of, uh, days from the current date until back 9 2 days, basically. And we'll find the number of, like, active days is, uh, like, uh, is less than, uh, like, 90 days. In that case, we'll make use of, like, uh, to, like, deactivate those users. That is, uh, one use case using Python, uh, like, And 1 more use case, like, To, like, take these, uh, backup of all, uh, like, uh, open shipped objects, basically, We use, like, OS model, uh, then YAML model. Uh, using that, we feel like have log in to OpenShift cluster, basically using this token. Then, uh, use the command, like, q c tail sorry. OC get objects. Any object you can get and redirect redirect into Ymail format. That was one use case to get all this list of, uh, objects to take the backup from this, uh, um, OpenShift cluster, uh, in
just the recovery plan so for any like stateful application basically we can take the like a backup of all our data basically suppose we are using RDS then we can take either manually or automated backup and keep the backup also that backup we can send into some other like a region basically to keep as backup actually suppose we are using US-61 then we can send it to US-62 so in case US-61 all the data centers are down then using route 53 we have to we can make the one DNS to DNS records in case the primary DNS is down then route 53 will automatically redirect the traffic to standby basically and in case of like high traffic web application we can use like in RDS proxy basically or maybe like elastic cache in AWS that is also basically to like keep the data in the cache and it will give them like downtime minimum and consistency will be there and the RDS cluster should be like multi cluster we can say through means multi region actually and we can make use of like a PB and PVC concept if you are using like stateful applications in Kubernetes so data will not be lost basically and we should take the backup of the like PVC to some other region basically in that way we can plan like better disaster recovery and according to AWS disaster recovery plan we should take the backup or like pilot live or on standby or active-active plan any plan according to your budget also keeping eye of like cost cutting we can make use of like another instance of our database running at all the time in different region
Uh, so suppose, uh, you are using AWS tools, basically. So better to avoid, like, uh, for CAC Jenkins. So what happens in our, uh, recent project, we moved from Jenkins to Azure DevOps because of that only. Because when we commit the call into Bitbucket, Bitbucket and Jenkins both are not synced. So, basically, whenever the developer used to commit a code, uh, so that changes were not reflected. So that is 1 and the backup Jenkins. So better we can use, like, AWS, uh, core pipeline, basically. So that, uh, sync issue with the, like, core commit, it should not happen. That's one use case. And, uh, suppose you are using, like, for better scaling and issue. Suppose your application goes and, uh, 1,000 of ports are running, then you can also use, uh, AWS EKS, basically. Uh, so in that case, you don't have to manage your control plane. And, uh, you can use, uh, um, the project pipeline. You can integrate with, uh, basically your, uh, address code pipeline, uh, with EKS. And, like, Python app, uh, based application. Anyway, your, uh, build phase is not that much. Uh, like, you don't have to use, like like, when Java based graphics and I'm having a. So in that case, uh, uh, it should not take more time. That is one thing. And, You can use, uh, like, uh, what are the vulnerabilities uh, in the pipeline, you can have it checked, uh, so that, uh, it should not play. Uh, no need to, like, roll back after the production deployment. Also, in the, uh, code report, database code, uh, you do not keep unnecessary Python, uh, libraries and files and all.
Integrate and use more skill database like MongoDB, database, cloud, bug, Python,