profile-pic
Vetted Talent

Sirish Kumar J

Vetted Talent

Accomplished Cloud Solutions Architect with over 17 years of hands-on experience in crafting and managing resilient cloud solutions. Demonstrated leadership in optimizing infrastructure, driving scalability, and achieving cost efficiency. Adept at aligning technological initiatives with business objectives to deliver cutting-edge cloud services.

  • Role

    Senior Cloud Solutions Architect/Platform Engineering

  • Years of Experience

    17 years

Skillsets

  • cost optimization
  • Windows
  • Team Leadership
  • PowerShell
  • Linux
  • Golang programming
  • Data Analytics
  • Cloudformation
  • strategic IT planning
  • ML services
  • Cloud security and compliance
  • IAC - 2 Years
  • hybrid cloud solutions
  • Performance Optimization
  • Generative AI
  • disaster recovery planning
  • cloud migration strategies
  • AI
  • Python - 3 Years
  • Cloud architecture design - 7 Years
  • Ci/Cd Pipelines - 5 Years
  • DevOps - 6 Years

Vetted For

8Skills
  • Roles & Skills
  • Results
  • Details
  • icon-skill_image
    AWS Solutions Architect(Remote)AI Screening
  • 68%
    icon-arrow-down
  • Skills assessed :.NET, CI/CD, AWS Services, IAC, Networking, Docker, Kubernetes, Security
  • Score: 61/90

Professional Summary

17Years
  • Apr, 2020 - Present6 yr 2 months

    Senior Cloud Solutions Architect/Platform Engineering

    GoDigit General Insurance
  • Jun, 2018 - Apr, 20201 yr 10 months

    Architect Cloud

    Microland Limited
  • Jan, 2013 - May, 20185 yr 4 months

    Infra Consultant

    NTT Data Global Delivery Services Limited
  • Aug, 2007 - Jun, 20113 yr 10 months

    Resource Specialist

    ASAP-Y Sourcing Solutions Pvt. Ltd
  • Jul, 2011 - Jan, 20131 yr 6 months

    Technical Lead

    Cognizant Technology Solutions

Applications & Tools Known

  • icon-tool

    Imperva

  • icon-tool

    Akamai

  • icon-tool

    Kong

  • icon-tool

    Rancher

  • icon-tool

    AWS Lambda

  • icon-tool

    Gradle

  • icon-tool

    GitHub

  • icon-tool

    Docker

  • icon-tool

    Kubernetes

  • icon-tool

    Terraform

  • icon-tool

    Route53

  • icon-tool

    S3

  • icon-tool

    RDS

  • icon-tool

    DynamoDB

  • icon-tool

    SNS

  • icon-tool

    SQS

  • icon-tool

    AWS CodePipeline

  • icon-tool

    CloudFormation

  • icon-tool

    CloudWatch

  • icon-tool

    AWS IAM

  • icon-tool

    EKS

  • icon-tool

    AKS

  • icon-tool

    Google Cloud Platform (GCP)

  • icon-tool

    Golang

  • icon-tool

    Jenkins

  • icon-tool

    AWS QuickSight

  • icon-tool

    Amazon Q

  • icon-tool

    Python

  • icon-tool

    Terraform

  • icon-tool

    Windows

  • icon-tool

    Linux

  • icon-tool

    AWS

  • icon-tool

    Azure

  • icon-tool

    Dynatrace

  • icon-tool

    Google Maps

  • icon-tool

    GCP

  • icon-tool

    SageMaker

  • icon-tool

    Redshift

  • icon-tool

    AWS CLI

  • icon-tool

    Terraform

  • icon-tool

    ServiceNow

  • icon-tool

    EC2

  • icon-tool

    VMware

Work History

17Years

Senior Cloud Solutions Architect/Platform Engineering

GoDigit General Insurance
Apr, 2020 - Present6 yr 2 months
    Lead the design and deployment of multi-cloud solutions on AWS and Azure, significantly enhancing scalability and flexibility. Architected comprehensive cloud solutions, implemented Kubernetes, enhanced observability, designed disaster recovery strategies, and automated cloud infrastructure.

Architect Cloud

Microland Limited
Jun, 2018 - Apr, 20201 yr 10 months
    Designed and administered AWS cloud platforms, executed automation solutions, performed cost analysis, migrated systems to ServiceNow, and managed containerization with Docker and Kubernetes.

Infra Consultant

NTT Data Global Delivery Services Limited
Jan, 2013 - May, 20185 yr 4 months
    Implemented self-healing infrastructure, designed monitoring solutions, automated AWS environments, managed hybrid cloud environments, and facilitated serverless computing solutions.

Technical Lead

Cognizant Technology Solutions
Jul, 2011 - Jan, 20131 yr 6 months
    Managed VMware ESXi deployments, troubleshooting, and administering virtual environments.

Resource Specialist

ASAP-Y Sourcing Solutions Pvt. Ltd
Aug, 2007 - Jun, 20113 yr 10 months
    Worked on staffing solutions for direct clients and in-house projects, Windows administration, and troubleshooting.

Achievements

  • Architected comprehensive AWS and Azure cloud solutions, ensuring optimal performance and scalability.
  • Implemented Kubernetes, DevOps, and DevSecOps practices, enhancing deployment efficiency and security.
  • Enhanced system observability by integrating advanced monitoring and logging tools.
  • Utilized Google Maps and GCP for basic cloud services, enabling seamless API integration and functionality.
  • Designed and executed multi-cloud disaster recovery strategies using cloud-agnostic tools and solutions, ensuring business continuity.
  • Established a greenfield Azure cloud environment for a subsidiary company, providing a robust and scalable infrastructure.
  • Developed self-healing cloud infrastructure, automating recovery processes to maintain uptime and reliability.
  • Implemented Dynatrace for advanced application performance monitoring and optimization.
  • Deployed security tools such as Web Application Firewalls (WAF) and Network Firewalls, along with API Gateway configurations, to safeguard cloud environments.
  • Led multi-cloud cost optimization initiatives, leveraging AWS and Azure cost management tools to reduce expenditures.
  • Managed inOps and team activities, fostering collaboration and ensuring the successful delivery of cloud projects.
  • Managed cloud budgets, optimizing costs and aligning expenditures with organizational goals.
  • Automated Cloud Infrastructure using Golang, CI/CD deployment automation, automated Kubernetes cluster upgrade/migration activity.

Major Projects

5Projects

AWS and Azure Cloud Platform Solutions

    Architected and deployed multi-cloud infrastructure, implemented Kubernetes and DevOps practices, integrated observability, and executed disaster recovery strategies.

Sentimental Analysis Solution Redesign

    Redesigned solutions for optimal response time and cost-effective infrastructure, collaborated with AI & ML teams, and provisioned necessary infrastructure.

SERCO, UK

Jun, 2018 - Apr, 20201 yr 10 months
    IaaS (Infrastructure as a Service) for AWS cloud environment with 200+ AWS instances in different availability zone.

Forethought Financial Group Acquisition by Global Atlantic

Jan, 2013 - May, 20185 yr 4 months
    Proficient in automating, configuring, and deploying instances on AWS, with expertise in designing and deploying a variety of applications using the AWS stack, including EC2, Route53, S3, RDS, DynamoDB, SNS, SQS, and Lambda, focusing on high-availability, fault tolerance, and auto-scaling in AWS CloudFormation.

Comcast Converged Products (CCP)

Jul, 2011 - Jan, 20131 yr 6 months
    Experience on Installation, Configuration, Administration and Troubleshooting of VMware ESXi 4.x, 5.x Virtual Center.

Education

  • Bachelor of Technology in Electronics & Communication Engineering

    JNTU Hyderabad (2005)

Certifications

  • Aws certified solutions architect associate

  • Itil foundation certificate in it service management

  • Aws certified solutions architect professional (aws sap)

  • Itil foundation certificate in it service management (itil)

  • Aws certified solutions architect associate (aws saa)

AI-interview Questions & Answers

Yes. Of course. So, I have around 17 years of experience in infrastructure. I started my career as an administrator, and after that, I moved into a virtualization environment where VMware is a key component and virtualization environment. And after that, after a couple of years, based on the client requirement and the technology shift, I moved to a cloud infrastructure. From the past 7 years, I have been in public cloud infrastructure managing and architecting. The current organization I'm working for is a product-based company, and it's a Fintech company, actually. And then I've been managing the complete cloud infrastructure for the company. Our company has a multi-cloud environment. Recently, we acquired one of the companies or introduced a new company. And for that company, we created a complete end-to-end, or a greenfield, infrastructure, which we created. It's a completely new infrastructure based on our knowledge and based on our experiences and the drawbacks we accumulated. And based on that, we created a complete new infrastructure for our new company. I've been working with AWS and Azure, and we have a requirement for a multi-cloud expert. I have completed the research and prepared a multi-cloud expert assessment and provided it. We are going in that direction, actually. So the plan was like, we have two companies, and one of the companies stays in AWS, another one is in Azure. So what we are currently doing is recreating an environment or a plan for ADR in such a way that, one company's Doctor will be in another cloud, and the other company's Doctor will be in the same cloud. So it's a completely cross-cloud Doctor, and we have a near-cloud Doctor as well, which is like, for smaller applications or quick, high availability. It's in AWS or in any public cloud, high availability is derived from utilizing the company's, or rather, the utilization of multiple availability zones. All the multi-availability zones, actually. So we are creating a near-Doctor as well for a few applications and a few applications which we found, which is very critical in nature. And for compliance purposes, we created a multi-cloud Doctor. And, if you want to explain about the skills and everything, as I told, I've been working with AWS for almost 7 years now, and Azure, I've been working for almost 3 to 4 years now. And I've initiated and worked on many projects like implementing self-healing mechanisms for an application and creating a complete monitoring setup for the company, a complete monitoring setup, and creating a Doctor for the companies based on the budget and based on the compliance requirement. And many more applications you can find, actually.

What strategy would imply to ensure disaster-free for critical AWS applications? There are two strategies, based on the business need and company budget. We can have multiple strategies. One of the strategies we have is a near-drift. For example, in India, we have Mumbai as one of the available regions, and we have another region, which is a near, newly created region, a hydropath region. Before, we didn't have a multi-region, so we never had a near-drift or a drift for our infrastructure, because the company or country rules where our data should not reside in another region. So we didn't have our drift, but for that sake, what we did is we did a multi-cloud drift for one of our critical applications, like our core application and the database. And the similar thing, now we have a near-drift or a new region which has come up and is fully functional. We plan to do an RTO drift for our core application. And how we did this, we listed down all the critical applications and services we're using in AWS. And based on that, we created a setup in the same region, where we have similar kinds of applications as well. For example, we have an RDS running on a post-release, and we created a multi-region setup. So there are two options: when we can have a read replica in another region and the applications which write can use a read replica as a drift instance or for the same application. So for example, the primary will be in one region, and the secondary will be in another region. There will be a bit of latency because for all read calls, it will have to go to another region. For that, what we did is we had a primary and secondary in one region and another region, which is a multi-region, and we enabled it, and we have another instance which is running in the other region. So when we have to take over or do a drift, we'll just take over the other one, the standby instance, which is in the other region. And that will be a primary, and these two will be a secondary and standby. So this way we planned our application and for the database and for the applications. It's a completely loosely coupled architecture. We have 80% of our workload in a microservice environment, and the remaining are completely in a scaling mechanism. So we transferred the images from one region to another region and created a hardware setup and everything. In case of any drift, we just go ahead and deploy there as we have a S3 bucket and a code pipeline, everything is set up. So whenever we wanted to have a drift, we'll just upscale the servers in the other region, and it will become available for the application purpose. Another strategy is for the multi-cloud tier, actually. Multi-cloud here, to achieve a multi-cloud here, we mostly worked on removing the native services, removing the dependency on native app services from the public cloud provider, and we moved that to a third-party tool. And based on that, we had a multi-cloud setup.

How do you ensure data encryption compliance for data at rest in transit within AWS? Okay, the encryption purpose, we use SSL/TLS certification at the load balance level for infrastructure. Our infrastructure, our servers are in a scaling mechanism. We use auto scaling, and the code is in Bitbucket, and we have a launch template for it. Most of the applications are in a scaling mechanism. There are no standalone servers; independent servers run an application. We don't have that kind of setup. For the scaling purpose and encryption purpose, we have multiple things. We have an SSL setup, which we integrated. We have a domain hosted in AWS, and from there, we use the ACM service, where we upload our domain certificate. And from there, it's integrated to all our services. This is one level of encryption available in AWS, and another one is at the EBS volume level. We encrypted our volumes, all our volumes in encryption. This has become encryption at the volume level, at rest. And for in-transit, we use our SSL certification. But each and every communication between those services and from the public, it is encrypted. This is how we achieved data encryption or complete transaction encryption in transit.

We have many security services in AWS for the audit purpose or for compliance purposes, to name a few. We have VPC flow logs, where we can have a complete infrastructure. I mean, so all the requests, all the activities happening in the VPC level, or the network level, traffic in and out, or what kind of requests we are getting and what kind of requests are blocked or rejected and where it is failing. We can track it from VPC flow logs, and we have a CloudTrail service where it is enabled by default by AWS Cloud Infrastructure. So every activity which is an API call to the cloud or any management activity, and everything. It is activated in the sense that it is stored in CloudTrail. We have services like Config, where it will track changes to each and every service or several config changes, so it will be recorded in that. And based on all these logs, all these logs are accumulated, and it can be stored in the Security Hub where you can have all the logs audited, actually. It will give us a score based on the application or the compliance which you selected. For example, AWS has its own compliance certificate. We can enable it and see where we are falling into that, what kind of workload falls under that. And for ISO 27001 or something, it will be audited. We can go ahead and enable it based on our client or based on our workload or our company requirements and automatic security audit. And another tool will be AWS Trusted Advisor, where you can find each and every service. You can find out the security loopholes or the security findings where we can see any public IPs opened or anything. So it is a very good and must service, which is already enabled in AWS. With this combination of all the services, we can find out the security findings we have. And based on that, we can audit and take an action based on it.

How do you maintain historical records and audit changes to AWS instances? How do you leverage AWS CloudTrail? CloudTrail is a service offered from AWS, which is enabled by default. You can have a stream, which when enabled, can be sent to CloudWatch services, log groups. Another stream can be configured for S3, where you can store historical data for audit or compliance requirements, based on requirements such as three months, six months, or one year. To monitor CloudTrail, you can use CloudWatch, where you can see security findings, breaches, or other issues you want to audit, or find the root cause of failures. Config is a service that records configuration changes made to a particular service. For example, if you enable Config for the EC2 service, it will record all changes, such as who stopped the instance, who took a snapshot, or added tags. These logs can be forwarded to S3 for storage, and from there, you can use Athena for querying or a QuickSight dashboard that integrates with Athena to create a security dashboard.

What is the process to securely manage some safety configuration service. Some safety configuration services require very important secrets, for any applications or for those that need to be managed in a highly secure environment. So AWS offers secret manager services, such as AWS Key Management Service (KMS), where you can store secrets for any application, and with an API call, you can retrieve the data, and your application can use the service. There are different kinds of services. The KMS service is one of them, and this is for applications that can integrate with them. And if you really want to have a secure environment for any audit purpose or other government requirement, you can have a dedicated Hardware Security Module (HSM) as well. It is a bit costly, but it gives you high-level security, what we call a security container, hardware devices that are dedicated to you, and we can use that service for that. Oh, sensitive configuration. Now, sensitive configuration. Similarly, for my experience, what we did is we removed all secrets from the code for an application, let's give you an example of our microservices, which is running on an AKS infrastructure. What we did is, the developers or dev team stored the application secrets or all the secrets in the code itself. So, that was creating a security loophole. So what we did is we removed all the secrets from the code and for communication purposes, we integrated with the AWS Secrets Manager. Recently, we got an endpoint to use it, but before that, like, they were using a server, and they were using a third-party service, and the secrets were stored in the code. So, that was a security risk which we found out, and we moved that completely to have a role and the endpoint, VPC endpoint and S3 endpoint to store the data. So, similarly, not only S3 service, but any other application that wants to communicate or use a third-party service, the dev team requires, and we refer to store the secrets. So, we moved our strategy to have a secret config service in the EKS environment, and we stored the secrets in EKS. And every time the application requires that information, it will go to make an API call to a secret service. And from there, it will get a secret, and it will carry forward the request. So, this is the setup we have implemented.

And this policy is written in JSON. Let's take that prevent it from executing as intended. Can you spot and explain the room? So there is a version is fine statement. Effect is allowed to action as s 3 and s 3 for all the resources. Action is it's completely full access we have given to a particular resource called my bucket and the child policies. Got it. And conditions, stress, strings not equals to s 3 prefix. It's none. The prefix is none. Home slash and home slash with rate AWS username. This statement might be good to design. I think the condition is wrong, string not equals to s 3 prefix. None. It should not be none, actually. Home slash home slash. First of all, they should note. Is it I need to write here or policy written in JSON. There is a mistake that prevented it from executing, isn't it? Can you spot and explain the error? First of all, there should not be a version is fine statement. It should be "Version" with a capital V. There should not be a open bracket and where 1 1 1, 2, 3, 4, 5, 6, 7. In the 8th line, there is a bracket which should not be there. It is not part of the JSON structure. The closing bracket should be after the "conditions" line or the line should be there. It's only about the JSON format is correct, but the way the format is wrong, actually. The intentions, space intent intentions is are wrong. I think that is correct. The statement "intentions is are wrong" should be "intentions is wrong" or "the intentions are wrong".

Given to say there was lambda functions, a bit written in Python, can you identify whatever it might throw during sessions? 2, 3. Okay. Define lambda handler event and context. Okay. S3 client equals S3 client. S3 response equals response. Get object bucket equals my bucket. Key equals event. Return response.body. I'm sorry. What it is saying? Even if it was lambda functions to better return Python. Can you imagine for what it might throw during execution? What exception it might throw? Response equals S3 client.get_object(Bucket='my_bucket', Key=event['key']). It goes to event. There should be a value in the key, actually. Key equals event['key']. It's a key-value pair. Return response['Body'].read(). What we are trying to do here is, we are trying to get a bucket list. What are the listings what are there in the bucket line? The there is a bucket called my bucket, S3 bucket, and we are trying to list the objects which are stored in that bucket. Event. What? So there is a key-value pair in the bucket, and we are trying to get the value from it. I think the issue is that we need to provide a value here, which is not mentioned here. We are providing a key. Again, key, actually. That's the issue. We can correct it. Okay.

When scaling an application using easy to auto scaling and spot instance, how could you optimize cost when scaling an application using easy to auto scaling and spot instance? Easy to auto scaling is a mechanism which is used to scale an application based on the hardware. So for example, if I have a server with a minimum of 3 servers in it, an auto scaling server with 3 servers which are running, and I got many requests on that server. So what happens is if you get a more request on it will start to use more hardware. So and based on the CPU and memory, I will scale my applications based on the hardware. I'll increase the number of servers, for example, if CPU grows 60% or 70% of the average CPU, I'll create 1 more server. So auto scaling is a scaling mechanism for it. And if you implement spot in it, so spot is a type of reserved instance. It's not a reserved instance. It's an extra hardware which is provided at the AWS level, and they offer these hardware at one-third of its cost. They'll give you spot instances at 60 or 70 percent of the cost. What you can do is you can configure in the auto scaling to run servers with standard instances when you want to have a minimum for a production kind of environment. You can run servers with on-demand instances based on your request. For example, my auto scaling has 3 servers which are required, and at peak hours, it goes up to 4 or 5. So what you can do is for the scaling purpose, you can have an auto scaling service implemented for that. So minimum service, my servers will be on-demand servers with 3, and for any scaling mechanism, it will use the spot instances. With this, you can reduce your cost. But you need to have a mechanism or your application has to be in a way that it has to recover from the failure because spot instance will come, and there is no guarantee that it will stay for 1 hour or 2 hours or half an hour or 30 minutes. So your application has to be designed in such a way that with the failure of the hardware, you should run the sessions or it should not be a sticky session, or it should not have any sessions which is holding at the instance level. If you have that loosely coupled architecture for your application, then we can use spot instance in production as well. Else, that is 1 use case for production. And for that development environment or in lower environment, we can blindly go ahead and use spot instances for earlier scaling mechanisms, scaling servers, or for the EKS infrastructure, Kubernetes infrastructure, everything, you can go ahead and implement Spot instance where you can have a massive cost saving, actually.

How could you deploy an application on AWS with considering for both CICD pipeline and AWS risk practice? Could you deploy an application on AWS with considering for both CICD and AWS security best practices? So we have many services for container services or any services which you want to have as EDC, I set up. You can have CodeDeploy for deploying code in the servers or the microservices environment, and you have a mechanism like ECR for storing a Docker image. We can create a Docker image and then store it in the ECR. This public cloud infrastructure has the tools required for configuring CICD implementation deployments. For security purposes, you can have services like Inspector, where you can scan the server or the image in the ECR to find out if there are any vulnerabilities. Before deploying code or creating an image, you can have a scanning mechanism to find out if there are any vulnerabilities. If you find anything, you can stop the deployment or fix the vulnerabilities and retrigger the redeployment. There are many services which can be integrated to achieve this kind of setup, like scanning for any security findings before deploying, and rectifying it. You can have a CloudWatch for monitoring purposes to find out any irregularities or security vulnerabilities. Inspector is a service for the security side, used for scanning purposes. You can also have a Detective service, which can detect secondary vulnerabilities or code vulnerabilities or platform vulnerabilities. By combining all these services, you can have a CICD setup with security.

In which scenario could you choose AWS Fargate or using an AWS EC2 instance for running a container? So Fargate is a managed service from AWS where you just need to put your code into that, and it will create a container image and everything and provide a service for you. So for EC2, an EC2 instance is a server where you can create a server. You deploy a code there. That means it's a self-managed infrastructure where you have to manage the resources. You have to manage monitoring. You have to manage the deployment, the code, of any security scanning, patches, and everything on the servers. It's an infrastructure where you have to manage it. On the other hand, it's a managed service from AWS. You just need to put your code there, and it will create the servers. It will manage the scaling. It will scan. Since it is a managed service, it will do all the infrastructure-related work for you. So it's very easy for any small application to be implemented into a dockerized kind of environment, and it will take care of the infrastructure from your side. Hello? This is mostly used for easy Kubernetes and with one end. So we have services like EKS. It's an elastic Kubernetes service, which is managed. It's an open-source Kubernetes service, which is managed by AWS, actually. The master engine and everything, the masters, which will be managed by AWS. You see ECS is another service from AWS where you can have a container. It's not an open-source service. It's a service that AWS implemented for the managed services for container or microservices environment. Similarly, for the managed service from AWS, which is you don't need to have to create any containers or anything. Just run a code, and it will provide a provisioned infrastructure for you, and you can use that service. That's it. Thank you.