profile-pic
Vetted Talent

SURAJ KUMAR DAS

Vetted Talent
To leverage my extensive experience in DevOps and Cloud Infrastructure, aiming to contribute to an organization's innovation and efficiency through expert orchestration of AWS, Kubernetes, and CI/CD pipelines, while growing into more strategic roles and d
  • Role

    DevOps Consultant / Architect

  • Years of Experience

    13.5 years

Skillsets

  • Splunk
  • cloud transformation
  • NO SQL
  • DevOps
  • CI/CD - 9 Years
  • AWS Cloud - 8 Years
  • Security - 5 Years
  • IAC - 6 Years
  • AWS Services - 8 Years
  • Apica
  • NO SQL
  • Cloudformation
  • AWS - 8 Years
  • Grafana
  • Docker - 6 Years
  • EKS
  • Terraform - 5 Years
  • Ansible
  • Kubernetes - 03 Years
  • Git
  • Jenkins
  • SQL
  • Python - 08 Years

Vetted For

8Skills
  • Roles & Skills
  • Results
  • Details
  • icon-skill_image
    AWS Solutions Architect(Remote)AI Screening
  • 66%
    icon-arrow-down
  • Skills assessed :.NET, CI/CD, AWS Services, IAC, Networking, Docker, Kubernetes, Security
  • Score: 59/90

Professional Summary

13.5Years
  • Nov, 2024 - Present1 yr 7 months

    Senior Technical Lead

    Coforge
  • Dec, 2022 - Aug, 20241 yr 8 months

    Senior Associate

    JPMorganChase
  • Oct, 2020 - Dec, 20222 yr 2 months

    Assistant Consultant

    Tata Consultancy Services
  • May, 2017 - Feb, 2018 9 months

    Development Support Professional

    Kofax/Hyland
  • Apr, 2018 - Aug, 20191 yr 4 months

    Senior Software Development Engineer

    Euclid Innovations
  • Aug, 2019 - Jul, 2020 11 months

    Senior Implementation Engineer

    Thoughtworks
  • Jan, 2015 - May, 20172 yr 4 months

    Associate Consultant

    Virtusa (Polaris Consulting & Services Limited)
  • Jan, 2012 - Dec, 20142 yr 11 months

    Software Developer

    Tech Mahindra

Applications & Tools Known

  • icon-tool

    AWS

  • icon-tool

    Jenkins

  • icon-tool

    Ansible

  • icon-tool

    Python

  • icon-tool

    Bash

  • icon-tool

    Git

  • icon-tool

    Docker

  • icon-tool

    ECS

  • icon-tool

    ECR

  • icon-tool

    Kubernetes

  • icon-tool

    EKS

  • icon-tool

    SQL

  • icon-tool

    NO SQL

  • icon-tool

    Linux

  • icon-tool

    Splunk

  • icon-tool

    Grafana

  • icon-tool

    Terraform

  • icon-tool

    CloudFormation

  • icon-tool

    AWS

  • icon-tool

    SQL

  • icon-tool

    NoSQL

  • icon-tool

    Linux

  • icon-tool

    DataDog

  • icon-tool

    Terraform

Work History

13.5Years

Senior Technical Lead

Coforge
Nov, 2024 - Present1 yr 7 months

Senior Associate

JPMorganChase
Dec, 2022 - Aug, 20241 yr 8 months
    Managed EC2 Instance Refresh for Unbound (Onyx): Directed application refresh activities, including the replacement of EC2 instances, ensuring seamless operation and system stability for Client Unbound (Onyx). Automated SSL Renewals & Workflows (Python): Developed Python scripts to automate SSL certificate renewals and streamline team workflows through proof-of-concept automation, enhancing security, compliance, and efficiency. Resolved Customer Data Flow Issues: Addressed critical customer issues related to data flow and errors, providing high-level escalation support to maintain operational continuity. Streamlined Operations & Client Solutions: Developed and maintained efficient Ansible playbooks that streamline complex operational workflows, significantly improve deployment efficiency and system reliability, while also tailoring and maintaining Ansible jobs to meet specific client requirements, optimizing automation processes and ensuring client alignment. Provided On-Call Support: Delivered on-call support, ensuring prompt and effective resolution of issues outside regular business hours, maintaining high levels of service availability. AWS EC2 Instance Repaving: Directed the comprehensive repaving of AWS EC2 instances, crucially supporting the stability and performance of the Onyx blockchains mission-critical systems. Enhanced Security with Encryption: Instituted robust encryption protocols, safeguarding sensitive customer information and reinforcing the security framework against potential threats. Enhanced Monitoring & Performance (Grafana & APICA): Developed and maintained comprehensive Grafana dashboards for real-time system health insights and APICA checks to ensure application dependability and consistent performance, facilitating immediate response to potential issues, safeguarding operational integrity, and guaranteeing uninterrupted service delivery. Developed RCA Documents: Created detailed RCA documents that provided actionable insights, driving process enhancements and preventing future issues, contributing to ongoing operational excellence. Debugged APIs for Reliability: Showcased advanced skills in diagnosing and resolving API issues, ensuring smooth and reliable API functionality that underpinned critical business applications.

Assistant Consultant

Tata Consultancy Services
Oct, 2020 - Dec, 20222 yr 2 months
    Designed Multi-environment Infrastructures: Designed, maintained, and scaled infrastructures across production, QA, and other environments, ensuring robust performance and scalability. Streamlined Deployments & Scalability (Kubernetes & CI/CD): Implemented Kubernetes and CI/CD tools to automate deployments, accelerate application delivery, and enhance microservice scalability and system reliability through optimized resource allocation and performance. Automated AWS Operations for Ericsson: Developed automation scripts using Jenkins to streamline the management of EC2 instances, S3 operations, load balancers, and database installations, significantly improving operational efficiency for Ericsson. Led Automated Builds and Deployments: Spearheaded initiatives to automate builds, deployments, and validations for client servers, enhancing deployment speed and reliability. Enhanced Security and Automation for Ericsson: Improved company codes and automated security systems to mitigate risks for Ericsson, ensuring robust protection of critical systems and data. Optimized Cloud Infrastructure with AWS and Jenkins: Played a key role in optimizing and monitoring AWS and Jenkins cloud infrastructure, enhancing operational efficiency and ensuring high availability of services. Facilitated Knowledge Transfer: Led knowledge transfer sessions for new team members, ensuring smooth onboarding. Managed Escalation for Job Failures: Acted as the primary point of contact for managing escalations related to job failures, swiftly resolving issues to maintain uninterrupted service delivery. Contributed to Agile Development: Played an integral role in an agile development environment, collaborating closely with cross-functional teams to deliver high-quality solutions and meet sprint goals.

Senior Implementation Engineer

Thoughtworks
Aug, 2019 - Jul, 2020 11 months
    Led COVID-19 Hospital System POC for Odisha Govt: Spearheaded Proof of Concept (POC) initiatives for COVID-19 hospital system in collaboration with the Odisha Government, using technology to support public health initiatives. Orchestrated AWS, Jenkins, and Python CI/CD Pipelines: Led the development team in orchestrating automated CI/CD pipelines using AWS services, Jenkins & Python, ensuring streamlined software delivery and enhanced efficiency. AWS Infrastructure: Managed and optimized AWS infrastructure, including EC2, S3, and RDS, by implementing efficient backups, patches, and scaling strategies, resulting in a monthly cost savings of $3,000. Architected Bahmni Implementations: Designed scalable architectures for Bahmni implementations in hospital management. Containerized Monolithic App with Docker: Transformed a monolithic application into microservices architecture using Docker, significantly improving scalability and operational speed for Bahmni. Automated Build/Deployment with Jenkins: Reduced errors & accelerated workflows, enhanced development efficiency and deployment reliability. Custom Solutions for Bahmni: Developed tailored solutions to meet specific customer requirements for the Bahmni product. Managed AWS Deployments for Bahmni: Supported the Bahmni product by handling customer onboarding and AWS deployments, optimizing infrastructure for efficient operations. Optimized AWS Billing for PSI Zimbabwe: Implemented cost-saving measures and optimized AWS billing for PSI Zimbabwe.

Senior Software Development Engineer

Euclid Innovations
Apr, 2018 - Aug, 20191 yr 4 months

Development Support Professional

Kofax/Hyland
May, 2017 - Feb, 2018 9 months
    Designed AWS architecture for migrations; guided client setup; developed Python solutions; optimized costs by removing unnecessary servers and databases.

Associate Consultant

Virtusa (Polaris Consulting & Services Limited)
Jan, 2015 - May, 20172 yr 4 months
    P+ rating for outstanding service; Spot Excellency Award for resolving critical issues in the CITI Bank SMART II project.

Software Developer

Tech Mahindra
Jan, 2012 - Dec, 20142 yr 11 months
    Boosted Oracle ELP-999 training performance by 90%; Star Performer and recognized for impactful contributions to training.

Achievements

  • Significantly improved the stability and reliability of a vendor's AWS environment by conducting a thorough audit and optimizing the use of EC2, S3, Route53, DynamoDB, and RDS services. Analyzed usage patterns, implemented targeted improvements, and established robust monitoring systems. Resultantly, minimized disruptions, enhanced performance and cost-efficiency, and ensured that project timelines were met.
  • Successfully implemented an automated certificate renewal process using a Python script and CKMS/vendor APIs, replacing manual renewals. The automation ensured timely updates, reduced manual effort, and maintained security compliance, significantly improving system reliability and uptime. Integrating the solution into the CI/CD pipelines streamlined operations and eliminated the risk of service disruptions due to expired certificates.
  • Significantly improved the stability and reliability of a vendor's AWS environment by conducting a thorough audit and optimizing the use of EC2, S3, Route53, DynamoDB, and RDS services.
  • Successfully implemented an automated certificate renewal process using a Python script and CKMS/vendor APIs.
  • Led migration of critical workloads to AWS, achieving significant infrastructure cost reductions.

Major Projects

3Projects

Migration of Critical Workloads to AWS

    Led migration of critical workloads to AWS, achieving significant infrastructure cost reductions.

AWS Environment Optimization

Jan, 2023 - Dec, 2023 11 months
    Significantly improved the stability and reliability of a vendor's AWS environment by conducting a thorough audit and optimizing the use of EC2, S3, Route53, DynamoDB, and RDS services. Analyzed usage patterns, implemented targeted improvements, and established robust monitoring systems. Resultantly, minimized disruptions, enhanced performance and cost-efficiency, and ensured that project timelines were met.

Automated Certificate Renewal Process

Jan, 2022 - Dec, 2022 11 months
    Successfully implemented an automated certificate renewal process using a Python script and CKMS/vendor APIs, replacing manual renewals. The automation ensured timely updates, reduced manual effort, and maintained security compliance, significantly improving system reliability and uptime. Integrating the solution into the CI/CD pipelines streamlined operations and eliminated the risk of service disruptions due to expired certificates.

Education

  • B. Tech - Computer Science and Engineering

    JITM, Biju Patnaik University of Technology, Odisha (2011)

Certifications

  • Cka: certified kubernetes administrator | the linux foundation | jun 2024

  • Aws certified developer - associate | amazon web services (aws) | apr 2024

  • Aws certified solutions architect - associate | amazon web services (aws) | feb 2023

  • Aws certified solutions architect - associate | aws | feb 2023

  • Aws certified developer - associate | aws | apr 2024

Interests

  • Cricket
  • Badminton
  • AI-interview Questions & Answers

    Hello. I have a total of 13 years of experience in IT, and I have been working in AWS and infrastructure for the last 8 years. So, initially, I was in a Linux environment and worked for clients like Tech Mahindra and Polaris. After that, I moved to Cofax and Thoughtworks, maintaining the infrastructures and implementing solutions for customers who utilize our products. Whenever new instances, VMs, or servers are required, we manage the infrastructure and provide the details to them. We also manage anything related to applications, CSUs, or databases in RDS, DynamoDB, or any other AWS services. We are the first point of contact and work on resolving those issues. Additionally, when decommission requests come into the picture, like old servers or projects getting decommissioned, we clean up the systems and everything happens in the agile development process. We work on a sprint-by-sprint basis. Besides that, we write new Terraform scripts, Cloudformers, and templates to automate the infrastructure as required by the clients. In one of my previous organizations, I wrote Terraform scripts for the ONIX client, a major client for Bitcoin mining and related activities. We deploy instances through Jenkins, which is also known as JUULES.

    So in AWS, when data is in transit, we can use SSL/TLS, and the certification 5509, in order to encrypt the data in transit. And there are two ways, whether it's a customer-managed or the AWS-managed. So the customer-managed, the customer will have full control over the encryption and decryption and only use the AWS services for data storage and everything. And when it comes to the KMS, it's managed by the AWS, and AWS will take care of the auto-rotation of the keys every once in a while, but we can configure that. So, at rest, for example, in S3, if we are storing the data, we can enable KMS. By default, S3 encryption is enabled, but we can enable it by using KMS, which is managed by the AWS. So, and another thing, SSCC, where the customer has to request the customer will manage the encryption part, and AWS has nothing to do with it.

    BPC pairing is nothing but. It's a communication between two VPCs. Suppose VPC A and VPC B, we can establish a connection among these two VPCs by doing VPC pairing. So, and the security can be provided in two ways. One is using security groups, which are stateful and will only allow traffic to it. And in addition, in order to provide additional security, we can provide security at the subnet level by using a network access list, which is stateless. This allows us to define allow and deny rules, and everything is different based on our requirements. Whatever traffic we want as inbound and outbound, everything can be defined and checked. So, in VPC peering, communication between two different VPCs can be done by using security groups and NACL level, by providing NACL at the subnet level.

    In code pipeline, we can use CodeDeploy. The CodeDeploy is where we can have two instances to deploy to. And there are several ways to have no downtime. If we are using, for example, the blue-green deployment type or rolling in batches or rolling update. So in that case, we won't have any deployment downtimes. So when we have that, for example, the blue-green deployment, in that case, we can have another instance where we can redirect our traffic to a part of the traffic can be redirected to that. And once everything is sorted out, the full traffic can be redirected to the newly created instance. And the older one can be terminated. And another one is rolling updates. So what happens in rolling update is another set of instances can be created, and once that is updated with the latest code and everything, we can terminate the existing one and all the traffic can be done. If we implement a zero-downtime deployment, then there will be no downtime.

    So what CloudTrail does is it stores information related to the user accessing the APIs and it's best sure for figuring out what user has access to the API and for trailing the information, is there any person who is trying to access who is not supposed to be. So, that thing can be done using the CloudTrail. And in the AWS config, I don't recall what AWS Config does right now.

    The version control can be created in either Git or Bitbucket, anywhere we want to use. And the usable modules mean we don't need to suppose one resource is being created, like an EC2 instance by using CloudFormation Terraform. So that resource, we don't need to rewrite it every time we want to have one EC2 instance. In that case, we can have one module, one small module for that, which does what it does is it gets the right set of permissions, and that can be reused. That's where the reusable module is. And the same thing applies in cloud formation.

    But, in any ISE, the database, especially when a stateful set is not part of an ISE, it should be handled externally, and the connection endpoint should be provided to the ISE, regardless of the cloud formation or Terraform. And the username and password should not be hardcoded in the template itself. There is a concept called parameter, where while providing the template, the username and password can be provided as a parameter. So if that is the case, then the password also can be stored in a secret manager, and it can be accessed through the secret manager or SSM parameter. Anywhere we can store it, and we can make sure that it is KMS encrypted. So the username and password should not be hardcoded in this case.

    Effect allows actions 3. Resource is here in my bucket. Condition string does not report. My Slush. Well, the resource section is here. Given my bucket, wildcard. It means everything inside my bucket should be accessible. And then again in the condition section, if the string is not home slash AWS username. So AWS colon username is given to the specific username who is trying to access, so that he can only access what he's storing. He cannot see what others store. So effect is a low. This was condition string not equal to. So it should not be home, but instead my bucket then AWS username. Or in the resource section, we can have home slash.

    Well, the monolithic applications, the approach would be like, first, we'll have the connectivity, whether we wanted to use SSO login or SAML 2.0. And we need to figure that out first. And once we have that, the monolithic applications first of all require some measures. Just like we have to have microservices instead of putting everything in a single instance, having numerous instances coming to the load balancer and AZ coming into the picture. So what we can do is we can utilize Kubernetes, Docker containerization. All that stuff can be used, which will definitely require major development changes. So if we want to directly use the monolithic application, what we can do is we can have static content if that application is processing. So we can redirect the user to CloudFront. And from there, we can send the request directly to S3. If not, then the API gateway, then the load balancer, and underneath all the EC2 instances. And each EC2 instance will have an individual, standalone component. And so, in order to have a microservices approach, we need to do the development changes. And all individual components can be deployed into a port, and port means in EKS.

    Well, when I'm unsure about how many issue 2 instances will be created in ACS, the task is task definition based on the task definition, and two instances are getting created. Suppose if we don't want to manage the underlying resources and we don't know the load, how many containers will be getting created. In that case, it's better to give it to AWS. AWS will take care of all the underlying resources. We don't need to worry about the number of issue 2 instances we want or how many resources we want for our purpose. So, AWS will take care of the auto-scaling and underlying resources. If it's required more, then it will have more, and based on the request, it will scale down. So, everything will be happening in the backend. We won't be managing the resources. The AWS will take care of it.

    Lambda, we can send the Lambda log into CloudWatch, and CloudWatch is the best part to see log information, unless there's another way, like using Kinesis Data Stream to send data to 3rd party observability tools like Splunk, Datadog, and all that stuff. That's one way to figure it out. And we need to see what lambda function is, whether it's synchronous or asynchronous. If synchronous, we know right away if we're not getting a 200 or success message. In asynchronous, what happens is like if something is sending to an S3 bucket and we don't wait for the response. We do get a response right away, but underneath it still works, putting data into S3 or SQS sending queue. So whenever any issue happens, it sends data to the dead letter queue and processes subsequently with exponential backoff. So if those things are still not working, everything is failed, we need to see the CloudWatch and CloudWatch logs. And we need to do a testing on the Lambda, like we can have another Lambda function and try to test if everything's working fine or not. So the error loop Lambda usually used for short intervals of time. We don't write a big complex program into that. So most of the complaints which can be handled within 15 minutes, those things would be mentioned, and those things should be handled by using Lambda, and testing everything can be in a linear application itself. So yeah.