With over 17+ years of experience, I have proven expertise in building enterprise applications using Java. My skills extend to cloud-enabled enterprise architectures, build automation, and infrastructure deployment. I am well-versed in microservices enablement, Spring, Java Stream API, multithreading, messaging
APIs, enterprise caching solutions, full-text search APIs, and testing. Furthermore, I possess in-depth knowledge of both relational and non-relational databases and have a keen eye for performance optimization at both the application and database levels.
Over the past 5+ years, I have successfully managed and led strategic initiatives at both department and organizational levels. I am well-versed in agile methodologies and have hands-on experience in managing scrum, sprint planning, sprint management, sprint retrospective, and sprint review.
Staff Engineer
VMwareAssistant Vice President
Credit SuisseTechnical Lead
LocationGuru SystemsSystem Engineer
IBMModule Lead
Persistent Systems LTdJava
Python
Spring Boot
AWS
Linux
Docker
Kubernetes
Terraform
Ansible
Jenkins
Git
Okay. Hi. I'm Anand Buble. I'm an engineer by profession, and I started my career back in November 2006. From then onwards, I've been a Java developer. I I essentially started with IBM, then I then I went to multiple organizations like Persistent, Vocation Guru, Credit Suisse. And finally, from last three and a half years, in fact, more than three and a half years, I've been working with VMware as a product developer. And, uh, the basic day to day work of mine involves majorly coding and guiding my junior team members and mentoring them with different sort of technical things. That's the basic introduction about me. Uh, about the product which I'm working on, it's a carbon black workload protection, uh, which is essentially an on prem product and being used by our customers to to do multiple things like installation of the sensor agents on their on prem workloads and then communication of those sensor agents to our Carbon Black Cloud. That's the, uh, the major part of, uh, what what I'm doing in
Okay. Well, uh, when we say without disrupting, uh, the current operation, I would highly go with the cloud based approach, uh, where the deployment would be in a rolling way, uh, where what I would do is, uh, slowly, I will I will start decommissioning my legacy systems. And, um, you know, part by part, I'll move the APIs on the, uh, on the latest versions, or or I'll I'll try to, uh, you know, convert my system into multiple microservices. With that, I'll get an advantage of doing the, uh, the migrations of 1 service at a given point of time without disrupting all the other services. And with that, you know, I can even maintain the the balance between my old customers and new customers as well. So that's how, uh, you know, essentially I will go with migrating a legacy system into a more advanced type of system.
Okay. The the major, problem with with with the item potency in payment system is because of because let's say if a customer double clicks the submit button or the request, it's auto submitted again Or because of the network fluctuations, you know, the the request goes back to the payment system, you know, more than one time. So in that case, what happens is whenever the request go more than one time, the customer, uh, customer's account, uh, would be debited multiple times. However, uh, it is not intentional, but it is because of the system's lack of idempotency. So for that, what I would do is, um, uh, whenever I'm trying to create a request to the payment system I mean, whenever I'm I'm going to the payment gateway, I'll make sure that I'm generating, um, a unique random number, and I'm associating that unique random number with my payment request. Then every time the payment request will go to the payment system, the payment system will actually check that number and validate it whether that particular ID or the number is in process or is in, yeah, it's in flight. If if the the the process is in flight, then the payment request will reject, um, the the next request until, um, the associated process, which is related with this particular session ID or the number which I generated, is in a finite state. And that's how I would not deal with the idempotency part of the payment processing system.
Okay. Uh, to answer this question, I'll just give it a try. But, uh, let's say UPI is a very new way of of of payment in in payment system. Right? Now here is the thing. Let's say in the earlier payment gateways, we used to have only 2 or 3 ways, internet banking, credit card, or debit card. Right? And now with the new thing, we have, um, the wallets or the UPI payment as well as, um, as well as some QR code based payments as well. So now in order to make maintain the balance, um, I will, uh, first of all, make sure that, you know, my my payment gateway, whatever I'm designing, um, it's is an extensible product while designing. Right? The next part is, um, it should be easy to hook up multiple, uh, ways of payments, uh, which are coming in the way. For example, let's say tomorrow if I want to hook up the digital wallet in my, uh, in my application, I should be able to hook it up quickly and there should not there should be minimal changes in in in the in the existing design. So that is how I would say, uh, I'll maintain the balance between the technical and the business objectives of, um, of new payment
Okay. So there are a couple of very important, uh, ways, um, I would say, to handle transactions in a distributed services. 1 is 2 phase commit and the other 1 is 3 phase commit. So, the 2 phase commit is a way where we do a prepare stage and then do a commit stage. Now in order to do a 2 stage 2 phase commit, um, we we need a we need some sort of coordination service in between. So for an instance, let's say, um, I'm trying to buy a product on a shopping site, right, so I'll have to do 2 different things. 1 is I need to make sure that the inventory is decremented by whatever number of products or the quantity which I'm buying. And then secondly is, um, the payment part. So after payment so during this particular transaction, both of the things should occur or both the things should be rolled back. So in case of 2 phase coming in, uh, the way I will do this is I'll have inventory service. I'll have payment service. And then I'll have a coordination service in between. Right? Now the coordination service will, first of all, prepare the things. That means it will lock the inventory at a given point of state and then it'll lock the, um, uh, the the order book, uh, what I'm ordering. Right? Now whenever somebody else is trying to, um, trying to grab the same inventory, um, with same quantity or different quantity, right, um, the person will, uh, will see a logged state. In that case, uh, first thing is my transaction will go. It will do the payment. It will decrement the inventory, And then finally, it'll release the things for others to buy. And that is how, uh, you know, I would I would do I I would handle a distributed transaction, um, in in a high load condition. Now, uh, the improvement can be done, on on this particular thing with a 3 phase commit, which is a little, let's say, uh, little improved version of of a 2 phase commit. But honestly, I have not worked on it. So um, conceptually, it's just just the top end design of, um, of of a 2 phase commit. Uh, that is how I would I would take it.
Yeah. So high availability and fault tolerance are definitely 2 different major pillars of, um, of a cloud system. And, uh, the higher liability can be taken care by having a horizontal scaling part uh, where, uh, based on the CPU usage, based on the lower, based on the RAM usage, or based on the threads which I'm running in my system, Um, I I can spawn, uh, multiple, uh, you know, instances of a given service. For an instance, let's say if I'm dealing with a with a Black Day sale or I'm dealing with a World Cup final match, um, on a on a platform, let's say, like Hotstar, uh, what I can I can do is, uh, based on number of customers who are watching that or streaming that match, I should be able to, you know, spawn multiple multiple instances of a given given service and then let let my users have a streamline or a smooth streaming experience? Right? Now, uh, fault tolerant, uh, fault tolerance is something, um, which can be achieved by replicating in different physical locations. For an instance, let's say, take an example of Netflix or YouTube, right, what we usually do or what they actually do is whenever a customer is uploading a video, uh, the video will be uploaded to some distributed file system, but at the same time, the video will be replicated over different geographical locations like CDNs. Right? And the same video will be replicated over over the CDNs and that's how, um, you know, the the nearest CDN will be able to serve the users um
Okay. Here, uh, as discussed in the very first question, the item put potency part is is is missing because, um, uh, it might happen that this process payment API is being called multiple times because of the network glitches or user clicks by mistake on on the submit button twice. Right? So in in that case, obviously, we are not validating the the actual processing request. And and we are allowing the request to to process the payment multiple times and which which may result in, you know, deduction of the amount multiple times in the user's account, which is not a good user experience. So that's the very important thing which is missing in in this particular thing. Yeah. And then okay. I don't know. But, um, you know, obviously, the authorization part, authentication part, um, is also missing. But I hope, um, it is taken care by the aspect oriented, uh, programming. So yeah.
Yeah. The logger is not initialized here. Uh, that is the mistake, and that will definitely lead to a, uh, lead to a potential null pointer exception. So let's say if if a processing transaction had some exception, which we are catching in the exception block, the logger is not initialized. And that's why, you know, it will it will simply terminate the program because, um, it's the nonpartner exception is a runtime exception, and it's not caught here. That's the mistake in this
Yeah. Okay. So I I can relate this, but I'm not sure if if if I'm a 100% correct over here. So, uh, the unified payment interface is an example of payment system, which puts a threshold of 100,000 rupees, which can be transferred in the window of 24 hour. Right? And that is basically based on the real time analytics. So it keeps track of how much payment or how much money is transferred from a given, uh, UPI ID or or a bank account, right, through UPI, uh, whether it it's it's from 1 UPI ID or it's from from multiple UPI IDs. But it has to be from from a single, uh, account through UPI mode. Right? So so that is a best example, um, uh, with which is dynamic adoption of payment processing based on threshold. And now the way it is it it can be implemented is, um, NPCI is a centralized body which keeps track of the payment processing through UPI interface. So no matter what, the NPCI will, uh, you know, monitor each and every request, uh, which is coming as a UPI payment for a given account. Now it it it has, um, the information about the UPI ID, the account number, and what is the transfer which is being made from that particular account. So that is how, um, the transfers can be can be monitored based on real time dynamics. And it is also true, uh, it is true not only for the transfers, but also for the acceptance as well. So, um, it also has a limit of accepting the payments on other UPI through a UPI gateway. So that's the process. So so we have to have a centralized, um, you know, mechanism, um, for the payment systems. Like, uh, for credit cards, it it could be, um, the the Visa, Mastercard, MX, like gateways. And for UPIs, it could be it could be NPCI which will monitor all these kind of things. Um, and, uh, for net banking, I believe the bank's interface should be able to monitor, uh, how much amount is being deducted from a given account and and maintain the threshold based on that.
Okay. The backward the the best strategy about maintaining the backward compatibility for a given REST API is, uh, you know, adding a new attribute in in the request, right, but not making it mandatory. Right? So if we make it mandatory, then, obviously, the backward backward compatibility will be broken. So we'll have to make sure that we are not making it mandatory. And if at all we are making it mandatory, then we'll have to, uh, release the new version of the API and then make sure that the older customers or older, uh, clients are using older version and newer clients are using newer version. Now there are multiple strategies we can do around it. Uh, for example, let's say if I'm using the API version like v1, v2 in the in the API URL. Or, um, I can make, um, the version parameter in the request itself. And then based on the version parameter, I'm checking whether the the, uh, which which particular parameters are really mandatory or or optional. Uh, so there are multiple strategies we can we can play around, um, with for for maintaining the backward compatibility of, uh, of a payment API or any API for that matter.
Asynchronous communication is a very important pillar I would say in distributed systems and that can be achieved with multiple things. So a very brute force approach is spawning a different thread to perform something which is not really important for the main line process. That is one part. Now spawning a thread that depends on your technology which you are using. For example in Java you can use concurrency, in C++ you have a threading framework that you can use, Go, Kotlin are again some good languages which handles concurrency in a very efficient way. Another way of handling asynchronous communication is using middleware like messaging queues. So you have RabbitMQ, ActiveMQ or for that matter WebSphereMQ. They have two different approaches. One is a point-to-point communication and another is a subscriber and publisher kind of model where the publisher will publish the data and then there would be multiple consumers who have subscribed for a given topic and they can process the things accordingly. And then finally we have something like Kafta which is a distributed messaging queue and it has its own design in itself where the broker has partitions and then it is able to handle large and large volumes at a given point of time. So based on the partitions there would be some coordination service like ZooKeeper which keeps the track where to send the message or store the message and then based on the consumer side, the consumer will obviously coordinate with the ZooKeeper or the coordination service to consume the message from a given partition. So that's how the asynchronous communication happens to reduce the latency. One more major advantage of having an asynchronous communication is not only to reduce latency but to decouple the production of the message and the consumption of the message as well. So there would be multiple workers, one or more workers will be producing the things and then multiple workers would be consuming and processing the things. So in that way nothing is in a blocking state and obviously things would be much, I mean things would be faster and will have a very low latency in a distributed system.