Creative software professional offering over 11 years of experience. Enthusiastic about developing forward-thinking solutions to solve tomorrow's productivity problems.
Project Lead
R Systems InternationalTechnical Lead
Sopra Banking SoftwareSenior Software Engineer
R Systems InternationalSoftware Developer
Instant Systems IncDeveloper
Instant Systems IncConsultant
CapgeminiSoftware Engineer
Instant Systems IncDeveloper
National Informatics Centre (Contractual)Software Developer
National Informatics Centre (Contractual)Software Engineer
National Informatics Centre
Intellij Idea
.png)
Jenkins

Microservices

REST APIs

Springboot

Spring MVC

Spring Security

Spring Data JPA

Hibernate

Mockito
.png)
Docker

Kubernetes

Apache Kafka

Apache Storm

Apache Camel

Microsoft Azure

Google Cloud Platform

AWS S3

AWS Cloudfront

AWS Lambda

SQL Server

Apache Solr

Apache Cassandra

Scrum
Yeah. So currently, I'm working as a project lead in our international systems and I'm based out of Noida. Right. So, currently, I'm working as a project lead in our international systems and I'm based out of Noida. I have 11 years of experience overall. Here, my roles and responsibilities include designing and developing microservices using Java and associated tech stack. So, I also help fellow team members if they get stuck anywhere in their respective tasks. As an individual contributor, I get tasks myself, get clarification if needed from business design, develop the solution, and then deploy it. I'm also involved in the deployment of the solution. For example, I work on EPMs using Apache Storm and Kafka. We use Kafka to ingest live streaming data and then ETLs process the data once it's processed, we insert it into the database, which includes both SQL and NoSQL databases. In SQL, we have MySQL, and in NoSQL, we have Cassandra and Apache Sonar. Once the data is inserted, we develop microservices using Java and associated tech stack. There are two ways of consuming microservices: one is using Apache Camel, where we have a single Camel endpoint that can constitute multiple microservices working behind the scenes. The other way, which I'm currently working on, is using a data pipeline in GCP. I'm working on a cloud-based data pipeline project as of now. I've designed that as well, where we ingest data from multiple types of sources, such as files or inputs, which are vehicles that we process in the automobile domain. Based on the input, we call some microservices that we've already developed by using Python tasks in this data pipeline. Once we get the data back from microservices, we write it to BigQuery to structure the data a bit more. Then, in the next phase, we populate around 50 tables from the data we got from these microservices. Once the sequel part is done, in the next phase, we export that data from the cloud to an on-prem SQL server, where we run around 50 stored procedures in a sequential manner. I've designed the entire system with the approval of an onshore architect. So, I'm working on these kinds of things right now.
So currently, I'm doing this. As I told you in a previous answer, I'm working on a cloud-based data pipeline solution wherein we are making use of GCP. Here, we have a workflow, which is nothing but a combination of multiple pipelines arranged in a logical manner. In one of those pipelines, we are writing Python tasks that call Java-based microservices. And, we are getting data from this. The potential pitfalls of these are, yes, if the service is down, we don't get any data back in those Python tasks, and the data frame that we create sometimes becomes empty. This might lead to pipeline failures. If the microservice, the Java-based microservice, takes a lot of time to respond to a particular request, concurrent requests going in for that particular microservice can get cluttered up, and the entire pipeline slows down. Another thing I felt is that the error handling in these kinds of things is not very graceful. Error handling is not part of it, I would say. However, the availability and the time these microservices take over the network to return the data back can definitely slow it down.
I have not done this yet. I have not integrated any Python-based AI model to Java microservice architecture. I could think of something like this: You have an API where you want to train some data. So you write a normal Java-based microservice. You expose an endpoint. Behind the scenes, we can have a Java library that can be configured in the project's POM file. This library can give us a handle to the AI model that we want to integrate. Once we have that handle, we can call the methods that are there in that particular model. We can provide our data and get insights from it. So, that's a pretty basic take I can think of. There could be multiple things involved in this, but as of now, I can think of getting some library configured to the Java project through POM. We create an object of that particular handler class, which provides us the capability of using this model. Then we provide appropriate data to call that particular function and get the data back, whether it's an insight or a pattern or prediction or anything.
0-downtime deployments in Microsoft using Springboard can be achieved through so configuring, by deploying these microservices using Kubernetes. Let's say you have one microservice, and there's just one part where you've deployed this particular microservice, and you're getting a lot of load. So what we can do using Kubernetes is that we can, based on the amount of traffic that's coming in, the load that's coming in, increase the number of parts for this particular microservice, and the load balancer in this case would then distribute the load on both the parts. This will ensure zero-downtime deployment in the microservice. Another way of handling this is, if there is a version change in the microservice, say
Distributed caching mechanism can be okay. So, we'll have a Redis server that would be distributed so there would be multiple nodes of it. And in the application.properties file, we can specify all the nodes that we would want to connect for that particular node. We would define a property, and in that property, we would specify all the addresses of all the nodes that have caching implemented. We would create a connection from a connection pool for that particular caching node. And whatever data we want to cache, we can just simply make use of Jedis as an implementation of it. And, yeah, like, we can dump the data in a cache or after a certain time or something, we can read the data. Not after a certain time, I mean, till a certain time. Because in caching, we do provide the ability of time to live. We don't want everything to be cached forever. So we would specify some data to be cached for a certain amount of time and, yeah. Like, I'll give you an example of what we're doing. We are working on the ETS using Apache Storm. So, what we do is, in one of the bolts, let's say bolt number 1, we make a call to an API. And then once we get the response back from that particular API, it's a very huge response. It's a logic tree. Basically, it's an XML response, and we don't want to carry that response from one bolt to another because then the Apache Storm ETL would be very slow. The throughput of the ETL would be very slow. So what we did to increase the performance or boost the performance, we implemented a cache, and when we are getting that XML back from that particular service, we are dumping that XML into the Redis cache, and we just acknowledge that particular topple. And once we move to the second bolt, we go and try to fetch that information from the cache, and we get that particular XML, and we read data from it, and whatever we want to do, like, whatever the business process is there, we do that, and then we move along. So this way, the ETL throughput is better. And, yeah, that's how we have implemented caching in a distributed manner. It's not just on one node. It's distributed across 3 nodes in our project. So we specify all the 3 nodes in the properties file for this. So in case of ETS, we do that in the YAML file, and in microservices, you can do that in the application.properties.
Yeah. So, the same thing. A load balance would be very helpful in this case. We can tell that we only want to entertain 100 requests for a one port of a particular microservice. And as soon as the request passes 100, another part should be spun up at that instance or if there's a part already available, just redirect that request to that second part. So by doing this, we can manage the load during peak. And if you talk about using the infrastructure, using the deployment techniques, technically, how we can minimize this while building the microservice, we would have to ensure that we are not creating any unnecessary lists or collections iteratively inside a loop. Right? That would be one thing to check in the code. Second, if there are any computations, we try to do computations using the intents instead of integers if we are adding any two variables or stuff like that. Right? So that would be one second thing. Third, don't create a lot of new objects iteratively again inside the loop. I'll try to see if those objects are really wanted to create them inside a loop or inside an iteration or we can just move them out. Because as long as the object stays in the memory, it will keep it occupied and then there would be a peak in the memory size of that particular microservice. Maybe I'm just getting rotated from the topic here a little bit, but yeah, these are the two things that I can think of.
So what can happen in this case is, like, one thread, let's say, t one, it comes in and it tries to access the get service instance method, and we would check if the service instance is done. Let's say this is the very first invocation. It's none. Okay. You got it. And you get the service instance, the new service instance, and fine. Then, again, what will happen if two threads are assigned the second, third come in at the exact same time, then it could also get a new service instance rather than getting the earlier service instance, which even had got. So this is basically kind of a singleton pattern that's trying that we're trying to implement here. But this concurrent access on this particular method would break this pattern because the idea of singleton pattern is to provide one single object throughout the application, but in this case, when multiple threads come in, there's a possibility to get different service instances for the accessing threads. So what we can do here is, we can do two things. We can synchronize the entire method, and then only one thread can acquire the lock over this method and can get the instance. So only one thread would be allowed to get the instance at a time. So the problem would be solved here, but it would be very slow as the lock would be acquired on the old method. The second approach is applying a double check here. So we would create a volatile variable above for this service instance, and then we would only put the if service instance equals null inside a synchronized block. And before that, there would be another check for if service instance equals null. So there would be two checks. That's why it's called a double check singleton pattern. By doing this, we can solve the problem. So since we are only going to apply the lock only on the part where we are creating the service instance. So in this case, you know, the cost of operation would be less, and again, only one thread would be able to access the instance at a time. So the problem would be resolved with better performance.
So first of all, if for every exception that will occur from this rest API, it will always return our internal server error. Now we are losing the ability to provide or return any custom exceptions from the user service. Let's say there are no users as of now in the system, so we could simply return a 200 empty. We would want to return an empty 200. So that I don't think we would be able to return no. It's the case. Yeah. Like, I mean, even if we try to throw any custom exception, it would just go to the catch in the exception block and would be returned as an internal service error. And one more thing, it's just the error status code that we are returning. There is no exception trace or exception message or anything that we are returning. So we would never get to know in the logs what went wrong with this. So that's also one thing because you see, just passing HTTP status dot internal server error. The exception which was holding the exception details in the catch block, it's gone. So we are losing the track of what happened here.
Back-up strategy for microservices. That expands where you could bring data stores. I'm assuming we are talking about configuring multiple types of databases or multiple databases in microservices. So I'm not sure if I'm fully understanding this product. That's one thing. If that's not what's being asked here, then the back-up strategy for microservices ecosystem that spans multiple data stores. Is it related to distributed transactions? Because in that case, if there's a transaction that spans across multiple databases, we would implement a pattern like Saga or something to roll back the transactions in case there's a transaction failure at one of the microservices. I'm really not able to understand what this question is, so I'm giving different answers. I hope one of them is what you're looking for. Back-up strategy for different data stores could be if we don't work on just a single database, have a shared database spread across our network. Maybe use a cluster of nodes of databases so that even if one database is down, the application can be connected to the other databases on the different nodes. So you must configure your application to a cluster instead of just one data store so that in case one of the databases goes down in that particular cluster, your application is not impacted. So yeah. And plus, there would be additional replication of the data that will ensure data availability at all times and data safety as well. So, yeah, we must have to ensure that the data is available at all times. It's not lost and there is integrity in the data. It shouldn't be in a particular cluster. If there are three nodes, then on one node we are updating the data and the other two nodes are not getting reflected with it. So, that we have to manage. All three nodes on a particular cluster should have identical data. I think, yeah, that would be the best answer to this question.