
I am python back-end code specialist with more than 6 years of experience in design, development, testing, deploying and maintaining web applications using Django and ETL pipelines which use python in backend. The role of my engineering practice consists of database design, interaction with MySQL, MongoDB, fixing backend bugs, adding functionality, adding an API endpoint, configuring the unit test, setting up deployment. Architect, design, and develop advanced backend services for our AI-powered market research platform using core Python, advanced Python, and frameworks such as Flask MVC, Django Rest API, and Fast API. My technological expertise includes data warehouse, Django, Microservices, Flask using python, pandas. SQL, Pyspark, AWS lambda, AWS Glue job. Also I have months of experience in setting up AWS infrastructure using terraform. I am also familiar with DevOps tools like Jenkins, docker, Kubernetes.
Senior Software Engineer
Coforge (On the payroll of IT Hub)Senior Software Engineer
AjackusSenior Software EngineeR
StanceCode Technology
Python
REST API
.png)
Flask

MongoDB
.png)
FastAPI

MySQL

Fast API
Environment: Python, MySQL, Pandas, HTML, CSS, JavaScript , Minikube, docker,kubernetes, microservices, flask/ fast API, Django, ETL, AWS, Linux and Windows.
Environment: Python, MySQL, HTML, CSS, JavaScript , docker, kubernetes, ETL, AWS,Linux and UNIX.
Project is about extracting data from a client database and processing data using a python data library like pandas and stored in a database for BI tool developers. It include creation of web application using flask framework to load users file on web server, process data using pandas and load into database
Roles and Responsibilities:
Project is about transporting goods and its real time environment monitoring and providing information to users.Customer can login from phone or computer to know the status of goods.It uses the api of IOT.based third party service provider, build a user based application to provide real time data.It used data science analysis tools like pandas, geopandas with AI algorithm.
Roles and Responsibilities:
Project is about certifying vehicles against different test cases, whichever the new vehicle or part of a vehicle comes into the market for launching. Before launching they need to certify all the rules and regulations specified by the government. The project objective was to generate a certificate for vehicles as specified by standards. The certification process completes with various tests, which ensure the standards that aim at improving safety, environmental protection as well as the quality of products and production process.
Roles and Responsibility:
Hello. Yes, I'd be happy to help you understand more about the background. I am a Python developer. More than 6 years of experience. I'm involved in developing web applications using Django and developing ETL pipelines. So, I am very well-versed in the Django and Flask frameworks, along with working on the Python core. Also, I work on developing tier 2 using AWS Glue and Airflow. So, there, we deal with writing Python code. I have good experience of seeking understanding and working with clients to design architecture as well because I'm also involved in deploying, creating, and using AWS services for web applications and writing them using Terraform. So, I can understand the end-to-end functionality of the web application. This helps me understand how requests flow. So, I have experience in developing APIs, REST APIs, and connecting databases and processing data. I process small amounts of data, like 1 to 2 GB files, using pandas, and high-column data using FireSpark. So, I have overall good experience. Along with that, I also have experience of processing data using SQL. I seek to write stored procedures and process data based on temporarily creating temporary tables. So, I have experience in developing web applications that involve understanding end-to-end functionality and working with clients to design architecture.
How it is in system 100 ingestion large amount of data?
The large amount of data I have so far is in PySpark. I missed the 2nd question. So, with a large amount of data, using PySpark, we can manage it. PySpark actually generates big-sized data. The challenge is within PySpark, it's not like pandas. In pandas, we can process and transform data very easily because it uses a lot of functionality to deal with data, filter data. It has limited functionality. There, we can process our data, but as the data is huge, sometimes the operation speed is quite slow to process the data. These are the challenges I face because it sometimes takes a lot of time. So, Jaden, to overcome this, PySpark provides some functionality. Like, we can do partition functionality. So, it processes the data in parallel, which will process the data. So, its speed will increase. To overcome the time issue, in PySpark, I use partition. They need to improve the performance of the data using the
What are some biggest challenges faced on developing different machine learning models using Yeah. So I work on machine learning models. I had to aggregate the data because the project I was working on deals with IoT data, which includes all temperature and humidity readings. However, the data was huge and coming in seconds, so I had to aggregate it. To achieve this, I used a machine learning model for aggregation and the pandas library to process the data. I applied the aggregate algorithm to the data using pandas, and then used Scikit to provide the algorithm. I used this algorithm to process my data and generate output. One of the issues I faced while developing the model was choosing and tuning the cluster. I had to find the right aggregation account to provide, and this required some trial and error. It took time to come to the conclusion of what number to use for the next step. The aggregating algorithm requires input, which was a challenge while developing the machine learning model.
So to optimize the Python code for better performances, first, we can follow the paper standard. Then, for the Python code, you should avoid repeated variables or unused variables, and we can remove that. And then, we can follow a solid principle. Like, we should write functions for each functionality. It should not be code without functions. And the same repeated function for the same functionality should have one function. The same function we can call. We should not write repeated code. And we should not use if it is more. So for that, we can use a dictionary. And we can use a for loop, we can use a Python complete list, so these are the ways. So most importantly, these are the ways we can optimize the Python code for better performance. And there's another part, like, we should avoid. So for the code, we can check how many for loops are there, and we can optimize the code. Like, one or zero log of. And in that way, we can find out the time and space complexity of the code, and we can try to optimize it for the better time and space.
For this, I designed in our project, we were using third-party API endpoints. So when I pointed out, we're using IoT data, and it was coming in seconds. And so we had to hit that API endpoint every five minutes. So, what we did, we hit the API endpoints, took our data. So we got data from that API endpoint, and then we returned Python code to fetch that data, and then we processed it using PySpark. And then we processed this data, and as per our requirement, we created our columns after processing. So, we could view the clean data with the machine learning model. On this, I worked otherwise, in the banking domain, I worked. So there, from the client, we used to get a CSV file in an S3 bucket. So, there I worked on Airflow. The Airflow was set up so that when a file came to the S3 bucket, it fed the data. Our PySpark model was triggered by the Airflow, and it took the data from the S3 bucket. And it was processed, and it gave the output to the client. And then that information was restored into the database system, where we use Django. So, to generate a report, we used Django to create a REST API endpoints. Using the API endpoints, we provided the report for the client. But the data which was created after Python processing, this is the design.
So we are importing the Python package. And as we cannot use binary, so we are getting there. We're reading the CSC file, the data CSC file, stored on the local. So after reading the data, we get a data frame. And we are adding a new column into the data frame. So in the CSC, there are 2 columns. We are adding the 2 headers from column 1 and column 2, and we are giving it to the new column. And then, from the data frame, we are again converting it into the CSV file. So for that, we have given the path of the new data CSV file here. We name the index call center, and in the output, it will not add, like, 0, 1, 2, 3, 4, 5, each row in the CSV.
When you are following Python code, which is Kubernetes client? Can I explain this? Okay. So for Kubernetes, we are importing client and the config. So we are loading the configuration here, config.load. So, from the client, we are using the API endpoint. So it is listening to the parts of the Kubernetes IP. Then we list all the namespaces. Yeah. So whatever the namespace for this part, it will list out all these namespaces. So we'll print out the namespaces and for the for loop as a rate is a list, sorry, is a dictionary. And we are taking the values, key and values. So we are getting the port ID and its namespace. So this is about the script.
So one of the projects I work on is an AWS Glue job. We use a scheduler that triggers the Glue job after one hour. In that project, we return a Python code and the pi's latitudes. We were using data from the mainframe data. That file had binary data, so we returned a logic in Python that could read that file. Then we store that file in S3, and we plan to store the output in S3 as well. We fetch data from S3, process it, and then process the data link. We have two files, and depending on those two files, we apply an inner join or auto join. After transformation, we create a new file and get the output. It was a complex problem, so we designed a solution. First, we used normal Python code to unpack the data and store it in the bucket. Then, from the S3 bucket, we use a job to process the data. We have a user scheduler, so daily the file from the client comes to the S3 bucket. We process it using Python code, and then from the Python code, we get the output in this way. In another project, we use Airflow. There was no need for a user scheduler. Using Airflow, we process the code, and it was not about the database blue jump. The code was in a Linux machine, so that's how we have designed this solution.
How do you ensure high scalability, availability to, and the durability of the data in the SQL database? To ensure high availability and durability of the data in the SQL database, yeah. So, for that, while designing a web application from a cloud point of view, we can use a proxy server. So, we will have multiple replicas of the data. The replicas will be replicated to the database. So, whenever a request comes to the Python application, the request will go from the proxy server to the less busy database. So, data will be always available for the user. Another thing is, like, in the recovery of the application, we can add the databases and the snapshot. So, it should take a snapshot of the database every hour or every day. So, the availability and recovery functionality can be done for the database.