
I am python back-end code specialist with more than 6 years of experience in design, development, testing, deploying and maintaining web applications using Django and ETL pipelines which use python in backend. The role of my engineering practice consists of database design, interaction with MySQL, MongoDB, fixing backend bugs, adding functionality, adding an API endpoint, configuring the unit test, setting up deployment. Architect, design, and develop advanced backend services for our AI-powered market research platform using core Python, advanced Python, and frameworks such as Flask MVC, Django Rest API, and Fast API. My technological expertise includes data warehouse, Django, Microservices, Flask using python, pandas. SQL, Pyspark, AWS lambda, AWS Glue job. Also I have months of experience in setting up AWS infrastructure using terraform. I am also familiar with DevOps tools like Jenkins, docker, Kubernetes.
Senior Software Engineer
Coforge (On the payroll of IT Hub)Senior Software Engineer
AjackusSenior Software EngineeR
StanceCode Technology
Python
REST API
.png)
Flask

MongoDB
.png)
FastAPI

MySQL

Fast API
Environment: Python, MySQL, Pandas, HTML, CSS, JavaScript , Minikube, docker,kubernetes, microservices, flask/ fast API, Django, ETL, AWS, Linux and Windows.
Environment: Python, MySQL, HTML, CSS, JavaScript , docker, kubernetes, ETL, AWS,Linux and UNIX.
Project is about extracting data from a client database and processing data using a python data library like pandas and stored in a database for BI tool developers. It include creation of web application using flask framework to load users file on web server, process data using pandas and load into database
Roles and Responsibilities:
Project is about transporting goods and its real time environment monitoring and providing information to users.Customer can login from phone or computer to know the status of goods.It uses the api of IOT.based third party service provider, build a user based application to provide real time data.It used data science analysis tools like pandas, geopandas with AI algorithm.
Roles and Responsibilities:
Project is about certifying vehicles against different test cases, whichever the new vehicle or part of a vehicle comes into the market for launching. Before launching they need to certify all the rules and regulations specified by the government. The project objective was to generate a certificate for vehicles as specified by standards. The certification process completes with various tests, which ensure the standards that aim at improving safety, environmental protection as well as the quality of products and production process.
Roles and Responsibility:
Hello. Yeah. Can I help you understand more about the background? Hello. Hi. I am Python developer. Uh, more than 6 6 years of experience. I'm involved in, um, developing the web application using Django class and developing the ETL pipeline. So there, uh, I am very well with the Django and Flask framework along with I work on the Python core. Also, uh, I work on developing the tier 2 using AWS Glue and, uh, airflow. So, uh, there, sick, we, uh, deal with the writing, uh, Python code. So I have good experience, uh, of seek understanding and working with the client to design architecture as well because I'm also involved in deploying, uh, creating, uh, using AWS, uh, services for the web application and writing it using a Terraform. So, uh, here, uh, I can understand the end to end functionality of the, uh, web application. So it helped me to understand the how the how the request flows. So, 6. So I have experience in developing the API, uh, REST APIs and connecting the databases and processing the data. 6, uh, small amount data, like 1 to 2 GB files that I process using pandas and, uh, the high, uh, column data that process using FireSpark. 6. So I have the overall, uh, good experience. Along with that, I also have experience of processing data using SQL. So seek to write a stored proc and, uh, process data based using, uh, temporary creating temporary tables. 6. So, um, so
How it is in system 100 ingestion large amount of data?
The large amount of data I have so far that we have a PySpark. So I missed the 2nd question. So large amount of the data, uh, using PySpark, uh, we can manage. So the PySpark actually generated the big size data. The challenge, uh, within Within PySpark, it's not like a pandas. In the pandas, we can process the data and transform the data very easily because, uh, it use that much, functionality to, uh, deals with the data, filter the data. The has some limited functional limited functionality. There also we can process our data and, uh, but as the data is huge, uh, sometimes, uh, the operation speed It's quite, uh, takes time to process the data. These are the challenges, uh, I face because sometime it took More time. Yeah. So, Jaden, to overcome this, the Python, uh, PySpark provides some, uh, functionality. Like, we can do a partition functionality. So it process the data in parallelly, it will process the data. So its speed will increase. So to overcome the time issue, uh, so in the PySpark, uh, the partition, uh, I use. So they need to improve the performance of the, uh, data using the
What are some biggest challenges faced on developing different machine learning models using Yeah. So I work on the machine learning model. Uh, there I had to aggregate the data. So I was working on the project which deals with IoT data. And it has all data, all the temperature and humidity, But I had to aggregate the data because data was huge, and it was coming in seconds. But I had to aggregate the the data. So for that, I use, machine learning model, uh, for the aggregation and, uh, for that, uh, that for the use of pandas and that Data using pandas, I have given to that algorithm, like aggregate algorithm. So for that, I use the Scikit. Uh, so Scikit provides, algorithm. That algorithm I used, my process data, and, uh, I used that, uh, for the processing or aggregating the data and generated output and use it. So, So the there the, uh, issue was while developing that, uh, so cluster that we have to choose, we have to tune it. So, uh, so we have to find the, uh, aggregation account we have to provide. Uh, so that tuning was, uh, a bit try the error method I have to give. So it took some time because it at first, like, I had to by by just, uh, testing, I had To come to the conclusion, like, yeah, this number we can use for the flash step. So for aggregate, uh, yeah. So I'll that aggregating algorithm needs that, uh, input. So it was a challenge while developing the, uh, using that machine learning model.
So to optimize the Python code for better, uh, performances, uh, first, we can follow the paper standard. Other, uh, then, uh, for the Python code, you should avoid, uh, so re repeated variables or unused variables, uh, we should not, uh, we can remove that. And then, uh, we can, uh, follow a solid principle. Like, function that we uh, we should write the functions, uh, like, for each functionality. It should not be code without functions. Uh, and and, uh, the same repeated function the same, like, for the if the functionality is same, it should have 1 function. The same function we can call. We should not write the repeated code. And, uh, we should not use if it is more. So for that, we can use the dictionary. And, uh, we can use, uh, for for loop, we can, uh, use a a Python complete list completion we can use. So, uh, so these are the ways. So most important so so these are the ways we can optimize the Python code for, uh, better performance. Uh, and there, uh, other part is, like, um, we should avoid. So so for the code, we can check, uh, like, many for loop are there so we can how optimize the code. Like, uh, o one or 0 log of. And so, uh, in in that way, we can find out the time and space complexity of the code, and we can try to optimize it for the better time and space.
So for this, um, I designed, like, uh, in our project, we were using a third party API endpoints. So, uh, so so when I point, uh, like, we're using, uh, IoT data, and it was coming in in seconds. And so we have to hit that API endpoint in every 5 minutes. So what we are doing, we're hitting the API endpoints. We're taking our data. So we're getting a data from that API endpoint, and then we shall return a a Python code to fetch that data, and then we're processing using a PySpark. And, uh, then so we're processing this data, and, uh, as per our requirement, we are creating our, our columns after processing. So, uh, we can so so that with the machine learning model, we can we should able to view the clean clean data. So on this, I work otherwise, in in banking domain, I worked. So there, from the client we used to get the CSV file in s three bucket. So, uh, there I work on the airflow. So, uh, so the airflow was So it it was, uh, so so when your file comes to the s three bucket, so it was feeding the data. Our PySpark model was getting triggered by the airflow, and it it was taking the data from the s three bucket. And, uh, it was processed, and it was give the output to the client. And then that information was restored into the database system, there we use the Django. So, uh, to generate a report of it, Then using Django, we have created a rest API endpoints. Using the API, rest endpoints, we are providing the report for the Apply. So but, uh, the data which is created after the Python processing, so, uh, this is a design
Here's CSC. Okay. New column. Connection purpose of each line. Yeah. So we are importing the Python package. And as we cannot use binary, so as we Getting there. We're reading the CSC file, the data CSC file, uh, stored on the local. So after reading a date, we get a data frame, And we are adding a new column into the into the data frame. So in the c s c, well, there is 2 columns. So we are adding the 2 headers from column 1 and column 2, And we are giving it to the new column. And then, uh, from dataframe, we are again converting into the CSV file. So for that, We have given the path of the new data, uh, CSV file here. Name the index call center, we In the output, it will not, uh, add, like, 0, 1, 2, 3, 4, 4 each row in the c s.
When you are following Python code, which is Kubernetes client? Can I explain this? Okay. So for Kubernetes, we are importing client and the config. So we are loading the configuration here, config dot load. So, um, here, uh, from the client, we are using the API endpoint. So so it is listening to the parts, of the Kubernetes IP. Then we dot list all the namespace. Yeah. So whatever the namespace For this part, it will list out all these, uh, namespaces for the part. So So we'll print out the namespaces and the for the for loop as a rate is a list list, uh, sorry, is a dictionary. And, uh, we are, uh, taking the values, uh, key and values. So we are getting the port ID and its name space. So this is about the script.
Explain of the complex data processing problem you are solving in Python and with Python. What was problem and how did you design the solution? So, uh, one of the project I work, uh, on the AWS glue job. So there, actually, we use a scheduler. So it will trigger the glue job, uh, you know, after 1 hour. So in that, uh, actually, we'll return a Python code and the pi's Uglitudes. So there, uh, we were using data from, uh, the mainframe data. Actually, it was, uh, that file, Have a, uh, binary data, so we return a logic in Python also, uh, which would be able to, uh, read that file. Then That file we're using storing in s three, and, uh, so we plan to store in s three. Then in was fetching data from the s three, And it processed we process the data link, uh, as well. So we have 2 files. So, uh, depending on the 2 files, Uh, what data we want, applying inner join, auto join. Uh, so after transformation, we'll create a new file, and So we're getting the output. So it was a complex, like, uh, we we have design solution, like, 1st, uh, using, uh, normal Python code. 1st, unpack this data and then store into this bucket. And then from s 3 bucket, uh, you have to, uh, using job, you have to, uh, process the data. So for that, we have user user scheduler. So daily, the file from client will come to the s three bucket. And, uh, from s three by click, we will process using Python code, then from the Python code. So, uh, in this way, we have, We have designed this solution. In other project, uh, we use Airflow. So there was no need of user as scheduler. So using Airflow Airflow. It was not about the database blue jump. The code was in a Linux machine so that we have
How do you ensure high scalability, uh, availability to? To ensure the high availability and durability of the data in the, uh, SQL database. Yeah. So, uh, yeah. For that, uh, while designing uh, web application and the on the cloud point of view, we can use the proxy server there. So we will have the multiple replicas of the datas. The so, uh, replicate to the database. So whenever the request come to the Python application, so request will go from proxy server to the, uh, database which is, uh, less busy. So, uh, so data will be always available for the user. Other thing, like, is, uh, we can in the recovery, uh, require application, we can add in the databases and the snapshot. So it should take the snapshot of the database in every 1 hour or in in every day. So the availability, we can do a snapshot and the recovery, functionality into for the database.