profile-pic
Vetted Talent

Shodhan Pujari

Vetted Talent
I m passionate database developer having total of 15+ years of experience in database development technologies and working with clients based in United States, Hong Kong, Australia, China. I have good working knowledge on the following technologies. SQL Server development and performance tuning.
  • Role

    Senior SQL Developer & Lead - SQL

  • Years of Experience

    17 years

Skillsets

  • Microsoft SQL
  • Documentation
  • Azure Data Lake
  • On
  • Visual Basic
  • SAP
  • Azure DevOps
  • Stored Procedures
  • Azure Data Factory
  • ETL pipelines
  • testing
  • ETL
  • Troubleshooting
  • SSIS
  • Tableau
  • Cloud
  • SQL Server
  • C#
  • Analytics
  • Performance Tuning
  • Data Integration
  • SQL Server Development
  • DevOps
  • SQL - 17 Years
  • Azure - 3 Years
  • Azure - 3 Years
  • Data Processing
  • Azure Synapse
  • Database
  • Statistics
  • Microsoft SQL Server
  • Data Analytics
  • Telecom
  • SQL - 17 Years
  • Routing
  • Data Warehousing
  • Reporting
  • Big Data
  • Oracle
  • Java
  • Debugging
  • C
  • Marketing
  • Microsoft Azure

Vetted For

7Skills
  • Roles & Skills
  • Results
  • Details
  • icon-skill_image
    SQL Server Database DeveloperAI Screening
  • 76%
    icon-arrow-down
  • Skills assessed :Unix, Database Design, etl programming, sql developement, Data Modelling, Python, Shell Scripting
  • Score: 68/90

Professional Summary

17Years
  • Nov, 2016 - Present9 yr 1 month

    Senior SQL SSIS Developer & Lead at

    The Interpublic Group of Companies, Inc. (IPG), Mumbai
  • Senior SQL SSIS Developer & Lead

    The Interpublic Group of Companies, Inc. (IPG)
  • Oct, 2014 - Jun, 20161 yr 8 months

    Assistant Manager

    Deloitte, Mumbai
  • Technical Lead-Band 7A

    IBM India Private Limited
  • May, 2005 - Oct, 20072 yr 5 months

    SQL Programmer

    Tech Services India Private Limited
  • Nov, 2007 - Apr, 20113 yr 5 months

    SQL Developer

    Dun & Bradstreet, Mumbai

Applications & Tools Known

  • icon-tool

    Microsoft SQL Server

  • icon-tool

    SQL Server Reporting Services (SSRS)

  • icon-tool

    Azure DevOps

  • icon-tool

    Azure Data Factory

  • icon-tool

    Microsoft BizTalk Server

  • icon-tool

    Tableau Desktop

Work History

17Years

Senior SQL SSIS Developer & Lead at

The Interpublic Group of Companies, Inc. (IPG), Mumbai
Nov, 2016 - Present9 yr 1 month

Senior SQL SSIS Developer & Lead

The Interpublic Group of Companies, Inc. (IPG)
    Development project using Microsoft SQL Server Integration Services (SSIS) ETL packages to integrate various systems with IPG US network of companies.

Assistant Manager

Deloitte, Mumbai
Oct, 2014 - Jun, 20161 yr 8 months

SQL Developer

Dun & Bradstreet, Mumbai
Nov, 2007 - Apr, 20113 yr 5 months

SQL Programmer

Tech Services India Private Limited
May, 2005 - Oct, 20072 yr 5 months

Technical Lead-Band 7A

IBM India Private Limited
    Worked for UK client in automobile and manufacturing domain for Honda UK Private Limited.

Achievements

  • Received Quarterly Recognition Award from Nordea and L&T Infotech for the contribution made in this project. Automated manual processes using SQL stored procedures and agent jobs. Created stored procedures used for customized reports which are delivered to customers. Daily log checks for import jobs written in stored procedures.
  • Received Star Performer Award from Dun & Bradstreet
  • Received Quarterly Recognition Award from Nordea and L&T Infotech

Education

  • Executive Masters in Business Administration with Specialization in IT

    Institute of Technology & Management (ITM), (2015)
  • Executive Masters in Business Administration with Specialization in IT

    Institute of Technology & Management (ITM)Southern New Hampshire University
  • Bachelor of Management Studies

    University of Mumbai (2005)
  • Diploma in Advanced Software Technology

    CMC Limited, Subsidiary of TATA Sons, Mumbai
  • Executive Masters in Business Administration with Specialization in IT

    Institute of Technology & Management (ITM), Mumbai
  • Diploma in Advanced Software Technology

    CMC Limited, Subsidiary of TATA Sons (2005)

Certifications

  • Microsoft Certification

  • Mcts sql server 2008 (70-433)

  • Sql server 2012 database development (70-461)

AI-interview Questions & Answers

Yeah. Uh, myself, uh, Shudan Pujali. I have around 15 years of experience into SQL Server development. And, uh, I have I worked on various projects uh, including banking, finance, HR. And, uh, currently, I'm working, uh, working for interpublic group and in the middle of a team, uh, working on the SSIS, EDL, s SQL developer. And, um, I'm good on SQL and ETL ETL SSIS packages. I also have experience on the cloud SAP ETL tool. And also have worked on SSIS, uh, Tableau reporting. So total, I rate myself around 9 out of 10 in SQL Server. Like, I started my career with SQL Server way back in, uh, 2005, and then I worked for, uh, Darren Broadstreet, then LED for tech. Uh, after that, I worked for Deloitte, IBM, and this is my current company. Its in the public group. It's a advertising, uh, major group, uh, wherein we have, uh, many agencies around the world. So we cater to all the development activities, uh, in house, uh, like creating the middleware integrations, uh, involving SS packages, Bistro. Currently, uh, we are migrating to SAP Cloud. So, also, that's a brief background about me. Thank you.

Yeah. Uh, can you describe how to automate a data validation test after ETL process? Okay. Data validations might be like, uh, um, Yeah. Like, uh, you can pass, uh, for example, uh, you can automate a job. Like, for example, if the ETL if that ETL process is running via some, uh, agent or some, like, uh, for example, a control engine. So we can run that job by passing a test file. Like, the one is, like, uh, the regression testing. Like, you pass around the 1,000,000 file and see how the ETL ETL package is, uh, um, be behaving with that file. And you also pass some gen characters like a unicode characters, special characters in that. And you all if you are targeting any specific, uh, test cases for data validation, uh, make sure that you include some, um, like, uh, more specific, uh, data, uh, like like for example, Unicode characters with with Chinese or special characters so that you can validate whether that data is getting inserted properly into the table and it is getting viewed in the report. Uh, so that, uh, that is one automated way, uh, that I can think of. And also, we can do some manual testing like, um, uh, debugging the ETL packages by using a data viewer and checking what is the data coming in the in the next flow. So you can you can see what is, uh, like, what is the data after after each of the flows. You can you can see that. And, also, you can use, uh, like, a log file to capture the capture the events. Like, for example, how many records have been processed in that particular, uh, transformation. So you can capture that using some log file. Uh, anyways, you can check the logs and the execution logs, which, uh, for example, in in SSIS, you have a monitor wherein you get to see all the, uh, execution, how many records are processed, what is the step, uh, what are the warnings we get. Uh, for example, some in some of the cases, uh, you have, uh, explicit conversion happening, implicit conversion happening. So you get some warnings. So so all those things, we can, uh, do a validation test.

Database transactions. Okay. So we'll so in SQL Server, uh, the transactions are mostly like, we have the asset asset properties, atomicity, isolation, consistency, and, uh, asset, and the due durability. Okay? So the current isolation level, we have different isolation level like read, commit, read, uncommitted, repeatable reads, and serializing. So the default isolation level is the read committed. So wherein, uh, wherein if you open a transaction, it will wait it will wait for that, uh, like, if you need, uh, at the same time, any other transaction is open, it will wait for the transaction to to get committed. So so for that, uh, like, we, uh, for example, if I'm writing a stored position, so what what happens is that you need to follow, uh, the proper approach, like, uh, like a try catch block. We have a try catch block. In that, you can try to begin transaction, uh, roll back and commit. If there is a error if the error count is greater than 1 or, uh, like so you you can roll back that transaction. So in that way, uh, you can manage the transactions in the database and also try to keep the your transaction as small as possible. Uh, like, we shouldn't, uh, like, if you have multiple updates update statements, so try to make that smaller, uh, transactions. So don't keep them, uh, your, uh, your transaction waiting for more than more than a minute or more than more than 2 minutes. Because at the same time, uh, there there might be chances that some of the users might be accessing the same table or the same record. So they'll be, uh, so so they'll be, uh, like, since you are holding a log on those tables, so it is better to use a track edge block and, uh, close the transaction as soon as your, uh, your operation is completed.

Next release. That must, uh, join several tables and function efficiently. Okay. So so for example, if I'm joining man uh, if I have okay. So if I have joins on several tables, uh, so first thing, what I will do is, uh, we can use a temporary table. Temporary table and create a smaller dataset. Smaller dataset creates some index on that, uh, temporary table so that, uh, like, it can do index seek when you use that template table in your next condition. Uh, for example, if I've done a inner join, left outer join in my step 1, uh, and then the I've stored the result set into a temporary table. I can use that temporary table. In the second condition, uh, I've used one more table as a left order join. And for of if I want to add a function, uh, to a join, then we have to use a cross apply join. Cross apply join or outer reply, uh, so that, uh, because, um, the normal joins won't support a function. You know, like, you can't write a normal join for a function, like table valued function. We have table valued function. So that can be used in a join as a as a cross reply. So outer reply and all. So, um, so cross apply is like a inner join, outer reply. So left outer join. So we have to use that. So my my approach will be if your record set is very huge, try, uh, try splitting them into into multiple temporary tables, small temporary tables, and create indexes on that. And then work on that. Uh, instead of writing everything in one in one, uh, statement, try to split that out into multiple steps so that, uh, that will give a better performance as compared to several joints, including a select, uh, sim single select segment.

Uh, slow running query without database access. Without database access. Slow run inquiry. Okay. So for example, uh, if the for example, if someone is complaining about a about a slow run inquiry, first, I will ask them about the like, uh, what was the actual what is the actual expected, uh, time? Like, uh, like, uh, like, um, like, how much it was, uh, what was the output and how much time it was it used to take before. And then I will get the for example, if you're calling a stored position, I will I will ask them for the stored position query. Stored procedure. Uh, then in the stored procedure query, uh, without having access to the database So first thing, if I had access, then I will have checked the execution plan. But since I don't have the access to the database, I will look into the code, what is written. So for example, if they have written some cursor logic that I will try to, uh, remove that using some lead lag function. Uh, and, uh, if they're using a scalar function within a cursor, uh, we can try to convert that into a table valued function and use that as a join. So in my in my earlier experience, uh, we had a stored procedure. Like, I used to work on I have worked. I'll be working on the performance unit of of various stored positions, queries, and all. So in one scenario, we have a stored procedure, uh, which used to take more than an hour also, uh, to run. So, uh, when like, it had multiple business logics. So it had a it was it was calling a scalar function, and then that scalar function was applied inside a cursor. It was looping 1 by 1 record, uh, which was coming from the scalar function. So what I did is, um, we converted that scalar function to a table valued function and use use it as a cross apply join. And then we removed the cursor logic. The cursor logic for taking the previous and the next record data from from a sales, uh, from a sales sales data. So what I would what we did is instead of writing a cursor, we just use a lead and lag functions, uh, which is getting the previous and the next record value, which is available in the latest versions of SQL Server. So in that way, we can rewrite the query. Uh, we can try to rewrite the query if it is possible. And, also, we can see if there is any implicit conversion happening. If you're if you're matching with the integer and your back caps, that will always be, uh, like, implicit conversion. So we should avoid that. And, um, any, like, a parameter sniffing. You have parameter sniffing. If you're passing any, uh, parameters, you have if you created a stored position, uh, you are passing any input parameters. So what is happening is, like, at the, uh, like, if you're extracting data for last, uh, 1 year. Okay. So the, uh, SQL Server will it will store the plan based on that 1 year data. Okay. It started and and ended. But the same plan won't work for the if, uh, for the other. If you are taking more than 1 year data 1 year data, then it won't work. So that is called parameter sampling. So we can avoid that using option recombine.

How do you how do you prevent SQL injection attacks in in stored procedures? SQL injection. Yeah. So in that case, uh, SQL injection attacks in stored procedure is like, uh, we can use SP, uh, SP underscore execute SQL, wherein it will, uh, and pass your dynamic SQL query into that. It will it will compile your code, and, uh, uh, it will compile your code before it runs. So that is one thing. And s 12 injection is like a hacker will try to ingest some incorrect values into your in your application, like some input parameters, which can hamper your, uh, like, which can hamper your application performance, like it can go in low. So to avoid that, you declare the input, uh, variables, input values, uh, properly and try to recompile that using SPSQ test split. So that is, uh, what I what I currently can think of.

Question, there is a potential performance issue when dealing with that reset. That's it. Can you explain what k. Here, this function get order total, order in returns money. We can think, uh, declare in a total, select sum of sum of in net into quantity from where order ID return total. Okay. Uh, this function, it is returning a value in in the data type money. Like, uh, it is selecting, uh, unit price. It is doing a sum, uh, it is doing a sum of unit price into quantity. It is multiplying unit price into quantity and taking the sum from order details where order equal to this. Sum of order okay. For that particular order from the order details, it is taking us some into unit. Okay. So say, uh, since it is selecting the order ID from order details, we can check whether, uh, uh, whether any index is present on that order ID on that order ID. And, uh, because it it is doing a seek operation on that order ID, and we have to see whether unit price and, uh, quantity can be can be included in that in that particular particular index. For example, create a noncashier index on order ID. Include unit price and quantity, uh, in that particular index. So this will improve the performance.

So what is the logic error present here and which might cause the impact on the rectified? Okay. So why is select from users where is that where is that typical to 1 greater than 0. And begin delete top 1 from delete top, uh, one from here that are active, but top one from here. So here, uh, we are writing a while statement, but we are missing the, uh, incremental logic. The increment should happen here. Uh, delete top one from users where is at equal to 1. So missing statement is, like, we need to increment that value, the count star. Um, count star from when, uh, delete. So So we are begin and end. Yeah. If if the count star is count star is 0, then it should come to the it should it should stop the while loop. So we need to take a count of the users table here after deleting. And then, uh, increment, uh, incremented here. I need to check that actually. So basically to rectify it, we need to add a incrementing logic here. That is what is coming to my mind now.

End to end pipeline including error handling and data point of checks. Okay. So end to end pipeline. Uh, for example, first of all, we need to see the source and target. What is your source and target? We'll create the source and target destination. Source, uh, source connection and your target connection destinations. Now, uh, for example, uh, you have created a like, there are control flow and the data flow. So for example, in a control flow, uh, you are doing some looping activity. Loop, uh, for example, you're looping some files from a folder or from from the SFTP FTP folders. So for example, there are multiple files. So you're looping from the FTP folder and fetching them on the incremental basis. And, uh, then you're downloading them to a local your local path and then processing them based on the order of the sequence of the files. Uh, for so in that case, we'll add a 4 h two container. Wherein we'll fetch the file from the SFTP. We'll drive the FTP task, and we'll fetch the file from the FTP. And whatever files are downloaded, we'll add one more for it for loop, which will loop the files from that particular folder. Now, uh, uh, now here, we'll do, like, if there is a failure, like, it should not fill the package. So we will, uh, we will do on prop propagate, uh, on propagate equal to true, wherein it will it will skip the error and go to the next file and go to the next for example, if there are 10 files, uh, it processed the first five files, and 6th file has some issue in the data for data quality. So what it will do is it will it will log that particular file records into one one file or some table, which we can do using a data flow. And then it will continue with the with the next, uh, 6th, 7th, 8th, and and so on. Like, uh, so so we'll do error handling on that. Uh, like, after it completes the package, uh, execution, it will send a email email alert that out of 10 files, these many files were successful, and out of them, one file was not successful. Uh, please find the attached error error records. So in that, uh, like, uh, also in this way, we can do the data data quality checks and error handling.

Okay. So for example, uh, in my current, uh, just current experience, uh, we had a value apps, uh, data database, which is like a CVG and Zomato, uh, Zomato type of, uh, order application. Wherein, uh, wherein in the p cars, uh, the the, uh, the application was facing a huge pressure, like, um, a huge like, the read and writes were going very high and the CPU memory was going very high. So, uh, that is specifically on the orders table because on that table, all the orders and your audio are, like, uh, like, your reporting orders. Like, if if the customer wants to check their previous history, it will it will read from the orders table. If you want to place any order, it will read, again, read the order table. Even the the the delay delivery guy, if you want to update anything, it'll it'll again read and write into into the order table. So that was causing a lot of issue. So in that case, what we did is we checked the table table design. They were storing address ID, address in the same table. And, uh, whatever, um, order quantity, order order value, the total tax, and how many, like, commissions and all, they were storing in the same same table. So instead of that, we separated all those columns into a separate table. All that is but to table, like, commission table, your tax table, your order quantity, and the total order quantity, or how much is the invoice that is going to separate table. And the address field is just address ID into order orders table and address, uh, a text value will be stored in the address table. So that will move. And, also, we partition that orders table into range and list partition. In in the SQL, we have MySQL. We were using MySQL in that project. So and and it was the the database was hosted in the AWS cloud. So we you know, my my SQL, uh, we use a range and list partition. We had a brand ID wherein, uh, based on the brand ID and, uh, a brand value, it, uh, it, uh, we partition the table, uh, based on the ranges. Like, uh, the brand ID has some late ranges, uh, like 128, uh, and the store location. Brand ID and store location. Store store ID. Sorry. So brand ID based on the brand ID, uh, the range and and the keys were based on the store. So for example, every store had a has a store ID. So based on that store ID, for example, 1 to 10, uh, stores will be stored in the partition range of 1. 2nd partition will store from 11 to 20 like that. So so in that case, what will happen, whenever there is a read and write happening, it will go to that particular particular, uh, partition and it will not search in the whole table. So that helped the performance of our application.

Okay. So approach my approach for documenting SQL code and database design, firstly, we use the Azure DevOps, uh, Git repository to store the SQL codes. And for the database design, we like, before starting any any new development, we we create a visual diagram, uh, visual diagram with the high level document and the and the in-depth document. Like, how good will like, either high level document is based on the requirement, uh, specification which we get from the client. So based on that, we create a high level document. And then we have a call with the with our architect. And then based on the discussion, we create a detailed, uh, data flow diagram. So in that, it, uh, like, for example, in the video MSV geo, we create the data flows over there. Uh, what will be the source? How of the source will be flowing? What what what will the steps here and error handling? How we can, uh, do the email notifications, and what will be the transformation will be. Like, it's it will be detailed diagram. Based on that, we will we'll be creating the ADL packages on the SQL queries on on the on the databases. So so based on the visual diagram or any ER diagram for a for example, a new database is to be created. So based on we create the ER diagram for that. So based on that, I will, uh, I document my SQL code and the database designs.