Database Developer
InfosysDatabase Developer
SunTrust MortgageDatabase Developer
TELUS MVNEMicrosoft Visual Studio
Business Intelligence
Python
C#
Jira
Hi, so I have like 5-6 years of experience in SQL and this was my third company, my last company Capgemini and I worked with the financial, healthcare and telecom industry. My first job was with Infosys as an ETL developer, I worked there in ETL sector where I worked for a financial project called SunTrust and here I was as an ETL developer then promoted to senior ETL developer, after that I moved to Accenture as an Application Development Analyst and here I was there for hardly 8 months where I had personal constraints and I couldn't move out of the location and I had to stay back, I had to stay back in Bangalore so I had to leave the company and moved to my last company which was Capgemini and here I worked with Abbott which was a Healthcare Solutions and I was an ETL developer mostly, and here I mostly worked on SQL and ETL processes along with it, so that's my briefing about the background, thank you.
Actually, I have done data validations with respect to the ETL, I mean, after doing the upgrade or including a change in the project, but coming to the validation tool, we used to prepare validation scripts pertaining to the changes which we have made and the data stability and data validation coming to the SQL jobs and everything which has run during the process, wherever the change is involved or if, say, it's a monitoring of SQL jobs or daily running packages or scripts, then we used to validate that based on the data flow of the change from end to end and also the time taken along with it, so that's my answer about automating with data validation.
How would you handle data conflicts and vendor conflicts? So when it comes to a separate databases like combining data from two sources or various sources we would initially see what kind of data is there and what are we merging it for and what is the end result we want. And coming to the data conflicts, mostly data conflicts when there is a change of source would be of data types or where they are stored and if we have to combine it or derive a column from the source or divide or derive the columns. All these things happen and depending on the result which we want we would either add the data to the other source or we remove data from other source and do the processing as per the requirement wherever in whatever way the outcome is required.
It requires both OLTP and OLAP operations on the same data set. So it depends on the type of system which we are going to define. If we have the data which is going to be accumulated for OLTP and OLAP separately or are we going to use like the raw data whichever is present in the data lake and then we if we have to do the transformations or analysis on it. So it depends on the type of data which we are getting and given a data set we would prefer based on the requirement if it is to be a OLTP or OLAP source, OLAP result and then we move forward with whatever is required on it.
So most of the times we mostly use schema stability, so we would avoid doing any schema changes, major schema changes and then make change all the, if we are proceeding with the schema changes then we try to maintain stability by giving the schema to all the database references of the schema and then we proceed with the changes whatever is required based on the business needs.
The most efficient indexes created for query performance. The indexes created on a database or a table depends on the usage of indexes, most probably the type of data which we are storing. So indexes are mostly used to reduce the search time or to reduce the table scanning of the data where the usage of index will reduce the vast time which is being used to fetch data or do something, do operations on data. So the primary thing would be to use the viable indexes which would improve the performance of the tables or database or any process.
In this example, there is a potential documentation when dealing with large sizes of data. What it is, I suggest how to optimize it. The main issue in this scenario would be the order ID is being searched for a certain order ID and if it has multiple unit prices or multiple entries in the same table then it would need to calculate all of them and also multiply with the quantity on unit price and give the sum of the total money. And when there is a performance issue when dealing with large data sets it comes down to the multiplication and both summation of the products whichever is given and then to improve this, we would prefer to add a group by clause and try to minimize the summation before doing the unit price and quantity. So that would improve the performance of this machine.
If you identify the debit and tax issue, run the code from executing successfully, run the code from executing successfully, run the code from executing successfully, run the code from executing successfully, run the code from executing successfully, run the code from executing successfully, run the code from executing successfully. When there is an alias defined as price category, we cannot use it in a where clause where the price category is just defined and we are giving it as expensive and then the query would directly throw an error saying that the affordable situation where the list price is not greater than 1000 then it would throw an error and that would cause an issue on it.
So for the applications, wherever the recovery is required, then we would prefer to have a snapshot of previous installments or the working conditions so that we will be able to do the database recovery or else we can also have a disaster recovery database separately for all the databases or all the systems where we would be able to recover the data and archival is also one more kind of a recovery procedure or recovery tool where you can store the data of like vast data in say a ready a dead database where the reference would be there and then we can take it from the database I mean the archival database on the data whichever is required. If the run time on the main database is too much then we would go for archival and if it is going to be like if there are chances of crashing then we would go for disaster recovery.
The basic thing when it comes to SQL Service Analysis Services, SSAS, in database solutions would be to accumulate the data and then analyze the data on business requirements wherever it is required. And coming to the analysis of the data, it majorly depends on the usage of data and type of data which is coming in and type of data which is coming in on various sources and data which is going to the various downstreams as well. And most of the time the data would be varied from source to destination or destination to the lower environments where the data is needed for screening, reporting or for various uses. So basically analysis would play a major role on what has to be done and how it can be achieved in a simpler way and it also helps in creating a predefined structure to analyze the data and process it in an efficient way.
to your SQL code and database design.