profile-pic
Vetted Talent

Naveen Sequeira

Vetted Talent

Experienced Data Analyst with over 5 years of hands-on experience in leveraging Python, SQL, Power BI, and Excel for comprehensive data analysis. Proficient in extracting insights from complex datasets through data manipulation, exploration, and visualization techniques. Adept at conducting statistical analysis to derive actionable conclusions and facilitate informed decision-making. Demonstrated ability to collaborate cross-functionally to understand business requirements and deliver tailored analytical solutions. Committed to continuous learning and staying updated with emerging trends in data analytics to enhance organizational effectiveness.

  • Role

    DATA ANALYST

  • Years of Experience

    5.8 years

  • Professional Portfolio

    View here

Skillsets

  • SQL - 4 Years
  • Python - 4 Years
  • MS Excel - 5 Years
  • PowerBI - 4 Years
  • Reporting - 5 Years
  • NumPy - 3 Years
  • Advanced Excel - 4 Years
  • pandas - 3 Years
  • Seaborn - 3 Years
  • Matplotlib - 4 Years

Vetted For

7Skills
  • Roles & Skills
  • Results
  • Details
  • icon-skill_image
    Data AnalystAI Screening
  • 70%
    icon-arrow-down
  • Skills assessed :Data Extraction, Matplotlib, PowerBI, SPSS, Python, SQL, Tableau
  • Score: 63/90

Professional Summary

5.8Years
  • Jan, 2022 - Dec, 20231 yr 11 months

    Data Analyst

    Teamlease Services
  • May, 2020 - Dec, 20211 yr 7 months

    Data Analyst

    SR Intelligent Technologies
  • Jul, 2018 - May, 20201 yr 10 months

    Finance Executive

    Alten India

Applications & Tools Known

  • icon-tool

    MySQL

  • icon-tool

    Python

  • icon-tool

    Microsoft Power BI

  • icon-tool

    PostgreSQL

  • icon-tool

    BigQuery

  • icon-tool

    Jupyter Notebook

  • icon-tool

    Pandas

  • icon-tool

    NumPy

  • icon-tool

    Seaborn

  • icon-tool

    Matplotlib

  • icon-tool

    Microsoft Excel

Work History

5.8Years

Data Analyst

Teamlease Services
Jan, 2022 - Dec, 20231 yr 11 months
    • Identified non buying customers for targeted sales promotion
    • Communicated KPIs via reports & dashboards
    • Sales performance analysis to identify top selling categories and regions
    • Converted raw data to report-ready format through ETL process
    • Automated data extraction and reporting processes using Python scripts
    • Collaborated with sales team to understand data needs and deliver solutions
    • Delivered on-demand (adhoc) sales data and reports
    • Tracked performance of sales team for incentives

Data Analyst

SR Intelligent Technologies
May, 2020 - Dec, 20211 yr 7 months
    • Created interactive dashboards using Excel & Power Bi to visualize key performance metrics
    • Conducted analysis, identified trends, and produced reports to support decision-making
    • Submitted findings with internal teams to identify root causes of IT issues

Finance Executive

Alten India
Jul, 2018 - May, 20201 yr 10 months
    • Assisted in financial analysis and prepared accurate reports for finance team
    • Supported accounts payable/receivable processes and ensured financial data integrity
    • Maintained financial models to forecast cash flow & revenue projections

Achievements

  • I automated a daily report using a python script which automatically retrieved data from Azure Storage. After downloading, the script performed a few transformations plus aggregations and saved the report to an excel file. The script then sent this report via email to the stakeholders.

Major Projects

3Projects

Beat Program

Flipkart
Jan, 2022 - Dec, 2022 11 months
    • Collaborated closely with sales and Business Development teams to communicate key performance indicators (KPIs) effectively, contributing to an increase in overall project revenue.
    • Spearheaded the implementation of the Beat program, which involved introducing an online sales model and recruiting sales personnel.
    • This initiative aimed to bolster memberships and drive up order volumes, resulting in tangible growth outcomes for the project.
    • Continuously engaged with stakeholders to assess the effectiveness of strategies and adapt approaches for optimal results.

Sales Report Automation

Nov, 2022 - Dec, 2022 1 month
    • Developed and implemented a streamlined automation solution by crafting a Python script to generate a daily report seamlessly.
    • Leveraging Azure Storage, the script autonomously retrieved raw data, conducted essential transformations, and performed aggregations.
    • Resulting in increased efficiency and accuracy, the report was then saved to an Excel file and efficiently disseminated to stakeholders via email.
    • This initiative significantly reduced manual effort and minimized errors, ensuring stakeholders received timely and actionable insights to support informed decision-making.

IT helpdesk ticket analysis

Mashreq Bank
May, 2020 - Dec, 20211 yr 7 months
    • Worked with internal IT teams to figure out why IT problems were happening.
    • We used helpdesk ticket data directly from CRM to understand what was going wrong.
    • By breaking down the data in a simple way, we could understand the IT issues better.
    • This teamwork helped us find solutions and make our IT systems better.

Education

  • MBA (Finance)

    Symbiosis Centre for Distance Learning (2023)
  • Bachelor of Business Management

    Mangalore University (2023)

Certifications

  • Data Analyst Associate

    datacamp (Nov, 2023)
    Credential ID : DAA0012627238611
    Credential URL : Click here to view
  • Analyze Data with SQL Skill Path

    codecademy (Oct, 2020)
    Credential ID : 5cafb2d937090210d7df3652
    Credential URL : Click here to view
  • Learn Data Analysis with Pandas Course

    codecademy (Jul, 2020)
    Credential ID : 95dd3ed417d7d6c449afffc6401b310a
    Credential URL : Click here to view

AI-interview Questions & Answers

Yeah. So my name is Naveen Sequera, and I've been working as a data analyst for the past 5 to 6 years. So my data analyst experience is with my previous 3 companies. So my last, most recent experience was with a company called Flipkart. Actually, Flipkart was a client. My payroll company was Teamly Services. So at Flipkart, I worked closely with the sales and business management team. And my day-to-day responsibilities were showing KPIs, doing reports and dashboards, and also sharing a lot of sales data to the stakeholders. So my skill set is I'm good at programming languages like Python and SQL, and I also am very good at using Excel. I could see that my Excel knowledge is at an advanced level. So at Flipkart, I used to extract data from SQL Server or the data warehouse and perform some transformations on the data, aggregate the data, and finally share the findings to the stakeholders. So I used to do a lot of ad hoc reporting as well. And one of the, yeah, one of the important achievements here at Flipkart was I automated our daily report using a Python script. So the report was supposed to be sent every day at a certain time. So the entire process of extracting the data, doing the necessary transformation and aggregation steps, as well as sending the email out to the stakeholders was automated. The entire process saved a considerable amount of time every day for me and the team as well. So that was one mini project that I took up here at Flipkart. So other than that, mostly it was reporting, communicating KPIs, and ad hoc tasks. So this is a brief background about myself. Thank you.

So SQL, we could use techniques like grouping and aggregation to handle high velocity and voluminous data. Excel has a very versatile language and can be used to handle pretty much every situation. So my approach would be to instead of loading the data at a granular level, I would use grouping and aggregation to get the data in the right shape. And so all the data isn't loaded and only the required amount of data is loaded for analysis. Apart from that, there may be other techniques as well, which could be used. But mostly, I use the grouping and the aggregation techniques to analyze the data. SQL also has functions like window functions, and you could use those techniques as well to analyze the data. So this is as I said, I would use mainly grouping and aggregation techniques in SQL to handle and analyze data from high velocity and volume sources to prevent mortality rates.

In NoSQL, the advantage is that you can use different categories of data to store different categories of data in a NoSQL database, such as audio files, video files, etcetera. So, the main term would be that in a NoSQL database, you can use structured as well as unstructured data. While in a traditional SQL database, you can only use structured data. The biggest advantage of a NoSQL database would be that you can use structured as well as unstructured data. And that can be very useful for things like doing machine learning projects and AI-related work because, traditionally, these things were not used so much. For example, audio files, video files - the necessity of these types of data was not that high. But since the advent of machine learning and the things you can do with this type of data, I think a NoSQL database is highly recommended, and many organizations have a mixture of both SQL and NoSQL databases in their organization.

So Tableau, the Tableau is a very useful visualization tool, a business intelligence tool. So the steps that I would take would be first, the first step would be connecting to real-time data from Tableau connecting to the data source. And next, would be checking once the data is in Tableau, I would check all the data quality, like if the data is in the correct format, upload only the columns that are required and use only the required columns. Also, check the data types. If the data quality would be very essential for an interactive dashboard. So that would be my next step, checking the data quality. And then, once all that is done and only the required amount of data is selected, I would then use different visualizations to present the data to the audience. So using charts and graphs like a bar chart or a line chart to show the trend. Also, try to communicate the data as visually as possible. So once all the visualizations are created, I would do a check overall check again to see if the numbers are showing correctly. And do all the necessary quality checks before finalizing on the dashboard. So, these would be my steps for the Tableau dashboard: first would be extracting the data, checking for the data quality, building all the necessary visualizations, and finally, verifying the numbers in the visualizations and finalizing the dashboard after that.

So handling time series data is, yes, time Power BI can handle time series data quite well. So but, the first step would definitely be, the data cleaning because the dates would have to be in the correct format. I would use Power Query. If possible, I would try to get the data corrected at the source level, make the formats consistent at the source level. But if that's not possible, then I would use Power Query to get the date formats corrected. And once that is done in Power Query, then I would bring back load that data into the Power BI interface and then do the necessary next steps. That would be creating a matrix visual or a trending line chart or whatever is required to support decision making. So, the first step would be definitely either clean the data formats at the source level or at the Power Query level. If neither of these steps is possible, then I would do it at the Power BI data level. There are a few options where you can clean the data and do your visualizations. But definitely, the steps would be in that order. First would be the data cleaning at the source level, if possible, then at the Power Query level, or finally at the data model level inside Power BI. This is how I would handle time series analysis in Power BI when dealing with inconsistent date formats.

In machine learning, there are many methods for backtesting. Yes. I, at my workplace, haven't encountered this backtesting process that much. But, from what I have learned, you could use a few techniques which are available in the scikit-learn module, which could be used for the backtesting process. Yeah. But, to answer this question, I think I may not have the required amount of knowledge at the present moment. But I know that I know how to basic machine learning techniques. I do know that from loading the dataset to data cleaning and feature engineering if required. And then, finally, fitting and checking the model results. Those things I know, but we have this backtesting process. I don't have that much knowledge at the moment, but I can definitely say it's very much a required process to make sure that the machine learning model is giving the right output, and so the worst part would be getting the wrong results rather than getting no results at all. So I would say, this process is really important. And, definitely, Python is the preferred language and the industry standard when it comes to machine learning. So, we can use libraries that are especially built in Python for machine learning, like scikit-learn or deep learning models, which have a few features in them which can be used for backtesting.

Yeah. So it looks like in the select statement, we have selected the module ID. But then we are grouping with a different column, which is not used in the select statement. So actually, it should you know we were supposed to use the student ID in the select statement, but here we have used the module ID, which is incorrect. So I would replace the module ID with the student ID. I would replace module ID with student ID, and that would give me the correct result. Oh, no. Let me review this SQL query intending to retrieve average school module. Okay. So yes. Yeah. So no. That, what I would do was in the select statement, we do need the module ID. I would add the student ID as well before the module ID and then group by both the student ID and module ID to get the average score per module for a student. So the error here is that we need to use the student ID in the select statement. And in the group by, we have to add the module ID, and this would give us the correct score to help us retrieve the average score per module for a student.

It looks like the error is in the k-means variable. So the k-means function takes in a parameter called n_clusters, which is hardcoded as a string. But actually, it accepts a number. So either you don't pass any parameter or you if you're passing a parameter, then it should be an integer, like n_clusters 50 or 100, and not so if it's auto, then you need not pass this parameter. It will take the default parameter, which is already built in. And also, one more error is that auto is mentioned as a string, which is also incorrect. So that would be the main error in this function. So I would replace auto with an integer, which could be from 0 to 100 or 200, etc. And that would correct this function and give the right output.

I know that Power BI has an option to integrate Python within its environment. It can be used to build visualizations, like a heat map or any other visualization which is not already there by default inside Power BI. It can also be used for time series analysis, but I haven't had the chance to use it in my analysis at the workplace. But definitely, it is possible within Power BI. It has the support for both Python and R scripting. And you could use your imagination and creativity to integrate Python inside Power BI to get the type of advanced level of insights and information inside Power BI. So, yes, it's definitely possible to integrate a custom Python algorithm into a Power BI dashboard for predictive analysis. And it can be implemented within the Power BI desktop environment and can be used to show very advanced level of predictive or informative insights, which can support decision making.

Yes. So I had already mentioned this in a previous question that SQL has some very powerful functionality called SQL window functions, which perform at the row level. We have the normal aggregations like sum and average, which perform at the column level. But these window functions perform at the row level. Some of the very popular window functions are lag and lead. You have the rank functions like rank and dense rank. And we could also use normal aggregation functions like sum and average inside window functions. So what window functions do is they work on a row context. Suppose if you have a column called the present month, and you want to know the value of the previous month, then you could use the lag window function in this scenario. This would create another column which would give you the value for the previous month. This is one example of a window function. Also, if you want to know the total sale for all the entire year, if you have data at a monthly or a daily level, but still you want to have a column which shows you the total sale for the entire year, then you could use the sum function along with the over clause. The syntax is like this: you would use sum of the sales amount, and then you would use the over clause with brackets. Inside the over clause, you could either use partition by or order by to get the required type of result that you're looking for. SQL window functions are a very convenient and very useful function inside SQL and can be used for complex data aggregations in analytical tasks.

Python is one of the most versatile and popular programming languages that are present worldwide at this particular moment. So the advantage of Python is that its syntax is very, like, as if you have written it in English language. So it's very readable and not very robust. And I think it's quite beginner-friendly language. But though it's beginner-friendly, it has a lot of capabilities and a lot of development has happened in recent years within Python because it is open source. And anybody can contribute to the Python open source in the Python open source world. So, apart from the standard library, you have a lot of very good packages, which gives an advantage over SPSS for statistical analysis. SPSS was good at one point in time, but I think now Python has definitely overtaken SPSS in all aspects. And Python also has the advantage of which can also be used for machine learning, artificial intelligence, and so many other different areas. So, definitely, Python would be my choice for any kind of data or statistical analysis. Apart from that, since Python is open source, it's free to use and very easy to learn as well. So anybody can pick up Python and learn Python and use it at their work place to create meaningful work, which is a big advantage for organizations, which can help them grow rapidly in this age of data because data is the new holder, they say. So and so using Python to get the potential out of this data is very beneficial. And I would definitely use Python over SPSS.