Skip to content

inderpalk/sql_project

Repository files navigation

Final-Project-Transforming-and-Analyzing-Data-with-SQL

Project/Goals

(To create and analyze total revenue generated, sales performance report + to analyze and clean data + find risk areas, identify potential issues + to understand the website visitor buying behaviour, time spent on site and product category.)

Process

(Step 1: Loading CSV Files into Database)

(Step 2: Data Cleaning)

(Part 3: Starting with Questions)

(Step 4: Starting with Data)

(Step 5: QA Process)

(Part 6: Generate ERD)

(Step 6: Results)

(Step 7: Conclusion)

(Step 8: Future Goals)

(Part 1: Loading CSV Files into Database)

I had created an ecommerce database first, then I had created the tables/columns and I had also added details for each column/alter column + import csv. I had also imported the csv via sql terminal which helped with importing all of my tables.

(Part 2: Data Cleaning)

I had took these steps to clean data: removed: (irrelevant, redundant, or duplicate data, Clean “structural” issues, Type conversion, Clean missing data, Clean outliers, Validate). I had also taken numerous steps to clean the data

(Part 3: Starting with Questions)

I had answered the 5 questions as well as included the querry used that shows the steps I took to get the answer.

(Part 4: Starting with Data)

I had created and answered 3 questions that could help with the database.

(Part 5: QA Your Data)

I had used a variety of codes and steps for my QA process to identify risk areas.

(Part 6: Generate the ERDLoading Your Final Table Into PostgreSQL Database)

I had generated an ERD for my database.


Results

(Fill in what you discovered this data could tell you: What I had discovered with this data is that it was not clean data, it took some time to re-organize the data and transform the data and clean/filter the data. It was challenging to manually add the columns and then try to apply different queries to adhere to each column. And USA made the most revenue and organic search was the highest lead. There was no clear pattern on time spent on site.) And how you used the data to answer those questions: I had used the data to answer these questions by analyzing the data and interpreting the data)

Challenges

(Some challenges that I had experienced include: Null values, having to change time, and having to manually add columns as well as numerous transformations and data cleaning. There was also missing values, Inconsistent data that creates confusion, incorrect data, value entered in the wrong field, Errors in columns/tables and missing/null values. The PGAdmin app also crashes and the lack/limited data for the columns.)

Future Goals

(What would you do if you had more time?) (Answer: If I had more time I would go more in-depth with the data cleaning process, transformations, troubleshooting, and educate others about what I had learned and try to learn more.)

About

Transforming and Analyzing Data with SQL

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published