Skip to content

This project demonstrates a complete data analytics workflow that includes data extraction, cleaning, transformation, loading, and analysis. It is ideal for data engineers and analysts looking to learn ETL (Extract, Transform, Load) processes and SQL-driven analytics.

Notifications You must be signed in to change notification settings

theDhanendra/retail-data-analysis-with-Python-and-SQL

Repository files navigation

🛒 Retail Orders Data Analysis and Visualization 📊

This project focuses on analyzing a retail orders dataset to generate actionable business insights. The dataset was processed, cleaned, and analyzed using Python, SQL, and various visualization tools. The insights derived can help businesses optimize their operations, improve profitability, and enhance customer experience.


✨ Project Highlights

  • 🧹 Data Cleaning and Preprocessing using Python.
  • 🚀 Feature Engineering: Creation of new columns like discount, sale price, and profit.
  • 🛠️ SQL Queries for in-depth analysis and insights.
  • 📈 Advanced Visualizations: Bar charts, Column Charts, line plots, Pie Chart and Heatmap.
  • 🔍 Actionable Insights to drive business decisions.

🔑 Key Features

  1. 🧹 Data Cleaning:

    • Replaced invalid or missing values.
    • Standardized column names for consistency.
    • Converted data types for better usability.
  2. 🧮 Data Analysis:

    • SQL queries to analyze top products, regions, and profitability.
    • 📅 Month-over-month and year-over-year comparisons.
    • 🎯 Impact of discounts on sales and profits.
  3. 📊 Visualizations:

    • 🟠 Category-wise Performance (Bar Charts).
    • 🗺️ Top Cities on a Map (Folium).
    • 📉 Monthly Trends (Line Charts).
    • 🌍 Regional Profitability and Product Popularity.

💻 Technologies Used

  • Python Libraries:

    • 🐼 Pandas
    • 📊 Matplotlib
    • 🖌️ Seaborn
    • 🛠️ mysql.connector
    • 📍 sqlalchemy for create_engine
  • SQL: Queries for analysis and insights.

    • Top Revenue Generating Products
    • Best-Selling Products by Region
    • Month-over-Month Sales Growth
    • Category-Wise Best Sales Month
    • Sub-Category Profit Growth
    • Discount Effectiveness
    • Shipping Mode Analysis
  • Database: 🗄️ MySQL for data storage and querying.


⚙️ Setup Instructions

  1. Clone this repository:
    git clone https://github.com/theDhanendra/retail-orders-analysis.git
    
  2. Navigate the project directory:
    cd retail-orders-analysis
    
  3. Install the required Python libraries:
    pip install -r requirements.txt
    
  4. Set up MySQL and create the database and table using the provided SQL script (retail_orders.sql).
  5. Run the Jupyter Notebook for analysis.

Thank You 😊

About

This project demonstrates a complete data analytics workflow that includes data extraction, cleaning, transformation, loading, and analysis. It is ideal for data engineers and analysts looking to learn ETL (Extract, Transform, Load) processes and SQL-driven analytics.

Topics

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published