Segment customers based on their purchasing behavior to help a retail company understand different customer groups and tailor marketing strategies.
The dataset contains transactions from a UK-based online retail company from December 2010 to December 2011. It includes details such as invoice number, stock code, quantity, invoice date, unit price, customer ID, and country.
- Data Collection: Loaded the dataset from the UCI Machine Learning Repository.
- Data Cleaning: Removed missing values, handled outliers, and created a TotalPrice column.
- Exploratory Data Analysis (EDA): Analyzed total sales and number of orders by country.
- Feature Engineering: Calculated Recency, Frequency, and Monetary value for each customer.
- Model Selection: Used K-Means clustering to segment customers.
- Model Training and Evaluation: Evaluated clustering using the silhouette score.
- Visualization: Visualized clusters in 2D and 3D plots.
- Identified 4 distinct customer segments.
- Detailed characteristics of each segment.
- Clone the repository.
git clone https://github.com/HarshaBojanki3/Customer-Segmentation-Project
- Navigate to the project directory.
cd Customer-Segmentation-Project
- Install required libraries:
pip install pandas numpy matplotlib seaborn scikit-learn openpyxl
- Run the Jupyter notebook.
jupyter notebook Customer_Segmentation_Project.ipynb
Customer_Segmentation_Project.ipynb
: Jupyter Notebook containing the code for the project.README.md
: Documentation for the project.
- This project demonstrates skills in data cleaning, exploratory data analysis, feature engineering, clustering, and visualization.
- The dataset was obtained from the UCI Machine Learning Repository.