Dataset is collected from Kaggle: https://www.kaggle.com/datasets/mehmettahiraslan/customer-shopping-dataset.
The dataset contains shopping information from 10 different malls between 2021 and 2023 in Istanbul. The data gathered from various ages and genders to provide a comprehensive view of shopping habits in Istanbul.
invoice_no: Invoice number. Nominal. A combination of the letter 'I' and a 6-digit integer uniquely assigned to each operation. customer_id: Customer number. Nominal. A combination of the letter 'C' and a 6-digit integer uniquely assigned to each operation. gender: String variable of the customer's gender. age: Positive Integer variable of the customers age. category: String variable of the category of the purchased product. quantity: The quantities of each product (item) per transaction. Numeric. price: Unit price. Numeric. Product price per unit in Turkish Liras (TL). payment_method: String variable of the payment method (cash, credit card or debit card) used for the transaction. invoice_date: Invoice date. The day when a transaction was generated. shopping_mall: String variable of the name of the shopping mall where the transaction was made.
To understand the data using Python. Check any duplicate and null data.
Data cleaning by standardize the data format using Pandas.
Analysis each parameter data to gather information of data information of each parameter.
Analysis combined parameters to gather further information based on the dataset.
Analysis the dataset further through machine learning.