This project analyzes Spotify song data using machine learning techniques to explore patterns in audio features and classify songs based on user preferences. The project is divided into two main components:
- K-Means Clustering: Unsupervised learning to group songs into clusters based on their audio features.
- Logistic Regression: Supervised classification to predict user preferences for songs.
- K-Means Clustering:
- Grouped songs into meaningful clusters based on audio features such as tempo, loudness, and danceability.
- Visualized clusters for better interpretability and analysis.
- Logistic Regression:
- Predicted song preferences using a logistic regression model.
- Evaluated model performance using accuracy metrics and confusion matrices.
This project utilized the following Python libraries:
- Pandas
- Numpy
- Matplotlib
- Seaborn
- Scikit-learn
- Regex
- Identified clusters of songs with similar characteristics.
- Built a model to predict user preferences, providing actionable insights for playlist curation.