This repository is responsible for generating the data consumed by the Music Pearls web application, which provides an intuitive interface for exploring classical music works and their popularity.
This project fetches and analyzes classical music data from Spotify to standardize metadata and calculate work popularity:
- Scrapes track information from Spotify for composers listed in composers.json
- Standardizes track names through regex pattern matching and AI analysis
- Groups tracks that belong to the same musical work (e.g. movements of a Symphony)
- Calculates popularity metrics for complete musical works based on their constituent tracks
- Spotify integration for fetching composer albums, tracks and track details
- Regex-based parsing for initial standardization of track names
- AI-powered analysis using OpenAI's GPT models to standardize track names
- Support for common classical music catalogs (BWV, K., Op., etc.)
- Recognition of standard musical forms (Symphony, Concerto, Sonata, etc.)
- Grouping tracks into complete musical works and calculating popularity metrics of the full work, not just individual movements or tracks
-
Clone the repository:
git clone https://github.com/yourusername/classical-music-metadata-parser.git cd classical-music-metadata-parser
-
Create and activate a virtual environment:
python -m venv venv # On Windows .\venv\Scripts\activate # On macOS/Linux source venv/bin/activate
-
Install dependencies:
pip install -r requirements.txt
-
Set up environment variables:
# Copy the template file cp .env.template .env # Edit .env file with your credentials: # - SPOTIFY_CLIENT_ID=your_spotify_client_id # - SPOTIFY_CLIENT_SECRET=your_spotify_client_secret # - OPENAI_API_KEY=your_openai_api_key
-
Create required directories and files:
# Create data directory mkdir data # Create composers.json with your desired composers # Example structure (Spotify IDs can be left blank. They will be populated by getComposerInfo.py): # { # "composers": [ # { # "name": "Ludwig van Beethoven", # "spotifyId": "", # } # ] # } touch data/composers.json