Skip to content

gden173/chicago-marathon-2023

Repository files navigation

Chicago Marathon 2023

Description

The initial dataset that has been collected was scraped from the 2023 Chicago Marathon official results in which the late Kelvin Kiptum ran the current world record marathon time of 02:00:30 and the 2024 Paris Olympic Marathon Championt Siffan Hassan ran the Second fastest recorded time for women.

Siffan Hassan

Data

Location

All data is located in the data/ directory. The data is stored in a Parquet files, with the aggregated parquet file being data/chicago-marathon-results-2023.parquet.

The data is also available from Kaggle

Format

The data is in the format

  • 'Name (CTZ)'
  • 'Age Group'
  • 'Bib Number'
  • 'City'
  • `State'
  • 'Gender'
  • 'Short'
  • 'Split'
  • 'Time Of Day'
  • 'Time'
  • 'Diff'
  • 'min/km'
  • 'km/h'
  • 'min/mile'
  • 'miles/h'

Scraping Script

The script that scrapes the data is located in [scrape.py](scrape.py). The script uses the requests and beautifulsoup4 libraries to scrape the data, it relies on the chicago_marathon_records.csv file to get the list of urls to scrape.

Attribution

About

Data repository containing the checkpoint data for the 2023 Chicago Marathon.

Topics

Resources

License

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published

Languages