Augusta University uses the modern campus product for their catalog. In addition to being hard to navigate, this platform is poorly accessible, since the courses' description, pre-requisites, etc., are accessible only after clicking on some elements (triggering some javascript function).
This repository hosts two simple programs (one using the Node.js library Puppeteer, and the other a bash script using mainly sed) to scrape the data for a particular diploma and present the data as a csv
file.
Normally, the following steps are enough:
- Find the
poid
of your program. For example, the Bachelor of Science with a major in Computer Science is athttps://catalog.augusta.edu/preview_program.php?catoid=44&poid=10211&hl=computer&returnto=search
, which means that thepoid
I am looking for is10211
. - Open
scrape_catalog.sh
, and insert yourpoid
in thearr
array (at the top of the file), deleting all the other poids. - Run the following commands:
npm init -y npm install puppeteer chmod +x scrape_catalog.sh chmod +x convert_to_csv.sh ./scrape_catalog.sh
- Open the
outputs/xxxx.csv
file(s) (possibly with libreoffice).