Skip to content

Latest commit

 

History

History
34 lines (22 loc) · 1.27 KB

README.md

File metadata and controls

34 lines (22 loc) · 1.27 KB

Drizly Web Crawler

Beers' characteristics Web Crawler. Extracting from Drizly website.

Intro

Its main goal is to retrieve the beers characteristics, given a certain beer style. It crawls all the beers in all pages until it reaches the end.

Setup

# Create python venv
python3.7 -m venv .venv

# Linux
source .venv/bin/activate

# Windows
.venv\Scripts\activate

pip install -r requirements.txt

Running the Crawler

To run the crawler you must pass the category's endpoint as an argument to the python script. An example is shown below:

python .\drizly_crawler.py /beer/ale/ipa/c15

In this example, the seed of the crawler will be the https://drizly.com/beer/ale/ipa/c15 page. When the end of the page is reached, it jumps to the next page and all the crawling process runs again, until all the pages for this beer style are crawled.

Buy Me A Coffee