Skip to content
This repository has been archived by the owner on Jun 11, 2024. It is now read-only.

Latest commit

 

History

History
35 lines (24 loc) · 792 Bytes

README.md

File metadata and controls

35 lines (24 loc) · 792 Bytes

Wikimedia Dumps

Filter and restructure data from Wikimedia sources to prepare use for QwantMaps.

Running

Loading Wikidata dumps

You first need to download a complete Wikidata dump from Wikimedia in JSON format.

Then you can generate CSV data for site links and labels:

src/main.py load-wikidata --dump latest-all.json.bz2

Loading stats dumps

You can download and extract data from Wikimedia statistics with a single command:

# Omit `--download` if you already downloaded raw dumps
src/main.py load-stats --download

Configuration

src/config.py holds configuration for the languages to include in the dumps and the list of the files to load for statistics.