Open Office XML (docx) files parser
You need Python 3.7 or later to run dcl-asozd-parser.
Used packages:
- beautifulsoup4
- lxml
-
Clone a repository:
git clone git@github.com:drnk/dcl-asozd-parser.git
-
Create virtual environment and start it:
cd dcl-asozd-parser python -m venv .venv # unix source .venv/bin/activate # windows .venv\Scripts\activate.bat
-
Upgrade
pip
and download and install necessary libraries:python -m pip install -U pip pip install -r requirements.txt
pytest
To parse all files end ups with итоговая карточка
, run:
python parse.py "in" --source-mask=".*,\s*итоговая карточка.docx"