This repository is a collection of minimal code to generate summarizations using models developed at AIC as well as simple REST API. We currently provide only code for network predictions (not training).
Install required modules:
pip install -r requirements.txt
Then run the server, providing a configuration file:
python api.py cfg/mbart_headline.json
To run on CUDA use --device
command line parameter:
python api.py --device cuda cfg/mbart_headline.json
Get help:
python api.py --help
We provide multilingual models finetuned on a news corpus focussing on the Czech language. Czech datasets involve SumeCzech and a proprietary dataset kindly provided by Czech News Agency. The models are of two kinds:
- models trained to generate a headline based on a concatenation of the article abstract and the main text,
- models trained to generate an abstract based on a concatenation of the headline and the main text.
More details on models, data, and training are given in HuggingFace database: