Part-of-speech and morphological tagger employing a simple cased-based algorithm.
- Free software: MIT license
- Documentation: https://casetagger.readthedocs.io.
The case tagger is a polyglot part-of-speech and morphological gloss-tagger. The tag-set used is the Typecraft tag-set.
The tagger uses simple case-based learning from a large corpus to create a large database of different cases for each language.
When tagging a phrase, the tagger fetches any relevant case for each word, and then 'merges' the cases.
or
After installation, you will have available the casetagger command:
The three different subcommands are tag, train and test.
Usage: casetagger test [OPTIONS] [FILES]... Options: --language TEXT --raw-text --output-raw-text --print-test-details --help Show this message and exit.
Usage: casetagger train [OPTIONS] [FILES]... Options: --language TEXT --help Show this message and exit.
Each command takes a files as arguments. Each file is expected to be a TC-XML file. All output is written to stdout.
TODO
- TODO
This package was created with Cookiecutter and the audreyr/cookiecutter-pypackage project template.