Skip to content

Typecraft/casetagger

Repository files navigation

Case Tagger

Documentation Status Updates

Part-of-speech and morphological tagger employing a simple cased-based algorithm.

Overview

The case tagger is a polyglot part-of-speech and morphological gloss-tagger. The tag-set used is the Typecraft tag-set.

The tagger uses simple case-based learning from a large corpus to create a large database of different cases for each language.

When tagging a phrase, the tagger fetches any relevant case for each word, and then 'merges' the cases.

Installation

or

Usage

After installation, you will have available the casetagger command:

The three different subcommands are tag, train and test.

Usage: casetagger test [OPTIONS] [FILES]...

Options:
  --language TEXT
  --raw-text
  --output-raw-text
  --print-test-details
  --help                Show this message and exit.
Usage: casetagger train [OPTIONS] [FILES]...

Options:
  --language TEXT
  --help           Show this message and exit.

Each command takes a files as arguments. Each file is expected to be a TC-XML file. All output is written to stdout.

Configuration

TODO

Features

  • TODO

Credits

This package was created with Cookiecutter and the audreyr/cookiecutter-pypackage project template.