Releases: WorksApplications/SudachiTra
Releases · WorksApplications/SudachiTra
v0.1.9
Highlights
- Support 4.34 and newer version of transformers (#66)
v0.1.8
Highlights
- Add new
word_format_type
: normalized_nouns
. (#48, #50)
- Normalizes morphemes that do not have conjugation form.
v0.1.7
Highlights
- Update sudachipy version in order to use PosMatcher
- Required
sudachipy>=0.6.2
- Add preprocessing codes #32 #34
- Normalizers and filters for pretraining corpus
NormalizedConjugation
#31 #35
- New
word_form_type
that normalizes a morpheme with preserving conjugation of a word
v0.1.6
update
- update sudachipy version #30
- require SudachiPy impremented in Rust (
sudachipy>=0.6.0
).
- add
InputStringNormalizer
#27
- add NFKC and Lowercase normalization to tokenizers.
obsolete feature
v0.1.5
- improve default configurations #21
- add word forms and tests
surface_harf_ascii
dictionary_half_ascii
dictionary_and_surface_half_ascii
v0.1.3
- Add a slow tokenizer for development
v0.1.1
- Bump bunkai from 1.3.0 to 1.4.0 (#10)
- Fix a bug related (#11)