Skip to content

This poject is a segmentation tool for Chinese text. It uses a combination of Forward Maximum Matching (FMM) and Reverse Maximum Matching (RMM) followed by a lowest-mean-cost disambiguation algorithm.

Notifications You must be signed in to change notification settings

robert1ridley/chinese-segmentation-tool

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

19 Commits
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

Chinese Segmentation Tool

This poject is a segmentation tool for Chinese text. It uses a combination of Forward Maximum Matching (FMM) and Reverse Maximum Matching (RMM) followed by a lowest-mean-cost disambiguation algorithm.

Requirements

  • Python version: 3.5.1

Start Developing

After cloning the repository, run the program:

  • Setting up the environment:

    • cd chinese-segmentation-tool
    • Create a virtual environmnet: python3 -m venv virtual-environment
    • cd virtual-environment/bin
    • Enter source activate to start the virtual environment
    • cd ../.. (to return to the chinese-segmentation-tool folder)
    • Install the project dependencies:pip install –r requirements.txt
  • Start the program:

    • Ensure that you are inside chinese-segmentation-tool and that your virtual environment is running
    • Enter python __main__.py
    • Following the prompt in the terminal, enter a Chinese sentence. The program will then output the results.
    • To stop the program, enter 1.
    • Deactivate your virtual environment by entering deactivate

About

This poject is a segmentation tool for Chinese text. It uses a combination of Forward Maximum Matching (FMM) and Reverse Maximum Matching (RMM) followed by a lowest-mean-cost disambiguation algorithm.

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published

Languages