For this homework assignment, you will create a class that is able to clean and parse text into stemmed words. Use UTF8
when writing your files. See the Javadoc comments in the template code for additional details.
The official name of this homework is TextFileStemmer
. This should be the name you use for your Eclipse Java project and the name you use when running the homework test script.
See the Homework Guides for additional details on homework requirements and submission.
Below are some hints that may help with this homework assignment:
-
You need to have the third-party Apache OpenNLP library setup as a user library in Eclipse for this assignment. See the Adding User Libraries to Eclipse guide for details.
-
The methods for cleaning and parsing text are already provided in the
TextParser
class. You can use this class for your homework and project. -
You should use try-with-resources and buffered readers and writers. See the Files and Exceptions examples from lecture.
You are not required to use these hints in your solution. There may be multiple approaches to solving this homework.