Java-based Text Indexing Application

This is a Java program for text indexing, which takes in a file containing names of files to be indexed, scans each file to extract the words and assigns a document number to each file. It then builds a hash map to map each unique word to the document number(s) where it appears. The program also implements stop word removal and stemming to improve the quality of the indexed words. The output is a dictionary of words and the corresponding documents where they appear.

Name		Name	Last commit message	Last commit date
Latest commit History 7 Commits
PorterStemmer.java		PorterStemmer.java
README.md		README.md
Stemmer.class		Stemmer.class
Stemmer.java		Stemmer.java
doc1.txt		doc1.txt
doc1.txt~		doc1.txt~
doc2.txt		doc2.txt
doc2.txt~		doc2.txt~
doc_name.txt		doc_name.txt
indexer.class		indexer.class
indexer.java		indexer.java
stopwords		stopwords

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

Java-based Text Indexing Application

About

Releases

Packages

Languages

shubhamshubhankar/Indexer

Folders and files

Latest commit

History

Repository files navigation

Java-based Text Indexing Application

About

Resources

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages