Skip to content

Latest commit

 

History

History
23 lines (18 loc) · 835 Bytes

README.md

File metadata and controls

23 lines (18 loc) · 835 Bytes

Information Retrieval

This project implements a simple search engine, which consists of web crawling, creating the inverted index and serving the results through a query processor that ranks the documents based on TF-IDF cosine similarity. Multi-threading is supported in all three stages of the application.

Deployment

  • Web crawler: Python 3.9 and Anaconda
  • Indexer: Java 8 and Gradle v6.7
  • Query processor: NodeJS v15.8
  • Database: MongoDB v4.4

Alternatively, you can use Docker and Docker Compose to run each and every component:

$ docker-compose build
$ docker-compose up

Screenshots

homepage

Results Page

Authors