This repository contains an implementation for a recommendation system. The application is trained using Netflix Prize dataset.
However, any type of dataset can easily be plugged in.
In order to run this project you need the following resources:
- Hadoop Distributed File System (HDFS) ~ 3.1.2
- Spark ~ 3.1.2
- PySpark
- Two python scripts are used for preprocessing and tokenization.
- One python script for ALS implementation and one for FP growth.
Please don't hesitate to make a pull request!