Skip to content

rkapsalis/Thesis

Repository files navigation

⚖ Research and development of a system for bias identification in recommender systems

image

📕 Abstract

Recently, researchers have increased their scrutiny of ethical issues on artificial intelligence (AI), especially οn the field of Machine Learning. However, most previous studies on the area of Ethical Machine Learning have only focused on classification and regression tasks, while only a few studies have investigated ethical issues on recommender systems. The aim of this Diploma Thesis is to contribute to the understanding of biases that appear in recommender systems.

In this direction, a web-app was developed to help users understand how biases are introduced in recommender systems. Moreover, we could thereby estimate the extend of bias in these systems. The app is comprised by four main pages. The first page visualizes datasets to help users find possible biases. In the second page, the user can build a recommender system by choosing between a vast collection of algorithms and hyperparameter tuning options in a user-friendly way. Additionally, we developed a page for the evaluation of a recommender system as per popularity bias, fairness, diversity, novelty and coverage. The evaluation consists of a) bias monitoring through different types of plots for a single dataset or dataset comparison b) cut-off analysis and c) hyperparameter analysis. Finally, we developed a page for popularity bias mitigation using one of the four algorithms that are available: FAR, PFAR, FA*IR and Calibrated recommendations.

With reference to the broader field of ethical issues, this thesis shares special interest to popularity bias, diversity, novelty and item coverage. An extensive experimental study was conducted to gain a better understanding of the sources of bias and analyze the effect of different bias mitigation algorith-ms. This was implemented by utilizing the aforementioned web app. Four datasets were used in the present study: one real dataset provided by a major electronics retailer, and three datasets collected from the internet. The first part of the study examines the role of the hyperparameter tuning for every algorithm that was used and the role of dataset characteristics, in bias and accuracy. It also compares the above-mentioned datasets. The second part consists of bias mitigation using three re-ranking algorithms: FAR, PFAR and Calibrated recommendations and an in-processing algorithm.

This study has identified that data characteristics, and especially the sparsity of user-item matrix, can highly affect the bias that is introduced. Moreover, another significant finding is that the post-processing mitigation algorithms that were examined can improve the bias-accuracy tradeoff, but have several limitations too. In conclusion, developers of recommender systems need to be aware of sources of biases and of the accuracy-bias tradeoff. This work contributes to this direction and lays the groundwork for future research into bias in recommender systems.

💻 Streamlit app

Visualize data

image
Useful information about Movielens1M dataset

In this page, the user can get a qualitative understanding of the data via getting useful information and statistical details for the dataset and via 4 main types of plots that are offered:

Most rated itemss

image

Top users

image

Long tail

image

Average number of ratings

image

Build recommendation system

The user can also build a recommendation system, choosing from a variety of algorithms and evaluation metrics, provided by Elliot framework. image image

He can use the default values of the hyperparameters of every algorithm or set his preferred values (for more experienced users):

image

Bias identification

After building a recommender system, the user can analyze the generated recommendations by using the evaluation metrics of his preference. There are 24 metrics available including accuracy, popularity bias, coverage, diversity and novelty metrics, provided by Elliot framework. You can either analyze the results of a single dataset or compare the results of different datasets.

Best results

Best results

image

Hyperaparameter analysis

image

Cutoff analysis

image

Dataset comparison

image

Bias mitigation

The bias mitigation technique used, belongs to the category of post-processing techniques, and more specifically to the re-ranking algorithms. These techniques take as input a recommendation list for every user in the dataset, produced by an algorithm of your choice. There are 3 bias mitigation algorithms, available in our app: FAR, PFAR and FA*IR. These algorithms are provided by Librec-auto.

image

When the bias mitigation process has been completed, plots comparing the results produced by the technique with the initial results are shown.

Metrics explanation

The app provides also detailed and simple explanations in non-technical terms, for all the evaluation metrics contained in our app: image image

Upload data

image

📜 License

Creative Commons Zero v1.0 Universal

About

Master thesis for Computer Engineering and Informatics Department of University of Patras

Topics

Resources

License

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published

Languages