This repository is inspired and built on the CARLA library. CARLA is a python library to benchmark counterfactual explanation and recourse models. It comes out-of-the box with commonly used datasets and various machine learning models. Designed with extensibility in mind: Easily include your own counterfactual methods, new machine learning models or other datasets. Find extensive documentation here! Our arXiv paper can be found here.
What is algorithmic recourse? As machine learning (ML) models are increasingly being deployed in high-stakes applications, there has been growing interest in providing recourse to individuals adversely impacted by model predictions (e.g., below we depict the canonical recourse example for an applicant whose loan has been denied). This library provides a starting point for researchers and practitioners alike, who wish to understand the inner workings of various counterfactual explanation and recourse methods and their underlying assumptions that went into the design of these methods.
- Getting Started (notebook): Source
- Causal Recourse (notebook): Source
- Plotting (notebook): Source
- Benchmarking (notebook): Source
- Adding your own Data: Source
- Adding your own ML-Model: Source
- Adding your own Recourse Method: Source 1 Source 2
Name | Source |
---|---|
Adult | Source |
COMPASS | Source |
Give Me Some Credit (Credit) | Source |
German Credit | Source |
Mortgage | |
TwoMoon |
Model | Description | Tensorflow | Pytorch | Sklearn | XGBoost |
---|---|---|---|---|---|
MLP (ANN) | Multi-Layered Perceptron (Artificial Neural Network) with 2 hidden layers and ReLU activation function. | X | X | ||
LR | Linear Model with no hidden layer and no activation function. | X | X | ||
RandomForest | Tree Ensemble Model. | X | X |
The framework a counterfactual method currently works with is dependent on its underlying implementation. It is planned to make all recourse methods available for all ML frameworks . The latest state can be found here:
Recourse Method | Paper | Tensorflow | Pytorch | SKlearn | XGBoost |
---|---|---|---|---|---|
CCHVAE | Source | X | |||
Contrastive Explanations Method (CEM) | Source | X | |||
Counterfactual Latent Uncertainty Explanations (CLUE) | Source | X | |||
CRUDS | Source | X | |||
Diverse Counterfactual Explanations (DiCE) | Source | X | X | ||
Feasible and Actionable Counterfactual Explanations (FACE) | Source | X | X | ||
FeatureTweak | Source | X | X | ||
FOCUS | Source | X | X | ||
Growing Spheres (GS) | Source | X | X | ||
Mace | Source | X | |||
Revise | Source | X | |||
Wachter | Source | X |
python3.7
pip
Using python directly or within activated virtual environment:
pip install -U pip setuptools wheel
pip install -r requirements-dev.txt
pip install -e .
flowchart TD
A[Recourse Methods] --> B[_Deprecated]
A --> C[Data]
A --> D[Evaluation]
A --> E[Live Site]
A --> F[Models]
A --> G[Recourse Methods]
C --> H[Catalog]
D --> I[Catalog]
D --> J[benchmark.py]
F --> K[Catalog]
G --> L[Catalog]
H --> M[online_catalog.py]
I --> N[Benchmark Metrics. i.e. distance, time, etc]
K --> O[catalog.py]
L --> P[Recourse Folders. i.e. dice, cchvae, etc]
E --> Q[Server.py]
A --> R[run_experiment.py]
This folder contains deprecated material from the CARLA library that is no longer considered useful for the current development efforts of this repository.
This folder houses all datasets and their cached versions. It also contains the data catalog class, which includes methods for loading datasets and other relevant functionalities.
This folder contains the implementation of all evaluation and benchmark metrics used to compare recourse methods in the repository. This includes metrics such as distance
, redundancy
, success rate
, time
, violations
, and y nearest neighbors
.
This folder contains the implementation of the frontend UI interface, which displays results stored in results.csv
from executing ./experiments/run_experiment.py
.
This folder contains all implemented models/classifiers in the repository. It also contains the model catalog class, which includes methods for loading models/classifiers and other relevant functionalities.
This folder contains all the implemented recourse methods in the repository. Each recourse method has its own subfolder within the catalog directory (methods/catalog
) and is implemented using the RecourseMethod
API class` interface.
from data.catalog import DataCatalog
from evaluation import Benchmark
import evaluation.catalog as evaluation_catalog
from models.catalog import ModelCatalog
from random import seed
from methods import GrowingSpheres
RANDOM_SEED = 54321
seed(RANDOM_SEED) # set the random seed so that the random permutations can be reproduced again
# load a catalog dataset
data_name = "adult"
dataset = DataCatalog(data_name, "mlp", 0.8)
# load artificial neural network from catalog
model = ModelCatalog(dataset, "mlp", "tensorflow")
# get factuals from the data to generate counterfactual examples
factuals = (dataset._df_train).sample(n=10, random_state=RANDOM_SEED)
# load a recourse model and pass black box model
gs = GrowingSpheres(model)
# generate counterfactual examples
counterfactuals = gs.get_counterfactuals(factuals)
# Generate Benchmark for recourse method, model and data
benchmark = Benchmark(model, gs, factuals)
evaluation_measures = [
evaluation_catalog.YNN(benchmark.mlmodel, {"y": 5, "cf_label": 1}),
evaluation_catalog.Distance(benchmark.mlmodel),
evaluation_catalog.SuccessRate(),
evaluation_catalog.Redundancy(benchmark.mlmodel, {"cf_label": 1}),
evaluation_catalog.ConstraintViolation(benchmark.mlmodel),
evaluation_catalog.AvgTime({"time": benchmark.timer}),
]
df_benchmark = benchmark.run_benchmark(evaluation_measures)
print(df_benchmark)
Using python directly or within activated virtual environment:
python .\quick_start.py
This interface displays the results from running the recourse benchmark library with a range of datasets, models and recourse methods.
cd .\live_site
pip install -r .\requirements.txt
python .\server.py
Read more from here to learn about amending the live site tool.
Using python directly or within activated virtual environment:
pip install -r requirements-dev.txt
python -m pytest .\tools\sanity_test.py
Before running the command below, clear out all former computations from the results.csv
file. Ensure to maintain the header (first line) of the csv file, and only delete the computation result.
python -m experiments.run_experiment
black .\
Contributions of any kind are very much welcome! Take a look at the To-Do issues section to see what we are currently working on. If you have an idea for a new feature or a bug you want to fix, please follow look at the subsections below for our branching and commit policies, and make a PR with your suggestions.
- Branch off of
main
for all feature work and bug fixes, and create a "feature branch". Prefix the feature branch name with your name. The branch name should be in snake case and it should be short and descriptive. E.g.abu/readme_update
-
Commits should be atomic (guideline: the commit is self-contained; a reviewer could make sense of it even if they viewed the commit diff in isolation)
-
Commit messages and PR names are descriptive and written in imperative tense. With commit messages, they should include if they are a feature or a fix. E.g. "feat: create user REST endpoints", or "fix: remove typo in readme". From the example in the previous sentence, pay attention to the imperative tense used.
-
PRs can contain multiple commits, they do not need to be squashed together before merging as long as each commit is atomic.
-
Modules and Packages: Use lowercase letters with underscores to separate words. Example: minimum_observables.py, growing_spheres.py.
-
Classes: Use CamelCase for class names. Example: MinimumObservables, GrowingSpheres.
-
Functions and Methods: Use lowercase with underscores to separate words. Example: load_data(), get_counterfactuals().
-
Variables: Use lowercase with underscores. Be descriptive but concise. Example: training_data, learning_rate.
-
Constants: Use uppercase letters with underscores to separate words. Example: MAX_EPOCHS, DEFAULT_BATCH_SIZE.
-
Global Variables: Use a prefix like g* or global* to indicate that it is a global variable. Example: g_model_dir, global_logger.
- We advise future contributors to consider utilizing
PyTorch
for their recourse implementations whenever possible. This recommendation stems from our past experience, which has demonstrated thatPyTorch
benefits from a larger community support and offers easier refactoring, in contrast toTensorFlow
, which tends to be more susceptible to version changes.
- It is essential that implemented algorithms closely match the research paper they are derived from. Therefore, every implemented algorithm must be accompanied by a
reproduce.py
test file (in the corresponding folder inmethods/catalog
). This file should contain unit tests that replicate the experiments presented in the corresponding research paper, ensuring that the results obtained are consistent with those reported in the paper, within an acceptable margin of error.
- Expand the existing repository of available recourse methods to include new recourse methods.
- Minimum Observables - https://arxiv.org/abs/1907.04135
- ClaPROAR - https://arxiv.org/abs/2308.08187
- PROBE - https://arxiv.org/abs/2203.06768
- Extend the existing frontend design to incorporate new interactive features (Adjourned).
- Revamp the entire library to a newer python version.
- Refactor existing methods to utilize a singular backend type.
- Extending the repo to be installible through pip.