Feedback Forensics is a tool to investigate pairwise feedback data used for AI training and evaluation: when used for training, what is the data teaching our models? When used for evaluation, towards what kind of models is the feedback leading us? Is this feedback asking for more lists or more ethically considerate responses? Feedback Forensics enables answering these kind of questions, building on the Inverse Constitutional AI (ICAI) pipeline to automatically detect and measure the implicit objectives of annotations. Feedback Forensics is an open-source Gradio app that can be used both online and locally.
"Investigate your pairwise feedback data" 🕵🏻♂️💬
pip install feedback-forensics
To start the app locally, run the following command in your terminal:
feedback-forensics -d data/output/example
This will start the Gradio interface on localhost port 7860 (e.g. http://localhost:7860).
Note
The online results are currently not available when running locally.
To investigate your own dataset, you need to run your own Inverse Constitutional AI (ICAI) experiment. Install the ICAI package as described here, including setting up relevant API secrets. For comparability, we initially recommend using ICAI standard principles rather than generating new ones. These standard principles are used to created the online interface results (shown as the implicit objectives). With the package installed, run:
icai-exp data_path="data/input/example.csv" s0_added_standard_principles_to_test="[v2]" annotator.skip=true s0_skip_principle_generation=true
Replace example.csv
with your own dataset, ensuring it complies with the ICAI standard data format (as described here, i.e. containing columns text_a
, text_b
, and preferred_text
). The last two arguments (annotator.skip
and s0_skip_principle_generation
) reduce experiment cost by skipping parts not necessary for feedback forensics visualisation. Set s0_skip_principle_generation=false
to additionally generate new principles beyond the standard set.
Once the experiment is completed, run the following command (also shown at end of ICAI experiment terminal output):
feedback-forensics -d /path/to/icai_results/
This command will again open up the feedback forensics app on localhost port 7860, now including the local results on your own dataset.
Feedback Forensics relies on AI annotators (LLM-as-a-Judge) to detect implicit objectives in feedback data. Though such annotators have been shown correlate with human judgements on many tasks, they also have well-known limitations: they are often susceptible to small input changes and can exhibit various biases (as do human annotators). As such, Feedback Forensics results should be taken as an indication for further investigation rather than a definitive final judgement of the data. In general, results based on more samples are less susceptible to noise introduced by AI annotators – and thus may be considered more reliable.
If you want to contribute to Feedback Forensics, there are two options to set up the development environment:
- Clone this repository
- Install the package with development dependencies:
pip install -e ".[dev]"
For a consistent development environment, this repository includes a VS Code dev container configuration:
- Install the Remote - Containers extension
- Open the repository in VS Code
- Click "Reopen in Container" when prompted
To run the tests for the package, run:
pytest ./src
First create a PR to the staging
branch, from there the work will then be merged with the main branch. A merge (and push) in the staging
branch will allow you to view the staged online version of Feedback Forensics app at https://rdnfn-ff-dev.hf.space.
Ensure that the current branch is up-to-date with main, and then bump the version (using patch
, minor
, or major
):
bump-my-version bump patch
Then on the GitHub website create a new release named after the new version (e.g. "v0.1.2"). As part of this release in the GitHub interface, create a new tag with the updated version. This release will trigger a GitHub action to build and upload the PyPI package.