GitHub - rdnfn/feedback-forensics: A tool to investigate pairwise feedback: understand and find issues in your data

Feedback Forensics is a tool to investigate pairwise feedback data used for AI training and evaluation: when used for training, what is the data teaching our models? When used for evaluation, towards what kind of models is the feedback leading us? Is this feedback asking for more lists or more ethically considerate responses? Feedback Forensics enables answering these kind of questions, building on the Inverse Constitutional AI (ICAI) pipeline to automatically detect and measure the implicit objectives of annotations. Feedback Forensics is an open-source Gradio app that can be used both online and locally.

"Investigate your pairwise feedback data" 🕵🏻‍♂️💬

Example use-cases	Screenshots
🗂️ Visualizing dataset differences How does Chatbot Arena differ from Anthropic Helpful data?
🧑‍🎨🧑‍💼 Finding preference differences between task domains How do preferences differ across writing tasks?
🤖 Discovering model strengths How is GPT-4o different to other models?

Local usage

Installation

pip install feedback-forensics

Getting started

To start the app locally, run the following command in your terminal:

feedback-forensics -d data/output/example

This will start the Gradio interface on localhost port 7860 (e.g. http://localhost:7860).

Note

The online results are currently not available when running locally.

Investigating your own dataset

To investigate your own dataset, you need to run your own Inverse Constitutional AI (ICAI) experiment. Install the ICAI package as described here, including setting up relevant API secrets. For comparability, we initially recommend using ICAI standard principles rather than generating new ones. These standard principles are used to created the online interface results (shown as the implicit objectives). With the package installed, run:

icai-exp data_path="data/input/example.csv" s0_added_standard_principles_to_test="[v2]" annotator.skip=true s0_skip_principle_generation=true

Replace example.csv with your own dataset, ensuring it complies with the ICAI standard data format (as described here, i.e. containing columns text_a, text_b, and preferred_text). The last two arguments (annotator.skip and s0_skip_principle_generation) reduce experiment cost by skipping parts not necessary for feedback forensics visualisation. Set s0_skip_principle_generation=false to additionally generate new principles beyond the standard set.

Once the experiment is completed, run the following command (also shown at end of ICAI experiment terminal output):

feedback-forensics -d /path/to/icai_results/

This command will again open up the feedback forensics app on localhost port 7860, now including the local results on your own dataset.

Limitations

Feedback Forensics relies on AI annotators (LLM-as-a-Judge) to detect implicit objectives in feedback data. Though such annotators have been shown correlate with human judgements on many tasks, they also have well-known limitations: they are often susceptible to small input changes and can exhibit various biases (as do human annotators). As such, Feedback Forensics results should be taken as an indication for further investigation rather than a definitive final judgement of the data. In general, results based on more samples are less susceptible to noise introduced by AI annotators – and thus may be considered more reliable.

Development

Setup

If you want to contribute to Feedback Forensics, there are two options to set up the development environment:

Option 1: Standard development setup

Clone this repository
Install the package with development dependencies:
```
pip install -e ".[dev]"
```

Option 2: Development container

For a consistent development environment, this repository includes a VS Code dev container configuration:

Install the Remote - Containers extension
Open the repository in VS Code
Click "Reopen in Container" when prompted

Running test cases

To run the tests for the package, run:

pytest ./src

Creating a PR

First create a PR to the staging branch, from there the work will then be merged with the main branch. A merge (and push) in the staging branch will allow you to view the staged online version of Feedback Forensics app at https://rdnfn-ff-dev.hf.space.

Creating a new release

Ensure that the current branch is up-to-date with main, and then bump the version (using patch, minor, or major):

bump-my-version bump patch

Then on the GitHub website create a new release named after the new version (e.g. "v0.1.2"). As part of this release in the GitHub interface, create a new tag with the updated version. This release will trigger a GitHub action to build and upload the PyPI package.

License

Apache 2.0

Name		Name	Last commit message	Last commit date
Latest commit History 142 Commits
.devcontainer		.devcontainer
.github/workflows		.github/workflows
data		data
docs		docs
src/feedback_forensics		src/feedback_forensics
.gitignore		.gitignore
LICENSE		LICENSE
README.md		README.md
pyproject.toml		pyproject.toml

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

Local usage

Installation

Getting started

Investigating your own dataset

Limitations

Development

Setup

Option 1: Standard development setup

Option 2: Development container

Running test cases

Creating a PR

Creating a new release

License

About

Releases 6

Contributors 2

Languages

License

rdnfn/feedback-forensics

Folders and files

Latest commit

History

Repository files navigation

Local usage

Installation

Getting started

Investigating your own dataset

Limitations

Development

Setup

Option 1: Standard development setup

Option 2: Development container

Running test cases

Creating a PR

Creating a new release

License

About

Resources

License

Stars

Watchers

Forks

Releases 6

Contributors 2

Languages