Abstraction Alignment is a methodology to measure the alignment between model decisions and formal human knowledge. This repo contains code to recreate the experiments in Abstraction Alignment: Comparing Model-Learned and Human-Encoded Conceptual Relationships.
To explore the case studies in an live interactive interface instead, check out the abstraction alignment interface.
Abstraction alignment is a new method for understanding human-AI alignment. By comparing AI decisions against existing formalization of human knowledge, it examines how AI models learn concepts and relate them to form abstractions. Working alongside experts in computer vision, LLMs, and medicine, we find that abstraction alignment enhances AI transparency by revealing if and when models generalize in human-like ways and improves domain understanding by identifying ways to refine human knowledge.
Explore abstraction alignment on our three case studies:
- 📷 Interpreting Image Model Behavior: Notebook | Interface
- 🤖 Benchmarking Lanuage Model Specificity: Notebook | Interface
- 🏥 Analyzing Medical Dataset Encodings: Notebook | Interface
# Notebooks to explore abstraction alignment
abstraction_alignment_cifar.ipynb # Interpreting Image Model Behavior case study (Sec. 5.1)
abstraction_alignment_llm.ipynb # Benchmarking Lanuage Model Specificity case study (Sec 5.2)
abstraction_alignment_mimic.ipynb # Analyzing Medical Dataset Encodings case study (Sec. 5.3)
abstraction_alignment_toy_example.ipynb # Additional toy example of abtraction alignment
# Supporting data files to compute abstraction alignment
abstraction_graph_cifar.py
abstraction_graph_mimic.py
abstraction_graph_toy_example.py
graph.py
metrics.py
util/cifar/ # Uiliity files for the CIFAR example
util/llm/ # Utility files and data for the LLM example
util/toy_example/ # Utility files for the toy example
# Code to extract interface data files
extract_data_cifar.py
extract_data_mimic.py
extract_data_llm.py
interface/data/cifar/ # Stores the CIFAR model data files
interface/data/llm/ # Stores the LLM model data files
interface/data/mimic/ # Stores the MIMIC-III data files
The CIFAR-100 data will autmatically download during analysis in the notebooks.
- Follow instructions to download the S-TEST dataset. Put it in a folder called
./util/llm/S-TEST
. - Run
python ./util/llm/S-TEST/scripts/run_experiments.py
to compute the model's output on the data. - Update the paths in
./extract_data_wordnet.py
to match your file structure.
- Request access to MIMIC-III via PhysioNet.
- Download the MIMIC-III dataset and update the paths in
./extract_data_mimic.py
to point to it.
The code in extract_{cifar/wordnet/mimic}_data.py
creates the data files needed to run the abstraction alignment interface. Run these if you'd like to run the abstraction alignment interface locally or reference them for the data file set up to run the interface with your own data.
Abstraction Alignment: Comparing Model-Learned and Human-Encoded Conceptual Relationships. Angie Boggust, Hyemin Bang, Hendrik Strobelt, and Arvind Satyanaryan. Proceedings of the ACM Human Factors in Computing Systems (CHI). 2025.
@inproceedings{boggust2025abstraction,
title = {{Abstraction Alignment: Comparing Model-Learned and Human-Encoded Conceptual Relationships}},
author = {Angie Boggust AND Hyemin Bang AND Hendrik Strobelt AND Arvind Satyanarayan},
booktitle = {ACM Human Factors in Computing Systems (CHI)},
year = {2025},
doi = {10.1145/3706598.3713406},
url = {http://vis.csail.mit.edu/pubs/abstraction-alignment}
}