Skip to content

πŸ‘©β€πŸ’» Code for the ACL paper "Detecting Edit Failures in LLMs: An Improved Specificity Benchmark"

License

Notifications You must be signed in to change notification settings

apartresearch/specificityplus

Folders and files

NameName
Last commit message
Last commit date

Latest commit

0e36a97 Β· Jan 19, 2024
Jan 13, 2023
Jan 18, 2023
May 24, 2023
May 24, 2023
May 31, 2023
Feb 8, 2023
Oct 14, 2022
May 31, 2023
Jan 30, 2023
Feb 6, 2023
May 24, 2023
Jan 30, 2023
Jun 6, 2023
Oct 14, 2022
May 28, 2023
May 24, 2023
May 24, 2023
May 26, 2023
Jan 19, 2024
Mar 8, 2023
Oct 14, 2022
Oct 14, 2022

Repository files navigation

Detecting Edit Failures in LLMs: An Improved Specificity Benchmark (website)

This repository contains the code for the paper Detecting Edit Failures in LLMs: An Improved Specificity Benchmark (ACL Findings 2023).

It extends previous work on model editing by Meng et al. [1] by introducing a new benchmark, called CounterFact+, for measuring the specificity of model edits.

Attribution

The repository is a fork of MEMIT, which implements the model editing algorithms MEMIT (Mass Editing Memory in a Transformer) and ROME (Rank-One Model Editing). Our fork extends this code by additional evaluation scripts implementing the CounterFact+ benchmark. For installation instructions see the original repository.

Installation

We recommend conda for managing Python, CUDA, and PyTorch; pip is for everything else. To get started, simply install conda and run:

CONDA_HOME=$CONDA_HOME ./scripts/setup_conda.sh

$CONDA_HOME should be the path to your conda installation, e.g., ~/miniconda3.

Running Experiments

See INSTRUCTIONS.md for instructions on how to run the experiments and evaluations.

How to Cite

If you find our paper useful, please consider citing as:

@inproceedings{jason2023detecting,
title         = {Detecting Edit Failures In Large Language Models: An Improved Specificity Benchmark},
author        = {Hoelscher-Obermaier, Jason and Persson, Julia and Kran, Esben and Konstas, Ionnis and Barez, Fazl},
booktitle     = {Findings of ACL},
year          = {2023},
organization  = {Association for Computational Linguistics}
}

About

πŸ‘©β€πŸ’» Code for the ACL paper "Detecting Edit Failures in LLMs: An Improved Specificity Benchmark"

Topics

Resources

License

Citation

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published