-
Notifications
You must be signed in to change notification settings - Fork 312
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Dataset/update climate fever #1873
base: main
Are you sure you want to change the base?
Dataset/update climate fever #1873
Conversation
Looks amazing!! Can you maybe share the results on the old ClimateFEVER for one of those models? Overall, do you think this is a net improvement of ClimateFEVER? If so, maybe worth incorporating it in some benchmarks (/for future benchmarks) cc @KennethEnevoldsen |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Wonderful addition! A few minor changes.
We sadly can't update the actual benchmark (this will break backward compatibility and require us to rerun all models on the leaderboard).
however future versions of the benchmark will likely use this updated version.
@@ -72,3 +72,39 @@ class ClimateFEVERHardNegatives(AbsTaskRetrieval): | |||
primaryClass={cs.CL} | |||
}""", | |||
) | |||
|
|||
|
|||
class ClimateFEVERv2(AbsTaskRetrieval): |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
You will need to add supeseded_by
to ClimateFEVER
If we want to consistently name tasks we should probably call this
class ClimateFEVERv2(AbsTaskRetrieval): | |
class ClimateFEVERRetrievalv2(AbsTaskRetrieval): |
The same with the name
description="CLIMATE-FEVER is a dataset adopting the FEVER methodology that consists of 1,535 real-world claims regarding climate-change. ", | ||
reference="https://www.sustainablefinance.uzh.ch/en/research/climate-fever.html", | ||
dataset={ | ||
"path": "Mina76/climate-fever", |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Would love to move this over to the mteb org to ensure that it doesn't get taken down (I have sent you an invite to the org).
Not to say that you would do it, but it has happened sometimes (often people just cleaning up the datasets)
domains=["Academic"], | ||
task_subtypes=["Question answering"], | ||
license="cc-by-sa-4.0", | ||
annotations_creators="human-annotated", |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
if the metadata is not filled out in the old one - could you move this up there as well?
main_score="ndcg_at_10", | ||
date=("2020-12-11", "2020-12-11"), | ||
domains=["Academic"], | ||
task_subtypes=["Question answering"], |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Isn't it Claim Verification?
eval_langs=["eng-Latn"], | ||
main_score="ndcg_at_10", | ||
date=("2020-12-11", "2020-12-11"), | ||
domains=["Academic"], |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
domains=["Academic"], | |
domains=["Academic", "Written"], |
What is the source data of climate fever? Research articles? (would be great to update the description to make this clearer
eval_splits=["test"], | ||
eval_langs=["eng-Latn"], | ||
main_score="ndcg_at_10", | ||
date=("2020-12-11", "2020-12-11"), |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The date should refer to when the source data was written. E.g. articles from the period 2014-2018.
@Samoed shouldn't the test fail due to missing descriptive stats? |
I've added these test only for |
This PR has been created related to the following issue, which updates the ClimateFEVER dataset:
Closes #1498 (comment)
I tried to use the same metadata as the original ClimateFEVER class, but while running the test, I got an error related to some metadata fields needing to be filled in. We need to review and ensure these fields are correct.
I ran the tests locally, and nothing broke related to the code I added. However, before making my changes, the tests already failed in 7 parts related to other parts of the codebase. These failures seem unrelated to the changes introduced in this PR.
Also the following are the results related to
paraphrase-multilingual-MiniLM-L12-v2
model:
intfloat/multilingual-e5-small
Checklist
make test
.make lint
.Adding datasets checklist
Reason for dataset addition: ...
The reason for updating this dataset is explained here:
#1498 (comment)
mteb -m {model_name} -t {task_name}
command.sentence-transformers/paraphrase-multilingual-MiniLM-L12-v2
intfloat/multilingual-e5-small
self.stratified_subsampling() under dataset_transform()
make test
.make lint
.