Added benchmark object #878

KennethEnevoldsen · 2024-06-04T15:32:06Z

Assuming we want a selection menu for benchmarks in the future leaderboard I have now added a benchmark object. Generally it shouldn't change much otherwise.

The idea is that you would select the benchmark. The description will show + some references. Potentially with a dropdown for the citation.

Added benchmark object (should work like a list)
removed a duplicate task
Added SEB (we should probably add a few more here as well)

Muennighoff

Nice, I think we also need to change the README which imports MTEB_MAIN_EN i.e. https://github.com/embeddings-benchmark/mteb?tab=readme-ov-file#dataset-selection

KennethEnevoldsen · 2024-06-05T08:03:28Z

Nice, I think we also need to change the README which imports MTEB_MAIN_EN i.e. https://github.com/embeddings-benchmark/mteb?tab=readme-ov-file#dataset-selection

Added a test and it does run just fine (since it has an iter method). I also changed the type hint to iterable (weaker than sequence) to reflect that it is the only requirement.

* Ensure result are consistently stored in the same way - (due to failing test): updated missing dataset references - (to test with more than one model) Added e5 models base and large - updated mteb.get_model to now include metadata in the model object - ensure that model name is always included when saving (with a default when it is not available) - use the ModelMeta for the model_meta.json * format * minor test fixes * docs: Minor updated to repro. workflow docs * fixed failing test * format * Apply suggestions from code review Co-authored-by: Isaac Chung <chungisaac1217@gmail.com> * docs: update PR template * fix: Added benchmark object (#878) * removed duplicate task * Added benchmark object * removed import for duplicate task * fix dataset references * added seb * Added test for running benchmarks * changed tasks to be an iterable * format * Apply suggestions from code review Co-authored-by: Niklas Muennighoff <n.muennighoff@gmail.com> Co-authored-by: Isaac Chung <chungisaac1217@gmail.com> --------- Co-authored-by: Isaac Chung <chungisaac1217@gmail.com> Co-authored-by: Niklas Muennighoff <n.muennighoff@gmail.com>

KennethEnevoldsen added 2 commits June 4, 2024 17:18

removed duplicate task

c46b0d5

Added benchmark object

1c037d3

KennethEnevoldsen requested review from isaac-chung and Muennighoff June 4, 2024 15:32

KennethEnevoldsen added 2 commits June 4, 2024 17:39

removed import for duplicate task

1b43257

fix dataset references

4c76c3e

Muennighoff reviewed Jun 4, 2024

View reviewed changes

KennethEnevoldsen added 4 commits June 5, 2024 09:43

added seb

cab4689

Added test for running benchmarks

b1a48d2

changed tasks to be an iterable

6a759ae

format

d643e80

KennethEnevoldsen merged commit 46170d0 into result-normalization Jun 5, 2024
6 checks passed

KennethEnevoldsen deleted the add_benchmark_handling branch June 5, 2024 08:03

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Added benchmark object #878

Added benchmark object #878

KennethEnevoldsen commented Jun 4, 2024

Muennighoff left a comment

KennethEnevoldsen commented Jun 5, 2024

Added benchmark object #878

Added benchmark object #878

Conversation

KennethEnevoldsen commented Jun 4, 2024

Muennighoff left a comment

Choose a reason for hiding this comment

KennethEnevoldsen commented Jun 5, 2024