Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

fix: Add bm25s as a model for retrieval #1082

Merged
merged 13 commits into from
Jul 16, 2024
Merged

fix: Add bm25s as a model for retrieval #1082

merged 13 commits into from
Jul 16, 2024

Conversation

isaac-chung
Copy link
Collaborator

@isaac-chung isaac-chung commented Jul 13, 2024

Addresses #990.

With Stopwords=None and the Snowball Stemmer, I can reproduce NFCorpus from the paper's table 3 at 32.0 NDCG@10.

Checklist

  • Move results to the results repo
  • Run tests locally to make sure nothing is broken using make test.
  • Run the formatter to format the code using make lint.

Adding a model checklist

  • I have filled out the ModelMeta object to the extent possible
  • I have ensured that my model can be loaded using
    • mteb.get_model(model_name, revision_id) and
    • mteb.get_model_meta(model_name, revision_id)
  • I have tested the implementation works on a representative set of tasks.

Co-authored by: @malteos github@i.mieo.de

@isaac-chung isaac-chung requested a review from Muennighoff July 13, 2024 16:48
@isaac-chung
Copy link
Collaborator Author

@Muennighoff running the rest of the English benchmark now. Will push when I have all results.

@isaac-chung isaac-chung changed the title [WIP] add bm25s as a model for retrieval [WIP] Add bm25s as a model for retrieval Jul 13, 2024
@isaac-chung isaac-chung marked this pull request as ready for review July 14, 2024 08:46
@isaac-chung isaac-chung changed the title [WIP] Add bm25s as a model for retrieval fix: Add bm25s as a model for retrieval Jul 14, 2024
@isaac-chung isaac-chung marked this pull request as draft July 14, 2024 08:50
@isaac-chung
Copy link
Collaborator Author

Before merging I'd like to add the co-author line in the squashed commit to highlight the foundation work done by @malteos 🚀

mteb/models/bm25.py Outdated Show resolved Hide resolved
@isaac-chung isaac-chung marked this pull request as ready for review July 15, 2024 06:27
@malteos
Copy link
Contributor

malteos commented Jul 15, 2024

Thanks for making the PR!

@isaac-chung isaac-chung merged commit 5269f2c into main Jul 16, 2024
8 checks passed
@isaac-chung isaac-chung deleted the add-bm25s branch July 16, 2024 09:49
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants