Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add CRAG #1151

Open
Muennighoff opened this issue Aug 11, 2024 · 2 comments
Open

Add CRAG #1151

Muennighoff opened this issue Aug 11, 2024 · 2 comments
Assignees
Labels
good first issue Good for newcomers new dataset Issues related to adding a new task or dataset

Comments

@Muennighoff
Copy link
Contributor

Muennighoff commented Aug 11, 2024

Part of CRAG is only to evaluate embedding/retrieval models i.e. without the generative part. Would be great to integrate that!

(or CodeRAG)

@KennethEnevoldsen KennethEnevoldsen added new dataset Issues related to adding a new task or dataset good first issue Good for newcomers labels Aug 12, 2024
@isaac-chung
Copy link
Collaborator

Tagging the authors @zorazrw @AkariAsai @yiqingxyq as well 🤗
The only missing item is to:

submit results of a model on this benchmark to results repository

Originally posted by @Samoed in #1595 (comment)

@Samoed
Copy link
Collaborator

Samoed commented Feb 2, 2025

I think we should re-upload the data, because in authors datasets stackoverflow-posts they have 9 jsonl files on 6 GB, but in parquet branch these files contain only ~270MB per file

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
good first issue Good for newcomers new dataset Issues related to adding a new task or dataset
Projects
None yet
Development

No branches or pull requests

5 participants