-
Notifications
You must be signed in to change notification settings - Fork 311
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Add model spice #1884
Add model spice #1884
Conversation
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Implementation looks fine. Though some of the metadata on training data + code is missing. Can you please fill these out.
Generally it seems like the models doesn't have a lot of documentation surrounding how it was trained. Is that intentional or is there an upcoming paper?
Hi @KennethEnevoldsen - the training data is intentionally private. The training recipe is private for now but will be released later with an upcoming paper, with some details on the general mix of data. |
This is perfectly fine then you simply need to mark it I will also need confirmation that your model is zero-shot on MTEB? (i.e. haven't trained on the training/dev/test data) |
I am a bit confused about the zero-shot comment: are models not supposed to use the training data from MTEB datasets? |
oh, right, we changed it to a link instead of a boolean, that is my bad. None is just fine.
It was never clarified in the original paper, and you are welcome to do so, but if you do, we will need to mark the model as non-zero-shot. (will be filtered out by default in the updated version of the leaderboard, which will be released soon) As shown by voyage-3-exp (which is not recommended to use), they perform notably better when trained on the train set without the model generally performing better (when applied to an unseen task). So, if we care about performance on an unseen task, then we care about zero-shot generalization. This is to ensure fairness of comparison in the leaderboard (and that model performance on MTEB can be converted into meaningful model selection). Feel free to ask if there is anything that is unclear |
I see- thanks for the clarification! The data distribution does have some sources that might appear similar to the MTEB data, and might have some samples from the training set, but these exact datasets were not used. I think for now you can mark the model as non-zero-shot. |
We have discussed this case with the team. Based on your previous submission of identical scores to voyage-3-exp for your own model, we will not accept your submission, unless you provide a detailed description of the training procedure and data. |
Adding a model checklist
mteb.get_model(model_name, revision)
andmteb.get_model_meta(model_name, revision)