-
Notifications
You must be signed in to change notification settings - Fork 37
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Don't use GGUF but convert T5 directly from Hugging Face #967
Conversation
7559cc5
to
8af8997
Compare
8af8997
to
e39df9b
Compare
e39df9b
to
b0d3637
Compare
|
||
gguf_to_optional_config_names_map = { | ||
"t5.decoder_start_token_id": ["decoder_start_token_id"], | ||
def from_hugging_face_config( |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Not blocking on this, but can these keys be discovered automatically?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I shortened it.
@@ -498,19 +493,19 @@ def testCompareAgainstTransformers( | |||
|
|||
theta = Theta( | |||
{ | |||
"attn_q.weight": DefaultPrimitiveTensor( | |||
"q.weight": DefaultPrimitiveTensor( |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
What is the motivation with all the renames? Is it just the corresponding names from huggingface's layout instead of gguf's?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yes.
We don't want the stack to depend on the conversion tool from the lamma.cpp repo. Also the conversion to GGUF would not convert all tensors to bf16, but leave some in f32. We would like to control that ourselves if needed. This change makes any previously generated IRPA files obsolete.
e0286ac
to
26b0ad7
Compare
We don't want the stack to depend on the conversion tool from the lamma.cpp repo. Also the conversion to GGUF would not convert all tensors to bf16, but leave some in f32. We would like to control that ourselves if needed. This change makes any previously generated IRPA files obsolete.
We don't want the stack to depend on the conversion tool from the lamma.cpp repo. Also the conversion to GGUF would not convert all tensors to bf16, but leave some in f32. We would like to control that ourselves if needed.
This change makes any previously generated IRPA files obsolete.