You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
As far as I can see the model don't use GGML weights here, is that correct? (did not checked yet).
If so - what script is used to make 4bit quantization here?
For instance - if I am going to try vicuna (which is LLAMA + some delta weights) I can generate full vicuna weights, but than I will need to perform quantization - which one is used in this library case?
The text was updated successfully, but these errors were encountered:
Okay, I see it should be able to use gptq weights (safetensor format). I am facing another issues while trying to generate text, but that's offtopic for this issue.
So my problem should be solveable (well, at least if vicuna code itself is not different).
As far as I can see the model don't use GGML weights here, is that correct? (did not checked yet).
If so - what script is used to make 4bit quantization here?
For instance - if I am going to try vicuna (which is LLAMA + some delta weights) I can generate full vicuna weights, but than I will need to perform quantization - which one is used in this library case?
The text was updated successfully, but these errors were encountered: