-
Notifications
You must be signed in to change notification settings - Fork 16
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Questions about codebook dim #5
Comments
你好! fsq可能不太能行,你可以看下paper https://arxiv.org/pdf/2309.15505 和这个blog https://spaces.ac.cn/archives/9826 |
谢谢您的回复和建议。我可能有些表述不清。我的意思其实是:您有没有尝试过使用普通 VQ(而不是 FSQ)且更高的码本维度(如 512,1024)来训一个单码本或者少量码本(如 RVQ-3)的 X-Codec? 我在尝试做这个事情,不知道是不是码本维度高了不好训的原因~ |
一开始用bigcodec的vq尝试过,发现vq训练非常不稳定,而且vq codebook size上去了,codebook利用率也不一定能上去,效果未必变好。 |
更高的特征维度,如果codebook不变的话,不太会提升性能,因为vq作为bottleneck通过的信息有限。 |
感谢作者分享经验,学到了很多 |
感谢分享 ! |
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
作者您好!
请问你们有尝试用更高的码本维度吗(现在是 fsq, codebook dim=8)
比如你们有尝试过使用 1 层 vq,codebook dim=512 或 1024 吗
The text was updated successfully, but these errors were encountered: