Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

论文和代码训练好像没对上 #39

Open
EveningLin opened this issue Nov 4, 2024 · 13 comments
Open

论文和代码训练好像没对上 #39

EveningLin opened this issue Nov 4, 2024 · 13 comments

Comments

@EveningLin
Copy link

(1)第一阶段的输入在论文中是使用参考帧,音频和目标帧
image

但是现在的代码好像还是hallo1的:https://github1s.com/fudan-generative-vision/hallo2/blob/HEAD/hallo/datasets/mask_image.py#L132-L139

(2)可能是你没有更新第一阶段训练代码的原因,我不理解第二阶段训练的时候为什么权重保存格式net-3000.pth从哪里获得,其次是audio如果是用第一阶段的话显然是没有经过训练的

@xh-liu-tech
Copy link

The textual prompt control mentioned in the paper doesn't seem to be implemented in the current public version as well.

Is there any plan to release the complete implementation including this feature?

@cuijh26
Copy link
Contributor

cuijh26 commented Nov 4, 2024

你好,非常抱歉,给你带来误导。

  1. 我们论文的笔误,抱歉,第一阶段没有输入音频,我们会进行改正。
  2. net-3000.pth是指第一阶段的checkpoint,第二阶段是要训练audio attention和temporal attention

(1)第一阶段的输入在论文中是使用参考帧,音频和目标帧 image

但是现在的代码好像还是hallo1的:https://github1s.com/fudan-generative-vision/hallo2/blob/HEAD/hallo/datasets/mask_image.py#L132-L139

(2)可能是你没有更新第一阶段训练代码的原因,我不理解第二阶段训练的时候为什么权重保存格式net-3000.pth从哪里获得,其次是audio如果是用第一阶段的话显然是没有经过训练的

@cuijh26
Copy link
Contributor

cuijh26 commented Nov 4, 2024

The textual prompt control mentioned in the paper doesn't seem to be implemented in the current public version as well.

Is there any plan to release the complete implementation including this feature?

Thanks for you attention. Please stay tuned for update.

@EveningLin
Copy link
Author

(1)但是如果第一阶段没有训练audio的话,那么为什么audioproj.requires_grad_(False)?
https://github1s.com/fudan-generative-vision/hallo2/blob/HEAD/scripts/train_stage2_long.py#L562
(2)net-3000.pth保存格式也对不上
这是第一阶段的保存:checkpoint-{global_step}
image

@EveningLin
Copy link
Author

image

@EveningLin
Copy link
Author

image
第二阶段配置代码甚至注释了audio_modules,是不是没上传对代码啊哈哈哈哈哈

@EveningLin
Copy link
Author

@cuijh26

@cuijh26
Copy link
Contributor

cuijh26 commented Nov 4, 2024

抱歉 有可能确实上传错了 稍等我check一下

@EveningLin
Copy link
Author

@cuijh26
咋说咧 是不是没传对啊哈哈哈

@EveningLin
Copy link
Author

@cuijh26
是我的问题吗 还是 是代码的问题咧 能给个答复嘛

@hnsywangxin
Copy link

在线等回复

@hnsywangxin
Copy link

@cuijh26 是与不是辛苦大佬们给个回复,粗略看了下代码,感觉跟论文相差比较大

@yuchenli-sony
Copy link

I THINK IT IS A WRONG CODE OF TRAIN

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

5 participants