Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

OOM after Postprocessing on video datasets #488

Closed
xumingze0308 opened this issue Jan 6, 2025 · 1 comment
Closed

OOM after Postprocessing on video datasets #488

xumingze0308 opened this issue Jan 6, 2025 · 1 comment

Comments

@xumingze0308
Copy link
Contributor

I evaluated Oryx-34B on VideoMME using 8 H100 GPUs. The "Model Responding" works well but got OOM (NCCL WARN Cuda failure 2 'out of memory') after the Postprocessing was completed.

I tried #4 but it seems to only use 1 GPU. Is there any suggestion on this problem? I feel maybe metric calculation or gathering causes OOM and am curious if this can be optimized.

Thank you!

@xumingze0308
Copy link
Contributor Author

This issue can be solved by this PR.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

1 participant