Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

rewards_funcs set to eval mode #2685

Open
shirinyamani opened this issue Jan 29, 2025 · 1 comment
Open

rewards_funcs set to eval mode #2685

shirinyamani opened this issue Jan 29, 2025 · 1 comment
Labels
🏋 GRPO Related to GRPO ❓ question Seeking clarification or more information 🏋 Reward Related to Reward modelling

Comments

@shirinyamani
Copy link

shirinyamani commented Jan 29, 2025

shouldn't the reward_funcs here set to eval mode to disable the gradient? self.reward_funcs.eval() for the reward_funcs[i]

@shirinyamani shirinyamani changed the title "🏋 GRPO" "❓ question" 🏋 GRPO --❓ question Jan 29, 2025
@github-actions github-actions bot added 🏋 GRPO Related to GRPO 🏋 Reward Related to Reward modelling ❓ question Seeking clarification or more information labels Jan 29, 2025
@shirinyamani shirinyamani changed the title 🏋 GRPO --❓ question rewards_funcs set to eval mode Jan 29, 2025
@qgallouedec
Copy link
Member

Reward models are called in inference mode.

with torch.inference_mode():

I think it's the same. But not 100% sure

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
🏋 GRPO Related to GRPO ❓ question Seeking clarification or more information 🏋 Reward Related to Reward modelling
Projects
None yet
Development

No branches or pull requests

2 participants