You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
I think there is a small bug. I was trying to find out what the difference was between the whiten_rewards and normalize_rewards parameter in the RLOOConfig object and after inspecting the code for the RLOOTrainer class I found that it is not used. Hence, I think it should probably be removed.
Thank you for your help and the codebase! It is super helpful.
System Info
I can see this in the codebase.
Checklist
I have checked that my issue isn't already filed (see open issues)
I have included my system information
Any code provided is minimal, complete, and reproducible (more on MREs)
Any code provided is properly formatted in code blocks, (no screenshot, more on code blocks)
Any traceback provided is complete
The text was updated successfully, but these errors were encountered:
Reproduction
Hello!
I think there is a small bug. I was trying to find out what the difference was between the
whiten_rewards
andnormalize_rewards
parameter in theRLOOConfig
object and after inspecting the code for theRLOOTrainer
class I found that it is not used. Hence, I think it should probably be removed.Thank you for your help and the codebase! It is super helpful.
System Info
I can see this in the codebase.
Checklist
The text was updated successfully, but these errors were encountered: