Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

fix format reward #238

Merged
merged 4 commits into from
Feb 8, 2025
Merged

fix format reward #238

merged 4 commits into from
Feb 8, 2025

Conversation

JamesHujy
Copy link
Contributor

See #237

@JamesHujy JamesHujy closed this Feb 8, 2025
@JamesHujy JamesHujy reopened this Feb 8, 2025
@JamesHujy JamesHujy changed the title fix format reward #237 fix format reward Feb 8, 2025
Copy link
Member

@lewtun lewtun left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Great catch @JamesHujy and thank you for the fix! Would you mind adding your example from #237 as a regression test in /tests? Then we should be good to merge!

@kashif
Copy link
Collaborator

kashif commented Feb 8, 2025

@JamesHujy @lewtun so the current regex still fails for the input example from #237 let me fix it

@kashif
Copy link
Collaborator

kashif commented Feb 8, 2025

we should add a \s* between the </think> and <answer> to handle white space or newlines and also perhaps add a re.MULTILINE in addition to the re.DOTALL for multiline inputs

Copy link
Member

@lewtun lewtun left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for the fixes @kashif !

@lewtun lewtun merged commit d12886d into huggingface:main Feb 8, 2025
1 check passed
@qgallouedec qgallouedec mentioned this pull request Feb 9, 2025
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants