Skip to content

Issues: huggingface/trl

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Author
Filter by author
Loading
Label
Filter by label
Loading
Use alt + click/return to exclude labels
or + click/return for logical OR
Projects
Filter by project
Loading
Milestones
Filter by milestone
Loading
Assignee
Filter by who’s assigned
Sort

Issues list

GRPO VLLM does not work with Lora 🏋 GRPO Related to GRPO ⚡ PEFT Related to PEFT
#2698 opened Jan 30, 2025 by gagan3012
5 tasks done
I cannot launch PPOTrainning script with accelerate launch ⚡accelerate Related to accelerate ⚡ PEFT Related to PEFT 🏋 PPO Related to PPO
#2696 opened Jan 30, 2025 by daehuikim
5 tasks done
OOM 8xH100 using latest GRPO code with vLLM 🐛 bug Something isn't working 🚀 deepspeed Related to deepspeed 🏋 GRPO Related to GRPO
#2688 opened Jan 30, 2025 by abacaj
5 tasks done
empty Cache after logps_per_token 🐛 bug Something isn't working 🏋 GRPO Related to GRPO
#2686 opened Jan 29, 2025 by shirinyamani
rewards_funcs set to eval mode 🏋 GRPO Related to GRPO ❓ question Seeking clarification or more information 🏋 Reward Related to Reward modelling
#2685 opened Jan 29, 2025 by shirinyamani
Support iterative GRPO ✨ enhancement New feature or request 🏋 GRPO Related to GRPO ⚡ PEFT Related to PEFT
#2684 opened Jan 29, 2025 by howardzhou
About the Implementation of GRPO 🏋 GRPO Related to GRPO ❓ question Seeking clarification or more information
#2681 opened Jan 29, 2025 by macheng6
Ability to provide a static completion for GRPO ✨ enhancement New feature or request 🏋 GRPO Related to GRPO
#2680 opened Jan 29, 2025 by Palmik
TypeError: type list doesn't define __round__ method - why I am getting this error 🐛 bug Something isn't working ⏳ needs more info Additional information or clarification is required to proceed 🏋 Reward Related to Reward modelling
#2674 opened Jan 28, 2025 by Tarak200
"None of the inputs have requires_grad=True" with online DPO and GRPO 🐛 bug Something isn't working 🏋 GRPO Related to GRPO 🏋 Online DPO Related to Online DPO
#2671 opened Jan 28, 2025 by benjamin-marie
5 tasks done
whiten_rewards parameter in RLOO config is not used. 🐛 bug Something isn't working 🏋 RLOO Related to RLOO
#2665 opened Jan 26, 2025 by velezbeltran
5 tasks done
Issues with DPOTrainer and Qwen2-VL processor 🐛 bug Something isn't working 🏋 DPO Related to DPO
#2660 opened Jan 25, 2025 by baichuanzhou
Resuming from checkpoint doesn't seem to work 🐛 bug Something isn't working ⏳ needs more info Additional information or clarification is required to proceed 🏋 PPO Related to PPO 🏋 RLOO Related to RLOO
#2657 opened Jan 25, 2025 by Superskyyy
5 tasks done
Judge API feedback: Structured inputs ✨ enhancement New feature or request
#2655 opened Jan 25, 2025 by DreamGenX
Update Documentation about SFT Config 📚 documentation Improvements or additions to documentation 👶 good first issue Good for newcomers 🏋 SFT Related to SFT
#2649 opened Jan 24, 2025 by ParagEkbote
5 tasks done
How to stop SFTTrainer from auto tokenizing my messages ? ❓ question Seeking clarification or more information 🏋 SFT Related to SFT
#2642 opened Jan 24, 2025 by MohamedAliRashad
RLOO
#2634 opened Jan 23, 2025 by qgallouedec
Reward
#2633 opened Jan 23, 2025 by qgallouedec
PPO
#2629 opened Jan 23, 2025 by qgallouedec
ORPO
#2628 opened Jan 23, 2025 by qgallouedec
GKD
#2624 opened Jan 23, 2025 by qgallouedec
[Tracking issue] Wrong loss scaling when accumulating gradient 🐛 bug Something isn't working 🏋 DPO Related to DPO 🏋 DPPO Related to DDPO 🏋 GKD Related to GKD 🏋 GRPO Related to GRPO 🏋 Iterative SFT Related to Iterative SFT 🏋 KTO Related to KTO 🏋 Online DPO Related to Online DPO 🏋 ORPO Related to ORPO 🏋 PPO Related to PPO 🏋 PRM Related to PRM 🏋 Reward Related to Reward modelling 🏋 RLOO Related to RLOO 🏋 SFT Related to SFT 🏋 XPO Related to XPO
#2617 opened Jan 23, 2025 by qgallouedec
13 of 18 tasks
ProTip! What’s not been updated in a month: updated:<2024-12-30.