generated from fastai/nbdev_template
-
Notifications
You must be signed in to change notification settings - Fork 1.4k
Issues: huggingface/trl
[Tracking issue] Integrate native liger-kernel losses
#2495
opened Dec 17, 2024 by
qgallouedec
Open
4
[Tracking issue] Wrong loss scaling when accumulating gradient
#2617
opened Jan 23, 2025 by
qgallouedec
Open
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Author
Label
Projects
Milestones
Assignee
Sort
Issues list
GRPO VLLM does not work with Lora
🏋 GRPO
Related to GRPO
⚡ PEFT
Related to PEFT
#2698
opened Jan 30, 2025 by
gagan3012
5 tasks done
I cannot launch PPOTrainning script with accelerate launch
⚡accelerate
Related to accelerate
⚡ PEFT
Related to PEFT
🏋 PPO
Related to PPO
#2696
opened Jan 30, 2025 by
daehuikim
5 tasks done
OOM 8xH100 using latest GRPO code with vLLM
🐛 bug
Something isn't working
🚀 deepspeed
Related to deepspeed
🏋 GRPO
Related to GRPO
#2688
opened Jan 30, 2025 by
abacaj
5 tasks done
empty Cache after logps_per_token
🐛 bug
Something isn't working
🏋 GRPO
Related to GRPO
#2686
opened Jan 29, 2025 by
shirinyamani
rewards_funcs set to eval mode
🏋 GRPO
Related to GRPO
❓ question
Seeking clarification or more information
🏋 Reward
Related to Reward modelling
#2685
opened Jan 29, 2025 by
shirinyamani
Support iterative GRPO
✨ enhancement
New feature or request
🏋 GRPO
Related to GRPO
⚡ PEFT
Related to PEFT
#2684
opened Jan 29, 2025 by
howardzhou
About the Implementation of GRPO
🏋 GRPO
Related to GRPO
❓ question
Seeking clarification or more information
#2681
opened Jan 29, 2025 by
macheng6
Ability to provide a static completion for GRPO
✨ enhancement
New feature or request
🏋 GRPO
Related to GRPO
#2680
opened Jan 29, 2025 by
Palmik
logging issue: generation_config in rloo_trainer.py's generate_completions() is not reflective of actual model generations
🐛 bug
Something isn't working
🏋 RLOO
Related to RLOO
#2678
opened Jan 28, 2025 by
swkarlekar
5 tasks done
TypeError: type list doesn't define __round__ method - why I am getting this error
🐛 bug
Something isn't working
⏳ needs more info
Additional information or clarification is required to proceed
🏋 Reward
Related to Reward modelling
#2674
opened Jan 28, 2025 by
Tarak200
"None of the inputs have requires_grad=True" with online DPO and GRPO
🐛 bug
Something isn't working
🏋 GRPO
Related to GRPO
🏋 Online DPO
Related to Online DPO
#2671
opened Jan 28, 2025 by
benjamin-marie
5 tasks done
whiten_rewards
parameter in RLOO config is not used.
🐛 bug
#2665
opened Jan 26, 2025 by
velezbeltran
5 tasks done
Issues with DPOTrainer and Qwen2-VL processor
🐛 bug
Something isn't working
🏋 DPO
Related to DPO
#2660
opened Jan 25, 2025 by
baichuanzhou
Resuming from checkpoint doesn't seem to work
🐛 bug
Something isn't working
⏳ needs more info
Additional information or clarification is required to proceed
🏋 PPO
Related to PPO
🏋 RLOO
Related to RLOO
#2657
opened Jan 25, 2025 by
Superskyyy
5 tasks done
Judge API feedback: Structured inputs
✨ enhancement
New feature or request
#2655
opened Jan 25, 2025 by
DreamGenX
Update Documentation about SFT Config
📚 documentation
Improvements or additions to documentation
👶 good first issue
Good for newcomers
🏋 SFT
Related to SFT
#2649
opened Jan 24, 2025 by
ParagEkbote
5 tasks done
How to stop Seeking clarification or more information
🏋 SFT
Related to SFT
SFTTrainer
from auto tokenizing my messages ?
❓ question
#2642
opened Jan 24, 2025 by
MohamedAliRashad
../aten/src/ATen/native/cuda/Indexing.cu:1284: indexSelectLargeIndex: block: [193,0,0], thread: [66,0,0] Assertion Something isn't working
🙋 help from community wanted
Open invitation for community members to contribute
🏋 PPO
Related to PPO
srcIndex < srcSelectDimSize
failed.
🐛 bug
#2641
opened Jan 23, 2025 by
JohnConnor123
5 tasks done
[Tracking issue] Wrong loss scaling when accumulating gradient
🐛 bug
Something isn't working
🏋 DPO
Related to DPO
🏋 DPPO
Related to DDPO
🏋 GKD
Related to GKD
🏋 GRPO
Related to GRPO
🏋 Iterative SFT
Related to Iterative SFT
🏋 KTO
Related to KTO
🏋 Online DPO
Related to Online DPO
🏋 ORPO
Related to ORPO
🏋 PPO
Related to PPO
🏋 PRM
Related to PRM
🏋 Reward
Related to Reward modelling
🏋 RLOO
Related to RLOO
🏋 SFT
Related to SFT
🏋 XPO
Related to XPO
#2617
opened Jan 23, 2025 by
qgallouedec
13 of 18 tasks
PPOTrainer.__init__() missing 2 required positional arguments: 'reward_model' and 'train_dataset'
🐛 bug
Something isn't working
🏋 PPO
Related to PPO
#2614
opened Jan 23, 2025 by
ansonctyu
5 tasks done
Previous Next
ProTip!
What’s not been updated in a month: updated:<2024-12-30.