Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Update Documentation about SFT Config #2649

Open
5 tasks done
ParagEkbote opened this issue Jan 24, 2025 · 4 comments
Open
5 tasks done

Update Documentation about SFT Config #2649

ParagEkbote opened this issue Jan 24, 2025 · 4 comments
Labels
📚 documentation Improvements or additions to documentation 👶 good first issue Good for newcomers 🏋 SFT Related to SFT

Comments

@ParagEkbote
Copy link

Reproduction

Since the max_seq_length parameter was deprecated in #2306, the relevant documentation pertaining to it is not updated in the documentation as seen here.

Additionally, the SFT config class has kept the deprecated parameter.

https://github.com/huggingface/trl/blob/8e65825d4cb22cf304bcd245adef978c473efcb1/trl/trainer/sft_config.py#L80C2-L81C22

Could you please let me know what improvements are needed and I will be happy to contribute to update the documentation.

cc: @qgallouedec

For more info, please see this issue here.

System Info

Platform: Linux-6.5.0-1025-azure-x86_64-with-glibc2.31

  • Python version: 3.12.1
  • PyTorch version: 2.5.1+cpu
  • CUDA device(s): not available
  • Transformers version: 4.48.1
  • Accelerate version: 1.3.0
  • Accelerate config: not found
  • Datasets version: 3.2.0
  • HF Hub version: 0.27.1
  • TRL version: 0.13.0
  • bitsandbytes version: not installed
  • DeepSpeed version: not installed
  • Diffusers version: not installed
  • Liger-Kernel version: 0.5.2
  • LLM-Blender version: not installed
  • OpenAI version: not installed
  • PEFT version: not installed

Checklist

  • I have checked that my issue isn't already filed (see open issues)
  • I have included my system information
  • Any code provided is minimal, complete, and reproducible (more on MREs)
  • Any code provided is properly formatted in code blocks, (no screenshot, more on code blocks)
  • Any traceback provided is complete
@github-actions github-actions bot added 🏋 SFT Related to SFT 📚 documentation Improvements or additions to documentation 👶 good first issue Good for newcomers labels Jan 24, 2025
@ParagEkbote
Copy link
Author

Gentle ping

cc: @qgallouedec

@qgallouedec
Copy link
Member

I think there's some confusion here. This parameter has not been deprecated. It has been transferred from SFTTrainer and SFTConfig. The only instance in the doc that is too old that I can find is here

SFTTrainer always truncates by default the sequences to the max_seq_length argument of the SFTTrainer.

If you're willing to make a PR to change it, I'd be happy to merge it.

@ParagEkbote
Copy link
Author

To be clear and please correct me if I'm wrong; the max_seq_length param has been transferred to SFTConfig from SFTTrainer. It is also kept in SFTTrainer as to not break backwards compatibility?

About the changes in the doc, what new default has been set as compared to the previous version. Can you please let me know and I will be glad to contribute :)

cc: @qgallouedec

@qgallouedec
Copy link
Member

You can still specify a max len, but now it's with SFTConfig.

The doc should be

the max_seq_length argument of the SFTConfig

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
📚 documentation Improvements or additions to documentation 👶 good first issue Good for newcomers 🏋 SFT Related to SFT
Projects
None yet
Development

No branches or pull requests

2 participants