Update Documentation about SFT Config #2649

ParagEkbote · 2025-01-24T15:34:14Z

Reproduction

Since the max_seq_length parameter was deprecated in #2306, the relevant documentation pertaining to it is not updated in the documentation as seen here.

Additionally, the SFT config class has kept the deprecated parameter.

https://github.com/huggingface/trl/blob/8e65825d4cb22cf304bcd245adef978c473efcb1/trl/trainer/sft_config.py#L80C2-L81C22

Could you please let me know what improvements are needed and I will be happy to contribute to update the documentation.

cc: @qgallouedec

For more info, please see this issue here.

System Info

Platform: Linux-6.5.0-1025-azure-x86_64-with-glibc2.31

Python version: 3.12.1
PyTorch version: 2.5.1+cpu
CUDA device(s): not available
Transformers version: 4.48.1
Accelerate version: 1.3.0
Accelerate config: not found
Datasets version: 3.2.0
HF Hub version: 0.27.1
TRL version: 0.13.0
bitsandbytes version: not installed
DeepSpeed version: not installed
Diffusers version: not installed
Liger-Kernel version: 0.5.2
LLM-Blender version: not installed
OpenAI version: not installed
PEFT version: not installed

Checklist

I have checked that my issue isn't already filed (see open issues)
I have included my system information
Any code provided is minimal, complete, and reproducible (more on MREs)
Any code provided is properly formatted in code blocks, (no screenshot, more on code blocks)
Any traceback provided is complete

ParagEkbote · 2025-01-28T16:48:28Z

Gentle ping

cc: @qgallouedec

qgallouedec · 2025-01-30T17:04:25Z

I think there's some confusion here. This parameter has not been deprecated. It has been transferred from SFTTrainer and SFTConfig. The only instance in the doc that is too old that I can find is here

SFTTrainer always truncates by default the sequences to the max_seq_length argument of the SFTTrainer.

If you're willing to make a PR to change it, I'd be happy to merge it.

ParagEkbote · 2025-01-30T18:22:28Z

To be clear and please correct me if I'm wrong; the max_seq_length param has been transferred to SFTConfig from SFTTrainer. It is also kept in SFTTrainer as to not break backwards compatibility?

About the changes in the doc, what new default has been set as compared to the previous version. Can you please let me know and I will be glad to contribute :)

cc: @qgallouedec

qgallouedec · 2025-01-30T18:36:21Z

You can still specify a max len, but now it's with SFTConfig.

The doc should be

the max_seq_length argument of the SFTConfig

github-actions bot added 🏋 SFT Related to SFT 📚 documentation Improvements or additions to documentation 👶 good first issue Good for newcomers labels Jan 24, 2025

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Update Documentation about SFT Config #2649

Update Documentation about SFT Config #2649

ParagEkbote commented Jan 24, 2025

ParagEkbote commented Jan 28, 2025

qgallouedec commented Jan 30, 2025

ParagEkbote commented Jan 30, 2025

qgallouedec commented Jan 30, 2025

Update Documentation about SFT Config #2649

Update Documentation about SFT Config #2649

Comments

ParagEkbote commented Jan 24, 2025

Reproduction

System Info

Checklist

ParagEkbote commented Jan 28, 2025

qgallouedec commented Jan 30, 2025

ParagEkbote commented Jan 30, 2025

qgallouedec commented Jan 30, 2025