Don't use GGUF but convert T5 directly from Hugging Face #967

sogartar · 2025-02-14T15:24:18Z

We don't want the stack to depend on the conversion tool from the lamma.cpp repo. Also the conversion to GGUF would not convert all tensors to bf16, but leave some in f32. We would like to control that ourselves if needed.

This change makes any previously generated IRPA files obsolete.

KyleHerndon · 2025-02-17T23:55:03Z

sharktank/sharktank/layers/configs/llm_configs.py

-
-        gguf_to_optional_config_names_map = {
-            "t5.decoder_start_token_id": ["decoder_start_token_id"],
+    def from_hugging_face_config(


Not blocking on this, but can these keys be discovered automatically?

I shortened it.

KyleHerndon · 2025-02-17T23:58:07Z

sharktank/tests/models/t5/t5_test.py

@@ -498,19 +493,19 @@ def testCompareAgainstTransformers(

        theta = Theta(
            {
-                "attn_q.weight": DefaultPrimitiveTensor(
+                "q.weight": DefaultPrimitiveTensor(


What is the motivation with all the renames? Is it just the corresponding names from huggingface's layout instead of gguf's?

We don't want the stack to depend on the conversion tool from the lamma.cpp repo. Also the conversion to GGUF would not convert all tensors to bf16, but leave some in f32. We would like to control that ourselves if needed. This change makes any previously generated IRPA files obsolete.

sogartar force-pushed the users/sogartar/t5-irpa-directly-from-hf branch 2 times, most recently from 7559cc5 to 8af8997 Compare February 15, 2025 02:03

sogartar requested review from KyleHerndon, rsuderman, archana-ramalingam and IanNod February 15, 2025 02:10

sogartar marked this pull request as ready for review February 15, 2025 02:12

sogartar force-pushed the users/sogartar/t5-irpa-directly-from-hf branch from 8af8997 to e39df9b Compare February 17, 2025 20:35

sogartar mentioned this pull request Feb 17, 2025

Improve T5 encoder tests with more prompts and static context length #976

Draft

sogartar force-pushed the users/sogartar/t5-irpa-directly-from-hf branch from e39df9b to b0d3637 Compare February 17, 2025 23:43

KyleHerndon approved these changes Feb 17, 2025

View reviewed changes

sogartar added 2 commits February 20, 2025 11:14

Shorten T5Config.from_hugging_face_config[F

26b0ad7

sogartar force-pushed the users/sogartar/t5-irpa-directly-from-hf branch from e0286ac to 26b0ad7 Compare February 20, 2025 19:14

sogartar merged commit 63606a5 into main Feb 20, 2025
34 of 36 checks passed

sogartar deleted the users/sogartar/t5-irpa-directly-from-hf branch February 20, 2025 19:40

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Don't use GGUF but convert T5 directly from Hugging Face #967

Don't use GGUF but convert T5 directly from Hugging Face #967

sogartar commented Feb 14, 2025

KyleHerndon Feb 17, 2025

sogartar Feb 20, 2025

KyleHerndon Feb 17, 2025

sogartar Feb 20, 2025

Don't use GGUF but convert T5 directly from Hugging Face #967

Don't use GGUF but convert T5 directly from Hugging Face #967

Conversation

sogartar commented Feb 14, 2025

KyleHerndon Feb 17, 2025

Choose a reason for hiding this comment

sogartar Feb 20, 2025

Choose a reason for hiding this comment

KyleHerndon Feb 17, 2025

Choose a reason for hiding this comment

sogartar Feb 20, 2025

Choose a reason for hiding this comment