Added support for SFTTrainer checkpoint models and adapter models containing some non-LoRA weights #9778

Victoran0 · 2024-10-08T01:41:47Z

The previous code triggers an unexpected name error and calls sys.exit(1) (lines 350-351 current version) even if a single weight in the lora_model is not a lora_A, lora_B, or base layer weight.
This edit collects the names of all LoRA weights in the model before the for loop in line 341 (current version). And in line 350 (edit version), the subsequent operations are performed only on the LoRA and base layer weights, ignoring any non-LoRA weights in the lora_model.
Hopefully, this helps by allowing the script to extract LoRA weights and convert LoRA to GGUF for adapter models containing one or more non-LoRA weights.

I have read the contributing guidelines
Self-reported review complexity:
- Low
- Medium
- High

…taining some non-LoRA weights The previous code triggers an unexpected name error and calls sys.exit(1) (lines 350-351 current version) even if a single weight in the lora_model is not a lora_A, lora_B, or base layer weight. This edit collects the names of all LoRA weights in the model before the for loop in line 341 (current version). And in line 350 (edit version), the subsequent operations are performed only on the LoRA and base layer weights, ignoring any non-LoRA weights in the lora_model. Hopefully, this helps by allowing the script to extract LoRA weights and convert LoRA to GGUF for adapters containing one or more non-LoRA weights.

@FirstTimeEZ

…taining one or more non-LoRA weights My initial commit was more like a brute force. The edits suggested by @FirstTimeEZ reduces the complexity.

Victoran0

My initial commit was literally the Brute Force solution I implemented when I encountered the "Unexpected name '{name}': Not a lora_A or lora_B tensor" error while trying to convert a SFTTrainer-Checkpoint-lora-adapter-model to gguf.
The edits proposed by @FirstTimeEZ reduces the complexity of the code, making it more performant.

Victoran0 · 2024-10-09T14:49:46Z

Exactly, and the for loop checks for every weight in the lora_model and then sys.exit(1) if the weight is not a Lora weight/base layer weight.
The original will only work if all Lora weights have been extracted from the lora_model or every layer in the base_model was fine tuned.

ngxson

It make no sense to remove this error message.

We show the error because we never want to produce a half-working lora gguf. In your case, there are non-lora tensors in the file. This may hint that your safetensors file contains unknown data that is not handled by llama.cpp

It's better to know exactly what are the non-lora tensors in your case, rather than ignore them.

Please at least put a full output log here.

Victoran0 · 2024-10-17T00:29:31Z

INFO:lora-to-gguf:Loading base model: 392a143b624368100f77a3eafaa4a2468ba50a72
INFO:gguf.gguf_writer:gguf: This GGUF file is for Little Endian only
INFO:lora-to-gguf:Exporting model...
ERROR:lora-to-gguf:Unexpected name 'base_model.model.lm_head.weight': Not a lora_A or lora_B tensor

Above is the logged output error.
My LoRA Adapter is a fine tuned meta-llama-3.2-3B-instruct.
The target modules for my LoraConfig are ['up_proj', 'k_proj', 'v_proj', 'down_proj', 'q_proj', 'o_proj', 'gate_proj']..
As you can see from the log above, the lm_head weight caused the sys.exit(1) code to run, because the lm_head is not part of the target modules that was fine tuned.
The script is limited to LoRA adapters with every layer fine-tuned.

ngxson · 2024-10-18T13:38:56Z

@Victoran0 I'm checking with TRL team at hugging face to know why. In the meantime, could you share the adapter or the python code that you used to fine tune the model?

Victoran0 · 2024-10-18T16:52:00Z

The Adapter: https://huggingface.co/Victorano/llama-3.2-3B-it-Procurtech-Assistant
The Python Code: https://colab.research.google.com/drive/1OC5puMjnWi4wGpOvE-pqr2Y4BqtiiP1f?usp=sharing
@ngxson Here you go!

ngxson · 2024-10-18T21:17:04Z

Ok so the problem is that you used setup_chat_format which adds more tokens to the tokenizer. That's why it change the embeddings and lm_head, and this change is currently not supported by llama.cpp

What I suggest is to remove the setup_chat_format, because your base model Llama-3.2-3B-Instruct already have a chat template, so no need to setup.

Victoran0 · 2024-10-18T21:28:11Z

Okay, I will fix that, retrain and let you know the result

Victoran0 · 2024-10-19T18:41:42Z

@ngxson , removing setup_chat_format resolved the error. The lora adapter's size is now less than 100mb, excellent!
The fix for #9948 is an optimal solution.
I guess I can close this now.

github-actions bot added the python python script changes label Oct 8, 2024

This comment was marked as resolved.

Sign in to view

Added support for SFTTrainer checkpoint models and adapter models con…

c2c2626

…taining one or more non-LoRA weights My initial commit was more like a brute force. The edits suggested by @FirstTimeEZ reduces the complexity.

Victoran0 commented Oct 8, 2024

View reviewed changes

Victoran0 changed the title ~~Added support for SFTTrainer checkpoint models and adapter models containing one or more non-LoRA weights~~ Added support for SFTTrainer checkpoint models and adapter models containing some non-LoRA weights Oct 8, 2024

This comment was marked as outdated.

Sign in to view

ngxson requested changes Oct 16, 2024

View reviewed changes

This was referenced Oct 18, 2024

lora : error message if new token is added in the adapter #9948

Merged

setup_chat_format: throw error if there is already a template in base model huggingface/trl#2252

Merged

Victoran0 closed this Oct 19, 2024

Victoran0 deleted the patch-2 branch October 19, 2024 18:48

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Added support for SFTTrainer checkpoint models and adapter models containing some non-LoRA weights #9778

Added support for SFTTrainer checkpoint models and adapter models containing some non-LoRA weights #9778

Uh oh!

Victoran0 commented Oct 8, 2024 •

edited

Loading

Uh oh!

This comment was marked as resolved.

Uh oh!

Victoran0 left a comment

Uh oh!

This comment was marked as outdated.

Uh oh!

Victoran0 commented Oct 9, 2024 •

edited

Loading

Uh oh!

ngxson left a comment •

edited

Loading

Uh oh!

Victoran0 commented Oct 17, 2024

Uh oh!

ngxson commented Oct 18, 2024

Uh oh!

Victoran0 commented Oct 18, 2024

Uh oh!

ngxson commented Oct 18, 2024 •

edited

Loading

Uh oh!

Victoran0 commented Oct 18, 2024

Uh oh!

Victoran0 commented Oct 19, 2024

Uh oh!

Uh oh!

Added support for SFTTrainer checkpoint models and adapter models containing some non-LoRA weights #9778

Added support for SFTTrainer checkpoint models and adapter models containing some non-LoRA weights #9778

Uh oh!

Conversation

Victoran0 commented Oct 8, 2024 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

This comment was marked as resolved.

Uh oh!

Victoran0 left a comment

Choose a reason for hiding this comment

Uh oh!

This comment was marked as outdated.

Uh oh!

Victoran0 commented Oct 9, 2024 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

ngxson left a comment • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Victoran0 commented Oct 17, 2024

Uh oh!

ngxson commented Oct 18, 2024

Uh oh!

Victoran0 commented Oct 18, 2024

Uh oh!

ngxson commented Oct 18, 2024 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

Victoran0 commented Oct 18, 2024

Uh oh!

Victoran0 commented Oct 19, 2024

Uh oh!

Uh oh!

Victoran0 commented Oct 8, 2024 •

edited

Loading

Victoran0 commented Oct 9, 2024 •

edited

Loading

ngxson left a comment •

edited

Loading

ngxson commented Oct 18, 2024 •

edited

Loading