Skip to content

Added support for SFTTrainer checkpoint models and adapter models containing some non-LoRA weights #9778

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Closed
wants to merge 2 commits into from

Conversation

Victoran0
Copy link

@Victoran0 Victoran0 commented Oct 8, 2024

The previous code triggers an unexpected name error and calls sys.exit(1) (lines 350-351 current version) even if a single weight in the lora_model is not a lora_A, lora_B, or base layer weight.
This edit collects the names of all LoRA weights in the model before the for loop in line 341 (current version). And in line 350 (edit version), the subsequent operations are performed only on the LoRA and base layer weights, ignoring any non-LoRA weights in the lora_model.
Hopefully, this helps by allowing the script to extract LoRA weights and convert LoRA to GGUF for adapter models containing one or more non-LoRA weights.

…taining some non-LoRA weights

The previous code triggers an unexpected name error and calls sys.exit(1) (lines 350-351 current version) even if a single weight in the lora_model is not a lora_A, lora_B, or base layer weight.
This edit collects the names of all LoRA weights in the model before the for loop in line 341 (current version). And in line 350 (edit version), the subsequent operations are performed only on the LoRA and base layer weights, ignoring any non-LoRA weights in the lora_model.
Hopefully, this helps by allowing the script to extract LoRA weights and convert LoRA to GGUF for adapters containing one or more non-LoRA weights.
@github-actions github-actions bot added the python python script changes label Oct 8, 2024
FirstTimeEZ

This comment was marked as resolved.

…taining one or more non-LoRA weights

My initial commit was more like a brute force.
The edits suggested by @FirstTimeEZ reduces the complexity.
Copy link
Author

@Victoran0 Victoran0 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

My initial commit was literally the Brute Force solution I implemented when I encountered the "Unexpected name '{name}': Not a lora_A or lora_B tensor" error while trying to convert a SFTTrainer-Checkpoint-lora-adapter-model to gguf.
The edits proposed by @FirstTimeEZ reduces the complexity of the code, making it more performant.

@Victoran0 Victoran0 changed the title Added support for SFTTrainer checkpoint models and adapter models containing one or more non-LoRA weights Added support for SFTTrainer checkpoint models and adapter models containing some non-LoRA weights Oct 8, 2024
FirstTimeEZ

This comment was marked as outdated.

@Victoran0
Copy link
Author

Victoran0 commented Oct 9, 2024

Exactly, and the for loop checks for every weight in the lora_model and then sys.exit(1) if the weight is not a Lora weight/base layer weight.
The original will only work if all Lora weights have been extracted from the lora_model or every layer in the base_model was fine tuned.

Copy link
Collaborator

@ngxson ngxson left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It make no sense to remove this error message.

We show the error because we never want to produce a half-working lora gguf. In your case, there are non-lora tensors in the file. This may hint that your safetensors file contains unknown data that is not handled by llama.cpp

It's better to know exactly what are the non-lora tensors in your case, rather than ignore them.

Please at least put a full output log here.

@Victoran0
Copy link
Author

INFO:lora-to-gguf:Loading base model: 392a143b624368100f77a3eafaa4a2468ba50a72
INFO:gguf.gguf_writer:gguf: This GGUF file is for Little Endian only
INFO:lora-to-gguf:Exporting model...
ERROR:lora-to-gguf:Unexpected name 'base_model.model.lm_head.weight': Not a lora_A or lora_B tensor

Above is the logged output error.
My LoRA Adapter is a fine tuned meta-llama-3.2-3B-instruct.
The target modules for my LoraConfig are ['up_proj', 'k_proj', 'v_proj', 'down_proj', 'q_proj', 'o_proj', 'gate_proj']..
As you can see from the log above, the lm_head weight caused the sys.exit(1) code to run, because the lm_head is not part of the target modules that was fine tuned.
The script is limited to LoRA adapters with every layer fine-tuned.

@ngxson
Copy link
Collaborator

ngxson commented Oct 18, 2024

@Victoran0 I'm checking with TRL team at hugging face to know why. In the meantime, could you share the adapter or the python code that you used to fine tune the model?

@Victoran0
Copy link
Author

@ngxson
Copy link
Collaborator

ngxson commented Oct 18, 2024

Ok so the problem is that you used setup_chat_format which adds more tokens to the tokenizer. That's why it change the embeddings and lm_head, and this change is currently not supported by llama.cpp

What I suggest is to remove the setup_chat_format, because your base model Llama-3.2-3B-Instruct already have a chat template, so no need to setup.

@Victoran0
Copy link
Author

Okay, I will fix that, retrain and let you know the result

@Victoran0
Copy link
Author

@ngxson , removing setup_chat_format resolved the error. The lora adapter's size is now less than 100mb, excellent!
The fix for #9948 is an optimal solution.
I guess I can close this now.

@Victoran0 Victoran0 closed this Oct 19, 2024
@Victoran0 Victoran0 deleted the patch-2 branch October 19, 2024 18:48
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
python python script changes
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants