finetune: rename feed-forward tensors (w1/w2/w3) #4839

danbev · 2024-01-09T14:13:44Z

This commit renames the feed-forward tensors w1, w2 and w3 to ffn_gate, ffn_down and ffn_up respectively.

The motivation for this change is to make it easier to understand the purpose of the tensors. This also seems to be inline with the names used in the llama_layer struct in llama.cpp.

xaedes

Looks good to me.
For consistency, can you do the same for examples/train-text-from-scratch.cpp inside this PR?

danbev · 2024-01-10T04:59:17Z

For consistency, can you do the same for examples/train-text-from-scratch.cpp inside this PR?

Absolutely, I'll take a look at that examples as well 👍

danbev · 2024-01-10T10:56:29Z

The ci failure does not look related to this PR as far as I can tell.
Would someone with the correct permissions be able to re-run the job in question?

ggerganov

Yeah, sometimes this job fails - restarted it

danbev · 2024-01-15T18:14:54Z

@xaedes Would you be able to take a look at the changes to train-text-from-scratch.cpp? Thanks

This commit renames the feed-forward tensors w1, w2 and w3 to ffn_gate, ffn_down and ffn_up respectively. The motivation for this change is to make it easier to understand the purpose of the tensors. This also seems to be inline with the names used in the llama_layer struct in llama.cpp. Signed-off-by: Daniel Bevenius <[email protected]>

This commit renames the feed-forward tensors w1, w2 and w3 to ffn_gate, ffn_down and ffn_up respectively. The motivation for this change is to make it easier to understand the purpose of the tensors. This also seems to be inline with the names used in the llama_layer struct in llama.cpp Signed-off-by: Daniel Bevenius <[email protected]>

* finetune: rename feed-forward tensors (w1/w2/w3) This commit renames the feed-forward tensors w1, w2 and w3 to ffn_gate, ffn_down and ffn_up respectively. The motivation for this change is to make it easier to understand the purpose of the tensors. This also seems to be inline with the names used in the llama_layer struct in llama.cpp. Signed-off-by: Daniel Bevenius <[email protected]> * train-text-from-scratch: rename ff tensors This commit renames the feed-forward tensors w1, w2 and w3 to ffn_gate, ffn_down and ffn_up respectively. The motivation for this change is to make it easier to understand the purpose of the tensors. This also seems to be inline with the names used in the llama_layer struct in llama.cpp Signed-off-by: Daniel Bevenius <[email protected]> --------- Signed-off-by: Daniel Bevenius <[email protected]>

ggerganov requested a review from xaedes January 9, 2024 14:24

xaedes requested changes Jan 9, 2024

View reviewed changes

ggerganov reviewed Jan 10, 2024

View reviewed changes

ggerganov requested a review from xaedes January 11, 2024 21:24

danbev added 2 commits February 13, 2024 12:18

danbev force-pushed the finetune-ff-tensor-names branch from 77cfcb4 to 35670e7 Compare February 13, 2024 11:27

ggerganov approved these changes Feb 13, 2024

View reviewed changes

ggerganov merged commit 2639789 into ggml-org:master Feb 13, 2024

danbev deleted the finetune-ff-tensor-names branch February 16, 2024 10:47

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

finetune: rename feed-forward tensors (w1/w2/w3) #4839

finetune: rename feed-forward tensors (w1/w2/w3) #4839

Uh oh!

danbev commented Jan 9, 2024

Uh oh!

xaedes left a comment

Uh oh!

danbev commented Jan 10, 2024

Uh oh!

danbev commented Jan 10, 2024

Uh oh!

ggerganov left a comment

Uh oh!

danbev commented Jan 15, 2024

Uh oh!

Uh oh!

finetune: rename feed-forward tensors (w1/w2/w3) #4839

finetune: rename feed-forward tensors (w1/w2/w3) #4839

Uh oh!

Conversation

danbev commented Jan 9, 2024

Uh oh!

xaedes left a comment

Choose a reason for hiding this comment

Uh oh!

danbev commented Jan 10, 2024

Uh oh!

danbev commented Jan 10, 2024

Uh oh!

ggerganov left a comment

Choose a reason for hiding this comment

Uh oh!

danbev commented Jan 15, 2024

Uh oh!

Uh oh!