Skip to content

Add general name to train #6752

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
merged 3 commits into from
Apr 19, 2024

Conversation

teleprint-me
Copy link
Contributor

@teleprint-me teleprint-me commented Apr 18, 2024

This commit adds the model name to a GGML trained model when using train-text-from-scratch.

19:41:23 | /mnt/valerie/forked/ggerganov/llama.cpp
 git:(add-general-name-to-train | θ) λ python gguf-py/scripts/gguf-dump.py models/valerie/v0.1/ggml-valerie-v0.1-256x32-f32-LATEST.gguf --no-tensors
* Loading: models/valerie/v0.1/ggml-valerie-v0.1-256x32-f32-LATEST.gguf
* File is LITTLE endian, script is running on a LITTLE endian host.

* Dumping 24 key/value pair(s)
      1: UINT32     |        1 | GGUF.version = 3
      2: UINT64     |        1 | GGUF.tensor_count = 147
      3: UINT64     |        1 | GGUF.kv_count = 21
      4: STRING     |        1 | general.architecture = 'llama'
      5: STRING     |        1 | general.name = 'llama'  # Adds the models name
      6: UINT32     |        1 | general.file_type = 0
      7: UINT32     |        1 | llama.context_length = 256
      8: UINT32     |        1 | llama.embedding_length = 256
      9: UINT32     |        1 | llama.feed_forward_length = 768
     10: UINT32     |        1 | llama.attention.head_count = 8
     11: UINT32     |        1 | llama.block_count = 16
     12: UINT32     |        1 | llama.rope.dimension_count = 32
     13: FLOAT32    |        1 | llama.attention.layer_norm_rms_epsilon = 9.999999747378752e-06
     14: FLOAT32    |        1 | llama.rope.freq_base = 10000.0
     15: FLOAT32    |        1 | llama.rope.scale_linear = 1.0
     16: STRING     |        1 | tokenizer.ggml.model = 'llama'
     17: [FLOAT32]  |    32000 | tokenizer.ggml.scores
     18: [INT32]    |    32000 | tokenizer.ggml.token_type
     19: [STRING]   |    32000 | tokenizer.ggml.tokens
     20: UINT32     |        1 | tokenizer.ggml.bos_token_id = 1
     21: UINT32     |        1 | tokenizer.ggml.eos_token_id = 2
     22: UINT32     |        1 | tokenizer.ggml.unknown_token_id = 0
     23: UINT32     |        1 | tokenizer.ggml.seperator_token_id = 4294967295
     24: UINT32     |        1 | tokenizer.ggml.padding_token_id = 4294967295

This commit simply uses the models architecture as a base to keep the changes both minimal and simple until I have time to come up with a more customizable approach.

@ggerganov ggerganov merged commit 8b1b1f4 into ggml-org:master Apr 19, 2024
okuvshynov pushed a commit to okuvshynov/llama.cpp that referenced this pull request Apr 22, 2024
* llama : make general.name optional

* train: Add 'general.name' to model metadata

Signed-off-by: teleprint-me <[email protected]>

---------

Signed-off-by: teleprint-me <[email protected]>
Co-authored-by: Georgi Gerganov <[email protected]>
@teleprint-me teleprint-me deleted the add-general-name-to-train branch May 9, 2024 00:06
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants