You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Update Llama README.md for Stories110M tokenizer (#5960)
Summary:
The tokenizer from `wget "https://raw.githubusercontent.com/karpathy/llama2.c/master/tokenizer.model"` is TikToken, so we do not need to generate a `tokenizer.bin` and instead can just use the `tokenizer.model` as is.
Pull Request resolved: #5960
Reviewed By: tarun292
Differential Revision: D64014160
Pulled By: dvorjackz
fbshipit-source-id: 16474a73ed77192f58a5bb9e07426ba58216351e
(cherry picked from commit 12cb9ca)
Due to the larger vocabulary size of Llama 3, we recommend quantizing the embeddings with `--embedding-quantize 4,32` as shown above to further reduce the model size.
0 commit comments