Skip to content

Commit fbee9be

Browse files
authored
Update Llama README.md for Stories110M tokenizer
1 parent f005dd5 commit fbee9be

File tree

1 file changed

+2
-8
lines changed

1 file changed

+2
-8
lines changed

examples/models/llama2/README.md

Lines changed: 2 additions & 8 deletions
Original file line numberDiff line numberDiff line change
@@ -205,11 +205,6 @@ If you want to deploy and run a smaller model for educational purposes. From `ex
205205
```
206206
python -m examples.models.llama2.export_llama -c stories110M.pt -p params.json -X -kv
207207
```
208-
4. Create tokenizer.bin.
209-
210-
```
211-
python -m extension.llm.tokenizer.tokenizer -t <tokenizer.model> -o tokenizer.bin
212-
```
213208
214209
### Option D: Download and export Llama 2 7B model
215210
@@ -224,7 +219,6 @@ You can export and run the original Llama 2 7B model.
224219
python -m examples.models.llama2.export_llama --checkpoint <checkpoint.pth> --params <params.json> -kv --use_sdpa_with_kv_cache -X -qmode 8da4w --group_size 128 -d fp32
225220
```
226221
4. Create tokenizer.bin.
227-
228222
```
229223
python -m extension.llm.tokenizer.tokenizer -t <tokenizer.model> -o tokenizer.bin
230224
```
@@ -286,7 +280,7 @@ tokenizer.path=<path_to_checkpoint_folder>/tokenizer.model
286280
287281
Using the same arguments from above
288282
```
289-
python -m examples.models.llama2.eval_llama -c <checkpoint.pth> -p <params.json> -t <tokenizer.model> -d fp32 --max_seq_len <max sequence length> --limit <number of samples>
283+
python -m examples.models.llama2.eval_llama -c <checkpoint.pth> -p <params.json> -t <tokenizer.model/bin> -d fp32 --max_seq_len <max sequence length> --limit <number of samples>
290284
```
291285

292286
The Uncyclotext results generated above used: `{max_seq_len: 2048, limit: 1000}`
@@ -332,7 +326,7 @@ Note for Mac users: There's a known linking issue with Xcode 15.1. Refer to the
332326
cmake-out/examples/models/llama2/llama_main --model_path=<model pte file> --tokenizer_path=<tokenizer.model> --prompt=<prompt>
333327
```
334328
335-
For Llama2 and stories models, pass the converted `tokenizer.bin` file instead of `tokenizer.model`.
329+
For Llama2 models, pass the converted `tokenizer.bin` file instead of `tokenizer.model`.
336330
337331
To build for CoreML backend and validate on Mac, replace `-DEXECUTORCH_BUILD_XNNPACK=ON` with `-DEXECUTORCH_BUILD_COREML=ON`
338332

0 commit comments

Comments
 (0)