Llama stories example

lucylq · facebook-github-bot · commit df56c5d8021c · 2024-02-15T14:27:35.000-08:00
Summary: Add llama stories example to readme

Differential Revision: D53827636
diff --git a/examples/models/llama2/README.md b/examples/models/llama2/README.md
@@ -26,8 +26,38 @@ This example tries to reuse the Python code, with modifications to make it compa
 
 
 # Instructions:
+### Setup
 1. Follow the [tutorial](https://pytorch.org/executorch/stable/getting-started-setup) to set up ExecuTorch
 2. `cd examples/third-party/llama`
 3. `pip install -e .`
-4. Go back to `executorch` root, run `python3 -m examples.models.llama2.export_llama`. The exported program, llama2.pte would be saved in current directory using the dummy checkpoint.
-5. Llama2 pretrained parameters can be downloaded [here](https://ai.meta.com/resources/models-and-libraries/llama-downloads/) and run with `python3 -m examples.models.llama2.export_llama --checkpoint <checkpoint.pth> --params <params.json>`.
+4. Go back to `executorch` root, run `bash examples/models/llama2/install_requirements.sh`.
+
+### Export llama2 models
+2. From `executorch` root, run `python3 -m examples.models.llama2.export_llama`. The exported program, llama2.pte would be saved in current directory using the dummy checkpoint.
+3. Llama2 pretrained parameters can be downloaded [here](https://ai.meta.com/resources/models-and-libraries/llama-downloads/) and run with `python3 -m examples.models.llama2.export_llama --checkpoint <checkpoint.pth> --params <params.json>`.
+
+### Export and run stories110M model
+
+1. Download `stories110M.pt` and `tokenizer.model` from Github.
+```
+wget "https://huggingface.co/karpathy/tinyllamas/resolve/main/stories110M.pt"
+wget "https://raw.githubusercontent.com/karpathy/llama2.c/master/tokenizer.model"
+```
+2. Create params file.
+```
+echo '{"dim": 768, "multiple_of": 32, "n_heads": 12, "n_layers": 12, "norm_eps": 1e-05, "vocab_size": 32000}' > params.json
+```
+3. Export model. Export options available [here](https://github.com/pytorch/executorch/blob/main/examples/models/llama2/export_llama_lib.py#L161).
+```
+python3 -m examples.models.llama2.export_llama -c stories110M.pt -p params.json
+```
+4. Create tokenizer.bin.
+```
+buck2 run examples/models/llama2/tokenizer:tokenizer_py -- -t tokenizer.model -o tokenizer.bin
+```
+5. Run model. Run options available [here](https://github.com/pytorch/executorch/blob/main/examples/models/llama2/main.cpp#L13).
+```
+buck2 run examples/models/llama2:main -- --model_path=llama2.pte --tokenizer_path=tokenizer.bin --prompt="Once"
+```
+
+See test script [here](https://github.com/pytorch/executorch/blob/main/.ci/scripts/test_llama.sh).