|
| 1 | + |
| 2 | +Instructions, as suggested by @Orion. (Consider creating a version |
| 3 | +with text interspersed as Google Colab and link it here at the top.) |
| 4 | + |
| 5 | +``` |
| 6 | +python3 -m pip install --user virtualenv |
| 7 | +python3 -m virtualenv .llama-fast |
| 8 | +source .llama-fast/bin/activate |
| 9 | +git clone https://github.com/pytorch/torchat.git |
| 10 | +cd llama-fast |
| 11 | +git submodule sync |
| 12 | +git submodule update --init |
| 13 | +
|
| 14 | +# If we need PyTorch nightlies |
| 15 | +pip3 install --pre torch --index-url https://download.pytorch.org/whl/nightly/cpu |
| 16 | +# Otherwise |
| 17 | +# pip3 install torch |
| 18 | +
|
| 19 | +pip install sentencepiece huggingface_hub |
| 20 | +# Eventually should be (when Dave has the PyPI packages) |
| 21 | +# pip install sentencepiece huggingface_hub executorch |
| 22 | +# I had some issues with the pytorch submodule not downloading from ExecuTorch - not sure why |
| 23 | +
|
| 24 | +# To download Llama 2 models, go to https://huggingface.co/meta-llama/Llama-2-7b and go through steps to obtain access. |
| 25 | +
|
| 26 | +# Once approved, login with |
| 27 | +huggingface-cli login |
| 28 | +# You will be asked for a token from https://huggingface.co/settings/tokens |
| 29 | +
|
| 30 | +# Set the model and paths for stories15M as an example to test things on desktop and mobile |
| 31 | +MODEL_NAME=stories15M |
| 32 | +MODEL_PATH=checkpoints/${MODEL_NAME}/stories15M.pt |
| 33 | +MODEL_DIR=~/llama-fast-exports |
| 34 | +
|
| 35 | +# Could we make this stories15 instead? |
| 36 | +export MODEL_DOWNLOAD=meta-llama/Llama-2-7b-chat-hf |
| 37 | +./scripts/prepare.sh $MODEL_DOWNLOAD |
| 38 | +python generate.py --compile --checkpoint-path ${MODEL_PATH} --prompt "Hello, my name is" --device {cuda,cpu,mps} |
| 39 | +
|
| 40 | +``` |
0 commit comments