@@ -56,32 +56,32 @@ source .venv/bin/activate
56
56
57
57
[ shell default ] : ./install_requirements.sh
58
58
59
- Installations can be tested by
59
+ Installations can be tested by running
60
60
61
61
``` bash
62
62
# ensure everything installed correctly
63
63
python3 torchchat.py --help
64
64
```
65
65
66
66
### Download Weights
67
- Most models use HuggingFace as the distribution channel, so you will need to create a HuggingFace account.
67
+ Most models use Hugging Face as the distribution channel, so you will need to create a Hugging Face account.
68
68
69
69
[ prefix default ] : HF_TOKEN="${SECRET_HF_TOKEN_PERIODIC}"
70
- Create a HuggingFace user access token [ as documented here] ( https://huggingface.co/docs/hub/en/security-tokens ) with the ` write ` role.
71
- Log into huggingface :
70
+ Create a Hugging Face user access token [ as documented here] ( https://huggingface.co/docs/hub/en/security-tokens ) with the ` write ` role.
71
+ Log into Hugging Face :
72
72
```
73
73
huggingface-cli login
74
74
```
75
75
76
76
Once this is done, torchchat will be able to download model artifacts from
77
- HuggingFace .
77
+ Hugging Face .
78
78
79
79
```
80
80
python3 torchchat.py download llama3
81
81
```
82
82
83
- * NOTE: This command may prompt you to request access to llama3 via
84
- HuggingFace , if you do not already have access. Simply follow the
83
+ * NOTE: This command may prompt you to request access to Llama 3 via
84
+ Hugging Face , if you do not already have access. Simply follow the
85
85
prompts and re-run the command when access is granted.*
86
86
87
87
View available models with:
@@ -99,9 +99,10 @@ Finally, you can also remove downloaded models with the remove command:
99
99
100
100
101
101
## Running via PyTorch / Python
102
- [ Follow the installation steps if you haven't] ( #installation )
102
+ [ Follow the installation steps if you haven't. ] ( #installation )
103
103
104
104
### Chat
105
+ This mode allows you to chat with an LLM in an interactive fashion.
105
106
[ skip default ] : begin
106
107
``` bash
107
108
# Llama 3 8B Instruct
@@ -112,6 +113,7 @@ python3 torchchat.py chat llama3
112
113
For more information run ` python3 torchchat.py chat --help `
113
114
114
115
### Generate
116
+ This mode generates text based on an input prompt.
115
117
``` bash
116
118
python3 torchchat.py generate llama3 --prompt " write me a story about a boy and his bear"
117
119
```
@@ -120,7 +122,7 @@ For more information run `python3 torchchat.py generate --help`
120
122
121
123
122
124
### Browser
123
-
125
+ This mode provides access to the model via the browser's localhost.
124
126
[ skip default ] : begin
125
127
```
126
128
python3 torchchat.py browser llama3
@@ -143,7 +145,7 @@ conversation.
143
145
## Desktop/Server Execution
144
146
145
147
### AOTI (AOT Inductor)
146
- AOT compiles models before execution for faster inference
148
+ AOT compiles models before execution for faster inference (read more about AOTI [ here ] ( https://pytorch.org/blog/pytorch2-2/ ) ).
147
149
148
150
The following example exports and executes the Llama3 8B Instruct
149
151
model. The first command performs the actual export, the second
@@ -179,7 +181,7 @@ cmake-out/aoti_run exportedModels/llama3.so -z `python3 torchchat.py where llama
179
181
180
182
## Mobile Execution
181
183
182
- ExecuTorch enables you to optimize your model for execution on a
184
+ [ ExecuTorch] ( https://github.com/pytorch/executorch ) enables you to optimize your model for execution on a
183
185
mobile or embedded device, but can also be used on desktop for
184
186
testing.
185
187
0 commit comments