You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
* executable README
* fix title of CI workflow
* markup commands in markdown
* extend the markup-markdown language
* Automatically identify cuda from nvidia-smi in install-requirements (#606)
* Automatically identify cuda from nvidia-smi in install-requirements
* Update README.md
---------
Co-authored-by: Michael Gschwind <[email protected]>
* Unbreak zero-temperature sampling (#599)
Fixes#581.
* Improve process README
* [retake] Add sentencepiece tokenizer (#626)
* Add sentencepiece tokenizer
Summary:
Test Plan:
Reviewers:
Subscribers:
Tasks:
Tags:
* Add white space
Summary:
Test Plan:
Reviewers:
Subscribers:
Tasks:
Tags:
* Handle white space:
Summary:
Test Plan:
Reviewers:
Subscribers:
Tasks:
Tags:
* Handle control ids
Summary:
Test Plan:
Reviewers:
Subscribers:
Tasks:
Tags:
* More cleanup
Summary:
Test Plan:
Reviewers:
Subscribers:
Tasks:
Tags:
* Lint
Summary:
Test Plan:
Reviewers:
Subscribers:
Tasks:
Tags:
* Use unique_ptr
Summary:
Test Plan:
Reviewers:
Subscribers:
Tasks:
Tags:
* Use a larger runner
Summary:
Test Plan:
Reviewers:
Subscribers:
Tasks:
Tags:
* Debug
Summary:
Test Plan:
Reviewers:
Subscribers:
Tasks:
Tags:
* Debug
Summary:
Test Plan:
Reviewers:
Subscribers:
Tasks:
Tags:
* Cleanup
* Update install_utils.sh to use python3 instead of python (#636)
As titled. On some devices `python` and `python3` are pointing to different environments so good to unify them.
* Fix quantization doc to specify dytpe limitation on a8w4dq (#629)
Summary:
Test Plan:
Reviewers:
Subscribers:
Tasks:
Tags:
Co-authored-by: Kimish Patel <[email protected]>
* add desktop.json (#622)
* add desktop.json
* add fast
* remove embedding
* improvements
* update readme from doc branch
* tab/spc
* fix errors in updown language
* fix errors in updown language, and [skip]: begin/end
* fix errors in updown language, and [skip]: begin/end
* a storied run
* stories run on readme instructions does not need HF token
* increase timeout
* check for hang un hf_login
* executable README improvements
* typo
* typo
---------
Co-authored-by: Ian Barber <[email protected]>
Co-authored-by: Scott Wolchok <[email protected]>
Co-authored-by: Mengwei Liu <[email protected]>
Co-authored-by: Kimish Patel <[email protected]>
Co-authored-by: Scott Roy <[email protected]>
Copy file name to clipboardExpand all lines: README.md
+18-8Lines changed: 18 additions & 8 deletions
Original file line number
Diff line number
Diff line change
@@ -80,15 +80,18 @@ HuggingFace.
80
80
python3 torchchat.py download llama3
81
81
```
82
82
83
-
*NOTE: This command may prompt you to request access to llama3 via HuggingFace, if you do not already have access. Simply follow the prompts and re-run the command when access is granted.*
83
+
*NOTE: This command may prompt you to request access to llama3 via
84
+
HuggingFace, if you do not already have access. Simply follow the
85
+
prompts and re-run the command when access is granted.*
84
86
85
87
View available models with:
86
88
```
87
89
python3 torchchat.py list
88
90
```
89
91
92
+
You can also remove downloaded models with the remove command:
93
+
`python3 torchchat.py remove llama3`
90
94
91
-
You can also remove downloaded models with the remove command: `python3 torchchat.py remove llama3`
92
95
93
96
94
97
## Running via PyTorch / Python
@@ -111,15 +114,15 @@ python3 torchchat.py generate llama3 --prompt "write me a story about a boy and
111
114
112
115
For more information run `python3 torchchat.py generate --help`
113
116
114
-
[end default]:
115
117
116
118
### Browser
117
119
118
-
[shell default]: if false; then
120
+
[skip default]: begin
119
121
```
120
122
python3 torchchat.py browser llama3
121
123
```
122
-
[shell default]: fi
124
+
[skip default]: end
125
+
123
126
124
127
*Running on http://127.0.0.1:5000* should be printed out on the
125
128
terminal. Click the link or go to
@@ -139,9 +142,15 @@ conversation.
139
142
AOT compiles models before execution for faster inference
140
143
141
144
The following example exports and executes the Llama3 8B Instruct
145
+
<<<<<<< HEAD
146
+
model. The first command performs the actual export, the second
147
+
command loads the exported model into the Python interface to enable
148
+
users to test the exported model.
149
+
=======
142
150
model. (The first command performs the actual export, the second
143
151
command loads the exported model into the Python interface to enable
0 commit comments