You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
{"response":" I'm a software developer with a passion for building innovative and user-friendly applications. I have experience in developing web and mobile applications using various technologies such as Java, Python, and JavaScript. I'm always looking for new challenges and opportunities to learn and grow as a developer.\n\nIn my free time, I enjoy reading books on computer science and programming, as well as experimenting with new technologies and techniques. I'm also interested in machine learning and artificial intelligence, and I'm always looking for ways to apply these concepts to real-world problems.\n\nI'm excited to be a part of the developer community and to have the opportunity to share my knowledge and experience with others. I'm always happy to help with any questions or problems you may have, and I'm looking forward to learning from you as well.\n\nThank you for visiting my profile! I hope you find my information helpful and interesting. If you have any questions or would like to discuss any topics, please feel free to reach out to me. I"}
148
158
```
149
159
150
-
### Browser
151
-
This mode provides access to the model via the browser's localhost.
160
+
</details>
152
161
153
-
Launch an interactive chat with your model. Running the command will automatically open a tab in your browser. [Streamlit](https://streamlit.io/) should already be installed by the `install_requirements.sh` script.
162
+
### Browser
163
+
This mode provides access to a localhost browser hosting [Streamlit](https://streamlit.io/).
164
+
Running the command automatically open a tab in your browser.
154
165
```
155
166
streamlit run torchchat.py -- browser <model_name> <model_args>
python3 torchchat.py generate llama3 --device cpu --pte-path llama3.pte --prompt "Hello my name is"
254
271
```
255
272
256
-
NOTE: We use `--quantize config/data/mobile.json` to quantize the
273
+
> [!NOTE]
274
+
> We use `--quantize config/data/mobile.json` to quantize the
257
275
llama3 model to reduce model size and improve performance for
258
276
on-device use cases.
259
277
260
278
For more details on quantization and what settings to use for your use
261
-
case visit our [Quanitization documentation](docs/quantization.md) or
279
+
case visit our [Quantization documentation](docs/quantization.md) or
262
280
run `python3 torchchat.py export`
263
281
264
282
[end default]: end
@@ -267,10 +285,14 @@ run `python3 torchchat.py export`
267
285
268
286
The following assumes you've completed the steps for [Setting up ExecuTorch](#set-up-executorch).
269
287
288
+
<details>
289
+
<summary>Deploying with Xcode</summary>
290
+
270
291
#### Requirements
271
292
- Xcode 15.0 or later
272
293
- A development provisioning profile with the [`increased-memory-limit`](https://developer.apple.com/documentation/bundleresources/entitlements/com_apple_developer_kernel_increased-memory-limit) entitlement.
273
294
295
+
274
296
#### Steps
275
297
276
298
1. Open the Xcode project:
@@ -293,11 +315,15 @@ The following assumes you've completed the steps for [Setting up ExecuTorch](#se
<img src="https://pytorch.org/executorch/main/_static/img/llama_ios_app.png" width="600" alt="iOS app running a LlaMA model">
295
317
</a>
318
+
</details>
296
319
297
320
298
321
### Deploy and run on Android
299
322
300
-
#### Approach 1 (Recommended): Android Studio
323
+
The following assumes you've completed the steps for [Setting up ExecuTorch](#set-up-executorch). In torchchat, we show 2 approaches for Android deployment:
If you have Android Studio set up, and you have Java 17 and Android SDK 34 configured, you can follow this step.
303
329
@@ -309,9 +335,11 @@ If your model uses tiktoken tokenizer (llama3 model for example), download `exec
309
335
310
336
Currently the tokenizer is built at compile time, so you need to re-build the app when you need to use a different tokenizer for different model.
311
337
312
-
NOTE: The script to build the AAR can be found [here](https://github.com/pytorch/executorch/blob/main/build/build_android_library.sh). If you need to tweak with the tokenizer or runtime (for example use your own tokenizer or runtime library), you can modify the ExecuTorch code and use that script to build the AAR library.
338
+
> [!NOTE]
339
+
> The script to build the AAR can be found [here](https://github.com/pytorch/executorch/blob/main/build/build_android_library.sh). If you need to tweak with the tokenizer or runtime (for example use your own tokenizer or runtime library), you can modify the ExecuTorch code and use that script to build the AAR library.
@@ -342,7 +370,9 @@ Now, follow the app's UI guidelines to pick the model and tokenizer files from t
342
370
343
371
<img src="https://pytorch.org/executorch/main/_static/img/android_llama_app.png" width="600" alt="Android app running a LlaMA model">
344
372
345
-
#### Approach 2: E2E Script
373
+
</details>
374
+
<details>
375
+
<summary>Approach 2: E2E Script</summary>
346
376
347
377
Alternatively, you can run `scripts/android_example.sh` which sets up Java, Android SDK Manager, Android SDK, Android emulator (if no physical device is found), builds the app, and launches it for you. It can be used if you don't have a GUI.
348
378
@@ -352,17 +382,16 @@ export USE_TIKTOKEN=ON # Set this only for tiktoken tokenizer
352
382
sh scripts/android_example.sh
353
383
```
354
384
385
+
</details>
355
386
356
-
### Eval
387
+
## Eval
357
388
358
389
Uses the lm_eval library to evaluate model accuracy on a variety of
359
390
tasks. Defaults to wikitext and can be manually controlled using the
360
391
tasks and limit args.
361
392
362
393
See [Evaluation](docs/evaluation.md)
363
394
364
-
For more information run `python3 torchchat.py eval --help`
0 commit comments