Skip to content

Commit 73244a9

Browse files
kirklandsignfacebook-github-bot
authored andcommitted
Add LLM subpages to navi (#5475)
Summary: Pull Request resolved: #5475 Reviewed By: guangy10 Differential Revision: D62992589 Pulled By: kirklandsign fbshipit-source-id: ddfa3aa4c034326cfc776dacc909335fe43b3071
1 parent 8ef6c79 commit 73244a9

18 files changed

+19
-155
lines changed

docs/source/index.rst

Lines changed: 3 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -117,6 +117,9 @@ Topics in this section will help you get started with ExecuTorch.
117117
:hidden:
118118

119119
llm/getting-started
120+
llm/llama-demo-android
121+
llm/build-run-llama3-qualcomm-ai-engine-direct-backend
122+
llm/llama-demo-ios
120123

121124
.. toctree::
122125
:glob:

docs/source/llm/llama-demo-android.md

Lines changed: 1 addition & 140 deletions
Original file line numberDiff line numberDiff line change
@@ -1,141 +1,2 @@
1-
# ExecuTorch Llama Android Demo App
2-
3-
We’re excited to share that the newly revamped Android demo app is live and includes many new updates to provide a more intuitive and smoother user experience with a chat use case! The primary goal of this app is to showcase how easily ExecuTorch can be integrated into an Android demo app and how to exercise the many features ExecuTorch and Llama models have to offer.
4-
5-
This app serves as a valuable resource to inspire your creativity and provide foundational code that you can customize and adapt for your particular use case.
6-
7-
Please dive in and start exploring our demo app today! We look forward to any feedback and are excited to see your innovative ideas.
8-
9-
10-
## Key Concepts
11-
From this demo app, you will learn many key concepts such as:
12-
* How to prepare Llama models, build the ExecuTorch library, and model inferencing across delegates
13-
* Expose the ExecuTorch library via JNI layer
14-
* Familiarity with current ExecuTorch app-facing capabilities
15-
16-
The goal is for you to see the type of support ExecuTorch provides and feel comfortable with leveraging it for your use cases.
17-
18-
## Supporting Models
19-
As a whole, the models that this app supports are (varies by delegate):
20-
* Llama 3.1 8B
21-
* Llama 3 8B
22-
* Llama 2 7B
23-
* LLaVA-1.5 vision model (only XNNPACK)
24-
25-
26-
## Building the APK
27-
First it’s important to note that currently ExecuTorch provides support across 3 delegates. Once you identify the delegate of your choice, select the README link to get a complete end-to-end instructions for environment set-up to exporting the models to build ExecuTorch libraries and apps to run on device:
28-
29-
| Delegate | Resource |
30-
| ------------- | ------------- |
31-
| XNNPACK (CPU-based library) | [link](docs/delegates/xnnpack_README.md) |
32-
| QNN (Qualcomm AI Accelerators) | [link](docs/delegates/qualcomm_README.md) |
33-
| MediaTek (MediaTek AI Accelerators) | [link](docs/delegates/mediatek_README.md) |
34-
35-
## How to Use the App
36-
37-
This section will provide the main steps to use the app, along with a code snippet of the ExecuTorch API.
38-
39-
For loading the app, development, and running on device we recommend Android Studio:
40-
1. Open Android Studio and select "Open an existing Android Studio project" to open examples/demo-apps/android/LlamaDemo.
41-
2. Run the app (^R). This builds and launches the app on the phone.
42-
43-
### Opening the App
44-
45-
Below are the UI features for the app.
46-
47-
Select the settings widget to get started with picking a model, its parameters and any prompts.
48-
<p align="center">
49-
<img src="../_static/img/opening_the_app_details.png" width=800>
50-
</p>
51-
52-
53-
54-
### Select Models and Parameters
55-
56-
Once you've selected the model, tokenizer, and model type you are ready to click on "Load Model" to have the app load the model and go back to the main Chat activity.
57-
<p align="center">
58-
<img src="../_static/img/settings_menu.png" width=300>
59-
</p>
60-
61-
62-
63-
Optional Parameters:
64-
* Temperature: Defaulted to 0, you can adjust the temperature for the model as well. The model will reload upon any adjustments.
65-
* System Prompt: Without any formatting, you can enter in a system prompt. For example, "you are a travel assistant" or "give me a response in a few sentences".
66-
* User Prompt: More for the advanced user, if you would like to manually input a prompt then you can do so by modifying the `{{user prompt}}`. You can also modify the special tokens as well. Once changed then go back to the main Chat activity to send.
67-
68-
> [!TIP]
69-
> Helpful ExecuTorch API in app
70-
71-
```java
72-
// Upon returning to the Main Chat Activity
73-
mModule = new LlamaModule(
74-
ModelUtils.getModelCategory(mCurrentSettingsFields.getModelType()),
75-
modelPath,
76-
tokenizerPath,
77-
temperature);
78-
int loadResult = mModule.load();
1+
```{include} ../../../examples/demo-apps/android/LlamaDemo/README.md
792
```
80-
81-
* `modelCategory`: Indicate whether it’s a text-only or vision model
82-
* `modePath`: path to the .pte file
83-
* `tokenizerPath`: path to the tokenizer .bin file
84-
* `temperature`: model parameter to adjust the randomness of the model’s output
85-
86-
87-
### User Prompt
88-
Once model is successfully loaded then enter any prompt and click the send (i.e. generate) button to send it to the model.
89-
<p align="center">
90-
<img src="../_static/img/load_complete_and_start_prompt.png" width=300>
91-
</p>
92-
93-
You can provide it more follow-up questions as well.
94-
<p align="center">
95-
<img src="../_static/img/chat.png" width=300>
96-
</p>
97-
98-
> [!TIP]
99-
> Helpful ExecuTorch API in app
100-
```java
101-
mModule.generate(prompt,sequence_length, MainActivity.this);
102-
```
103-
* `prompt`: User formatted prompt
104-
* `sequence_length`: Number of tokens to generate in response to a prompt
105-
* `MainActivity.this`: Indicate that the callback functions (OnResult(), OnStats()) are present in this class.
106-
107-
[*LLaVA-1.5: Only for XNNPACK delegate*]
108-
109-
For LLaVA-1.5 implementation, select the exported LLaVA .pte and tokenizer file in the Settings menu and load the model. After this you can send an image from your gallery or take a live picture along with a text prompt to the model.
110-
111-
<p align="center">
112-
<img src="../_static/img/llava_example.png" width=300>
113-
</p>
114-
115-
116-
### Output Generated
117-
To show completion of the follow-up question, here is the complete detailed response from the model.
118-
<p align="center">
119-
<img src="../_static/img/chat_response.png" width=300>
120-
</p>
121-
122-
> [!TIP]
123-
> Helpful ExecuTorch API in app
124-
125-
Ensure you have the following functions in your callback class that you provided in the `mModule.generate()`. For this example, it is `MainActivity.this`.
126-
```java
127-
@Override
128-
public void onResult(String result) {
129-
//...result contains token from response
130-
//.. onResult will continue to be invoked until response is complete
131-
}
132-
133-
@Override
134-
public void onStats(float tps) {
135-
//...tps (tokens per second) stats is provided by framework
136-
}
137-
138-
```
139-
140-
## Reporting Issues
141-
If you encountered any bugs or issues following this tutorial please file a bug/issue here on [Github](https://github.com/pytorch/executorch/issues/new).

examples/demo-apps/android/LlamaDemo/README.md

Lines changed: 9 additions & 9 deletions
Original file line numberDiff line numberDiff line change
@@ -28,9 +28,9 @@ First it’s important to note that currently ExecuTorch provides support across
2828

2929
| Delegate | Resource |
3030
| ------------- | ------------- |
31-
| XNNPACK (CPU-based library) | [link](docs/delegates/xnnpack_README.md) |
32-
| QNN (Qualcomm AI Accelerators) | [link](docs/delegates/qualcomm_README.md) |
33-
| MediaTek (MediaTek AI Accelerators) | [link](docs/delegates/mediatek_README.md) |
31+
| XNNPACK (CPU-based library) | [link](https://github.com/pytorch/executorch/blob/main/examples/demo-apps/android/LlamaDemo/docs/delegates/xnnpack_README.md) |
32+
| QNN (Qualcomm AI Accelerators) | [link](https://github.com/pytorch/executorch/blob/main/examples/demo-apps/android/LlamaDemo/docs/delegates/qualcomm_README.md) |
33+
| MediaTek (MediaTek AI Accelerators) | [link](https://github.com/pytorch/executorch/blob/main/examples/demo-apps/android/LlamaDemo/docs/delegates/mediatek_README.md) |
3434

3535
## How to Use the App
3636

@@ -46,7 +46,7 @@ Below are the UI features for the app.
4646

4747
Select the settings widget to get started with picking a model, its parameters and any prompts.
4848
<p align="center">
49-
<img src="docs/screenshots/opening_the_app_details.png" width=800>
49+
<img src="https://github.com/pytorch/executorch/blob/main/docs/source/_static/img/opening_the_app_details.png" width=800>
5050
</p>
5151

5252

@@ -55,7 +55,7 @@ Select the settings widget to get started with picking a model, its parameters a
5555

5656
Once you've selected the model, tokenizer, and model type you are ready to click on "Load Model" to have the app load the model and go back to the main Chat activity.
5757
<p align="center">
58-
<img src="docs/screenshots/settings_menu.png" width=300>
58+
<img src="https://github.com/pytorch/executorch/blob/main/docs/source/_static/img/settings_menu.png" width=300>
5959
</p>
6060

6161

@@ -87,12 +87,12 @@ int loadResult = mModule.load();
8787
### User Prompt
8888
Once model is successfully loaded then enter any prompt and click the send (i.e. generate) button to send it to the model.
8989
<p align="center">
90-
<img src="docs/screenshots/load_complete_and_start_prompt.png" width=300>
90+
<img src="https://github.com/pytorch/executorch/blob/main/docs/source/_static/img/load_complete_and_start_prompt.png" width=300>
9191
</p>
9292

9393
You can provide it more follow-up questions as well.
9494
<p align="center">
95-
<img src="docs/screenshots/chat.png" width=300>
95+
<img src="https://github.com/pytorch/executorch/blob/main/docs/source/_static/img/chat.png" width=300>
9696
</p>
9797

9898
> [!TIP]
@@ -109,14 +109,14 @@ mModule.generate(prompt,sequence_length, MainActivity.this);
109109
For LLaVA-1.5 implementation, select the exported LLaVA .pte and tokenizer file in the Settings menu and load the model. After this you can send an image from your gallery or take a live picture along with a text prompt to the model.
110110

111111
<p align="center">
112-
<img src="docs/screenshots/llava_example.png" width=300>
112+
<img src="https://github.com/pytorch/executorch/blob/main/docs/source/_static/img/llava_example.png" width=300>
113113
</p>
114114

115115

116116
### Output Generated
117117
To show completion of the follow-up question, here is the complete detailed response from the model.
118118
<p align="center">
119-
<img src="docs/screenshots/chat_response.png" width=300>
119+
<img src="https://github.com/pytorch/executorch/blob/main/docs/source/_static/img/chat_response.png" width=300>
120120
</p>
121121

122122
> [!TIP]
Binary file not shown.
Binary file not shown.
Binary file not shown.
Binary file not shown.
Binary file not shown.
Binary file not shown.
Binary file not shown.

examples/demo-apps/apple_ios/LLaMA/README.md

Lines changed: 6 additions & 6 deletions
Original file line numberDiff line numberDiff line change
@@ -27,8 +27,8 @@ First it’s important to note that currently ExecuTorch provides support across
2727

2828
| Delegate | Resource |
2929
| ------------------------------ | --------------------------------- |
30-
| XNNPACK (CPU-based library) | [link](docs/delegates/xnnpack_README.md)|
31-
| MPS (Metal Performance Shader) | [link](docs/delegates/mps_README.md) |
30+
| XNNPACK (CPU-based library) | [link](https://github.com/pytorch/executorch/blob/main/examples/demo-apps/apple_ios/LLaMA/docs/delegates/xnnpack_README.md)|
31+
| MPS (Metal Performance Shader) | [link](https://github.com/pytorch/executorch/blob/main/examples/demo-apps/apple_ios/LLaMA/docs/delegates/mps_README.md) |
3232

3333
## How to Use the App
3434
This section will provide the main steps to use the app, along with a code snippet of the ExecuTorch API.
@@ -58,11 +58,11 @@ For more details integrating and Running ExecuTorch on Apple Platforms, checkout
5858
* Ensure that the ExecuTorch package dependencies are installed correctly, then select which ExecuTorch framework should link against which target.
5959

6060
<p align="center">
61-
<img src="docs/screenshots/ios_demo_app_swift_pm.png" alt="iOS LLaMA App Swift PM" width="600">
61+
<img src="https://github.com/pytorch/executorch/blob/main/docs/source/_static/img/ios_demo_app_swift_pm.png" alt="iOS LLaMA App Swift PM" width="600">
6262
</p>
6363

6464
<p align="center">
65-
<img src="docs/screenshots/ios_demo_app_choosing_package.png" alt="iOS LLaMA App Choosing package" width="600">
65+
<img src="https://github.com/pytorch/executorch/blob/main/docs/source/_static/img/ios_demo_app_choosing_package.png" alt="iOS LLaMA App Choosing package" width="600">
6666
</p>
6767

6868
* Run the app. This builds and launches the app on the phone.
@@ -82,13 +82,13 @@ For more details integrating and Running ExecuTorch on Apple Platforms, checkout
8282
If the app successfully run on your device, you should see something like below:
8383

8484
<p align="center">
85-
<img src="./docs/screenshots/ios_demo_app.jpg" alt="iOS LLaMA App" width="300">
85+
<img src="https://github.com/pytorch/executorch/blob/main/docs/source/_static/img/ios_demo_app.jpg" alt="iOS LLaMA App" width="300">
8686
</p>
8787

8888
For Llava 1.5 models, you can select and image (via image/camera selector button) before typing prompt and send button.
8989

9090
<p align="center">
91-
<img src="./docs/screenshots/ios_demo_app_llava.jpg" alt="iOS LLaMA App" width="300">
91+
<img src="https://github.com/pytorch/executorch/blob/main/docs/source/_static/img/ios_demo_app_llava.jpg" alt="iOS LLaMA App" width="300">
9292
</p>
9393

9494
## Reporting Issues

0 commit comments

Comments
 (0)