Skip to content

Commit c611bfd

Browse files
Riandyfacebook-github-bot
authored andcommitted
Update iOS Llama demo app readme docs (#5359)
Summary: Pull Request resolved: #5359 - Revamp and standardize the readme docs structure for better clarity. - Added different delegate instructions for ios. - Added some screenshots to improve readme clarity. Differential Revision: D62660106
1 parent bfce743 commit c611bfd

File tree

8 files changed

+256
-29
lines changed

8 files changed

+256
-29
lines changed
Lines changed: 51 additions & 29 deletions
Original file line numberDiff line numberDiff line change
@@ -1,52 +1,74 @@
1-
# Building ExecuTorch LLaMA iOS Demo App
1+
# ExecuTorch Llama iOS Demo App
22

3-
This app demonstrates the use of the LLaMA chat app demonstrating local inference use case with ExecuTorch.
3+
We’re excited to share that the newly revamped iOS demo app is live and includes many new updates to provide a more intuitive and smoother user experience with a chat use case! The primary goal of this app is to showcase how easily ExecuTorch can be integrated into an iOS demo app and how to exercise the many features ExecuTorch and Llama models have to offer.
44

5-
## Prerequisites
6-
* [Xcode 15](https://developer.apple.com/xcode)
7-
* [iOS 17 SDK](https://developer.apple.com/ios)
8-
* Set up your ExecuTorch repo and environment if you haven’t done so by following the [Setting up ExecuTorch](https://pytorch.org/executorch/stable/getting-started-setup) to set up the repo and dev environment:
5+
This app serves as a valuable resource to inspire your creativity and provide foundational code that you can customize and adapt for your particular use case.
96

10-
```bash
11-
git clone -b release/0.2 https://github.com/pytorch/executorch.git
12-
cd executorch
13-
git submodule update --init
7+
Please dive in and start exploring our demo app today! We look forward to any feedback and are excited to see your innovative ideas.
148

15-
python3 -m venv .venv && source .venv/bin/activate
9+
## Key Concepts
10+
From this demo app, you will learn many key concepts such as:
11+
* How to prepare Llama models, build the ExecuTorch library, and perform model inference across delegates
12+
* Expose the ExecuTorch library via Swift Package Manager
13+
* Familiarity with current Executorch app-facing capabilities
1614

17-
./install_requirements.sh
18-
```
15+
The goal is for you to see the type of support ExecuTorch provides and feel comfortable with leveraging it for your use cases.
16+
17+
## Supported Models
1918

20-
## Exporting models
21-
Please refer to the [ExecuTorch Llama2 docs](https://github.com/pytorch/executorch/blob/main/examples/models/llama2/README.md) to export the model.
19+
As a whole, the models that this app supports are (varies by delegate):
20+
* Llama 3.1 8B
21+
* Llama 3 8B
22+
* Llama 2 7B
23+
* Llava 1.5 (only XNNPACK)
2224

23-
## Run the App
25+
## Building the application
26+
First it’s important to note that currently ExecuTorch provides support across several delegates. Once you identify the delegate of your choice, select the README link to get a complete end-to-end instructions for environment set-up to export the models to build ExecuTorch libraries and apps to run on device:
2427

25-
1. Open the [project](https://github.com/pytorch/executorch/blob/main/examples/demo-apps/apple_ios/LLaMA/LLaMA.xcodeproj) in Xcode.
26-
2. Run the app (cmd+R).
27-
3. In app UI pick a model and tokenizer to use, type a prompt and tap the arrow buton
28+
| Delegate | Resource |
29+
| ------------------------------ | --------------------------------- |
30+
| XNNPACK (CPU-based library) | [link](docs/delegates/xnnpack_README.md)|
31+
| MPS (Metal Performance Shader) | [link](docs/delegates/mps_README.md) |
32+
33+
## How to Use the App
34+
This section will provide the main steps to use the app, along with a code snippet of the ExecuTorch API.
2835

2936
```{note}
3037
ExecuTorch runtime is distributed as a Swift package providing some .xcframework as prebuilt binary targets.
31-
Xcode will dowload and cache the package on the first run, which will take some time.
38+
Xcode will download and cache the package on the first run, which will take some time.
3239
```
3340

41+
* Open XCode and select "Open an existing project" to open `examples/demo-apps/apple_ios/LLama`.
42+
* Ensure that the Executorch package dependencies are installed correctly.
43+
* Run the app. This builds and launches the app on the phone.
44+
* In app UI pick a model and tokenizer to use, type a prompt and tap the arrow buton
45+
46+
3447
## Copy the model to Simulator
3548

36-
1. Drag&drop the model and tokenizer files onto the Simulator window and save them somewhere inside the iLLaMA folder.
37-
2. Pick the files in the app dialog, type a prompt and click the arrow-up button.
49+
* Drag&drop the model and tokenizer files onto the Simulator window and save them somewhere inside the iLLaMA folder.
50+
* Pick the files in the app dialog, type a prompt and click the arrow-up button.
3851

3952
## Copy the model to Device
4053

41-
1. Wire-connect the device and open the contents in Finder.
42-
2. Navigate to the Files tab and drag&drop the model and tokenizer files onto the iLLaMA folder.
43-
3. Wait until the files are copied.
54+
* Wire-connect the device and open the contents in Finder.
55+
* Navigate to the Files tab and drag&drop the model and tokenizer files onto the iLLaMA folder.
56+
* Wait until the files are copied.
57+
58+
If the app successfully run on your device, you should see something like below:
59+
60+
<p align="center">
61+
<img src="./docs/screenshots/ios_demo_app.jpg" alt="iOS LLaMA App" width="300">
62+
</p>
63+
64+
For Llava 1.5 models, you can select and image (via image/camera selector button) before typing prompt and send button.
4465

45-
Click the image below to see it in action!
66+
Note:
67+
* In order to try out Llava 1.5 model on iOS, you will need to pull [PR 5167](https://github.com/pytorch/executorch/pull/5167) and apply the changes on top of your current code. This step will no longer be needed when the PR is merged.
4668

47-
<a href="https://pytorch.org/executorch/main/_static/img/llama_ios_app.mp4">
48-
<img src="https://pytorch.org/executorch/main/_static/img/llama_ios_app.png" width="600" alt="iOS app running a LlaMA model">
49-
</a>
69+
<p align="center">
70+
<img src="./docs/screenshots/ios_demo_app_llava.jpg" alt="iOS LLaMA App" width="300">
71+
</p>
5072

5173
## Reporting Issues
5274
If you encountered any bugs or issues following this tutorial please file a bug/issue here on [Github](https://github.com/pytorch/executorch/issues/new).
Lines changed: 98 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,98 @@
1+
# Building Llama iOS Demo for MPS Backend
2+
3+
This tutorial covers the end to end workflow for building an iOS demo app using MPS backend on device.
4+
More specifically, it covers:
5+
1. Export and quantization of Llama models against the MPS backend.
6+
2. Building and linking libraries that are required to inference on-device for iOS platform using MPS.
7+
3. Building the iOS demo app itself.
8+
9+
## Prerequisites
10+
* [Xcode 15](https://developer.apple.com/xcode)
11+
* [iOS 18 SDK](https://developer.apple.com/ios)
12+
* Set up your ExecuTorch repo and environment if you haven’t done so by following the [Setting up ExecuTorch](https://pytorch.org/executorch/stable/getting-started-setup) to set up the repo and dev environment:
13+
14+
## Setup ExecuTorch
15+
In this section, we will need to set up the ExecuTorch repo first with Conda environment management. Make sure you have Conda available in your system (or follow the instructions to install it [here](https://conda.io/projects/conda/en/latest/user-guide/install/index.html)). The commands below are running on Linux (CentOS).
16+
17+
Create a Conda environment
18+
19+
```
20+
conda create -n et_mps python=3.10.0
21+
conda activate et_mps
22+
```
23+
24+
Checkout ExecuTorch repo and sync submodules
25+
26+
```
27+
git clone https://github.com/pytorch/executorch.git
28+
cd executorch
29+
git submodule sync
30+
git submodule update --init
31+
```
32+
33+
Install dependencies
34+
35+
```
36+
./install_requirements.sh
37+
```
38+
39+
## Prepare Models
40+
In this demo app, we support text-only inference with Llama 3.1, Llama 3, and Llama 2 models.
41+
42+
Install the required packages
43+
44+
```
45+
executorch/examples/models/llama2/install_requirements.sh
46+
```
47+
48+
Export the model
49+
```
50+
python -m examples.models.llama2.export_llama --checkpoint "${MODEL_DIR}/consolidated.00.pth" --params "${MODEL_DIR}/params.json" -kv --use_sdpa_with_kv_cache --mps -d fp32 --disable_dynamic_shape -qmode 8da4w -G 32
51+
```
52+
53+
## Pushing Model and Tokenizer
54+
55+
### Copy the model to Simulator
56+
* Drag&drop the model and tokenizer files onto the Simulator window and save them somewhere inside the iLLaMA folder.
57+
* Pick the files in the app dialog, type a prompt and click the arrow-up button.
58+
59+
### Copy the model to Device
60+
* Wire-connect the device and open the contents in Finder.
61+
* Navigate to the Files tab and drag & drop the model and tokenizer files onto the iLLaMA folder.
62+
* Wait until the files are copied.
63+
64+
## Configure the XCode Project
65+
66+
### Install CMake
67+
Download and open the macOS .dmg installer at https://cmake.org/download and move the Cmake app to /Applications folder.
68+
Install Cmake command line tools:
69+
70+
```
71+
sudo /Applications/CMake.app/Contents/bin/cmake-gui --install
72+
```
73+
74+
75+
### Swift Package Manager
76+
The prebuilt ExecuTorch runtime, backend, and kernels are available as a Swift PM package.
77+
78+
### Xcode
79+
Open the project in Xcode.In Xcode, go to `File > Add Package Dependencies`. Paste the URL of the ExecuTorch repo into the search bar and select it. Make sure to change the branch name to the desired ExecuTorch version, e.g., “0.3.0”, or just use the “latest” branch name for the latest stable build.
80+
81+
<p align="center">
82+
<img src="../screenshots/ios_demo_app_swift_pm.png" alt="iOS LLaMA App Swift PM" width="600">
83+
</p>
84+
85+
Then select which ExecuTorch framework should link against which target.
86+
87+
<p align="center">
88+
<img src="../screenshots/ios_demo_app_choosing_package.png" alt="iOS LLaMA App Choosing package" width="600">
89+
</p>
90+
91+
Click “Run” to build the app and run in on your iPhone. If the app successfully run on your device, you should see something like below:
92+
93+
<p align="center">
94+
<img src="../screenshots/ios_demo_app_mps.jpg" alt="iOS LLaMA App mps" width="300">
95+
</p>
96+
97+
## Reporting Issues
98+
If you encountered any bugs or issues following this tutorial please file a bug/issue here on [Github](https://github.com/pytorch/executorch/issues/new).
Lines changed: 107 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,107 @@
1+
# Building Llama iOS Demo for XNNPack Backend
2+
3+
This tutorial covers the end to end workflow for building an iOS demo app using XNNPack backend on device.
4+
More specifically, it covers:
5+
1. Export and quantization of Llama models against the XNNPack backend.
6+
2. Building and linking libraries that are required to inference on-device for iOS platform using XNNPack.
7+
3. Building the iOS demo app itself.
8+
9+
## Prerequisites
10+
* [Xcode 15](https://developer.apple.com/xcode)
11+
* [iOS 17 SDK](https://developer.apple.com/ios)
12+
* Set up your ExecuTorch repo and environment if you haven’t done so by following the [Setting up ExecuTorch](https://pytorch.org/executorch/stable/getting-started-setup) to set up the repo and dev environment:
13+
14+
## Setup ExecuTorch
15+
In this section, we will need to set up the ExecuTorch repo first with Conda environment management. Make sure you have Conda available in your system (or follow the instructions to install it [here](https://conda.io/projects/conda/en/latest/user-guide/install/index.html)). The commands below are running on Linux (CentOS).
16+
17+
Create a Conda environment
18+
19+
```
20+
conda create -n et_xnnpack python=3.10.0
21+
conda activate et_xnnpack
22+
```
23+
24+
Checkout ExecuTorch repo and sync submodules
25+
26+
```
27+
git clone https://github.com/pytorch/executorch.git
28+
cd executorch
29+
git submodule sync
30+
git submodule update --init
31+
```
32+
33+
Install dependencies
34+
35+
```
36+
./install_requirements.sh
37+
```
38+
39+
## Prepare Models
40+
In this demo app, we support text-only inference with up-to-date Llama models.
41+
42+
Install the required packages
43+
44+
```
45+
executorch/examples/models/llama2/install_requirements.sh
46+
```
47+
48+
Export the model
49+
```
50+
python -m examples.models.llama2.export_llama --checkpoint <consolidated.00.pth> -p <params.json> -kv --use_sdpa_with_kv_cache -X -qmode 8da4w --group_size 128 -d fp32 --metadata '{"get_bos_id":128000, "get_eos_ids":[128009, 128001]}' --embedding-quantize 4,32 --output_name="llama3_kv_sdpa_xnn_qe_4_32.pte"
51+
```
52+
53+
## Pushing Model and Tokenizer
54+
55+
### Copy the model to Simulator
56+
* Drag&drop the model and tokenizer files onto the Simulator window and save them somewhere inside the iLLaMA folder.
57+
* Pick the files in the app dialog, type a prompt and click the arrow-up button.
58+
59+
### Copy the model to Device
60+
* Wire-connect the device and open the contents in Finder.
61+
* Navigate to the Files tab and drag & drop the model and tokenizer files onto the iLLaMA folder.
62+
* Wait until the files are copied.
63+
64+
## Configure the XCode Project
65+
66+
### Install CMake
67+
Download and open the macOS .dmg installer at https://cmake.org/download and move the Cmake app to /Applications folder.
68+
Install Cmake command line tools:
69+
70+
```
71+
sudo /Applications/CMake.app/Contents/bin/cmake-gui --install
72+
```
73+
74+
75+
### Swift Package Manager
76+
The prebuilt ExecuTorch runtime, backend, and kernels are available as a Swift PM package.
77+
78+
### Xcode
79+
Open the project in Xcode.In Xcode, go to `File > Add Package Dependencies`. Paste the URL of the ExecuTorch repo into the search bar and select it. Make sure to change the branch name to the desired ExecuTorch version, e.g., “0.3.0”, or just use the “latest” branch name for the latest stable build.
80+
81+
<p align="center">
82+
<img src="../screenshots/ios_demo_app_swift_pm.png" alt="iOS LLaMA App Swift PM" width="600">
83+
</p>
84+
85+
Then select which ExecuTorch framework should link against which target.
86+
87+
<p align="center">
88+
<img src="../screenshots/ios_demo_app_choosing_package.png" alt="iOS LLaMA App Choosing package" width="600">
89+
</p>
90+
91+
Click “Run” to build the app and run in on your iPhone. If the app successfully run on your device, you should see something like below:
92+
93+
<p align="center">
94+
<img src="../screenshots/ios_demo_app.jpg" alt="iOS LLaMA App" width="300">
95+
</p>
96+
97+
For Llava 1.5 models, you can select and image (via image/camera selector button) before typing prompt and send button.
98+
99+
Note:
100+
* In order to try out Llava 1.5 model on iOS, you will need to pull [PR 5167](https://github.com/pytorch/executorch/pull/5167) and apply the changes on top of your current code. This step will no longer be needed when the PR is merged.
101+
102+
<p align="center">
103+
<img src="../screenshots/ios_demo_app_llava.jpg" alt="iOS LLaMA App" width="300">
104+
</p>
105+
106+
## Reporting Issues
107+
If you encountered any bugs or issues following this tutorial please file a bug/issue here on [Github](https://github.com/pytorch/executorch/issues/new).
Loading
Loading
Loading
Loading
Loading

0 commit comments

Comments
 (0)