Skip to content

Commit bbf0f9b

Browse files
winskuo-quicYIWENX14
authored andcommitted
Qualcomm AI Engine Direct - Update qaihub documentation (#7930)
Improve qaihub documentation
1 parent 9fee127 commit bbf0f9b

File tree

3 files changed

+16
-7
lines changed

3 files changed

+16
-7
lines changed

examples/qualcomm/README.md

Lines changed: 5 additions & 4 deletions
Original file line numberDiff line numberDiff line change
@@ -10,9 +10,9 @@ We have seperated the example scripts into the following subfolders, please refe
1010
For example, [llama2](./oss_scripts/llama2/qnn_llama_runner.cpp) contains not only the python scripts to prepare the model but also a customized runner for executing the model.
1111

1212
3. qaihub_scripts: QAIHub stands for [Qualcomm AI Hub](https://aihub.qualcomm.com/). On QAIHub, users can find pre-compiled context binaries, a format used by QNN to save its models. This provides users with a new option for model deployment. Different from oss_scripts & scripts, which the example scripts are converting a model from nn.Module to ExecuTorch .pte files, qaihub_scripts provides example scripts for converting pre-compiled context binaries to ExecuTorch .pte files. Additionaly, users can find customized example runners specific to the QAIHub models for execution. For example [qaihub_llama2_7b](./qaihub_scripts/llama2/qaihub_llama2_7b.py) is a script converting context binaries to ExecuTorch .pte files, and [qaihub_llama2_7b_runner](./qaihub_scripts/llama2/qaihub_llama2_7b_runner.cpp) is a customized example runner to execute llama2 .pte files. Please be aware that context-binaries downloaded from QAIHub are tied to a specific QNN SDK version.
13-
Before executing the scripts and runner, please ensure that you are using the QNN SDK version that is matching the context binary. Tutorial below will also cover how to check the QNN Version for a context binary.
13+
Before executing the scripts and runner, please ensure that you are using the QNN SDK version that is matching the context binary. Please refer to [Check context binary version](#check-context-binary-version) for tutorial on how to check the QNN Version for a context binary.
1414

15-
4. scripts: This folder contains scripts to build models provided by executorch.
15+
4. scripts: This folder contains scripts to build models provided by Executorch.
1616

1717

1818

@@ -62,12 +62,13 @@ python deeplab_v3.py -s <device_serial> -m "SM8550" -b path/to/build-android/ --
6262
```
6363

6464
#### Check context binary version
65+
This is typically useful when users want to run any models under `qaihub_scripts`. When users retrieve context binaries from Qualcomm AI Hub, we need to ensure the QNN SDK used to run the `qaihub_scripts` is the same version as the QNN SDK that Qualcomm AI Hub used to compile the context binaries. To do so, please run the following script to retrieve the JSON file that contains the metadata about the context binary:
6566
```bash
6667
cd ${QNN_SDK_ROOT}/bin/x86_64-linux-clang
6768
./qnn-context-binary-utility --context_binary ${PATH_TO_CONTEXT_BINARY} --json_file ${OUTPUT_JSON_NAME}
6869
```
69-
After retreiving the json file, search in the json file for the field "buildId" and ensure it matches the ${QNN_SDK_ROOT} you are using for the environment variable.
70-
If you run into the following error, that means the ${QNN_SDK_ROOT} that you are using is older than the context binary QNN SDK version. In this case, please download a newer QNN SDK version.
70+
After retrieving the json file, search in the json file for the field "buildId" and ensure it matches the `${QNN_SDK_ROOT}` you are using for the environment variable.
71+
If you run into the following error, that means the ${QNN_SDK_ROOT} that you are using is older than the context binary's QNN SDK version. In this case, please download a newer QNN SDK version.
7172
```
7273
Error: Failed to get context binary info.
7374
```

examples/qualcomm/qaihub_scripts/llama/README.md

Lines changed: 7 additions & 2 deletions
Original file line numberDiff line numberDiff line change
@@ -24,7 +24,10 @@ Note that the pre-compiled context binaries could not be futher fine-tuned for o
2424
python -m examples.models.llama.tokenizer.tokenizer -t tokenizer.model -o tokenizer.bin
2525
```
2626

27-
#### Step3: Run default examples
27+
#### Step3: Verify context binary's version
28+
Please refer to [Check context binary version](../../README.md#check-context-binary-version) for more info on why and how to verify the context binary's version
29+
30+
#### Step4: Run default examples
2831
```bash
2932
# AIHUB_CONTEXT_BINARIES: ${PATH_TO_AIHUB_WORKSPACE}/build/llama_v2_7b_chat_quantized
3033
python examples/qualcomm/qaihub_scripts/llama/llama2/qaihub_llama2_7b.py -b build-android -s ${SERIAL_NUM} -m ${SOC_MODEL} --context_binaries ${AIHUB_CONTEXT_BINARIES} --tokenizer_bin tokenizer.bin --prompt "What is Python?"
@@ -44,8 +47,10 @@ Note that the pre-compiled context binaries could not be futher fine-tuned for o
4447
2. Follow instructions in https://huggingface.co/qualcomm/Llama-v3-8B-Chat to export context binaries (will take some time to finish)
4548
3. For Llama 3 tokenizer, please refer to https://github.com/meta-llama/llama-models/blob/main/README.md for further instructions on how to download tokenizer.model.
4649

50+
#### Step3: Verify context binary's version
51+
Please refer to [Check context binary version](../../README.md#check-context-binary-version) for more info on why and how to verify the context binary's version
4752

48-
#### Step3: Run default examples
53+
#### Step4: Run default examples
4954
```bash
5055
# AIHUB_CONTEXT_BINARIES: ${PATH_TO_AIHUB_WORKSPACE}/build/llama_v3_8b_chat_quantized
5156
python examples/qualcomm/qaihub_scripts/llama/llama3/qaihub_llama3_8b.py -b build-android -s ${SERIAL_NUM} -m ${SOC_MODEL} --context_binaries ${AIHUB_CONTEXT_BINARIES} --tokenizer_model tokenizer.model --prompt "What is baseball?"

examples/qualcomm/qaihub_scripts/stable_diffusion/README.md

Lines changed: 4 additions & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -26,7 +26,10 @@ We have verified the code with `diffusers`==0.29.0 and `piq`==0.8.0. Please foll
2626
sh examples/qualcomm/qaihub_scripts/stable_diffusion/install_requirements.sh
2727
```
2828

29-
#### Step4: Run default example
29+
#### Step4: Verify context binary's version
30+
Please refer to [Check context binary version](../../README.md#check-context-binary-version) for more info on why and how to verify the context binary's version
31+
32+
#### Step5: Run default example
3033
In this example, we execute the script for 20 time steps with the `prompt` 'a photo of an astronaut riding a horse on mars':
3134
```bash
3235
python examples/qualcomm/qaihub_scripts/stable_diffusion/qaihub_stable_diffusion.py -b build-android -m ${SOC_MODEL} --s ${SERIAL_NUM} --text_encoder_bin ${PATH_TO_TEXT_ENCODER_CONTEXT_BINARY} --unet_bin ${PATH_TO_UNET_CONTEXT_BINARY} --vae_bin ${PATH_TO_VAE_CONTEXT_BINARY} --vocab_json ${PATH_TO_VOCAB_JSON_FILE} --num_time_steps 20 --prompt "a photo of an astronaut riding a horse on mars"

0 commit comments

Comments
 (0)