You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Copy file name to clipboardExpand all lines: examples/models/llava/README.md
+17-1Lines changed: 17 additions & 1 deletion
Original file line number
Diff line number
Diff line change
@@ -26,17 +26,26 @@ model) for general-purpose visual and language understanding, achieving
26
26
impressive chat capabilities mimicking spirits of the cutting edge multimodal
27
27
models and setting a high bar for accuracy on Science QA.
28
28
29
-
## Instructions
29
+
## Instructions to run Llava on Android/iOS
30
30
31
31
First you need to generate a .PTE file for the model, along with input image,
32
32
and other artifacts. Then you need either a C++ runner, or Android or iOS
33
33
application to test things out on device.
34
34
35
+
### Host machine requirements
36
+
37
+
The biggest requirement is to have a host machine with at least 32GiB memory, preferably 64GiB.
38
+
39
+
The model weights is 15GiB, and the other memory usage at export stage (`export_llava`) is around 10GiB. So you need at least 25GiB memory to run the export script.
40
+
41
+
35
42
### Generate ExecuTorch .PTE and other artifacts
36
43
37
44
Run the following command to generate `llava.pte`, `tokenizer.bin` and an image
38
45
tensor (serialized in TorchScript) `image.pt`.
39
46
47
+
> **Warning**: The C++ runner `llava_main` binary cannot process raw image inputs such as JPEG, PNG, or BMP files directly. You must convert these images to a `.pt` file format using the `examples/models/llava/image_util.py` script before using them with `llava_main`.
48
+
40
49
Prerequisite: run `install_executorch.sh` to install ExecuTorch and run
41
50
`examples/models/llava/install_requirements.sh` to install dependencies.
0 commit comments