Skip to content

Commit ccbb277

Browse files
cmp-nctggerganov
andauthored
llava : update README.md (#5489)
* Update README.md * Update README.md * Update examples/llava/README.md --------- Co-authored-by: Georgi Gerganov <[email protected]>
1 parent 8084d55 commit ccbb277

File tree

1 file changed

+42
-4
lines changed

1 file changed

+42
-4
lines changed

examples/llava/README.md

Lines changed: 42 additions & 4 deletions
Original file line numberDiff line numberDiff line change
@@ -1,10 +1,12 @@
11
# LLaVA
22

3-
Currently this implementation supports [llava-v1.5](https://huggingface.co/liuhaotian/llava-v1.5-7b) variants.
3+
Currently this implementation supports [llava-v1.5](https://huggingface.co/liuhaotian/llava-v1.5-7b) variants,
4+
as well as llava-1.6 [llava-v1.6](https://huggingface.co/collections/liuhaotian/llava-16-65b9e40155f60fd046a5ccf2) variants.
45

56
The pre-converted [7b](https://huggingface.co/mys/ggml_llava-v1.5-7b)
67
and [13b](https://huggingface.co/mys/ggml_llava-v1.5-13b)
78
models are available.
9+
For llava-1.6 a variety of prepared gguf models are available as well [7b-34b](https://huggingface.co/cmp-nct/llava-1.6-gguf)
810

911
After API is confirmed, more models will be supported / uploaded.
1012

@@ -18,6 +20,7 @@ After building, run: `./llava-cli` to see the usage. For example:
1820
```
1921

2022
**note**: A lower temperature like 0.1 is recommended for better quality. add `--temp 0.1` to the command to do so.
23+
**note**: For GPU offloading ensure to use the `-ngl` flag just like usual
2124

2225
## LLaVA 1.5
2326

@@ -55,11 +58,46 @@ python ./convert.py ../llava-v1.5-7b
5558

5659
Now both the LLaMA part and the image encoder is in the `llava-v1.5-7b` directory.
5760

58-
## LLaVA 1.6
61+
## LLaVA 1.6 gguf conversion
62+
63+
1) Backup your pth/safetensor model files as llava-surgery modifies them
64+
2) Use `python llava-surgery-v2.py -C -m /path/to/hf-model` which also supports llava-1.5 variants pytorch as well as safetensor models:
65+
- you will find a llava.projector and a llava.clip file in your model directory
66+
3) Copy the llava.clip file into a subdirectory (like vit), rename it to pytorch_model.bin and add a fitting vit configuration to the directory (https://huggingface.co/cmp-nct/llava-1.6-gguf/blob/main/config.json)
67+
4) Create the visual gguf model: `python ./examples/llava/convert-image-encoder-to-gguf.py -m ../path/to/vit --llava-projector ../path/to/llava.projector --output-dir ../path/to/output --clip_model_is_vision`
68+
- This is similar to llava-1.5, the difference is that we tell the encoder that we are working with the pure vision model part of CLIP
69+
5) Everything else as usual: convert.py the hf model, quantize as needed
70+
**note** llava-1.6 needs more context than llava-1.5, at least 3000 is needed (just run it at -c 4096)
71+
**note** llava-1.6 greatly benefits from batched prompt processing (defaults work)
72+
73+
## llava-cli templating and llava-1.6 prompting
74+
75+
llava-1.5 models all use the same vicuna prompt, here you can just add your image question like `-p "Provide a full description."`
76+
For llava-1.5 models which are not vicuna (mistral and Yi) you need to adapt system prompt as well as user prompt, for this purpose llava-cli has a basic templating system:
77+
78+
**For Mistral and using llava-cli binary:**
79+
Add this: `-p "<image>\nUSER:\nProvide a full description.\nASSISTANT:\n"`
80+
The mistral template for llava-1.6 seems to be no system print and a USER/ASSISTANT role
81+
82+
**For the 34B this should work:**
83+
Add this: `-e -p <|im_start|>system\nAnswer the questions.<|im_end|><|im_start|>user\n<image>\nProvide a full description.<|im_end|><|im_start|>assistant\n`
84+
85+
86+
## How to know if you are running in llava-1.5 or llava-1.6 mode
87+
88+
When running llava-cli you will see a visual information right before the prompt is being processed:
89+
90+
**Llava-1.5:**
91+
`encode_image_with_clip: image embedding created: 576 tokens`
92+
93+
**Llava-1.6 (anything above 576):**
94+
`encode_image_with_clip: image embedding created: 2880 tokens`
95+
96+
97+
Alternatively just pay notice to how many "tokens" have been used for your prompt, it will also show 1000+ tokens for llava-1.6
98+
5999

60-
- Use `llava-surgery-v2.py`
61100

62-
- TODO: add detailed instructions
63101

64102
## TODO
65103

0 commit comments

Comments
 (0)