You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Now both the LLaMA part and the image encoder is in the `llava-v1.5-7b` directory.
57
60
58
-
## LLaVA 1.6
61
+
## LLaVA 1.6 gguf conversion
62
+
63
+
1) Backup your pth/safetensor model files as llava-surgery modifies them
64
+
2) Use `python llava-surgery-v2.py -C -m /path/to/hf-model` which also supports llava-1.5 variants pytorch as well as safetensor models:
65
+
- you will find a llava.projector and a llava.clip file in your model directory
66
+
3) Copy the llava.clip file into a subdirectory (like vit), rename it to pytorch_model.bin and add a fitting vit configuration to the directory (https://huggingface.co/cmp-nct/llava-1.6-gguf/blob/main/config.json)
- This is similar to llava-1.5, the difference is that we tell the encoder that we are working with the pure vision model part of CLIP
69
+
5) Everything else as usual: convert.py the hf model, quantize as needed
70
+
**note** llava-1.6 needs more context than llava-1.5, at least 3000 is needed (just run it at -c 4096)
71
+
**note** llava-1.6 greatly benefits from batched prompt processing (defaults work)
72
+
73
+
## llava-cli templating and llava-1.6 prompting
74
+
75
+
llava-1.5 models all use the same vicuna prompt, here you can just add your image question like `-p "Provide a full description."`
76
+
For llava-1.5 models which are not vicuna (mistral and Yi) you need to adapt system prompt as well as user prompt, for this purpose llava-cli has a basic templating system:
77
+
78
+
**For Mistral and using llava-cli binary:**
79
+
Add this: `-p "<image>\nUSER:\nProvide a full description.\nASSISTANT:\n"`
80
+
The mistral template for llava-1.6 seems to be no system print and a USER/ASSISTANT role
81
+
82
+
**For the 34B this should work:**
83
+
Add this: `-e -p <|im_start|>system\nAnswer the questions.<|im_end|><|im_start|>user\n<image>\nProvide a full description.<|im_end|><|im_start|>assistant\n`
84
+
85
+
86
+
## How to know if you are running in llava-1.5 or llava-1.6 mode
87
+
88
+
When running llava-cli you will see a visual information right before the prompt is being processed:
0 commit comments