Replies: 1 comment
-
You should be able to use all available quantization strategies on a 4090. |
Beta Was this translation helpful? Give feedback.
0 replies
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Uh oh!
There was an error while loading. Please reload this page.
-
as topic , I saw the doc says support A100 A10G & T4 , I want to know if TGI could run on single RTX 4090 normally , although I run docker container success , but VRAM consume extremely high (7B model to 23G VRAM) , IDK quantize bitsandbytes is runable on this hardware or not , or have other better solution pls teach me , thanks.
Beta Was this translation helpful? Give feedback.
All reactions