-
Hi, INTRODUCTION
ISSUE QUESTION MY PARAMETERS FOR TESTING PURPOSE Please note that I don't know what parameters should I use to have good performance My output
|
Beta Was this translation helpful? Give feedback.
Replies: 4 comments 7 replies
-
What you did looks correct, I think it is just that your GPU is very old, slow, and has very little VRAM. You can try offloading a few more layers while using |
Beta Was this translation helpful? Give feedback.
-
...128 bit, ddr5 80 GB/s , 4GB - I do know what you expect from such card ... my RAM is faster ;) system_info: n_threads = 4 / 8 | AVX = 0 | AVX2 = 0 | AVX512 = 0 | <- your CPU is even lack of AVX llama_model_load_internal: total VRAM used: 550 MB <- you used only 550MB VRAM you can try --n-gpu-layers 10 or even 20 |
Beta Was this translation helpful? Give feedback.
-
I can also add that I've tested same model under Ubuntu 22.04 (via WSL at Windows 10 Pro) with CPU only support and compared it to build done in Visual Studio 2022 (Release build) with CPU only support and... performance is 10 times worse at native Windows comparing to Linux build via WSL... So... it seems that testing LLAMA2 does not have any sense under Windows. |
Beta Was this translation helpful? Give feedback.
-
Under windows is slower because your windows built is not handling AVX2 CPU instructions for no reason . |
Beta Was this translation helpful? Give feedback.
...128 bit, ddr5 80 GB/s , 4GB - I do know what you expect from such card ... my RAM is faster ;)
system_info: n_threads = 4 / 8 | AVX = 0 | AVX2 = 0 | AVX512 = 0 | <- your CPU is even lack of AVX
llama_model_load_internal: total VRAM used: 550 MB <- you used only 550MB VRAM you can try --n-gpu-layers 10 or even 20