Skip to content

llama2 + CUDA at Windows - how to run main.exe to use GPU resources? #2377

Answered by mirek190
Tarmenale asked this question in Q&A
Discussion options

You must be logged in to vote

...128 bit, ddr5 80 GB/s , 4GB - I do know what you expect from such card ... my RAM is faster ;)

system_info: n_threads = 4 / 8 | AVX = 0 | AVX2 = 0 | AVX512 = 0 | <- your CPU is even lack of AVX

llama_model_load_internal: total VRAM used: 550 MB <- you used only 550MB VRAM you can try --n-gpu-layers 10 or even 20

Replies: 4 comments 7 replies

Comment options

You must be logged in to vote
2 replies
@Tarmenale
Comment options

@slaren
Comment options

Comment options

You must be logged in to vote
4 replies
@Tarmenale
Comment options

@slaren
Comment options

@Tarmenale
Comment options

@Tarmenale
Comment options

Answer selected by Tarmenale
Comment options

You must be logged in to vote
0 replies
Comment options

You must be logged in to vote
1 reply
@Tarmenale
Comment options

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Category
Q&A
Labels
None yet
3 participants