Replies: 3 comments 3 replies
-
How do the high quality non-k-quants feel? Like q5_1, q8_0? Maybe you just like LLaMAs that are a bit buzzed. |
Beta Was this translation helpful? Give feedback.
-
Hmm, in my subjective experience, q4.0 performed best within the old quant method anyway, although higher perplexity than q5 and q8.. no idea why this is so.. and currently I haven't made so many comparisons yet, but subjectively I have the impression that q4km is about the same or even minimally better than q4.0 |
Beta Was this translation helpful? Give feedback.
-
For the first time I am switching from 5_1 to Q5K_M I think it's called. Does anyone still agree or disagree that Q5K_M should be better than 5_1? I care mostly about accuracy/perplexity. I heard that Q5K_M should give better prompts than the old 5_1. But be faster at generation than 5_1 as well? Now that people have had more time to subjectively test this? Any opinions in new quant methods? |
Beta Was this translation helpful? Give feedback.
Uh oh!
There was an error while loading. Please reload this page.
-
So I can see from the perplexity scores that the K-quants generally seem to perform more size efficiently compared to earlier formats.
But I wanna see if anyone else agrees - outputs from Q4_K_M and Q4_K_S actually feel worse than Q4_0 subjectively despite having a better perplexity score.
I dunno if I am dreaming
my responses from q4_0 just... feel? better?
Just wanna hear some anecdotes on this.
Beta Was this translation helpful? Give feedback.
All reactions