Skip to content

Additional KL-divergence statistics #5081

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
merged 3 commits into from
Jan 23, 2024
Merged

Additional KL-divergence statistics #5081

merged 3 commits into from
Jan 23, 2024

Conversation

ikawrakow
Copy link
Contributor

I have added:

  • Probability for quantized model predicting the same top token as the base model
  • KL-divergence median
  • KL-divergence min/max
  • KL-divergence 1%, 5%, 95%, 99%

I'll keep it as a draft PR for a bit.
Please express your wishes for additional things one may want to know.

Here is an example of what you currently get:

chunk        PPL          ln(PPL(Q)/PPL(base))          KL-Divergence           Same top
   1        3.9734      -0.00651 ±    0.01162       0.01145 ±    0.00153    0.96078 ± 0.01218
   2        4.5052       0.00239 ±    0.00794       0.01388 ±    0.00164    0.95686 ± 0.00901
   3        5.3057       0.00175 ±    0.00625       0.01315 ±    0.00118    0.95163 ± 0.00776
   4        6.0247       0.00473 ±    0.00524       0.01236 ±    0.00092    0.95196 ± 0.00670
...
 638        5.7483       0.00903 ±    0.00039       0.01062 ±    0.00006    0.95254 ± 0.00053
 639        5.7520       0.00904 ±    0.00039       0.01062 ±    0.00006    0.95252 ± 0.00053
 640        5.7581       0.00905 ±    0.00039       0.01062 ±    0.00006    0.95254 ± 0.00053
 641        5.7486       0.00904 ±    0.00039       0.01062 ±    0.00006    0.95259 ± 0.00053
 642        5.7427       0.00901 ±    0.00039       0.01062 ±    0.00006    0.95261 ± 0.00053

===== KL-divergence statistics
Average:   0.010623 ±   0.000065
Median :   0.005517
Minimum:  -0.000008
Maximum:   2.462638
KLD_01 :   0.000007
KLD_99 :   0.088446
KLD_05 :   0.000046
KLD_95 :   0.034547

llama_print_timings:        load time =     699.00 ms
llama_print_timings:      sample time =       0.00 ms /     1 runs   (    0.00 ms per token,      inf tokens per second)
llama_print_timings: prompt eval time =   62910.38 ms / 328704 tokens (    0.19 ms per token,  5224.96 tokens per second)
llama_print_timings:        eval time =       0.00 ms /     1 runs   (    0.00 ms per token,      inf tokens per second)
llama_print_timings:       total time =   66734.41 ms / 328705 tokens


std::sort(kld_values.begin(), kld_values.end());

printf("===== KL-divergence statistics\n");
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
printf("===== KL-divergence statistics\n");
printf("===== KL-divergence statistics =====\n");

Was this intentional? I usually see prints like this.

Comment on lines 1763 to 1768
const int n_1percent = nearest_int(0.01f*kld_values.size());
printf("KLD_01 : %10.6f\n", kld_values[n_1percent]);
printf("KLD_99 : %10.6f\n", kld_values[kld_values.size()-1-n_1percent]);
const int n_5percent = nearest_int(0.05f*kld_values.size());
printf("KLD_05 : %10.6f\n", kld_values[n_5percent]);
printf("KLD_95 : %10.6f\n", kld_values[kld_values.size()-1-n_5percent]);
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is not the correct way to calculate percentiles (due to integer rounding errors) but if the number of tokes is >> 100 it should be fine.

@ikawrakow ikawrakow marked this pull request as ready for review January 23, 2024 13:16
@ikawrakow ikawrakow merged commit 44879ee into master Jan 23, 2024
@ikawrakow ikawrakow deleted the ik/kl-divergence-2 branch January 23, 2024 13:17
@kalomaze
Copy link
Contributor

kalomaze commented Jan 23, 2024

I am unable to get it to print anything. It simply exits without evaluating.

perplexity.exe -m "C:\Users\Kalo\Downloads\Toppy7b\ggml-model-Q8_0.gguf" --kl-divergence-base "C:\Users\Kalo\Downloads\calibration\8k_kldiv_toppyfp16_v2.dat" --kl-divergence

image image

Regenerating it with or without GPU layers did not resolve the issue.

Removing --kl-divergence causes:

the data file you provided tokenizes to only 1 tokens

This issue may be Windows specific.

EDIT: On WSL I get a segfault.

jordankanter pushed a commit to jordankanter/llama.cpp that referenced this pull request Feb 3, 2024
* perplexity: add top-token probability

* perplexity: add additional KL-divergence statistics

* perplexity: a better organized KL-divergence statistics output

---------

Co-authored-by: Iwan Kawrakow <[email protected]>
hodlen pushed a commit to hodlen/llama.cpp that referenced this pull request Apr 1, 2024
* perplexity: add top-token probability

* perplexity: add additional KL-divergence statistics

* perplexity: a better organized KL-divergence statistics output

---------

Co-authored-by: Iwan Kawrakow <[email protected]>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants