Skip to content

what is "prompt eval time" and "eval time" ? #2260

Answered by abc-nix
Xiang-cd asked this question in Q&A
Discussion options

You must be logged in to vote

Edit: found the thread: #1323 (comment)

I was not able to find the discussion I read in the past which explains the meaning of each llama_print_timings, but what I understand is that:

  • load time: time it takes for the model to load.
  • sample time: time it takes to "tokenize" (sample) the prompt message for it to be processed by the program.
  • prompt eval time: time it takes to process the tokenized prompt message. If this isn't done, there would be no context for the model to know what token to predict next.
  • eval time: time needed to generate all tokens as the response to the prompt (excludes all pre-processing time, and it only measures the time since it starts outputting tokens).

I recomme…

Replies: 2 comments 1 reply

Comment options

You must be logged in to vote
1 reply
@SlyEcho
Comment options

SlyEcho Jul 21, 2023
Collaborator

Answer selected by DannyDaemonic
Comment options

You must be logged in to vote
0 replies
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Category
Q&A
Labels
None yet
3 participants
Converted from issue

This discussion was converted from issue #2237 on July 18, 2023 15:39.