Skip to content

Commit 2347e45

Browse files
authored
llama : do a warm-up eval at start for better timings (#1824)
1 parent 74d4cfa commit 2347e45

File tree

1 file changed

+7
-0
lines changed

1 file changed

+7
-0
lines changed

examples/main/main.cpp

Lines changed: 7 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -331,6 +331,13 @@ int main(int argc, char ** argv) {
331331

332332
std::vector<llama_token> embd;
333333

334+
// do one empty run to warm up the model
335+
{
336+
const std::vector<llama_token> tmp = { llama_token_bos(), };
337+
llama_eval(ctx, tmp.data(), tmp.size(), 0, params.n_threads);
338+
llama_reset_timings(ctx);
339+
}
340+
334341
while ((n_remain != 0 && !is_antiprompt) || params.interactive) {
335342
// predict
336343
if (embd.size() > 0) {

0 commit comments

Comments
 (0)