Skip to content

Commit 9d03d08

Browse files
authored
common : add --no-warmup option for main/llama-cli (#8712)
This commit adds a --no-warmup option for llama-cli. The motivation for this is that it can be convenient to skip the warmup llama_decode call when debugging. Signed-off-by: Daniel Bevenius <[email protected]>
1 parent bfb4c74 commit 9d03d08

File tree

1 file changed

+5
-0
lines changed

1 file changed

+5
-0
lines changed

common/common.cpp

Lines changed: 5 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -1324,6 +1324,10 @@ bool gpt_params_find_arg(int argc, char ** argv, const std::string & arg, gpt_pa
13241324
else { invalid_param = true; }
13251325
return true;
13261326
}
1327+
if (arg == "--no-warmup") {
1328+
params.warmup = false;
1329+
return true;
1330+
}
13271331
#ifndef LOG_DISABLE_LOGS
13281332
// Parse args for logging parameters
13291333
if (log_param_single_parse(argv[i])) {
@@ -1446,6 +1450,7 @@ void gpt_params_print_usage(int /*argc*/, char ** argv, const gpt_params & param
14461450
options.push_back({ "main infill", " --in-prefix-bos", "prefix BOS to user inputs, preceding the `--in-prefix` string" });
14471451
options.push_back({ "main infill", " --in-prefix STRING", "string to prefix user inputs with (default: empty)" });
14481452
options.push_back({ "main infill", " --in-suffix STRING", "string to suffix after user inputs with (default: empty)" });
1453+
options.push_back({ "main", " --no-warmup", "skip warming up the model with an empty run" });
14491454
options.push_back({ "server infill",
14501455
" --spm-infill", "use Suffix/Prefix/Middle pattern for infill (instead of Prefix/Suffix/Middle) as some models prefer this. (default: %s)", params.spm_infill ? "enabled" : "disabled" });
14511456

0 commit comments

Comments
 (0)