Skip to content

Commit 3e32104

Browse files
committed
Always set RNG seed when restoring cached prompt in main example.
Add a note in the main example README about how restoring a prompt doesn't imply restoring the exact session state.
1 parent 918cec5 commit 3e32104

File tree

2 files changed

+2
-10
lines changed

2 files changed

+2
-10
lines changed

examples/main/README.md

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -272,7 +272,7 @@ These options help improve the performance and memory usage of the LLaMA models.
272272

273273
### Prompt Caching
274274

275-
- `--prompt-cache FNAME`: Specify a file to cache the model state after the initial prompt. This can significantly speed up the startup time when you're using longer prompts. The file is created during the first run and is reused and updated in subsequent runs.
275+
- `--prompt-cache FNAME`: Specify a file to cache the model state after the initial prompt. This can significantly speed up the startup time when you're using longer prompts. The file is created during the first run and is reused and updated in subsequent runs. **Note**: Restoring a cached prompt does not imply restoring the exact state of the session at the point it was saved. So even when specifying a specific seed, you are not guaranteed to get the same sequence of tokens as the original generation.
276276

277277
### Quantization
278278

examples/main/main.cpp

Lines changed: 1 addition & 9 deletions
Original file line numberDiff line numberDiff line change
@@ -85,9 +85,6 @@ int main(int argc, char ** argv) {
8585

8686
fprintf(stderr, "%s: build = %d (%s)\n", __func__, BUILD_NUMBER, BUILD_COMMIT);
8787

88-
// Save the initial seed parameter before overwriting it so it's possible to determine whether
89-
// the user supplied a seed or not. This is useful when loading saved sessions.
90-
int32_t initial_seed = params.seed;
9188
if (params.seed < 0) {
9289
params.seed = time(NULL);
9390
}
@@ -156,12 +153,7 @@ int main(int argc, char ** argv) {
156153
return 1;
157154
}
158155
session_tokens.resize(n_token_count_out);
159-
if (initial_seed != -1) {
160-
fprintf(stderr, "%s: seed argument overrides session file RNG state, will now use seed: %d\n", __func__, params.seed);
161-
llama_set_rng_seed(ctx, params.seed);
162-
} else {
163-
fprintf(stderr, "%s: using RNG state from loaded session file rather than seed\n", __func__);
164-
}
156+
llama_set_rng_seed(ctx, params.seed);
165157

166158
fprintf(stderr, "%s: loaded a session with prompt size of %d tokens\n", __func__, (int) session_tokens.size());
167159
} else {

0 commit comments

Comments
 (0)