Skip to content

use std::chrono::system_clock::now() for random seed #6953

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Closed
wants to merge 1 commit into from

Conversation

dwrensha
Copy link
Contributor

Currently, the default random seed is set by time(NULL), which returns an integer number of seconds. If we call llama_sampling_init() many times within a single second, then all of the sampling contexts get the same random seed. Such behavior can come as an unpleasant surprise. I encountered it today while attempting to generate many samples via ollama, which makes use of an adapted version of llama.cpp/examples/server/server.cpp.

We can avoid the problem by using a higher-precision clock to set the random seed. On my (Linux) system, std::chrono::system_clock measures its ticks in nanoseconds. On Windows/Visual Studio, a tick is 100 nanoseconds. static_cast<uint32_t> then truncates the higher-order bits.

An alternative to this clock-based approach would be to use std::random_device, as is already done in utils.hpp. I opted against that approach in this PR, however, because, as noted here, on some platforms std::random_device is deterministic.

@dwrensha
Copy link
Contributor Author

Closing in favor of #6962, which seems nicer to me. Please feel free to reopen and accept this one instead, if you prefer. I consider this one to be a more conservative fix, but slightly less nice.

@dwrensha dwrensha closed this Apr 28, 2024
@dwrensha dwrensha deleted the random-seed-chrono branch April 29, 2024 13:46
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

1 participant