Skip to content

Commit dd665cc

Browse files
authored
parallel : increase the variability of the prompt lengths (#13927)
ggml-ci
1 parent df0c0c7 commit dd665cc

File tree

2 files changed

+6
-3
lines changed

2 files changed

+6
-3
lines changed

examples/parallel/README.md

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -4,7 +4,7 @@ Simplified simulation of serving incoming requests in parallel
44

55
## Example
66

7-
Generate 128 client requests (`-ns 128`), simulating 8 concurrent clients (`-np 8`). The system prompt is shared (`-pps`), meaning that it is computed once at the start. The client requests consist of 10 junk questions (`-j 10`) followed by the actual question.
7+
Generate 128 client requests (`-ns 128`), simulating 8 concurrent clients (`-np 8`). The system prompt is shared (`-pps`), meaning that it is computed once at the start. The client requests consist of up to 10 junk questions (`--junk 10`) followed by the actual question.
88

99
```bash
1010
llama-parallel -m model.gguf -np 8 -ns 128 --top-k 1 -pps --junk 10 -c 16384

examples/parallel/parallel.cpp

Lines changed: 5 additions & 2 deletions
Original file line numberDiff line numberDiff line change
@@ -315,7 +315,10 @@ int main(int argc, char ** argv) {
315315
} else {
316316
client.prompt += k_system;
317317
}
318-
for (int i = 0; i < n_junk; ++i) {
318+
319+
const int n_junk_cur = rand() % n_junk;
320+
321+
for (int i = 0; i < n_junk_cur; ++i) {
319322
const int r = rand() % k_questions.size();
320323
client.prompt += "User:\n" + k_questions[r] + "\nAssistant:\n " + k_answers[r] + "\n";
321324
}
@@ -340,7 +343,7 @@ int main(int argc, char ** argv) {
340343
client.n_decoded = 0;
341344
client.i_batch = batch.n_tokens - 1;
342345

343-
LOG_INF("\033[31mClient %3d, seq %4d, started decoding ...\033[0m\n", client.id, client.seq_id);
346+
LOG_INF("\033[31mClient %3d, seq %4d, junk = %4d, started decoding ...\033[0m\n", client.id, client.seq_id, n_junk_cur);
344347

345348
g_seq_id += 1;
346349

0 commit comments

Comments
 (0)