Skip to content

Commit 10f19c1

Browse files
author
eiery
authored
llama : have n_batch default to 512 (#1091)
* set default n_batch to 512 when using BLAS * spacing * alternate implementation of setting different n_batch for BLAS * set n_batch to 512 for all cases
1 parent 7e312f1 commit 10f19c1

File tree

1 file changed

+1
-1
lines changed

1 file changed

+1
-1
lines changed

examples/common.h

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -20,7 +20,7 @@ struct gpt_params {
2020
int32_t repeat_last_n = 64; // last n tokens to penalize
2121
int32_t n_parts = -1; // amount of model parts (-1 = determine from model dimensions)
2222
int32_t n_ctx = 512; // context size
23-
int32_t n_batch = 8; // batch size for prompt processing
23+
int32_t n_batch = 512; // batch size for prompt processing (must be >=32 to use BLAS)
2424
int32_t n_keep = 0; // number of tokens to keep from initial prompt
2525

2626
// sampling parameters

0 commit comments

Comments
 (0)