Skip to content

Commit 4c401e5

Browse files
mscheong01compilade
andcommitted
add exaone pre-tokenizer in llama-vocab.cpp
Co-Authored-By: compilade <[email protected]>
1 parent 98ad475 commit 4c401e5

File tree

1 file changed

+1
-0
lines changed

1 file changed

+1
-0
lines changed

src/llama-vocab.cpp

Lines changed: 1 addition & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -388,6 +388,7 @@ struct llm_tokenizer_bpe {
388388
case LLAMA_VOCAB_PRE_TYPE_COMMAND_R:
389389
case LLAMA_VOCAB_PRE_TYPE_SMOLLM:
390390
case LLAMA_VOCAB_PRE_TYPE_CODESHELL:
391+
case LLAMA_VOCAB_PRE_TYPE_EXAONE:
391392
regex_exprs = {
392393
"\\p{N}",
393394
"'s|'t|'re|'ve|'m|'ll|'d| ?\\p{L}+| ?\\p{N}+| ?[^\\s\\p{L}\\p{N}]+|\\s+(?!\\S)",

0 commit comments

Comments
 (0)