Skip to content

Commit bcc8420

Browse files
committed
ChatON: Add template for DeepSeek
Was looking at the tokenized vector, and noticed that the EOS mentioned by existing chat_apply_template of llama.cpp, is different from what I noticed in tokenizer_config.json of deepseek llm, so I have added two entries * "deepseek-alt" which matches llama.cpp's chat_apply_template and * "deepseek" which matches that in tokenizer_config.json. This impacts the assistant suffix and reverse prompt entries. CasOfThis: Need to look into other entries which I added previously at a later time. However as the default logic should be picking the EOS from model file, so I assume reverse-prompt being outofsync, may not matter beyond a limit, potentially.
1 parent a477efa commit bcc8420

File tree

2 files changed

+43
-0
lines changed

2 files changed

+43
-0
lines changed

common/chaton.hpp

Lines changed: 3 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -49,6 +49,9 @@
4949
* in-prefix, in-suffix and antiprompt of main.
5050
* These always adds any role specific prefix and suffix around the passed message.
5151
*
52+
* Sample chaton_meta.json includes template info for
53+
* * llama2, llama3, gemma, chatml, zephyr, deepseek
54+
*
5255
*/
5356

5457
#include <string>

examples/chaton_meta.json

Lines changed: 40 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -99,6 +99,46 @@
9999
},
100100
"reverse-prompt": "<eos>",
101101
"systemuser-1st-user-has-prefix": true
102+
},
103+
"deepseek-alt": {
104+
"global": {
105+
"begin": "",
106+
"end": ""
107+
},
108+
"system": {
109+
"prefix": "",
110+
"suffix": "\n"
111+
},
112+
"user": {
113+
"prefix": "### Instruction:\n",
114+
"suffix": "\n"
115+
},
116+
"assistant": {
117+
"prefix": "### Response:\n",
118+
"suffix": "\n<|EOT|>\n"
119+
},
120+
"reverse-prompt": "<|EOT|>",
121+
"systemuser-1st-user-has-prefix": true
122+
},
123+
"deepseek": {
124+
"global": {
125+
"begin": "",
126+
"end": ""
127+
},
128+
"system": {
129+
"prefix": "",
130+
"suffix": "\n\n"
131+
},
132+
"user": {
133+
"prefix": "User: ",
134+
"suffix": "\n\n"
135+
},
136+
"assistant": {
137+
"prefix": "Assistant: ",
138+
"suffix": " <|end▁of▁sentence|>\n"
139+
},
140+
"reverse-prompt": "<|end▁of▁sentence|>",
141+
"systemuser-1st-user-has-prefix": true
102142
}
103143
}
104144

0 commit comments

Comments
 (0)