|
9 | 9 | *
|
10 | 10 | * 1. Use a json file to configure the needed tags for each of the supported chat-handshake-template-standard
|
11 | 11 | * a. system -> prefix & suffix,
|
12 |
| - * b. user -> prefix & suffix, assistant -> prefix |
13 |
| - * * [main] these override the in-prefix and in-suffix |
| 12 | + * b. user -> begin, prefix & suffix; assistant -> prefix |
| 13 | + * * [main] these override the in-prefix (begin+prefix) and in-suffix |
14 | 14 | * c. reverse-prompt
|
15 | 15 | * * [main] this adds to any reverese-prompt specified using cmdline
|
16 | 16 | * d. global -> begin & end
|
17 |
| - * d. systemuser-1st-user-has-prefix |
18 |
| - * * if a combination of system and user messages/prompts is passed, |
| 17 | + * e. systemuser-1st-user-has-begin and systemuser-1st-user-has-prefix |
| 18 | + * * [chaton-tmpl-apply] if a combination of system and user messages/prompts is passed, |
19 | 19 | * then for the 1st user message following the 1st system message,
|
20 |
| - * include user prefix only if this flag is set. [chaton-tmpl-apply] |
21 |
| - * * [later] one or two models which I looked at seem to require not just BoS, but also the user-role-prefix-tag |
22 |
| - * to also be controlled wrt this case. So not differentiating between BoS and any user-role-prefix-tag. |
23 |
| - * However if bos and user-role-prefix-tag need to be decoupled, where only bos needs this treatment, |
24 |
| - * then maybe add begin and end keys (to specify the BoS) in addition to prefix and suffix keys (to specify user-role-prefix-tag), to role blocks in the json. |
25 |
| - * and inturn control only begin and not prefix, wrt whether to add or not. |
| 20 | + * include user begin and prefix only if corresponding flags is set. |
| 21 | + * * begin should normally relate to BoS while prefix should relate to Role Identifier tag. |
| 22 | + * If there is no need for seperate handling of BoS and RoleIdTag, then one could even |
| 23 | + * set both BoS and RoleIdTag to one of these entries itself. |
| 24 | + * |
26 | 25 | * 2. [main] currently the user specified system prompt (-p + -f) is tagged using system role tags,
|
27 | 26 | * and inturn this tagged message is tokenized with parse_special flag.
|
28 | 27 | * So any special token related tags in the user specified system prompt will get parsed as special.
|
| 28 | + * |
29 | 29 | * 3. chaton-tmpl-apply uses the json file, which was loaded, to decide on how to generate the tagged messages for tokenisation.
|
30 | 30 | * a. input: [ { role, message }, { role, message}, ....]
|
31 | 31 | * b. output: currently a single string is returned which contains the tagged message(s).
|
|
0 commit comments