-
Notifications
You must be signed in to change notification settings - Fork 12.2k
Continue on empty line #529
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
Do not insert a "newline" token if user inputs empty line. This let's user to continue the output after she has been asked by reverse prompt for more data. Otherwise an empty-line input would insert a "newline" token which would break the flow of the conversation.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This is a good change, however it needs some changes to work with the instruct mode.
As currently you wrap the suffix "### Response:" inside the buffer.length check but not the prefix "### Instruction:", breaking the model.
The prefix needs to be wrapped too. Something like this:
Line 437:
if (params.instruct) {
printf("\n> ");
}
Line 468:
// Add tokens to buffer only if the line is non-empty.
// This let's the user make the chat continue if it was stopped
// on a reverse prompt.
if (buffer.length() > 1) {
if (params.instruct) {
n_consumed = embd_inp.size();
embd_inp.insert(embd_inp.end(), inp_pfx.begin(), inp_pfx.end());
}
auto line_inp = ::llama_tokenize(ctx, buffer, false);
embd_inp.insert(embd_inp.end(), line_inp.begin(), line_inp.end());
if (params.instruct) {
embd_inp.insert(embd_inp.end(), inp_sfx.begin(), inp_sfx.end());
}
n_remain -= line_inp.size();
}
Also replace the tabs on L480 with spaces
edit
or actually should it be:
Line 437:
if (params.instruct) {
n_consumed = embd_inp.size();
printf("\n> ");
}
Line 468:
// Add tokens to buffer only if the line is non-empty.
// This let's the user make the chat continue if it was stopped
// on a reverse prompt.
if (buffer.length() > 1) {
if (params.instruct) {
embd_inp.insert(embd_inp.end(), inp_pfx.begin(), inp_pfx.end());
}
auto line_inp = ::llama_tokenize(ctx, buffer, false);
embd_inp.insert(embd_inp.end(), line_inp.begin(), line_inp.end());
if (params.instruct) {
embd_inp.insert(embd_inp.end(), inp_sfx.begin(), inp_sfx.end());
}
n_remain -= line_inp.size();
}
I'm not 100% certain of the n_consumed part which place is the right for it. Should test for both.
ping @blackhole89 @ggerganov Requesting backup 😄
I wasn't aware of instruct mode. I'll look into it and update the pull request. |
Apparently @ggerganov is looking to move instruct mode out of main.cpp and as its own thing according to #508. May be worth waiting for progress on that or any code on this pertaining to instruct mode may be for naught. |
Very well. Let's close it for now then. |
Hmmm the change is so small and the work is already done so why not just merge it? The only thing need to know anymore is the whether the option 1 or option 2 above for the placement of n_consumed is right, and thats why I pinged @blackhole89 since he originally made the instruct option and probably knows the answer on spot. The work would transfer over to the instruct implementation then when its moved. |
Ah, sorry. Didn't mean nudge this down. Above is right. The change is still good as far as I can tell. |
I'll reopen this so that whichever happens first (this gets merged or instruct mode moved to another file), that in any case it doesnt get forgotten. |
Do not insert a "newline" token if user inputs empty line. This let's
user to continue the output after she has been asked by reverse prompt
for more data. Otherwise an empty-line input would insert a "newline"
token which would break the flow of the conversation.