-
Notifications
You must be signed in to change notification settings - Fork 12.2k
Add Yandex instruct model template support #12621
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
I encountered an issue related to the difference between Jinja and C++ templates. For some reason, the system prompt "You are a helpful assistant" in Jinja was included in the first user message. I suspect this is due to some peculiarities in Jinja templating. As a workaround, I hardcoded it in the tests:
|
@compilade @ngxson Could you please take a look when you have time? |
This template doesn't seem to use EOT (end of turn) token. Have you tried with |
Yes, it works fine. We used EOS token as EOT during alignment
We have a prefix space in the 1st token of the reply, which is a consequence of converting our tokenizer to the HF infra. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Seems good to me.
Btw if you're from Yandex team, please do not use the \n\n
as EOT next time you make an instruction-tuned model. The reason is because this prevent the model to generate long code or markdown content, which makes it completely useless for real-life use case.
It uses the |
Thanks a lot for reviewing! |
Then I think the current template is wrong:
It must be |
In anyway, for future I would recommend using template like chatml or llama 3, as it is very easy to work with, and also you don't have to deal with trailing space issues |
Our model was trained to respond with only one turn after I agree that it would be much better and more convenient to use special tokens for this. We realized this as we began the open-source process. We will definitely switch to special tokens in future model updates :) |
No it is not the case, the way From what you said, I image the expected behavior is as follow: For the first turn, we will have:
When it done generating:
Then for next turn, we have:
But in reality, this is what you will get in
Currently, only So, please, just use an existing chat template next time. If you don't know how chat templates works and try to invent a new one like this, you risk degrading both performance and quality a lot! |
I will approve & merge this for now, but please note that the quality and performance will be degraded compared to other models (as explained above) Let's hope that next time we can reuse one of the existing templates, so your model will "just work" out of the box with the best performance. For example, with chatml, you can simple do:
|
Ok thanks a lot! |
I have read the contributing guidelines
Self-reported review complexity:
Hello!
We at Yandex are planning to release our 8B instruct model to open-source. The pre-trained model can be found here: YandexGPT-5-Lite-8B-pretrain.
I created this pull request to add support for our custom chat template in llama.cpp. Could you please take a look and review my changes? @ggerganov
I ran
test-chat-template
locally — it works as expected. I believe my changes should not affect other parts of the project.Thank you!