Skip to content

Reverse Chat Template Fallback Order #963

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Open
wants to merge 3 commits into
base: main
Choose a base branch
from

Conversation

sayanshaw24
Copy link
Contributor

@sayanshaw24 sayanshaw24 commented Jun 3, 2025

This change now ensures we try to parse the chat template with Minja first, and then falls back on the native implementation.

@sayanshaw24 sayanshaw24 requested a review from hanbitmyths June 3, 2025 23:19
@sayanshaw24 sayanshaw24 marked this pull request as ready for review June 3, 2025 23:19
@sayanshaw24 sayanshaw24 requested a review from a team as a code owner June 3, 2025 23:19
Copy link
Collaborator

@hanbitmyths hanbitmyths left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Do we need a clean-up for some specialized handling in case that minja works, so we keep codes clean?

@sayanshaw24
Copy link
Contributor Author

Do we need a clean-up for some specialized handling in case that minja works, so we keep codes clean?

You mean like this: https://github.com/microsoft/onnxruntime-extensions/pull/963/files#diff-660d41852d9740562f27b085a10a08b04b635625f45367300c30206c907b3f5aR816, or something more?

If Minja works, we get a chat template root from minja::Parser::parse that we can then call render on. If not, we attempt the native impl and finally return a kOrtxErrorInvalidArgument if that fails as well.

This PR simply switches fallback order plus any respective changes needed and ensures the status is returned correctly, it shouldn't make the code messier.

@hanbitmyths
Copy link
Collaborator

Do we need a clean-up for some specialized handling in case that minja works, so we keep codes clean?

You mean like this: https://github.com/microsoft/onnxruntime-extensions/pull/963/files#diff-660d41852d9740562f27b085a10a08b04b635625f45367300c30206c907b3f5aR816, or something more?

If Minja works, we get a chat template root from minja::Parser::parse that we can then call render on. If not, we attempt the native impl and finally return a kOrtxErrorInvalidArgument if that fails as well.

This PR simply switches fallback order plus any respective changes needed and ensures the status is returned correctly, it shouldn't make the code messier.

What I meant is to remove a model specific code in the file like TokenizerImpl::Phi3_5ChatTemplate if minja works fine with Phi3.5 chat template. We don't need to do it in this PR, but I'd like to know this makes sense and we can clean up unnecessary codes.

@sayanshaw24
Copy link
Contributor Author

Do we need a clean-up for some specialized handling in case that minja works, so we keep codes clean?

You mean like this: https://github.com/microsoft/onnxruntime-extensions/pull/963/files#diff-660d41852d9740562f27b085a10a08b04b635625f45367300c30206c907b3f5aR816, or something more?
If Minja works, we get a chat template root from minja::Parser::parse that we can then call render on. If not, we attempt the native impl and finally return a kOrtxErrorInvalidArgument if that fails as well.
This PR simply switches fallback order plus any respective changes needed and ensures the status is returned correctly, it shouldn't make the code messier.

What I meant is to remove a model specific code in the file like TokenizerImpl::Phi3_5ChatTemplate if minja works fine with Phi3.5 chat template. We don't need to do it in this PR, but I'd like to know this makes sense and we can clean up unnecessary codes.

Oh I see, sorry did not understand what you meant before - yes absolutely we should, but I want to implement multimodal support and run some tests before we do, in case we need some of the TokenizerImpl functions to add on to model-agnostic multimodal support. Like you said, can do that in a separate PR once multimodal work is sorted.

text = root->render(context);
output = text;
} catch (const std::runtime_error& e) {
if (model_to_template_map.count(activated_str)) {
Copy link
Collaborator

@hanbitmyths hanbitmyths Jun 4, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Do we have any test cases to catch exception due to minja failure?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants