-
Notifications
You must be signed in to change notification settings - Fork 607
Bug fix in bpe tokenizer #4149
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Bug fix in bpe tokenizer #4149
Conversation
guangy10
commented
Jul 3, 2024
- Record bos/eos in the binary format
- Updated tests
🔗 Helpful Links🧪 See artifacts and rendered test results at hud.pytorch.org/pr/pytorch/executorch/4149
Note: Links to docs will display an error until the docs builds have been completed. ✅ No FailuresAs of commit 1022e69 with merge base e4eeadc ( This comment was automatically generated by Dr. CI and updates every 15 minutes. |
7d4a02c
to
05b3788
Compare
@guangy10 has imported this pull request. If you are a Meta employee, you can view this diff on Phabricator. |
05b3788
to
7260263
Compare
It looks like .gitignore prevents checking in the new tokenizer.bin. Fixed it now. |
@guangy10 has imported this pull request. If you are a Meta employee, you can view this diff on Phabricator. |
7260263
to
42775b5
Compare
@guangy10 has imported this pull request. If you are a Meta employee, you can view this diff on Phabricator. |
42775b5
to
f6c9350
Compare
@guangy10 has imported this pull request. If you are a Meta employee, you can view this diff on Phabricator. |
f6c9350
to
1022e69
Compare
@guangy10 has imported this pull request. If you are a Meta employee, you can view this diff on Phabricator. |