-
Notifications
You must be signed in to change notification settings - Fork 249
update quant docs #425
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
update quant docs #425
Conversation
587c8a1
to
303bd9d
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thank you!
README.md
Outdated
@@ -63,7 +63,6 @@ python3 torchchat.py download llama3 | |||
* in Chat mode | |||
* in Generate mode | |||
* [Exporting for mobile via ExecuTorch](#export-executorch) | |||
* in Chat mode |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
What does that mean? This is a function of the token stream received from the driver. @JacobSzwejbka has been overhauling the chat mode, please work through the application scenario with @JacobSzwejbka to ensure it works. If it does not, it's a bug. Ditto for aoti-generated models.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Hmn, are we supporting chat mode for exported executorch models as well? I think @JacobSzwejbka's changes are for eager (in generate.py).
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
What does that mean? This is a function of the token stream received from the driver. @JacobSzwejbka has been overhauling the chat mode, please work through the application scenario with @JacobSzwejbka to ensure it works. If it does not, it's a bug. Ditto for aoti-generated models.
there is a 0% chance I will have time to make chat mode work for anything but eager by the deadline. If that feature is truly required someone else needs to be working on it.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I need to dig into this. Please remove this change and I'll approve to land the rest
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
sorry - missed this comment. Updated
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thank you!
README.md
Outdated
@@ -63,7 +63,6 @@ python3 torchchat.py download llama3 | |||
* in Chat mode | |||
* in Generate mode | |||
* [Exporting for mobile via ExecuTorch](#export-executorch) | |||
* in Chat mode |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I need to dig into this. Please remove this change and I'll approve to land the rest
303bd9d
to
f3cfd7d
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thank you!
86b50f6
to
b83db35
Compare
b83db35
to
134f73e
Compare
update quant docs, json properties must be in double quotes, groupsize 7 doesn't work (modulo 4096)