Skip to content

Update README.md #479

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
merged 1 commit into from
Apr 25, 2024
Merged

Update README.md #479

merged 1 commit into from
Apr 25, 2024

Conversation

mikekgfb
Copy link
Contributor

readme update

readme update
@facebook-github-bot facebook-github-bot added the CLA Signed This label is managed by the Meta Open Source bot. label Apr 25, 2024
@mikekgfb
Copy link
Contributor Author

README only update. Please ignore pending tests

@mikekgfb mikekgfb merged commit a024fb2 into main Apr 25, 2024
@mikekgfb mikekgfb deleted the mikekgfb-patch-16 branch April 25, 2024 07:51
larryliu0820 added a commit that referenced this pull request Jul 5, 2024
malfet pushed a commit that referenced this pull request Jul 17, 2024
malfet pushed a commit that referenced this pull request Jul 17, 2024
malfet pushed a commit that referenced this pull request Jul 17, 2024
malfet pushed a commit that referenced this pull request Jul 17, 2024
malfet pushed a commit that referenced this pull request Jul 17, 2024
malfet pushed a commit that referenced this pull request Jul 17, 2024
larryliu0820 added a commit that referenced this pull request Jul 17, 2024
larryliu0820 added a commit that referenced this pull request Jul 17, 2024
* Update quantize.py to use torchao Quantizers

Summary:

Remove duplicate code for Int4WeightOnlyQuantizer and
Int8DynActInt4WeightQuantizer and use torchao API.

Test Plan:

```
python torchchat.py generate llama2 --quantize '{"linear:int4": {"groupsize": 256}, "precision": {"dtype":"float16"}, "executor":{"accelerator":"cpu"}}' --prompt "Once upon a time," --max-new-tokens 256
python torchchat.py generate llama2 --quantize '{"linear:a8w4dq": {"groupsize": 256}, "precision": {"dtype":"float16"}, "executor":{"accelerator":"cpu"}}' --prompt "Once upon a time," --max-new-tokens 256
```

Reviewers:

Subscribers:

Tasks:

Tags:

* Fix import

Summary:

Test Plan:

Reviewers:

Subscribers:

Tasks:

Tags:

* Install torchao from gh

* Explain import

Summary:

Test Plan:

Reviewers:

Subscribers:

Tasks:

Tags:

* Fix dependencies

* Test ao PR #479

* Update torchao hash

* Update torchao pin

* Fix scheduler bf16/fp16 mix error

* Incorporate torchao changes

Summary:

Test Plan:

Reviewers:

Subscribers:

Tasks:

Tags:

* update hash

* Fix GPU CI job

Summary:

Test Plan:

Reviewers:

Subscribers:

Tasks:

Tags:

* More fix

* Fix executorch CI job

* Use quant api for int4 weight only quantization

Summary:

Test Plan:

Reviewers:

Subscribers:

Tasks:

Tags:

* Fix

Summary:

Test Plan:

Reviewers:

Subscribers:

Tasks:

Tags:

* Fix again

Summary:

Test Plan:

Reviewers:

Subscribers:

Tasks:

Tags:

* Fix 3

* Fix 4

* Try something

* debug

* Only migrate 8a4w

---------

Co-authored-by: Jack Zhang <[email protected]>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
CLA Signed This label is managed by the Meta Open Source bot.
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants