-
Notifications
You must be signed in to change notification settings - Fork 393
Support 16 channel TAEs (taesd3 and taef1) #527
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
Thank you for your contribution. |
TAE is truly amazing. By using f8 e4m3 for the weights, it compresses down to 2.5mb per TAE, I'm able to pack the TAEs for sd1.5, sdxl, sd3 and flux all together in under 10mb total, and they're usable enough to replace the usual VAEs. taesd_flux_f8e4m3.zip < 2mb!!! It does get a little wonky though. I did a little bit of testing and noticed that the numerical range for all of the values in weight tensors within TAEs so far lies within -5.0 to 5.0 range. Perhaps we can consider adding compatibility for the fp8 e3m4 format, it'll provide better precision within this range compared to e4m3? Thouights? |
I can try to give it a shot, but wouldn't q8_0 be better anyways? |
I did consider that. Unfortunately that is not possible as far as I know, because the shape of a majority of the weight tensors are in the shape |
Ah, right, that would be a problem |
I can't find any fp8 e3m4 standard... Should I just make up something like "e3m4 fn" using the same kind of formating as fp8 e4m3 fn? Edit: I just tried, it matches the table here https://paperswithcode.com/paper/efficient-post-training-quantization-with-fp8/review/#arxiv-table-container |
@LostRuins I'm pretty sure this should work: #559 Edit: also, depending on how the weight values are distributed, we might as well make up something like "fp8 e2m5", which would have a range between -7.75 and 7.75 (but a huge minimum positive subnormal of 0.03125) |
Hm, previously I converted it using a pytorch script as However, e3m4 is not. So that might be a little more inconvenient, and perhaps not such a good idea. I was not aware that the format was not supported by torch natively. Ideally, it should be a valid standards adhering safetensors file. |
Another format that would be better than standard e4m3(fn) is e4m3fnuz (from onnx). At least it's a bit more documented than e3m4, but still not compatible with safetensors as far as I know. |
No description provided.