Support 16 channel TAEs (taesd3 and taef1) #527

stduhpf · 2024-12-21T00:03:13Z

No description provided.

leejet · 2024-12-28T05:13:52Z

Thank you for your contribution.

LostRuins · 2025-01-08T14:19:18Z

TAE is truly amazing. By using f8 e4m3 for the weights, it compresses down to 2.5mb per TAE, I'm able to pack the TAEs for sd1.5, sdxl, sd3 and flux all together in under 10mb total, and they're usable enough to replace the usual VAEs.

taesd_flux_f8e4m3.zip < 2mb!!!

It does get a little wonky though. I did a little bit of testing and noticed that the numerical range for all of the values in weight tensors within TAEs so far lies within -5.0 to 5.0 range. Perhaps we can consider adding compatibility for the fp8 e3m4 format, it'll provide better precision within this range compared to e4m3? Thouights?

stduhpf · 2025-01-08T14:23:18Z

Perhaps we can consider adding compatibility for the fp8 e3m4 format, it'll provide better precision within this range compared to e4m3? Thouights?

I can try to give it a shot, but wouldn't q8_0 be better anyways?

LostRuins · 2025-01-08T14:27:26Z

Perhaps we can consider adding compatibility for the fp8 e3m4 format, it'll provide better precision within this range compared to e4m3? Thouights?

I can try to give it a shot, but wouldn't q8_0 be better anyways?

I did consider that. Unfortunately that is not possible as far as I know, because the shape of a majority of the weight tensors are in the shape [ 3, 3, 64, 64 ] while q8_0 and in fact all GGUF quants require a smallest block size of 32?

stduhpf · 2025-01-08T14:28:36Z

Ah, right, that would be a problem

stduhpf · 2025-01-08T18:42:22Z

I can't find any fp8 e3m4 standard... Should I just make up something like "e3m4 fn" using the same kind of formating as fp8 e4m3 fn?

Edit: I just tried, it matches the table here https://paperswithcode.com/paper/efficient-post-training-quantization-with-fp8/review/#arxiv-table-container

stduhpf · 2025-01-08T22:12:30Z

@LostRuins I'm pretty sure this should work: #559
Now, I have no idea how to even convert the models to e4m3 to test it.

Edit: also, depending on how the weight values are distributed, we might as well make up something like "fp8 e2m5", which would have a range between -7.75 and 7.75 (but a huge minimum positive subnormal of 0.03125)

LostRuins · 2025-01-09T02:35:25Z

Hm, previously I converted it using a pytorch script as torch.float8_e4m3fn is a supported type https://pytorch.org/docs/stable/tensors.html

However, e3m4 is not. So that might be a little more inconvenient, and perhaps not such a good idea. I was not aware that the format was not supported by torch natively. Ideally, it should be a valid standards adhering safetensors file.

stduhpf · 2025-01-09T02:46:15Z

Another format that would be better than standard e4m3(fn) is e4m3fnuz (from onnx). At least it's a bit more documented than e3m4, but still not compatible with safetensors as far as I know.

support 16 channel tae (taesd/taef1)

18e6c2c

stduhpf changed the title ~~Support 16 channel TAEs (taesd and taef1)~~ Support 16 channel TAEs (taesd3 and taef1) Dec 21, 2024

leejet merged commit d50473d into leejet:master Dec 28, 2024
9 checks passed

stduhpf added a commit to stduhpf/stable-diffusion.cpp that referenced this pull request Dec 28, 2024

feat: support 16 channel tae (taesd/taef1) (leejet#527)

6737c88

stduhpf deleted the tae16 branch January 1, 2025 15:48

stduhpf mentioned this pull request Jan 9, 2025

add fp8 e3m4 support #559

Closed

LostRuins mentioned this pull request May 19, 2025

Vulkan: Support fp32 accumulator in quantized matmul to fix GLM4-32B incoherence ggml-org/llama.cpp#13607

Merged

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Support 16 channel TAEs (taesd3 and taef1) #527

Support 16 channel TAEs (taesd3 and taef1) #527

Uh oh!

stduhpf commented Dec 21, 2024

Uh oh!

Uh oh!

leejet commented Dec 28, 2024

Uh oh!

LostRuins commented Jan 8, 2025

Uh oh!

stduhpf commented Jan 8, 2025

Uh oh!

LostRuins commented Jan 8, 2025

Uh oh!

stduhpf commented Jan 8, 2025

Uh oh!

stduhpf commented Jan 8, 2025 •

edited

Loading

Uh oh!

stduhpf commented Jan 8, 2025 •

edited

Loading

Uh oh!

LostRuins commented Jan 9, 2025

Uh oh!

stduhpf commented Jan 9, 2025

Uh oh!

Uh oh!

Support 16 channel TAEs (taesd3 and taef1) #527

Support 16 channel TAEs (taesd3 and taef1) #527

Uh oh!

Conversation

stduhpf commented Dec 21, 2024

Uh oh!

Uh oh!

leejet commented Dec 28, 2024

Uh oh!

LostRuins commented Jan 8, 2025

Uh oh!

stduhpf commented Jan 8, 2025

Uh oh!

LostRuins commented Jan 8, 2025

Uh oh!

stduhpf commented Jan 8, 2025

Uh oh!

stduhpf commented Jan 8, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

stduhpf commented Jan 8, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

LostRuins commented Jan 9, 2025

Uh oh!

stduhpf commented Jan 9, 2025

Uh oh!

Uh oh!

stduhpf commented Jan 8, 2025 •

edited

Loading

stduhpf commented Jan 8, 2025 •

edited

Loading