feat: support force downcast after FastRMSNorm multiply for Gemma #1658

drbh · 2024-03-20T17:55:07Z

This PR adds force_downcast_after to FastRMSNorm.forward which is used in the Gemma model. References huggingface/transformers#29402 and huggingface/transformers#29729

Setting force_downcast_after=True will perform the hidden_states * weight multiplication in f32 and then downcast to half. This differs slightly from the current implementation which first casts the hidden_states to a half and then multiples.

Narsil · 2024-03-20T18:45:48Z

server/text_generation_server/utils/layers.py

@@ -687,7 +687,7 @@ def load(cls, prefix, weights, eps=1e-6):
            weight = weights.get_tensor(f"{prefix}.weight")
            return cls(weight, eps)

-        def forward(self, hidden_states, residual=None):
+        def forward(self, hidden_states, residual=None, force_downcast_after=False):


I'd personally use a different method forward_downcast_after.

Also this only triggers when hidden_size > 8192 meaning it won't trigger for gemma.
Having a dedicated method seems simpler.

that makes alot of sense thanks for the comments. I've removed the branching logic from FastRMSNorm and added forward_downcast_after into the gemma modeling code

Narsil

Even better !

…ggingface#1658) This PR adds `force_downcast_after` to `FastRMSNorm.forward` which is used in the Gemma model. References huggingface/transformers#29402 and huggingface/transformers#29729 Setting `force_downcast_after=True` will perform the `hidden_states * weight` multiplication in f32 and then downcast to half. This differs slightly from the current implementation which first casts the `hidden_states` to a half and then multiples.

feat: support force downcast after FastRMSNorm multiply

b307fce

Narsil reviewed Mar 20, 2024

View reviewed changes

drbh added 2 commits March 21, 2024 03:28

feat: prefer gemma specific rms

5b076df

fix: simplify syntax

704d4dd

Narsil approved these changes Mar 21, 2024

View reviewed changes

Narsil merged commit 6f15ac6 into main Mar 21, 2024

Narsil deleted the fix-gemma-bugs branch March 21, 2024 09:25

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

feat: support force downcast after FastRMSNorm multiply for Gemma #1658

feat: support force downcast after FastRMSNorm multiply for Gemma #1658

Uh oh!

drbh commented Mar 20, 2024

Uh oh!

Narsil Mar 20, 2024

Uh oh!

drbh Mar 21, 2024

Uh oh!

Narsil left a comment

Uh oh!

Uh oh!

feat: support force downcast after FastRMSNorm multiply for Gemma #1658

feat: support force downcast after FastRMSNorm multiply for Gemma #1658

Uh oh!

Conversation

drbh commented Mar 20, 2024

Uh oh!

Narsil Mar 20, 2024

Choose a reason for hiding this comment

Uh oh!

drbh Mar 21, 2024

Choose a reason for hiding this comment

Uh oh!

Narsil left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!