Skip to content

Commit 03e6575

Browse files
Using head_dim instead of override scalar value for the 9b model.
1 parent a88abd5 commit 03e6575

File tree

1 file changed

+0
-1
lines changed

1 file changed

+0
-1
lines changed

gemma/config.py

Lines changed: 0 additions & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -117,7 +117,6 @@ def get_config_for_9b() -> GemmaConfig:
117117
head_dim=256,
118118
attn_types=[AttentionType.LOCAL_SLIDING, AttentionType.GLOBAL] * 21,
119119
sliding_window_size=4096,
120-
query_pre_attn_scalar=224, # hidden_size / num_attention_heads
121120
)
122121

123122

0 commit comments

Comments
 (0)