Update on "Transform model to be able to use Attention Sink"

helunwencser · helunwencser · commit 8451aba35043 · 2024-12-02T11:51:08.000-08:00
This PR adds necessary functions for transforming the model to be able to use Attention Sink. Differential Revision: [D65571289](https://our.internmc.facebook.com/intern/diff/D65571289/) [ghstack-poisoned]
diff --git a/examples/models/llama/source_transformation/attention_sink.py b/examples/models/llama/source_transformation/attention_sink.py
@@ -266,7 +266,7 @@ def _replace_attention(
     for _, child_module in module._modules.items():
         if len(list(child_module.children())) > 0:  # pyre-ignore [16]
             _replace_attention(
-                module=child_module,
+                module=child_module, # pyre-ignore [6]
                 rope_with_attention_sink=rope_with_attention_sink,
                 sink_size=sink_size,
                 window_size=window_size,