Call eval() in quantize_pt2

mcremon-meta · facebook-github-bot · commit cc3f25017ba1 · 2025-01-04T06:57:10.000-08:00
Summary: This will make sure ALL calls going through there are in eval mode. In a subsequent diff, all calls will go through `quantize_pt2`, including fp32 cases which will use a nop quantizer and will allow further cleanup of the flow.

Differential Revision: D67561642
diff --git a/backends/cadence/aot/compiler.py b/backends/cadence/aot/compiler.py
@@ -131,7 +131,10 @@ def quantize_pt2(
     Prepare, convert and fuse the model using the given quantizer.
     Returns a GraphModule with the quantized model.
     """
-    # Quantizer
+    # Make the model inference mode by calling model.eval()
+    model.eval()
+
+    # Instantiate the quantizer to CadenceQuantizer if not supplied
     if not quantizer:
         quantizer = CadenceQuantizer()