Remove memory-format workaround for Arm backend #3981

oscarandersson8218 · 2024-06-14T08:52:13Z

Remove temporary fix for memory format introduced in #2371. The dim-order of each node is annotated in a pass. Also some refactoring of arm_backend.py.

pytorch-bot · 2024-06-14T08:52:15Z

🔗 Helpful Links

🧪 See artifacts and rendered test results at hud.pytorch.org/pr/pytorch/executorch/3981

📄 Preview Python docs built from this PR

Note: Links to docs will display an error until the docs builds have been completed.

❗ 2 Active SEVs

There are 2 currently active SEVs. If your PR is affected, please view them below:

✅ No Failures

As of commit da256dc with merge base dd7fa6a ():
💚 Looks good so far! There are no failures yet. 💚

This comment was automatically generated by Dr. CI and updates every 15 minutes.

backends/arm/passes/arm_pass.py

digantdesai · 2024-06-14T16:12:02Z

backends/arm/passes/permute_memory_pass.py

+                    dim_order = HWCM_Order
+            else:
+                dim_order = tuple(range(len(data.shape)))
+            node.meta["tosa_dim_order"] = dim_order


Let me think, there are at least three ways of doing this.

(1) the way you are doing it. i.e. in preprocess() insert a dim_order tuple, track it manually and transform things as we convert to TOSA.
👍 Flexibility
👍 Mostly transparent to the user, except the compile_spec
👎 You are expecting the input to the call_delegate() to be in NHWC format
👎 Doesn't use pytorch infra
👎 node.meta['tosa_dim_order'] != node.meta["val"].dim_order() which may cause confusion

(2) Global switch, i.e. gm = export(model.to(memory_format=torch.channels_last), torch.randn(1,2,3,4).to(memory_format=torch.channels_last). This way gm should have everything in nhwc dim_order.
👍 Clean impl, uses pytorch infra
👎 If something falls out of delegate, that must be run in NHWC format
👎 Assumes input conversion to NHWC happens outside of the model. Might be OK if you are getting it already in NHWC.

(3) Runtime conversion: After export we insert the to_dim_order(NHWC) ops in the beginning/end (also on the weights/placeholders etc.) as we want. Consume these nodes as we see fit in the delegate.
👍 Mix of (1) and (2), allows you to select where the input conversion to NHWC happens.
👍 Makes it flexible in terms of which node runs in which format. That is if you leave the node, it will take NCHW and convert it at runtime on the CPU. If you partition and delete the node the input to the delegate and to the model (I think, I have to check, this is similar to deleting first q node) will be NHWC.
👍 Still uses pytorch infra, i.e. no need to manually maintain tosa_dim_order unless for cases like DW conv (but I see that as a special constraint from the backend so..)
👎 slightly harder to implement, particularly if you are sprinkling multiple nodes in the graph. But if you run this pass before partitioner then it might not be too difficult.
👎 Potentially involves running a pass outside i.e. modifying the graph before to_backend(). As long as this is user visible it might be ok.
👎 Exposes some footguns when multiple delegates are involved but should be the case here.

Thank you Digant for your comments, much appreciated!

(1) This would assume that the input is in NHWC-format if permute_memory_to_nhwc==True, yes. The next step of this PR could be to add a pass or extend this pass to add to_dim_order-ops around all our delegates such that the graph becomes:
Input (default format) -> ... -> to_dim_order(nhwc) -> delegate -> to_dim_order(default format) -> ... -> output (default format). This would require that we modify the graph outside (just before and after) our delegate after partitioning. What's the proper way of doing this?
I agree with the last two bullets. Those are the drawbacks of the current solution.

(2) model.to(memory_format=torch.channels_last) and node.meta["val"].to(memory_format=torch.channels_last) gives dim_order=(0,1,2,3) for certain shapes such as (3,1,3,3). Otherwise I think node.meta["val"].to(memory_format=torch.channels_last) could have been used in (1) instead of annotating node.meta["tosa_dim_order"].

(3) I don't think I fully understand this solution. We insert to_dim_order-ops at the inputs, outputs and placeholders of the full graph or just our delegate sub-graph? Will the dim_order be propagated forward when inserting to_dim_order-ops?

digantdesai · 2024-07-02T05:26:51Z

I will take a look tomorrow.

digantdesai · 2024-07-02T15:15:07Z

backends/arm/passes/annotate_channels_last_dim_order_pass.py

+        elif node.op == "placeholder":
+            # node is an input, weight or bias node
+            consumer_node = list(node.users)[0]
+            if self.is_weight_node_for_dw_conv(consumer_node):


nit: use consistent naming for depthwise convolution helper functions?

Updated in lastest rebase.

digantdesai · 2024-07-02T15:17:14Z

backends/arm/passes/annotate_channels_last_dim_order_pass.py

+                if self.is_weight_node_for_dw_conv(node):
+                    dim_order = HWCM_Order
+            else:
+                dim_order = tuple(range(node_data.dim()))


I assume this is mainly for the dim_order agnostic ops?

I consider it be for all ops, as we assume that the input is always transposed. It corresponds to this block https://github.com/pytorch/executorch/blob/main/backends/arm/arm_backend.py#L261C1-L261C70.

digantdesai · 2024-07-02T15:20:30Z

backends/arm/test/ops/test_linear.py

@@ -120,7 +120,7 @@ def _test_linear_tosa_MI_pipeline(
            ArmTester(
                module,
                example_inputs=test_data,
-                compile_spec=common.get_tosa_compile_spec(),
+                compile_spec=common.get_tosa_compile_spec(permute_memory_to_nhwc=False),


I guess rank==4 ==> NHWC breaks this?

Yes, (1, ,1, 1, C) works though. As a step towards to solution (2) we plan on removing this flag in the near future.

digantdesai · 2024-07-02T15:30:32Z

backends/arm/operators/op_placeholder.py

+        data=None,
+        placeholderFilename=inputs[0].name + ".npy",
+    )
+    tosa_graph.addInputTensor(tensor)


IIUC this means we will expect NHWC input tensor, do you want to add a runtime assert for this else it might just consume wrong format data and tricky to debug.

The functionality is the same as before, but it sounds like good thing to do. We can do it in an upcoming PR or when we implement (2) if that's okay?

digantdesai · 2024-07-02T15:42:15Z

backends/arm/operators/op_add.py

@@ -16,7 +16,7 @@
    build_rescale_from_int32,
    build_rescale_to_int32,
 )
-from executorch.backends.arm.tosa_utils import broadcast_shapes, getNodeArgs
+from executorch.backends.arm.tosa_utils import broadcast_shapes, getNodeArgs, tosa_shape


Are you planning to add more ops? For instance Permute op argument needs to be updated based on what format input it gets when you update the previous node's output format.

Yes, there are a few open PR:s with new ops. The ambition is that we can just update shapes such as permute's argument with tosa_shape(shape, dim_order).

facebook-github-bot · 2024-07-02T15:44:50Z

@digantdesai has imported this pull request. If you are a Meta employee, you can view this diff on Phabricator.

kimishpatel · 2024-07-02T18:27:13Z

backends/arm/passes/annotate_channels_last_dim_order_pass.py

+
+    def call(self, graph_module: torch.fx.GraphModule):
+        NHWC_Order = (0, 2, 3, 1)
+        HWCM_Order = (2, 3, 0, 1)


I dont think I follow this dim order. Can you add some comments? Did you mean HWNC?

https://www.mlplatform.org/tosa/tosa_spec.html#_data_layouts

It's a special case for depthwise convolution where the weights have shape = (H, W, C, M) => different dim-order (https://www.mlplatform.org/tosa/tosa_spec.html#_depthwise_conv2d).

kimishpatel

So not clear to me if this is really making use of dim order utils? is this in preparation to that?

digantdesai · 2024-07-02T20:38:51Z

So not clear to me if this is really making use of dim order utils? is this in preparation to that?

Yes this is the intermediate step. See convo on Slack - https://pytorch.slack.com/archives/C01FV3A914N/p1719838104994989?thread_ts=1718355881.095489&cid=C01FV3A914N

facebook-github-bot · 2024-07-10T04:20:08Z

@digantdesai has imported this pull request. If you are a Meta employee, you can view this diff on Phabricator.

The dim-order of each node is annotated in a pass. Some refactoring of arm_backend.py. Signed-off-by: Oscar Andersson <[email protected]> Change-Id: I14691b51b99acb9e8605100fd25731ab45c55a9d

Signed-off-by: Oscar Andersson <[email protected]> Change-Id: Ica6addb95d6b925beef4696780334268821af608

facebook-github-bot · 2024-07-12T22:10:10Z

@digantdesai has imported this pull request. If you are a Meta employee, you can view this diff on Phabricator.

Change-Id: Id0877faeddec21acdb918fc773bee410dbe6dbb5

facebook-github-bot · 2024-07-15T18:35:01Z

@digantdesai has imported this pull request. If you are a Meta employee, you can view this diff on Phabricator.

facebook-github-bot · 2024-07-15T19:54:10Z

@digantdesai merged this pull request in d3c92de.

facebook-github-bot added the CLA Signed This label is managed by the Facebook bot. Authors need to sign the CLA before a PR can be reviewed. label Jun 14, 2024

oscarandersson8218 added partner: arm For backend delegation, kernels, demo, etc. from the 3rd-party partner, Arm ciflow/trunk and removed CLA Signed This label is managed by the Facebook bot. Authors need to sign the CLA before a PR can be reviewed. labels Jun 14, 2024

facebook-github-bot added the CLA Signed This label is managed by the Facebook bot. Authors need to sign the CLA before a PR can be reviewed. label Jun 14, 2024

digantdesai reviewed Jun 14, 2024

View reviewed changes

oscarandersson8218 force-pushed the dim_order_pass branch from c1eac04 to 0cd6a43 Compare July 1, 2024 09:11

oscarandersson8218 marked this pull request as ready for review July 1, 2024 09:14

digantdesai approved these changes Jul 2, 2024

View reviewed changes

kimishpatel reviewed Jul 2, 2024

View reviewed changes

oscarandersson8218 force-pushed the dim_order_pass branch from 0cd6a43 to 89fb201 Compare July 4, 2024 14:04

oscarandersson8218 added 2 commits July 11, 2024 08:46

Add pass to annotate dim-order

92ab46e

The dim-order of each node is annotated in a pass. Some refactoring of arm_backend.py. Signed-off-by: Oscar Andersson <[email protected]> Change-Id: I14691b51b99acb9e8605100fd25731ab45c55a9d

Fixes for permute/annotation-pass

9de956f

Signed-off-by: Oscar Andersson <[email protected]> Change-Id: Ica6addb95d6b925beef4696780334268821af608

oscarandersson8218 force-pushed the dim_order_pass branch from 89fb201 to 9de956f Compare July 11, 2024 10:10

Merge remote-tracking branch 'executorch/main' into HEAD

da256dc

Change-Id: Id0877faeddec21acdb918fc773bee410dbe6dbb5

facebook-github-bot closed this in d3c92de Jul 15, 2024

facebook-github-bot added the Merged label Jul 15, 2024

Remove memory-format workaround for Arm backend #3981

Remove memory-format workaround for Arm backend #3981

Uh oh!

Conversation

oscarandersson8218 commented Jun 14, 2024

Uh oh!

pytorch-bot bot commented Jun 14, 2024 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

🔗 Helpful Links

🧪 See artifacts and rendered test results at hud.pytorch.org/pr/pytorch/executorch/3981

❗ 2 Active SEVs

✅ No Failures

Uh oh!

Uh oh!

digantdesai Jun 14, 2024 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

digantdesai commented Jul 2, 2024

Uh oh!

Choose a reason for hiding this comment

Uh oh!

oscarandersson8218 Jul 4, 2024 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

facebook-github-bot commented Jul 2, 2024

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

kimishpatel left a comment

Choose a reason for hiding this comment

Uh oh!

digantdesai commented Jul 2, 2024

Uh oh!

facebook-github-bot commented Jul 10, 2024

Uh oh!

facebook-github-bot commented Jul 12, 2024

Uh oh!

facebook-github-bot commented Jul 15, 2024

Uh oh!

facebook-github-bot commented Jul 15, 2024

Uh oh!

Uh oh!

pytorch-bot bot commented Jun 14, 2024 •

edited

Loading

digantdesai Jun 14, 2024 •

edited

Loading

oscarandersson8218 Jul 4, 2024 •

edited

Loading