-
Notifications
You must be signed in to change notification settings - Fork 607
[ET-VK] Introduce virtual_clone
API to support view of view use cases + fix synchronization hazard with view tensors
#5753
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
…es + fix synchronization hazard with view tensors ## Context This diff fixes some hazards (not necessarily) bugs with view tensors. ### `virtual_clone` API Consider the following sequence of calls which may be common in the view of view use case. ``` t1 = graph.add_tensor(...); // t2 will have the same metadata as t1 t2 = graph.add_tensor_view(t1); // t3 will also have the same metadata as t2 at this point. t3 = graph.add_tensor_view(t2); // t2 metadata will be updated correctly. t2 = add_transpose_view_node(t1, 0, 1, t2); // Unfortunately, this node will have an assumption that t3 has the same metadata as t2 to start. However, this is not true. // As a result, t3 will have incorrect metadata after this node. t3 = add_transpose_view_node(t2, 1, 2, t3); ``` To address this, the `virtual_clone` API is introduced which will allow view nodes to set the metadata of the output equal to the input before modifying the output. ### WAW synchronization hazards `vTensorStorage` maintains a `last_access` state which facilitates inserting the correct memory barriers for the underlying `vkImage` or `vkBuffer`. However, when we create a tensor view, `last_access` is not shared between `vTensor` instances that use the same resource. As as result, writing into a `vTensor` will not update the `last_access` of its views, and vice versa. Therefore, sebsequent accesses of the other tensor that references the same resource will result in a synchronization hazard. This diff fixes this hazard in a bit of a crude way; if the `vTensor` is a copy, or has copies, then cowardly assume that it has been written to before the current access so that appropriate memory barriers are inserted. This was the selected solution because I thought that adding a map to track last access of tensors that share resources is a bit overkill when the assumption that the underlying resource has been written to before the current access should hold most of the time. Differential Revision: [D63642092](https://our.internmc.facebook.com/intern/diff/D63642092/) [ghstack-poisoned]
🔗 Helpful Links🧪 See artifacts and rendered test results at hud.pytorch.org/pr/pytorch/executorch/5753
Note: Links to docs will display an error until the docs builds have been completed. ✅ No FailuresAs of commit 23704a6 with merge base 0d96f75 ( This comment was automatically generated by Dr. CI and updates every 15 minutes. |
This pull request was exported from Phabricator. Differential Revision: D63642092 |
…iew use cases + fix synchronization hazard with view tensors" ## Context This diff fixes some hazards (not necessarily) bugs with view tensors. ### `virtual_clone` API Consider the following sequence of calls which may be common in the view of view use case. ``` t1 = graph.add_tensor(...); // t2 will have the same metadata as t1 t2 = graph.add_tensor_view(t1); // t3 will also have the same metadata as t2 at this point. t3 = graph.add_tensor_view(t2); // t2 metadata will be updated correctly. t2 = add_transpose_view_node(t1, 0, 1, t2); // Unfortunately, this node will have an assumption that t3 has the same metadata as t2 to start. However, this is not true. // As a result, t3 will have incorrect metadata after this node. t3 = add_transpose_view_node(t2, 1, 2, t3); ``` To address this, the `virtual_clone` API is introduced which will allow view nodes to set the metadata of the output equal to the input before modifying the output. ### WAW synchronization hazards `vTensorStorage` maintains a `last_access` state which facilitates inserting the correct memory barriers for the underlying `vkImage` or `vkBuffer`. However, when we create a tensor view, `last_access` is not shared between `vTensor` instances that use the same resource. As as result, writing into a `vTensor` will not update the `last_access` of its views, and vice versa. Therefore, sebsequent accesses of the other tensor that references the same resource will result in a synchronization hazard. This diff fixes this hazard in a bit of a crude way; if the `vTensor` is a copy, or has copies, then cowardly assume that it has been written to before the current access so that appropriate memory barriers are inserted. This was the selected solution because I thought that adding a map to track last access of tensors that share resources is a bit overkill when the assumption that the underlying resource has been written to before the current access should hold most of the time. Differential Revision: [D63642092](https://our.internmc.facebook.com/intern/diff/D63642092/) [ghstack-poisoned]
This pull request was exported from Phabricator. Differential Revision: D63642092 |
…iew use cases + fix synchronization hazard with view tensors" ## Context This diff fixes some hazards (not necessarily) bugs with view tensors. ### `virtual_clone` API Consider the following sequence of calls which may be common in the view of view use case. ``` t1 = graph.add_tensor(...); // t2 will have the same metadata as t1 t2 = graph.add_tensor_view(t1); // t3 will also have the same metadata as t2 at this point. t3 = graph.add_tensor_view(t2); // t2 metadata will be updated correctly. t2 = add_transpose_view_node(t1, 0, 1, t2); // Unfortunately, this node will have an assumption that t3 has the same metadata as t2 to start. However, this is not true. // As a result, t3 will have incorrect metadata after this node. t3 = add_transpose_view_node(t2, 1, 2, t3); ``` To address this, the `virtual_clone` API is introduced which will allow view nodes to set the metadata of the output equal to the input before modifying the output. ### WAW synchronization hazards `vTensorStorage` maintains a `last_access` state which facilitates inserting the correct memory barriers for the underlying `vkImage` or `vkBuffer`. However, when we create a tensor view, `last_access` is not shared between `vTensor` instances that use the same resource. As as result, writing into a `vTensor` will not update the `last_access` of its views, and vice versa. Therefore, sebsequent accesses of the other tensor that references the same resource will result in a synchronization hazard. This diff fixes this hazard in a bit of a crude way; if the `vTensor` is a copy, or has copies, then cowardly assume that it has been written to before the current access so that appropriate memory barriers are inserted. This was the selected solution because I thought that adding a map to track last access of tensors that share resources is a bit overkill when the assumption that the underlying resource has been written to before the current access should hold most of the time. Differential Revision: [D63642092](https://our.internmc.facebook.com/intern/diff/D63642092/) [ghstack-poisoned]
This pull request was exported from Phabricator. Differential Revision: D63642092 |
This pull request has been merged in a5a76f7. |
Stack from ghstack (oldest at bottom):
SymInt
andParamsBuffer
#5754virtual_clone
API to support view of view use cases + fix synchronization hazard with view tensors #5753Context
This diff fixes some hazards (not necessarily) bugs with view tensors.
virtual_clone
APIConsider the following sequence of calls which may be common in the view of view use case.
To address this, the
virtual_clone
API is introduced which will allow view nodes to set the metadata of the output equal to the input before modifying the output.WAW synchronization hazards
vTensorStorage
maintains alast_access
state which facilitates inserting the correct memory barriers for the underlyingvkImage
orvkBuffer
. However, when we create a tensor view,last_access
is not shared betweenvTensor
instances that use the same resource.As as result, writing into a
vTensor
will not update thelast_access
of its views, and vice versa. Therefore, sebsequent accesses of the other tensor that references the same resource will result in a synchronization hazard.This diff fixes this hazard in a bit of a crude way; if the
vTensor
is a copy, or has copies, then cowardly assume that it has been written to before the current access so that appropriate memory barriers are inserted. This was the selected solution because I thought that adding a map to track last access of tensors that share resources is a bit overkill when the assumption that the underlying resource has been written to before the current access should hold most of the time.Differential Revision: D63642092