-
Notifications
You must be signed in to change notification settings - Fork 606
[ET-VK][BE] vTensor cleanup 6/N - Do not use gpu_memory_layout
as a source of truth, use packed_dim_whcn_idx
directly
#5479
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
… source of truth, use `packed_dim_whcn_idx` directly ## Context `GPUMemoryLayout` is not a sufficient description of how a tensor is laid out in GPU memory. For buffer backed tensors, this was true ever since strides were introduced; for texture backed tensors, this is true since the introduction of `axis_map`. For buffer backed tensors, the `strides` of the tensor is required to fully represent how the data of the tensor is laid out in GPU memory. For texture backed tensors, the `axis_map` and `packed_dim_whcn_idx` is required to fully represent the layout of the tensor as an image texture. Furthermore, with the introduction of functions like `virtual_transpose()`, tensor layouts may be produced which cannot be captured cleanly by an enum. This diff decouples `GPUMemoryLayout` from `vTensor`. Rather than storing it as a tensor property, it is only used during construction to determine the initial tensor layout metadata. The layout of a tensor can be estimated afterwards using `estimate_memory_layout()`, but this is only a "best effort" at producing a comparable memory layout. `GPUMemoryLayout` was helpful as a compact representation of the `packed_dim_whcn_idx` of the tensor, which identifies the "fastest moving dimension" in buffer backed tensors, or which dim is packed along a texel for texture backed tensors. Therefore, whenever `GPUMemoryLayout` is referenced, what is really of interest is the packed dim index. Therefore, this diff also replaces references to `memory_layout()` to reference `packed_dim_whcn_idx()` instead. Differential Revision: [D62995121](https://our.internmc.facebook.com/intern/diff/D62995121/) [ghstack-poisoned]
🔗 Helpful Links🧪 See artifacts and rendered test results at hud.pytorch.org/pr/pytorch/executorch/5479
Note: Links to docs will display an error until the docs builds have been completed. ⏳ No Failures, 1 PendingAs of commit 6f363c6 with merge base 8ef6c79 ( This comment was automatically generated by Dr. CI and updates every 15 minutes. |
… source of truth, use `packed_dim_whcn_idx` directly ## Context `GPUMemoryLayout` is not a sufficient description of how a tensor is laid out in GPU memory. For buffer backed tensors, this was true ever since strides were introduced; for texture backed tensors, this is true since the introduction of `axis_map`. For buffer backed tensors, the `strides` of the tensor is required to fully represent how the data of the tensor is laid out in GPU memory. For texture backed tensors, the `axis_map` and `packed_dim_whcn_idx` is required to fully represent the layout of the tensor as an image texture. Furthermore, with the introduction of functions like `virtual_transpose()`, tensor layouts may be produced which cannot be captured cleanly by an enum. This diff decouples `GPUMemoryLayout` from `vTensor`. Rather than storing it as a tensor property, it is only used during construction to determine the initial tensor layout metadata. The layout of a tensor can be estimated afterwards using `estimate_memory_layout()`, but this is only a "best effort" at producing a comparable memory layout. `GPUMemoryLayout` was helpful as a compact representation of the `packed_dim_whcn_idx` of the tensor, which identifies the "fastest moving dimension" in buffer backed tensors, or which dim is packed along a texel for texture backed tensors. Therefore, whenever `GPUMemoryLayout` is referenced, what is really of interest is the packed dim index. Therefore, this diff also replaces references to `memory_layout()` to reference `packed_dim_whcn_idx()` instead. Differential Revision: [D62995121](https://our.internmc.facebook.com/intern/diff/D62995121/) ghstack-source-id: 243424497 Pull Request resolved: #5479
This pull request was exported from Phabricator. Differential Revision: D62995121 |
…ayout` as a source of truth, use `packed_dim_whcn_idx` directly" ## Context `GPUMemoryLayout` is not a sufficient description of how a tensor is laid out in GPU memory. For buffer backed tensors, this was true ever since strides were introduced; for texture backed tensors, this is true since the introduction of `axis_map`. For buffer backed tensors, the `strides` of the tensor is required to fully represent how the data of the tensor is laid out in GPU memory. For texture backed tensors, the `axis_map` and `packed_dim_whcn_idx` is required to fully represent the layout of the tensor as an image texture. Furthermore, with the introduction of functions like `virtual_transpose()`, tensor layouts may be produced which cannot be captured cleanly by an enum. This diff decouples `GPUMemoryLayout` from `vTensor`. Rather than storing it as a tensor property, it is only used during construction to determine the initial tensor layout metadata. The layout of a tensor can be estimated afterwards using `estimate_memory_layout()`, but this is only a "best effort" at producing a comparable memory layout. `GPUMemoryLayout` was helpful as a compact representation of the `packed_dim_whcn_idx` of the tensor, which identifies the "fastest moving dimension" in buffer backed tensors, or which dim is packed along a texel for texture backed tensors. Therefore, whenever `GPUMemoryLayout` is referenced, what is really of interest is the packed dim index. Therefore, this diff also replaces references to `memory_layout()` to reference `packed_dim_whcn_idx()` instead. Differential Revision: [D62995121](https://our.internmc.facebook.com/intern/diff/D62995121/) [ghstack-poisoned]
This pull request was exported from Phabricator. Differential Revision: D62995121 |
…ayout` as a source of truth, use `packed_dim_whcn_idx` directly" ## Context `GPUMemoryLayout` is not a sufficient description of how a tensor is laid out in GPU memory. For buffer backed tensors, this was true ever since strides were introduced; for texture backed tensors, this is true since the introduction of `axis_map`. For buffer backed tensors, the `strides` of the tensor is required to fully represent how the data of the tensor is laid out in GPU memory. For texture backed tensors, the `axis_map` and `packed_dim_whcn_idx` is required to fully represent the layout of the tensor as an image texture. Furthermore, with the introduction of functions like `virtual_transpose()`, tensor layouts may be produced which cannot be captured cleanly by an enum. This diff decouples `GPUMemoryLayout` from `vTensor`. Rather than storing it as a tensor property, it is only used during construction to determine the initial tensor layout metadata. The layout of a tensor can be estimated afterwards using `estimate_memory_layout()`, but this is only a "best effort" at producing a comparable memory layout. `GPUMemoryLayout` was helpful as a compact representation of the `packed_dim_whcn_idx` of the tensor, which identifies the "fastest moving dimension" in buffer backed tensors, or which dim is packed along a texel for texture backed tensors. Therefore, whenever `GPUMemoryLayout` is referenced, what is really of interest is the packed dim index. Therefore, this diff also replaces references to `memory_layout()` to reference `packed_dim_whcn_idx()` instead. Differential Revision: [D62995121](https://our.internmc.facebook.com/intern/diff/D62995121/) [ghstack-poisoned]
This pull request was exported from Phabricator. Differential Revision: D62995121 |
This pull request has been merged in 7c6d58a. |
Stack from ghstack (oldest at bottom):
packed_dim_whcn_idx
withpacked_dim
#5484gpu_memory_layout
as a source of truth, usepacked_dim_whcn_idx
directly #5479Context
GPUMemoryLayout
is not a sufficient description of how a tensor is laid out in GPU memory. For buffer backed tensors, this was true ever since strides were introduced; for texture backed tensors, this is true since the introduction ofaxis_map
.For buffer backed tensors, the
strides
of the tensor is required to fully represent how the data of the tensor is laid out in GPU memory.For texture backed tensors, the
axis_map
andpacked_dim_whcn_idx
is required to fully represent the layout of the tensor as an image texture.Furthermore, with the introduction of functions like
virtual_transpose()
, tensor layouts may be produced which cannot be captured cleanly by an enum.This diff decouples
GPUMemoryLayout
fromvTensor
. Rather than storing it as a tensor property, it is only used during construction to determine the initial tensor layout metadata.The layout of a tensor can be estimated afterwards using
estimate_memory_layout()
, but this is only a "best effort" at producing a comparable memory layout.GPUMemoryLayout
was helpful as a compact representation of thepacked_dim_whcn_idx
of the tensor, which identifies the "fastest moving dimension" in buffer backed tensors, or which dim is packed along a texel for texture backed tensors. Therefore, wheneverGPUMemoryLayout
is referenced, what is really of interest is the packed dim index. Therefore, this diff also replaces references tomemory_layout()
to referencepacked_dim_whcn_idx()
instead.Differential Revision: D62995121