Skip to content

[ET-VK][BE] vTensor cleanup 6/N - Do not use gpu_memory_layout as a source of truth, use packed_dim_whcn_idx directly #5479

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Closed
wants to merge 3 commits into from

Conversation

SS-JIA
Copy link
Contributor

@SS-JIA SS-JIA commented Sep 18, 2024

Stack from ghstack (oldest at bottom):

Context

GPUMemoryLayout is not a sufficient description of how a tensor is laid out in GPU memory. For buffer backed tensors, this was true ever since strides were introduced; for texture backed tensors, this is true since the introduction of axis_map.

For buffer backed tensors, the strides of the tensor is required to fully represent how the data of the tensor is laid out in GPU memory.

For texture backed tensors, the axis_map and packed_dim_whcn_idx is required to fully represent the layout of the tensor as an image texture.

Furthermore, with the introduction of functions like virtual_transpose(), tensor layouts may be produced which cannot be captured cleanly by an enum.

This diff decouples GPUMemoryLayout from vTensor. Rather than storing it as a tensor property, it is only used during construction to determine the initial tensor layout metadata.

The layout of a tensor can be estimated afterwards using estimate_memory_layout(), but this is only a "best effort" at producing a comparable memory layout.

GPUMemoryLayout was helpful as a compact representation of the packed_dim_whcn_idx of the tensor, which identifies the "fastest moving dimension" in buffer backed tensors, or which dim is packed along a texel for texture backed tensors. Therefore, whenever GPUMemoryLayout is referenced, what is really of interest is the packed dim index. Therefore, this diff also replaces references to memory_layout() to reference packed_dim_whcn_idx() instead.

Differential Revision: D62995121

… source of truth, use `packed_dim_whcn_idx` directly

## Context

`GPUMemoryLayout` is not a sufficient description of how a tensor is laid out in GPU memory. For buffer backed tensors, this was true ever since strides were introduced; for texture backed tensors, this is true since the introduction of `axis_map`.

For buffer backed tensors, the `strides` of the tensor is required to fully represent how the data of the tensor is laid out in GPU memory.

For texture backed tensors, the `axis_map` and `packed_dim_whcn_idx` is required to fully represent the layout of the tensor as an image texture.

Furthermore, with the introduction of functions like `virtual_transpose()`, tensor layouts may be produced which cannot be captured cleanly by an enum.

This diff decouples `GPUMemoryLayout` from `vTensor`. Rather than storing it as a tensor property, it is only used during construction to determine the initial tensor layout metadata.

The layout of a tensor can be estimated afterwards using `estimate_memory_layout()`, but this is only a "best effort" at producing a comparable memory layout.

`GPUMemoryLayout` was helpful as a compact representation of the `packed_dim_whcn_idx` of the tensor, which identifies the "fastest moving dimension" in buffer backed tensors, or which dim is packed along a texel for texture backed tensors. Therefore, whenever `GPUMemoryLayout` is referenced, what is really of interest is the packed dim index. Therefore, this diff also replaces references to `memory_layout()` to reference `packed_dim_whcn_idx()` instead.

Differential Revision: [D62995121](https://our.internmc.facebook.com/intern/diff/D62995121/)

[ghstack-poisoned]
Copy link

pytorch-bot bot commented Sep 18, 2024

🔗 Helpful Links

🧪 See artifacts and rendered test results at hud.pytorch.org/pr/pytorch/executorch/5479

Note: Links to docs will display an error until the docs builds have been completed.

⏳ No Failures, 1 Pending

As of commit 6f363c6 with merge base 8ef6c79 (image):
💚 Looks good so far! There are no failures yet. 💚

This comment was automatically generated by Dr. CI and updates every 15 minutes.

@facebook-github-bot facebook-github-bot added the CLA Signed This label is managed by the Facebook bot. Authors need to sign the CLA before a PR can be reviewed. label Sep 18, 2024
SS-JIA added a commit that referenced this pull request Sep 18, 2024
… source of truth, use `packed_dim_whcn_idx` directly

## Context

`GPUMemoryLayout` is not a sufficient description of how a tensor is laid out in GPU memory. For buffer backed tensors, this was true ever since strides were introduced; for texture backed tensors, this is true since the introduction of `axis_map`.

For buffer backed tensors, the `strides` of the tensor is required to fully represent how the data of the tensor is laid out in GPU memory.

For texture backed tensors, the `axis_map` and `packed_dim_whcn_idx` is required to fully represent the layout of the tensor as an image texture.

Furthermore, with the introduction of functions like `virtual_transpose()`, tensor layouts may be produced which cannot be captured cleanly by an enum.

This diff decouples `GPUMemoryLayout` from `vTensor`. Rather than storing it as a tensor property, it is only used during construction to determine the initial tensor layout metadata.

The layout of a tensor can be estimated afterwards using `estimate_memory_layout()`, but this is only a "best effort" at producing a comparable memory layout.

`GPUMemoryLayout` was helpful as a compact representation of the `packed_dim_whcn_idx` of the tensor, which identifies the "fastest moving dimension" in buffer backed tensors, or which dim is packed along a texel for texture backed tensors. Therefore, whenever `GPUMemoryLayout` is referenced, what is really of interest is the packed dim index. Therefore, this diff also replaces references to `memory_layout()` to reference `packed_dim_whcn_idx()` instead.

Differential Revision: [D62995121](https://our.internmc.facebook.com/intern/diff/D62995121/)

ghstack-source-id: 243424497
Pull Request resolved: #5479
@facebook-github-bot
Copy link
Contributor

This pull request was exported from Phabricator. Differential Revision: D62995121

…ayout` as a source of truth, use `packed_dim_whcn_idx` directly"

## Context

`GPUMemoryLayout` is not a sufficient description of how a tensor is laid out in GPU memory. For buffer backed tensors, this was true ever since strides were introduced; for texture backed tensors, this is true since the introduction of `axis_map`.

For buffer backed tensors, the `strides` of the tensor is required to fully represent how the data of the tensor is laid out in GPU memory.

For texture backed tensors, the `axis_map` and `packed_dim_whcn_idx` is required to fully represent the layout of the tensor as an image texture.

Furthermore, with the introduction of functions like `virtual_transpose()`, tensor layouts may be produced which cannot be captured cleanly by an enum.

This diff decouples `GPUMemoryLayout` from `vTensor`. Rather than storing it as a tensor property, it is only used during construction to determine the initial tensor layout metadata.

The layout of a tensor can be estimated afterwards using `estimate_memory_layout()`, but this is only a "best effort" at producing a comparable memory layout.

`GPUMemoryLayout` was helpful as a compact representation of the `packed_dim_whcn_idx` of the tensor, which identifies the "fastest moving dimension" in buffer backed tensors, or which dim is packed along a texel for texture backed tensors. Therefore, whenever `GPUMemoryLayout` is referenced, what is really of interest is the packed dim index. Therefore, this diff also replaces references to `memory_layout()` to reference `packed_dim_whcn_idx()` instead.

Differential Revision: [D62995121](https://our.internmc.facebook.com/intern/diff/D62995121/)

[ghstack-poisoned]
@facebook-github-bot
Copy link
Contributor

This pull request was exported from Phabricator. Differential Revision: D62995121

…ayout` as a source of truth, use `packed_dim_whcn_idx` directly"

## Context

`GPUMemoryLayout` is not a sufficient description of how a tensor is laid out in GPU memory. For buffer backed tensors, this was true ever since strides were introduced; for texture backed tensors, this is true since the introduction of `axis_map`.

For buffer backed tensors, the `strides` of the tensor is required to fully represent how the data of the tensor is laid out in GPU memory.

For texture backed tensors, the `axis_map` and `packed_dim_whcn_idx` is required to fully represent the layout of the tensor as an image texture.

Furthermore, with the introduction of functions like `virtual_transpose()`, tensor layouts may be produced which cannot be captured cleanly by an enum.

This diff decouples `GPUMemoryLayout` from `vTensor`. Rather than storing it as a tensor property, it is only used during construction to determine the initial tensor layout metadata.

The layout of a tensor can be estimated afterwards using `estimate_memory_layout()`, but this is only a "best effort" at producing a comparable memory layout.

`GPUMemoryLayout` was helpful as a compact representation of the `packed_dim_whcn_idx` of the tensor, which identifies the "fastest moving dimension" in buffer backed tensors, or which dim is packed along a texel for texture backed tensors. Therefore, whenever `GPUMemoryLayout` is referenced, what is really of interest is the packed dim index. Therefore, this diff also replaces references to `memory_layout()` to reference `packed_dim_whcn_idx()` instead.

Differential Revision: [D62995121](https://our.internmc.facebook.com/intern/diff/D62995121/)

[ghstack-poisoned]
@facebook-github-bot
Copy link
Contributor

This pull request was exported from Phabricator. Differential Revision: D62995121

@facebook-github-bot
Copy link
Contributor

This pull request has been merged in 7c6d58a.

@SS-JIA SS-JIA deleted the gh/SS-JIA/84/head branch January 24, 2025 19:43
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
CLA Signed This label is managed by the Facebook bot. Authors need to sign the CLA before a PR can be reviewed. fb-exported Merged
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants