Skip to content

[ET-VK] Allow overwriting local workgroup size #4046

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Closed
wants to merge 2 commits into from

Conversation

jorgep31415
Copy link
Contributor

@jorgep31415 jorgep31415 commented Jun 24, 2024

Stack from ghstack (oldest at bottom):

Introduce a GraphConfig toggle following the convention of storage_type and memory_layout.

Differential Revision: D58957058

Introduce a `GraphConfig` toggle following the convention of `storage_type` and `memory_layout`.

Differential Revision: [D58957058](https://our.internmc.facebook.com/intern/diff/D58957058/)

[ghstack-poisoned]
Copy link

pytorch-bot bot commented Jun 24, 2024

🔗 Helpful Links

🧪 See artifacts and rendered test results at hud.pytorch.org/pr/pytorch/executorch/4046

Note: Links to docs will display an error until the docs builds have been completed.

❗ 1 Active SEVs

There are 1 currently active SEVs. If your PR is affected, please view them below:

❌ 1 New Failure

As of commit b7c4590 with merge base 398ce66 (image):

NEW FAILURE - The following job has failed:

This comment was automatically generated by Dr. CI and updates every 15 minutes.

@facebook-github-bot facebook-github-bot added the CLA Signed This label is managed by the Facebook bot. Authors need to sign the CLA before a PR can be reviewed. label Jun 24, 2024
jorgep31415 added a commit that referenced this pull request Jun 24, 2024
Introduce a `GraphConfig` toggle following the convention of `storage_type` and `memory_layout`.

Differential Revision: [D58957058](https://our.internmc.facebook.com/intern/diff/D58957058/)

ghstack-source-id: 231379021
Pull Request resolved: #4046
@facebook-github-bot
Copy link
Contributor

This pull request was exported from Phabricator. Differential Revision: D58957058

Introduce a `GraphConfig` toggle following the convention of `storage_type` and `memory_layout`.

Differential Revision: [D58957058](https://our.internmc.facebook.com/intern/diff/D58957058/)

[ghstack-poisoned]
@facebook-github-bot
Copy link
Contributor

This pull request was exported from Phabricator. Differential Revision: D58957058

jorgep31415 added a commit that referenced this pull request Jun 24, 2024
Pull Request resolved: #4046

Introduce a `GraphConfig` toggle following the convention of `storage_type` and `memory_layout`.

Differential Revision: [D58957058](https://our.internmc.facebook.com/intern/diff/D58957058/)
ghstack-source-id: 231385519
@facebook-github-bot
Copy link
Contributor

This pull request has been merged in ae175c5.

jorgep31415 added a commit that referenced this pull request Jun 25, 2024
This will allow us to override local workgroup sizes with #4046.

## Before
```
vTensorPtr t_out = graph.get_tensor(out);
api::utils::uvec3 global_size = t_out->image_extents();
api::utils::uvec3 local_size = adaptive_work_group_size(global_size);

graph.execute_nodes().emplace_back(new ExecuteNode(
      ...,
      global_size,
      local_size,
      ...,
      );
```

## After
```
graph.execute_nodes().emplace_back(new ExecuteNode(
      ...,
      graph.create_global_wg_size(out),
      graph.create_local_wg_size(out),
      ...,
      );
```

Note we do not migrate cases where the global size is nontrivial (MatMul, Linear, Conv1D, Repeat) or the image isn't a ValueRef (MaxPool2D, NativeLayerNorm). We should first align on an API design for those cases.

Differential Revision: [D59011492](https://our.internmc.facebook.com/intern/diff/D59011492/)

[ghstack-poisoned]
jorgep31415 added a commit that referenced this pull request Jun 25, 2024
This will allow us to override local workgroup sizes with #4046.

## Before
```
vTensorPtr t_out = graph.get_tensor(out);
api::utils::uvec3 global_size = t_out->image_extents();
api::utils::uvec3 local_size = adaptive_work_group_size(global_size);

graph.execute_nodes().emplace_back(new ExecuteNode(
      ...,
      global_size,
      local_size,
      ...,
      );
```

## After
```
graph.execute_nodes().emplace_back(new ExecuteNode(
      ...,
      graph.create_global_wg_size(out),
      graph.create_local_wg_size(out),
      ...,
      );
```

Note we do not migrate cases where the global size is nontrivial (MatMul, Linear, Conv1D, Repeat) or the image isn't a ValueRef (MaxPool2D, NativeLayerNorm). We should first align on an API design for those cases.

Differential Revision: [D59011492](https://our.internmc.facebook.com/intern/diff/D59011492/)

ghstack-source-id: 231549021
Pull Request resolved: #4061
facebook-github-bot pushed a commit that referenced this pull request Jun 25, 2024
Summary:
Pull Request resolved: #4061

This will allow us to override local workgroup sizes with #4046.

## Before
```
vTensorPtr t_out = graph.get_tensor(out);
api::utils::uvec3 global_size = t_out->image_extents();
api::utils::uvec3 local_size = adaptive_work_group_size(global_size);

graph.execute_nodes().emplace_back(new ExecuteNode(
      ...,
      global_size,
      local_size,
      ...,
      );
```

## After
```
graph.execute_nodes().emplace_back(new ExecuteNode(
      ...,
      graph.create_global_wg_size(out),
      graph.create_local_wg_size(out),
      ...,
      );
```

Note we do not migrate cases where the global size is nontrivial (MatMul, Linear, Conv1D, Repeat) or the image isn't a ValueRef (MaxPool2D, NativeLayerNorm). We should first align on an API design for those cases.

bypass-github-export-checks
bypass-github-pytorch-ci-checks
bypass-github-executorch-ci-checks

Reviewed By: nathanaelsee

Differential Revision: D59011492

fbshipit-source-id: 40e75afdaecefa733bf1748a627877b48b5d949a
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
CLA Signed This label is managed by the Facebook bot. Authors need to sign the CLA before a PR can be reviewed. fb-exported Merged
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants