Skip to content

Cleanup operator class #113

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Closed
wants to merge 1 commit into from

Conversation

larryliu0820
Copy link
Contributor

Summary:
This work is to improve memory efficiency on operator/kernel registration.

We deprecated Operator in favor of Kernel quite a while ago in the specialized kernels project. From then on Operator contains a fixed number ( 8 ) of kernels and the registry contains a fixed number (250) of Operators. This is a waste of memory in two aspects:

  1. We register prim ops unconditionally as Operators but they only have 1 kernel. This is a 8 times of waste.
  2. We should be able to leverage selective build information on how many kernels are being registered for this build. This number can be passed in by a preprocessing flag.

To address this issue, this diff changes the 2-layer structure of Operator and Kernel, only store kernels in the registry.

The downside is that we may experience a slight regression on static init time register_kernels() call because now it needs to do a linear search and make sure there's no duplicate kernels in the registry. This is O(N^2).

Reviewed By: JacobSzwejbka

Differential Revision: D48619001

@facebook-github-bot facebook-github-bot added CLA Signed This label is managed by the Facebook bot. Authors need to sign the CLA before a PR can be reviewed. fb-exported labels Aug 23, 2023
@facebook-github-bot
Copy link
Contributor

This pull request was exported from Phabricator. Differential Revision: D48619001

1 similar comment
@facebook-github-bot
Copy link
Contributor

This pull request was exported from Phabricator. Differential Revision: D48619001

@facebook-github-bot
Copy link
Contributor

This pull request was exported from Phabricator. Differential Revision: D48619001

@facebook-github-bot
Copy link
Contributor

This pull request was exported from Phabricator. Differential Revision: D48619001

Summary:
Pull Request resolved: pytorch/executorch#113

This work is to improve memory efficiency on operator/kernel registration.

We deprecated `Operator` in favor of `Kernel` quite a while ago in the specialized kernels project. From then on `Operator` contains a fixed number ( 8 ) of kernels and the registry contains a fixed number (250) of `Operator`s. This is a waste of memory in two aspects:
1. We register prim ops unconditionally as `Operators` but they only have 1 kernel. This is a 8 times of waste.
2. We should be able to leverage selective build information on how many kernels are being registered for this build. This number can be passed in by a preprocessing flag.

To address this issue, this diff changes the 2-layer structure of Operator and Kernel, only store kernels in the registry.

The downside is that we may experience a slight regression on static init time `register_kernels()` call because now it needs to do a linear search and make sure there's no duplicate kernels in the registry. This is O(N^2).

Reviewed By: JacobSzwejbka

Differential Revision: D48619001

fbshipit-source-id: 42cadc31aa266085519425d079db09dbe181141e
@facebook-github-bot
Copy link
Contributor

This pull request was exported from Phabricator. Differential Revision: D48619001

@facebook-github-bot
Copy link
Contributor

This pull request has been merged in 792388f.

Gasoonjia pushed a commit that referenced this pull request Jul 30, 2024
Update readme

Update README.md (#113)

update README.md

Update README.md (#114)

Update README.md (#115)

Update Readme.md
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
CLA Signed This label is managed by the Facebook bot. Authors need to sign the CLA before a PR can be reviewed. fb-exported Merged
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants