-
Notifications
You must be signed in to change notification settings - Fork 608
Default to cores/2 threads in JNI layer #6042
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
🔗 Helpful Links🧪 See artifacts and rendered test results at hud.pytorch.org/pr/pytorch/executorch/6042
Note: Links to docs will display an error until the docs builds have been completed. ✅ No FailuresAs of commit c8c1650 with merge base 866b40c ( This comment was automatically generated by Dr. CI and updates every 15 minutes. |
This pull request was exported from Phabricator. Differential Revision: D64107326 |
Summary: Default to using cores/2 threadpool threads. The long-term plan is to improve performant core detection in CPUInfo, but for now we can use cores/2 as a sane default. Based on testing, this is almost universally faster than using all cores, as efficiency cores can be quite slow. In extreme cases, using all cores can be 10x slower than using cores/2. This also matches Lite Interpreter's default behavior when it doesn't have a more precise heuristic for the target hardware. Differential Revision: D64107326
20ebaf3
to
4b5639d
Compare
This pull request was exported from Phabricator. Differential Revision: D64107326 |
Summary: Default to using cores/2 threadpool threads. The long-term plan is to improve performant core detection in CPUInfo, but for now we can use cores/2 as a sane default. Based on testing, this is almost universally faster than using all cores, as efficiency cores can be quite slow. In extreme cases, using all cores can be 10x slower than using cores/2. This also matches Lite Interpreter's default behavior when it doesn't have a more precise heuristic for the target hardware. Differential Revision: D64107326
4b5639d
to
c43f3fc
Compare
This pull request was exported from Phabricator. Differential Revision: D64107326 |
Summary: Default to using cores/2 threadpool threads. The long-term plan is to improve performant core detection in CPUInfo, but for now we can use cores/2 as a sane default. Based on testing, this is almost universally faster than using all cores, as efficiency cores can be quite slow. In extreme cases, using all cores can be 10x slower than using cores/2. This also matches Lite Interpreter's default behavior when it doesn't have a more precise heuristic for the target hardware. Differential Revision: D64107326
c43f3fc
to
2204a7e
Compare
This pull request was exported from Phabricator. Differential Revision: D64107326 |
Please fix lintrunner |
2204a7e
to
b1cc8f7
Compare
Summary: Default to using cores/2 threadpool threads. The long-term plan is to improve performant core detection in CPUInfo, but for now we can use cores/2 as a sane default. Based on testing, this is almost universally faster than using all cores, as efficiency cores can be quite slow. In extreme cases, using all cores can be 10x slower than using cores/2. This also matches Lite Interpreter's default behavior when it doesn't have a more precise heuristic for the target hardware. Differential Revision: D64107326
This pull request was exported from Phabricator. Differential Revision: D64107326 |
Summary: Default to using cores/2 threadpool threads. The long-term plan is to improve performant core detection in CPUInfo, but for now we can use cores/2 as a sane default. Based on testing, this is almost universally faster than using all cores, as efficiency cores can be quite slow. In extreme cases, using all cores can be 10x slower than using cores/2. This also matches Lite Interpreter's default behavior when it doesn't have a more precise heuristic for the target hardware. Differential Revision: D64107326
b1cc8f7
to
064c095
Compare
This pull request was exported from Phabricator. Differential Revision: D64107326 |
Summary: Default to using cores/2 threadpool threads. The long-term plan is to improve performant core detection in CPUInfo, but for now we can use cores/2 as a sane default. Based on testing, this is almost universally faster than using all cores, as efficiency cores can be quite slow. In extreme cases, using all cores can be 10x slower than using cores/2. This also matches Lite Interpreter's default behavior when it doesn't have a more precise heuristic for the target hardware. Reviewed By: kirklandsign Differential Revision: D64107326
064c095
to
bf88166
Compare
Summary: Default to using cores/2 threadpool threads. The long-term plan is to improve performant core detection in CPUInfo, but for now we can use cores/2 as a sane default. Based on testing, this is almost universally faster than using all cores, as efficiency cores can be quite slow. In extreme cases, using all cores can be 10x slower than using cores/2. This also matches Lite Interpreter's default behavior when it doesn't have a more precise heuristic for the target hardware. Reviewed By: kirklandsign Differential Revision: D64107326
bf88166
to
c8c1650
Compare
This pull request was exported from Phabricator. Differential Revision: D64107326 |
1 similar comment
This pull request was exported from Phabricator. Differential Revision: D64107326 |
This pull request has been merged in 27330f2. |
Summary:
Default to using cores/2 threadpool threads. The long-term plan is to
improve performant core detection in CPUInfo, but for now we can use cores/2 as a sane default.
Based on testing, this is almost universally faster than using all cores, as efficiency cores can be quite slow. In extreme cases, using
all cores can be 10x slower than using cores/2.
This also matches Lite Interpreter's default behavior when it doesn't have a more precise heuristic for the target hardware.
Differential Revision: D64107326