Skip to content

Commit bf88166

Browse files
GregoryComerfacebook-github-bot
authored andcommitted
Default to cores/2 threads in JNI layer (#6042)
Summary: Default to using cores/2 threadpool threads. The long-term plan is to improve performant core detection in CPUInfo, but for now we can use cores/2 as a sane default. Based on testing, this is almost universally faster than using all cores, as efficiency cores can be quite slow. In extreme cases, using all cores can be 10x slower than using cores/2. This also matches Lite Interpreter's default behavior when it doesn't have a more precise heuristic for the target hardware. Reviewed By: kirklandsign Differential Revision: D64107326
1 parent 866b40c commit bf88166

File tree

2 files changed

+27
-0
lines changed

2 files changed

+27
-0
lines changed

extension/android/jni/BUCK

Lines changed: 3 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -1,4 +1,5 @@
11
load("@fbsource//tools/build_defs/android:fb_android_cxx_library.bzl", "fb_android_cxx_library")
2+
load("@fbsource//xplat/executorch/backends/xnnpack/third-party:third_party_libs.bzl", "third_party_dep")
23
load("@fbsource//xplat/executorch/codegen:codegen.bzl", "executorch_generated_lib")
34

45
oncall("executorch")
@@ -41,6 +42,8 @@ fb_android_cxx_library(
4142
"//xplat/executorch/extension/module:module_static",
4243
"//xplat/executorch/extension/runner_util:inputs_static",
4344
"//xplat/executorch/extension/tensor:tensor_static",
45+
"//xplat/executorch/extension/threadpool:threadpool_static",
46+
third_party_dep("cpuinfo"),
4447
],
4548
)
4649

extension/android/jni/jni_layer.cpp

Lines changed: 24 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -25,6 +25,11 @@
2525
#include <executorch/runtime/platform/platform.h>
2626
#include <executorch/runtime/platform/runtime.h>
2727

28+
#ifdef ET_USE_THREADPOOL
29+
#include <cpuinfo.h>
30+
#include <executorch/extension/threadpool/threadpool.h>
31+
#endif
32+
2833
#include <fbjni/ByteBuffer.h>
2934
#include <fbjni/fbjni.h>
3035

@@ -260,6 +265,25 @@ class ExecuTorchJni : public facebook::jni::HybridClass<ExecuTorchJni> {
260265
}
261266

262267
module_ = std::make_unique<Module>(modelPath->toStdString(), load_mode);
268+
269+
#ifdef ET_USE_THREADPOOL
270+
// Default to using cores/2 threadpool threads. The long-term plan is to
271+
// improve performant core detection in CPUInfo, but for now we can use
272+
// cores/2 as a sane default.
273+
//
274+
// Based on testing, this is almost universally faster than using all
275+
// cores, as efficiency cores can be quite slow. In extreme cases, using
276+
// all cores can be 10x slower than using cores/2.
277+
//
278+
// TODO Allow overriding this default from Java.
279+
auto threadpool = executorch::extension::threadpool::get_threadpool();
280+
if (threadpool) {
281+
int thread_count = cpuinfo_get_processors_count() / 2;
282+
if (thread_count > 0) {
283+
threadpool->_unsafe_reset_threadpool(thread_count);
284+
}
285+
}
286+
#endif
263287
}
264288

265289
facebook::jni::local_ref<facebook::jni::JArrayClass<JEValue>> forward(

0 commit comments

Comments
 (0)