-
Notifications
You must be signed in to change notification settings - Fork 14.3k
[OpenMP] Add skewed iteration distribution on hybrid systems #69946
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
This commit adds skewed distribution of iterations in nonmonotonic:dynamic schedule (static steal) for hybrid systems when thread affinity is assigned. Currently, it distributes the iterations at 60:40 ratio. Consider this loop with dynamic schedule type, for (int i = 0; i < 100; ++i) In a hybrid system with 20 hardware threads (16 CORE and 4 ATOM core), 88 iterations will be assigned to performance cores and 12 iterations will be assigned to efficient cores. Each thread with CORE core will process 5 iterations + extras and with ATOM core will process 3 iterations.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I didn't try to review the "algorithm" but just browsed the code. There are minor issues but the idea seems solid.
openmp/runtime/src/kmp.h
Outdated
unsigned unused : 28; | ||
#endif | ||
#elif KMP_AFFINITY_SUPPORTED && KMP_STATIC_STEAL_ENABLED | ||
unsigned use_hybrid : 1; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Is it really worth adding all these cases for 1 bit?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Removed the macros here and put comments for relevant macros
openmp/runtime/src/kmp_dispatch.cpp
Outdated
@@ -90,6 +90,69 @@ static inline int __kmp_get_monotonicity(ident_t *loc, enum sched_type schedule, | |||
return monotonicity; | |||
} | |||
|
|||
#if KMP_WEIGHTED_ITERATIONS_SUPPORTED | |||
static inline float __kmp_get_float_val(float num) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Add a comment, get_float_val float->float is really not explaining much
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Added comment, and changed function name for clarity.
✅ With the latest revision this PR passed the C/C++ code formatter. |
@jdoerfert , friendly ping on this |
@jdoerfert , friendly ping on this. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM
This commit adds skewed distribution of iterations in
nonmonotonic:dynamic
schedule (static steal) for hybrid systems when thread affinity is assigned. Currently, it distributes the iterations at 60:40 ratio. Consider this loop with dynamic schedule type,for (int i = 0; i < 100; ++i)
. In a hybrid system with 20 hardware threads (16 CORE and 4 ATOM core), 88 iterations will be assigned to performance cores and 12 iterations will be assigned to efficient cores. Each thread with CORE core will process 5 iterations + extras and with ATOM core will process 3 iterations.Original Phabricator Patch: https://reviews.llvm.org/D152955