Skip to content

[OpenMP] Add skewed iteration distribution on hybrid systems #69946

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
merged 3 commits into from
Nov 8, 2023

Conversation

jpeyton52
Copy link
Contributor

This commit adds skewed distribution of iterations in nonmonotonic:dynamic schedule (static steal) for hybrid systems when thread affinity is assigned. Currently, it distributes the iterations at 60:40 ratio. Consider this loop with dynamic schedule type,
for (int i = 0; i < 100; ++i). In a hybrid system with 20 hardware threads (16 CORE and 4 ATOM core), 88 iterations will be assigned to performance cores and 12 iterations will be assigned to efficient cores. Each thread with CORE core will process 5 iterations + extras and with ATOM core will process 3 iterations.

Original Phabricator Patch: https://reviews.llvm.org/D152955

This commit adds skewed distribution of iterations in nonmonotonic:dynamic
schedule (static steal) for hybrid systems when thread affinity is assigned.
Currently, it distributes the iterations at 60:40 ratio. Consider this loop with
dynamic schedule type, for (int i = 0; i < 100; ++i)
In a hybrid system with 20 hardware threads (16 CORE and 4 ATOM core), 88 iterations
will be assigned to performance cores and 12 iterations will be assigned
to efficient cores. Each thread with CORE core will process 5
iterations + extras and with ATOM core will process 3 iterations.
@llvmbot llvmbot added the openmp:libomp OpenMP host runtime label Oct 23, 2023
Copy link
Member

@jdoerfert jdoerfert left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I didn't try to review the "algorithm" but just browsed the code. There are minor issues but the idea seems solid.

unsigned unused : 28;
#endif
#elif KMP_AFFINITY_SUPPORTED && KMP_STATIC_STEAL_ENABLED
unsigned use_hybrid : 1;
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is it really worth adding all these cases for 1 bit?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Removed the macros here and put comments for relevant macros

@@ -90,6 +90,69 @@ static inline int __kmp_get_monotonicity(ident_t *loc, enum sched_type schedule,
return monotonicity;
}

#if KMP_WEIGHTED_ITERATIONS_SUPPORTED
static inline float __kmp_get_float_val(float num) {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Add a comment, get_float_val float->float is really not explaining much

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Added comment, and changed function name for clarity.

@github-actions
Copy link

github-actions bot commented Oct 25, 2023

✅ With the latest revision this PR passed the C/C++ code formatter.

@jpeyton52
Copy link
Contributor Author

@jdoerfert , friendly ping on this

@jpeyton52
Copy link
Contributor Author

@jdoerfert , friendly ping on this.

@jpeyton52 jpeyton52 requested a review from jdoerfert November 7, 2023 18:50
Copy link
Contributor

@hansangbae hansangbae left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM

@jpeyton52 jpeyton52 merged commit 5cc603c into llvm:main Nov 8, 2023
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
openmp:libomp OpenMP host runtime
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants