-
Notifications
You must be signed in to change notification settings - Fork 788
[SYCL][Reduction] Use "if constexpr" over SFINAE #6343
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
Just a prototype at this point to get feedback on the direction. |
Several reasons: * The code is easier to read/understand. * Ability to mix compiler/run-time conditions in a single instance of logic description, including those in future (e.g. discrete vs integrated GPU). * Eliminate duplication of similar parts of different algorithms so that distinctions are better highlighted. * Ease of experiments when trying to switch algorithm used by decoupling hard requirements from the peformance tuning logic (i.e., calling slower code must not result in a compile-time error). * Allows to stop encoding implementation selection logic into overloads, use distinct names instead (e.g., see forward declarations in handler.hpp). Drawbacks: * The feature is only enabled in C++17 mode now. The customers who need C++14 mode aren't expected to require it. Mitigation plan would be to change remaining SFINAE logic inside reduction.hpp into runtime asserts and have "constexpr" changed to a compile-time macro __SYCL_REDUCTION_CONSTEXPR to resort to run-time only code selection in C++14 mode.
24a32ac
to
ac3fd7c
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I have 3 quick comments.
} | ||
|
||
template <class FunctorTy> | ||
static event withAuxHandler(std::shared_ptr<detail::queue_impl> Queue, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This version is not used, if I am not missing anything.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Inside reduSaveFinalResultToUserMemHelper
in reduction.hpp
: https://github.com/intel/llvm/pull/6343/files#diff-6fee47fe508539a049886253dcc0f49dea9210ab0c1151feed4cdca1632f184dR2489
sycl/include/CL/sycl/handler.hpp
Outdated
} // end while (NWorkItems > 1) | ||
|
||
auto CopyEvent = ext::oneapi::detail::reduSaveFinalResultToUserMem( | ||
QueueCopy, MIsHost, ReduTuple, ReduIndices); | ||
if (CopyEvent) | ||
MLastEvent = *CopyEvent; | ||
} | ||
#endif |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
#endif | |
#endif // __cplusplus >= 201703L |
ping |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Only a handful of very minor comments. Great work!
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM!
Failing test looks unrelated and I think I've seen it in other PRs as well. @intel/llvm-gatekeepers , I believe this is ready. |
@aelovikov-intel, similar issues have been addressed... do you have recent references to similar issue? |
The flaky failure on the first pre-commit CI run has been recorded as #6416. |
Thanks @aelovikov-intel for a follow up. |
Three pieces: * Access to private static handler::withAuxHandler * Pow2WG -> HasUniformWG * Out accessor must be the last created (after other temps)
Some reduction code is used on non-reduction path and must be available in C++14 even though the rest of the feature is guarded to be available in C++17 and above only.
Several reasons:
description, including those in future (e.g. discrete vs integrated GPU).
distinctions are better highlighted.
requirements from the peformance tuning logic (i.e., calling slower code
must not result in a compile-time error).
distinct names instead (e.g., see forward declarations in handler.hpp).
Drawbacks:
The feature is only enabled in C++17 mode now. The customers who need C++14
mode aren't expected to require it.
Mitigation plan would be to change remaining SFINAE logic inside reduction.hpp
into runtime asserts and have "constexpr" changed to a compile-time macro
__SYCL_REDUCTION_CONSTEXPR to resort to run-time only code selection in
C++14 mode.