Skip to content

[SYCL] Fix two reduction bugs #7347

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
merged 1 commit into from
Nov 10, 2022
Merged

Conversation

aelovikov-intel
Copy link
Contributor

@aelovikov-intel aelovikov-intel commented Nov 10, 2022

The one in NDRangeReductionreduction::strategy::basic was introduced during most recent reduction refactoring and most likely could only be triggered through internal API calls. Code path for public APIs is such that this strategy isn't auto-selected under conditions leading to the bug.

I think another one (strategy::group_reduce_and_last_wg_detection) was introduced earlier (a couple of months maybe) and probably was user-visible.

Both bugs were caused by some extra erroneous conditions resulting in taking the wrong branch target.

The one in NDRangeReduction<reduction::strategy::basic> was introduced
during most recent reduction refactoring and most likely could only be
triggered through internal API calls. Code path for public APIs is such
that this strategy isn't auto-selected under conditions leading to the
bug.

I think another one (strategy::group_reduce_and_last_wg_detection) was
introduced earlier (a couple of months maybe) and probably was
user-visible.
@aelovikov-intel
Copy link
Contributor Author

This will be tested by intel/llvm-test-suite#1373, but I need this, #7346 and one more future PR for reductions (ready locally, but dependent on #7346) to enable that test.

Copy link
Contributor

@steffenlarsen steffenlarsen left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Could you please describe the bugs in the description?

@aelovikov-intel
Copy link
Contributor Author

Failures

SYCL :: Regression/device_num.cpp
SYCL :: regression/tanh_fix_test.cpp
SYCL :: Basic/barrier_order.cpp

are all known and are being addressed in other PRs already. @intel/llvm-gatekeepers , this PR is ready I believe.

@pvchupin
Copy link
Contributor

SYCL :: Regression/device_num.cpp - Reported to owner in intel/llvm-test-suite#1354 (comment) with fix up in #7336.
SYCL :: regression/tanh_fix_test.cpp - Reported to owner in intel/llvm-test-suite#1361 (comment) with fix in intel/llvm-test-suite#1372.
SYCL :: Basic/barrier_order.cpp - disabled at intel/llvm-test-suite#1375

@pvchupin pvchupin merged commit 848be18 into intel:sycl Nov 10, 2022
@aelovikov-intel aelovikov-intel deleted the reduction-3 branch April 7, 2023 15:56
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants