Skip to content

[SYCL][ESIMD] Introduce predicates for lsc_block_store/load #6688

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
merged 7 commits into from
Sep 9, 2022

Conversation

fineg74
Copy link
Contributor

@fineg74 fineg74 commented Sep 1, 2022

No description provided.

@fineg74
Copy link
Contributor Author

fineg74 commented Sep 1, 2022

Complementary Test PR: intel/llvm-test-suite#1194

#ifdef __ESIMD_FORCE_STATELESS_MEM
return lsc_block_load<T, NElts, DS, L1H, L3H>(
__ESIMD_DNS::accessorToPointer<T>(acc, offset));
__ESIMD_DNS::accessorToPointer<T>(acc, offset, pred));
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

pred should apparently be an argument of lsc_block_load, not accessorToPointer.
This should have been caught by the intel/llvm-test-suite#1194 test, right? Do you have idea why it wasn't? Probably, the test should also compile/run with forced stateless memory access

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Several reasons:

  1. We do not run PVC tests in CI pipeline
  2. The test doesn't use __ESIMD_FORCE_STATELESS_MEM so the issue went unnoticed

@@ -442,6 +442,7 @@ lsc_gather(AccessorTy acc, __ESIMD_NS::simd<uint32_t, N> offsets,
/// @tparam AccessorTy is the \ref sycl::accessor type.
/// @param acc is the SYCL accessor.
/// @param offset is the zero-based offset in bytes.
/// @param pred is predicate.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Please add description of the effect of the predicate

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Done

#ifdef __ESIMD_FORCE_STATELESS_MEM
lsc_block_store<T, NElts, DS, L1H>(
__ESIMD_DNS::accessorToPointer<T>(acc, offset), vals);
__ESIMD_DNS::accessorToPointer<T>(acc, offset), vals, pred);
#else
detail::check_lsc_vector_size<NElts>();
Copy link
Contributor

@kbobrovs kbobrovs Sep 7, 2022

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Please make sure new functionality also works on ESIMD emulator, as lsc accesses work on emulator. @dongkyunahn-intel can help with logistics.

Upd: I'm not sure if CI runs tests on emulator - probably only after merge as post-commit. So please verify your new tests work on emulator.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I locally verified if two samples from intel/llvm-test-suite#1194 are working fine with this PR. While lsc_predicate.cpp passed, lsc_predicate_stateless.cpp failed during compilation.

Building dpcpp toolchain supporting ESIMD_EMULATOR and running ESIMD kernels on emulator requires some environment setup. Let me know if you need help in locally verifying your implementation with ESIMD_EMULATOR.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

After __ESIMD_FORCE_STATELESS_MEM fix, both tests are passing.

$ SYCL/ESIMD/lsc/Output/lsc_predicate.cpp.tmp.out
Device name: ESIMD_EMULATOR
USM lsc predicate test passed
USM lsc predicate test passed
USM lsc predicate test passed
Accessor lsc predicate test passed
Accessor lsc predicate test passed
Accessor lsc predicate test passed
$ SYCL/ESIMD/lsc/Output/lsc_predicate_stateless.cpp.tmp.out
Device name: ESIMD_EMULATOR
Accessor lsc predicate test passed
Accessor lsc predicate test passed
Accessor lsc predicate test passed

Copy link
Contributor

@kbobrovs kbobrovs left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Please address comments.

@fineg74
Copy link
Contributor Author

fineg74 commented Sep 7, 2022

lint failure occurs in files that are not part of this PR.

/// @tparam L1H is L1 cache hint.
/// @tparam L3H is L3 cache hint.
/// @param p is the base pointer.
/// @param pred is predicate to enable the operation
Copy link
Contributor

@kbobrovs kbobrovs Sep 7, 2022

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
/// @param pred is predicate to enable the operation
/// @param pred operation predicate. Zero means operation is skipped entirely, non-zero - operation is performed. The default is '1' - perform the operation.

@@ -442,6 +527,8 @@ lsc_gather(AccessorTy acc, __ESIMD_NS::simd<uint32_t, N> offsets,
/// @tparam AccessorTy is the \ref sycl::accessor type.
/// @param acc is the SYCL accessor.
/// @param offset is the zero-based offset in bytes.
/// @param pred is predicate to enable the operation
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

here please also expand description a bit telling which values mean what

kbobrovs
kbobrovs previously approved these changes Sep 8, 2022
# Conflicts:
#	sycl/test/basic_tests/stdcpp_compat.cpp
@kbobrovs
Copy link
Contributor

kbobrovs commented Sep 9, 2022

@fineg74, please file internal ticket (if it does not exist yet) for CUDA testing failures.

@kbobrovs kbobrovs merged commit f44edce into intel:sycl Sep 9, 2022
@fineg74 fineg74 deleted the lscPredicate branch September 9, 2022 16:00
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants