-
Notifications
You must be signed in to change notification settings - Fork 787
[ESIMD] Cast argument for atomic updates to integral type #7495
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
@@ -1492,7 +1492,7 @@ lsc_atomic_update(T *p, __ESIMD_NS::simd<uint32_t, N> offsets, | |||
__ESIMD_NS::simd<_MsgT, N> Tmp = | |||
__esimd_lsc_xatomic_stateless_1<_MsgT, _Op, L1H, L3H, _AddressScale, | |||
_ImmOffset, _DS, _VS, _Transposed, N>( | |||
pred.data(), addrs.data(), src0.data()); | |||
pred.data(), addrs.data(), convert<_MsgT>(src0).data()); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
- what is the type combination in the failing case?
- the corresponding test update does not seem to add any new test case
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
providing float as an arguments breaks the compilation as _MsgT is an integral type (int32_t, uint32_t, int64_t, uint64_t) and it requires an explicit type conversion from non-integral type.
Corresponding test fails compilation without this change as it uses float type as an argument
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The test that is actually failing is SYCL/ESIMD/lsc/atomic_cmpxchg.cpp (after the fix)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Sorry, but I don't understand this change-set. If atomic for float value causes a compilation, then why convert is the proper solution?
Let's say the argument if float32 values 2.3, then convert to uint32_t would change 2.3 to 2.0 right and then to uint32_t value 2, right?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
providing float as an arguments breaks the compilation
there is no legal user-level atomic API invocation which would translate into __esimd_lsc_xatomic_stateless_1 with floating-point argument. So I believe compilation failure is the right behavior.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The additional fix replaced 'convert' with 'bit_cast'. Please answer the question asked above. How is that possible to have FP types in __esimd_lsc_xatomic_stateless_1()?
Why the bit_cast is needed here?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Per sycl/ext/intel/esimd/common.hpp get_num_args() fadd, fsub, fmin, fmax have 1 argument
Per sycl/ext/intel/esimd/memory.hpp check_atomic fadd, fsub, fmin, fmax have to accept either float or half types
So in case of these operator calls __esimd_lsc_xatomic_stateless_1() will be called with FP types as arguments.
The bitcast is required since __esimd_lsc_xatomic_stateless_1() accepts only integral type vector as an argument so FP data should be passed as integral data. The cast<> call may lead to loss of data bitcast will preserve the data while passing argument to intrinsic
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Ok, this explanation helped. Thank you.
So, fadd is 2 arguments, but here it is considered as 1 argument because the other/memory operand does not count and this function being updated here accepts fp operations like fadd correctly.
Please add a test for the problem being fixed here. |
Actually..., sorry, it is my mistake. The test did not pass without this PR because that is a PVC test and we don't have PVC in CI. |
Before merging this fix, please check/confirm that the corresponding LIT test runs successfully on PVC (i.e. not only the compilation phase passes). BTW, can you please update that LIT test once again? |
Added PR for updated test: intel/llvm-test-suite#1458 |
Complementary test PR intel/llvm-test-suite#1407