-
Notifications
You must be signed in to change notification settings - Fork 787
[SYCL] Fix multi_ptr relational operators customised for nullptr. #13201
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
// RUN: %{build} -o %t.out | ||
// RUN: %{run} %t.out | ||
|
||
#include <sycl/sycl.hpp> |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
How hard would it be to switch it to <sycl/detail/core.hpp>
+ something? I'd hope
#include <sycl/sycl.hpp> | |
#include <sycl/detail/core.hpp> | |
#include <sycl/multi_ptr.hpp> |
should be enough.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
thanks @aelovikov-intel, done.
bool output[OUTPUT_SIZE]; | ||
try { | ||
sycl::queue queue; | ||
sycl::buffer<bool, 1> buf(output, sycl::range<1>(OUTPUT_SIZE)); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
nit, my preference would be
sycl::buffer<bool, 1> buf(N);
/* do work */
for (bool res : sycl::host_accessor{buf})
assert(res);
until we will have ranges and all_of
on a range.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
thanks @aelovikov-intel, done.
cgh.parallel_for(sycl::nd_range<1>{1, 1}, [=](sycl::id<1>) { | ||
locAcc[0] = 10; | ||
multi_ptr_t mp(locAcc); | ||
dev_acc[0] = std::less<multi_ptr_t>()(nullptr, mp) == nullptr < mp; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Should we also check the result using plain decorated pointers without multi_ptr
?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'm not quite sure I understand what you mean by "decorated pointers without multiptr." Could you please provide more context?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
decltype(multi_ptr_obj.get_decorated())
I would be something like int __attribute__((opencl)) *
.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@aelovikov-intel Testing plain decorated pointers is beyond the scope of this PR, as it exclusively addresses relational operators within multi_ptrs.
using multi_ptr_yes = | ||
sycl::multi_ptr<bool, sycl::access::address_space::local_space, | ||
sycl::access::decorated::yes>; | ||
using multi_ptr_no = | ||
sycl::multi_ptr<bool, sycl::access::address_space::local_space, | ||
sycl::access::decorated::no>; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
we should also test the other address spaces
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
thanks @Naghasan , done.
sycl/include/sycl/multi_ptr.hpp
Outdated
@@ -1546,70 +1546,80 @@ template <typename ElementType, access::address_space Space, | |||
access::decorated DecorateAddress> | |||
bool operator==(const multi_ptr<ElementType, Space, DecorateAddress> &lhs, | |||
std::nullptr_t) { | |||
return lhs.get() == nullptr; | |||
return lhs.get() == | |||
multi_ptr<ElementType, Space, DecorateAddress>(nullptr).get(); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I don't understand that change here. With sycl tip, the following
__attribute__((sycl_device)) bool f(__attribute__((opencl_local))int* i) {
return i == nullptr;
}
generates this for nvptx
%cmp = icmp eq ptr addrspace(3) %i, addrspacecast (ptr null to ptr addrspace(3))
which is what we expect. If you have an issue lhs.get() == nullptr
, this probably hides a compiler issue we need logged.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks @Naghasan
You are right regarding ==
operator. It fails to build for other relational operators such as lhs.get() > nullptr
.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
thanks @Naghasan , that particular change is reverted.
Improve test to cover other address spaces.
The fix is on review here: #13268 |
multi_ptr
relational operators taking astd::nullptr_t
are written in a way that assume it is the lowest possible value(example https://github.com/intel/llvm/blob/sycl/sycl/include/sycl/multi_ptr.hpp#L1575). However the C++ specs states there is no ordering requirement (https://eel.is/c++draft/expr.rel#4.3).
In practice, this is causing issues in the CUDA and AMDGPU backend. For instance, in CUDA the nullptr in the local address space is a non
0
value and0
in this address space is the root of the allocated local memory.