Skip to content

[SYCL] Fix multi_ptr relational operators customised for nullptr. #13201

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
merged 3 commits into from
Apr 3, 2024

Conversation

mmoadeli
Copy link
Contributor

multi_ptr relational operators taking a std::nullptr_t are written in a way that assume it is the lowest possible value
(example https://github.com/intel/llvm/blob/sycl/sycl/include/sycl/multi_ptr.hpp#L1575). However the C++ specs states there is no ordering requirement (https://eel.is/c++draft/expr.rel#4.3).

In practice, this is causing issues in the CUDA and AMDGPU backend. For instance, in CUDA the nullptr in the local address space is a non 0 value and 0 in this address space is the root of the allocated local memory.

@mmoadeli mmoadeli requested a review from a team as a code owner March 29, 2024 00:05
@mmoadeli mmoadeli linked an issue Mar 29, 2024 that may be closed by this pull request
// RUN: %{build} -o %t.out
// RUN: %{run} %t.out

#include <sycl/sycl.hpp>
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

How hard would it be to switch it to <sycl/detail/core.hpp> + something? I'd hope

Suggested change
#include <sycl/sycl.hpp>
#include <sycl/detail/core.hpp>
#include <sycl/multi_ptr.hpp>

should be enough.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

thanks @aelovikov-intel, done.

bool output[OUTPUT_SIZE];
try {
sycl::queue queue;
sycl::buffer<bool, 1> buf(output, sycl::range<1>(OUTPUT_SIZE));
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

nit, my preference would be

  sycl::buffer<bool, 1> buf(N);
  /* do work */
  for (bool res : sycl::host_accessor{buf})
    assert(res);

until we will have ranges and all_of on a range.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

thanks @aelovikov-intel, done.

cgh.parallel_for(sycl::nd_range<1>{1, 1}, [=](sycl::id<1>) {
locAcc[0] = 10;
multi_ptr_t mp(locAcc);
dev_acc[0] = std::less<multi_ptr_t>()(nullptr, mp) == nullptr < mp;
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Should we also check the result using plain decorated pointers without multi_ptr?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm not quite sure I understand what you mean by "decorated pointers without multiptr." Could you please provide more context?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

decltype(multi_ptr_obj.get_decorated()) I would be something like int __attribute__((opencl)) *.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@aelovikov-intel Testing plain decorated pointers is beyond the scope of this PR, as it exclusively addresses relational operators within multi_ptrs.

Comment on lines 46 to 51
using multi_ptr_yes =
sycl::multi_ptr<bool, sycl::access::address_space::local_space,
sycl::access::decorated::yes>;
using multi_ptr_no =
sycl::multi_ptr<bool, sycl::access::address_space::local_space,
sycl::access::decorated::no>;
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

we should also test the other address spaces

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

thanks @Naghasan , done.

@@ -1546,70 +1546,80 @@ template <typename ElementType, access::address_space Space,
access::decorated DecorateAddress>
bool operator==(const multi_ptr<ElementType, Space, DecorateAddress> &lhs,
std::nullptr_t) {
return lhs.get() == nullptr;
return lhs.get() ==
multi_ptr<ElementType, Space, DecorateAddress>(nullptr).get();
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't understand that change here. With sycl tip, the following

__attribute__((sycl_device)) bool f(__attribute__((opencl_local))int* i) {
    return i == nullptr;
}

generates this for nvptx

%cmp = icmp eq ptr addrspace(3) %i, addrspacecast (ptr null to ptr addrspace(3))

which is what we expect. If you have an issue lhs.get() == nullptr, this probably hides a compiler issue we need logged.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks @Naghasan
You are right regarding == operator. It fails to build for other relational operators such as lhs.get() > nullptr.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

thanks @Naghasan , that particular change is reverted.

Improve test to cover other address spaces.
@ldrumm ldrumm merged commit 4f91bbb into intel:sycl Apr 3, 2024
@aarongreig
Copy link
Contributor

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

multi_ptr relational operators assumes nullptr is 0
6 participants