-
Notifications
You must be signed in to change notification settings - Fork 788
[SYCL] SYCL 2020 callable device selectors #6486
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[SYCL] SYCL 2020 callable device selectors #6486
Conversation
Signed-off-by: Chris Perkins <[email protected]>
…redDevices in scoring and pi-trace disbled. Signed-off-by: Chris Perkins <[email protected]>
Signed-off-by: Chris Perkins <[email protected]>
Signed-off-by: Chris Perkins <[email protected]>
Signed-off-by: Chris Perkins <[email protected]>
Signed-off-by: Chris Perkins <[email protected]>
Signed-off-by: Chris Perkins <[email protected]>
… return type (as opposed to devicePtr). Signed-off-by: Chris Perkins <[email protected]>
…evice) Signed-off-by: Chris Perkins <[email protected]>
Signed-off-by: Chris Perkins <[email protected]>
Signed-off-by: Chris Perkins <[email protected]>
Signed-off-by: Chris Perkins <[email protected]>
/verify with intel/llvm-test-suite#1119 |
Signed-off-by: Chris Perkins <[email protected]>
/verify with intel/llvm-test-suite#1119 |
Signed-off-by: Chris Perkins <[email protected]>
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Nice to see at last some real SYCL 2020 device selectors. \o/
|
||
} // namespace detail | ||
|
||
device device_selector::select_device() const { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Do you need to implement the old SYCL 1.2.1 ugly Java-factory style API ?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'm no fan either, but this is still used if compiling with C++14. Just preserving the functionality that was here before.
explicit queue(const DeviceSelector &deviceSelector, | ||
const async_handler &AsyncHandler, | ||
const property_list &PropList = {}) | ||
: queue(detail::select_device(deviceSelector), AsyncHandler, PropList) {} |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
: queue(detail::select_device(deviceSelector), AsyncHandler, PropList) {} | |
: queue(device { deviceSelector }, AsyncHandler, PropList) {} |
?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@keryell I like this suggestion, and if you prefer it I'll change it.
However, consider that of the four constructors we are adding here, only two are amenable to the change. The other two queue constructors use continue to use detail::select_device
because they have the additional context argument. Personally, I prefer all four to be consistent with each other, each calling detail::select_device directly, rather than just passing it to the device constructor in two cases. This may simply be a matter of taste. Let me know.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Interesting you need this detail::select_device
at the first place.
Reading again the specification on queue, queue
creation throws if the device picked by the selector is not in the context. The device is not the best one picked from the devices in the context but it is picked independently. So it is unclear why this detail::select_device
takes all these parametersn
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
My reading was that the device selector would be used to select the best device from the context (assuming an acceptable one could be found). It makes little sense to even provide that constructor otherwise.
FWIW, that is how the analog SYCL 1.2.1 queue constructor is implemented right now, and I used its behavior to model this one ( https://github.com/intel/llvm/blob/sycl/sycl/source/queue.cpp#L23-L30 )
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Interesting feature, but this is not how the SYCL specification is defined:
Constructs a SYCL queue instance that is associated with the syclContext provided, using the device returned by the device selector provided. Must throw an exception with the errc::invalid error code if syclContext does not encapsulate the SYCL device returned by deviceSelector.
So either the specification has to be changed or DPC++ has to be updated. @bader any direction?
But anyway, this is outside of this PR.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I agree with @keryell that device selection is supposed to be "context-free". I don't know if logic currently implemented by DPC++ is used, but it can be implemented outside of the library.
To honest, I don't have enough data to recommend any direction here.
At first glace, DPC++ behavior sounds like a user friendly approach. The runtime emits less exceptions for the user to handle and tries to do it best to select the right device. On the other hand, it's just a syntax sugar, which might add an overhead and not sure if it's really used in the real word.
@intel/dpcpp-specification-reviewers, do you have any thoughts on that subject?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I looked at all the places in the SYCL 2020 spec where a device selector is passed. In all other cases, it's clear that the implementation should pass every root device to the device selector, and the selector ranks each one.
This queue constructor is the only place where the set of valid device is limited by a context that the user passes in. It could make sense to redefine the spec to say that the implementation only passes devices from the context to the selector in this case. However, that is not how the spec is worded now, and we have the same wording in SYCL 1.2.1.
I actually wonder if this queue constructor is even useful. If the user is passing a custom context, maybe we should also force them to pass a specific device (and not a device selector). This avoids any ambiguity about which devices the implementation passes to the selector. We already have a form of the queue constructor like this (taking a context and a device). If we were doing this from scratch, I think I would not have added a form of the queue constructor that takes a context and a selector.
My feeling is that DPC++ should implement the existing spec, and maybe the committee should consider deprecating this form of the queue constructor.
// see [filter_selectors not "Callable"] in device_selectors.cpp | ||
explicit queue(const DeviceSelector &deviceSelector, | ||
const property_list &PropList = {}) | ||
: queue(detail::select_device(deviceSelector), async_handler{}, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Idem?
<< "SYCL_PI_TRACE[all]: " | ||
<< " platform: " << PlatformName << std::endl | ||
<< "SYCL_PI_TRACE[all]: " | ||
<< " device: " << DeviceName << std::endl; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Not clear why you need these strings to be split.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
it's copy/pasta. This is the trace code we had before. I just wrapped it in a function, but preserved it otherwise.
sycl/source/device_selector.cpp
Outdated
<< " device: " << DeviceName << std::endl; | ||
} | ||
for (const auto &dev : Devices) { | ||
int dev_score = std::invoke(DeviceSelectorInvocable, dev); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
int dev_score = std::invoke(DeviceSelectorInvocable, dev); | |
int dev_score = DeviceSelectorInvocable(dev); |
?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@keryell Since we are using std::is_invocable_r_v
when we screen it, I actually prefer the symmetry of using std::invoke
to invoke it. Maybe I'm in the minority on that. Would you mind if we kept it as is?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Are you kidding me or are you a member of the Church of the C++ Discouragement and Obfuscation? :-)
Why not rewriting all the code with Operator function objects from https://en.cppreference.com/w/cpp/utility/functional instead of using just +
or -
? ;-)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I don't want to start a holy war. I'll change it.
The hallowed symmetrical beauty of std::is_invocable to std::invoke is perhaps not cherished by the unenlightened.
Signed-off-by: Chris Perkins <[email protected]>
Signed-off-by: Chris Perkins <[email protected]>
Signed-off-by: Chris Perkins <[email protected]>
…ss tests AND not have merge conflicts.
/verify with intel/llvm-test-suite#1119 |
Signed-off-by: Chris Perkins <[email protected]>
…ps://github.com/cperkinsintel/llvm into cperkins-sycl2020-callable-device-selectors--02
@cperkinsintel, shared libs build fails in post-commit: https://github.com/intel/llvm/runs/7698490682?check_suite_focus=true |
intel#6486 added new queue constructors for the use of callable device selectors. In accordance with SYCL 2020 these constructors are marked explicit and as such implicit conversion from initializer lists does not work. These changes fixes the Wait.cpp unittest to use the explicit constructor. Signed-off-by: Larsen, Steffen <[email protected]>
#6540 should fix the build issue. |
#6486 added new queue constructors for the use of callable device selectors. In accordance with SYCL 2020 these constructors are marked explicit and as such implicit conversion from initializer lists does not work. These changes fixes the Wait.cpp unittest to use the explicit constructor. Signed-off-by: Larsen, Steffen <[email protected]>
In SYCL 2020 device selectors switch from subclasses to simply being any callable. This allows user defined device selector lambdas and more. This implements them for the queue, platform and device classes.
Tests are updated here: intel/llvm-test-suite#1119