Skip to content

[SYCL] Implement new env var SYCL_FORCE_PI and extend device_selector::select_device #2214

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Closed
wants to merge 39 commits into from

Conversation

bso-intel
Copy link
Contributor

This PR responds to several requests for user controls to loading only specific plugins into SYCL RT, and a way to request a specific device regardless of SYCL_BE setting.
For the first request, this PR introduced a new env var SYCL_FORCE_PI. The possible values are any combination of {opencl, level0, cuda}. Only the specified plugins will be loaded into SYCL RT.
For the second request, this PR extends device_selector::select_device() to take optional parameters device_type and backend.
In current implementation, device availability is heavily depending on the env var SYCL_BE and SYCL_DEVICE_TYPE. The user wanted a control in API to pin down specific device regardless of SYCL_BE setting.
Note that SYCL_BE=PI_LEVEL0 prevents users from using CPU and ACC devices.

bso-intel added 16 commits July 26, 2020 22:13
Extended select_device to take optional parameters device_type and backend.
The current implementation of select_device returns different devices
depending on the env var such as SYCL_BE and SYCL_DEVICE_TYPE, and uses heuristic.
This extension of select_device will allow users to set the desired device type and possibly also backend if multiple devices are the same device type.

Signed-off-by: Byoungro So <[email protected]>
Extended select_device to take optional parameters device_type and backend.
The current implementation of select_device returns different devices
depending on the env var such as SYCL_BE and SYCL_DEVICE_TYPE, and uses heuristic.
This extension of select_device will allow users to set the desired device type and possibly also backend if multiple devices are the same device type.

Signed-off-by: Byoungro So <[email protected]>
Signed-off-by: Byoungro So <[email protected]>
Extended select_device to take optional parameters device_type and backend.
The current implementation of select_device returns different devices
depending on the env var such as SYCL_BE and SYCL_DEVICE_TYPE, and uses heuristic.
This extension of select_device will allow users to set the desired device type and possibly also backend if multiple devices are the same device type.

Signed-off-by: Byoungro So <[email protected]>
Signed-off-by: Byoungro So <[email protected]>
Signed-off-by: Byoungro So <[email protected]>
Signed-off-by: Byoungro So <[email protected]>
bso-intel and others added 4 commits July 29, 2020 16:43
Copy link
Contributor

@alexbatashev alexbatashev left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

How is it going to affect user-defined selectors? Why don't users use custom selectors? If someone needs to control device selection with environment variables, why don't they create a special device selector?

Comment on lines 37 to 41
// deviceType is an optional parameter to set the desired device
// info::device_type::all means a heuristic is used to select a device with
// highest score
device select_device(info::device_type deviceType = info::device_type::all,
backend BE = backend::level0) const;
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can we implement it as a non-breaking change?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This change won't affect existing abi. The existing code will continue to work as before based on a heuristic since no parameters are passed.

Comment on lines 38 to 73
device device_selector::select_device(info::device_type deviceType,
backend be) const {
// return if a requested deviceType is found
if (deviceType != info::device_type::all) {
if (deviceType == info::device_type::host) {
vector_class<device> devices;
devices.resize(1);
return devices[0];
}

const vector_class<detail::plugin> &plugins = RT::initialize();
for (unsigned int i = 0; i < plugins.size(); i++) {
pi_uint32 numPlatforms = 0;
plugins[i].call<detail::PiApiKind::piPlatformsGet>(0, nullptr,
&numPlatforms);
if (numPlatforms) {
vector_class<RT::PiPlatform> piPlatforms(numPlatforms);
plugins[i].call<detail::PiApiKind::piPlatformsGet>(
numPlatforms, piPlatforms.data(), nullptr);
for (const auto &piPlatform : piPlatforms) {
platform pltf = detail::createSyclObjFromImpl<platform>(
std::make_shared<detail::platform_impl>(piPlatform, plugins[i]));
if (!pltf.is_host()) {
vector_class<device> devices = pltf.get_devices(deviceType);
for (uint32_t i = 0; i < devices.size(); i++) {
if (deviceType != info::device_type::gpu) {
return devices[i];
} else if (devices[i].is_gpu() && be == pltf.get_backend()) {
return devices[i];
}
}
}
}
}
}
}
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This seem to violate the spirit of device_selector. Why not just give some boost to matching device scores?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The problem is that SYCL_BE=PI_LEVEL0 will hide CPU and ACC devices from available devices in the system. So giving bonus points won't work.
As I mentioned above, the existing calls to select_device() with no parameter specified will continue to work and does not violate the existing code.

Comment on lines 43 to 45
vector_class<device> devices;
devices.resize(1);
return devices[0];
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
vector_class<device> devices;
devices.resize(1);
return devices[0];
return device{};

}

const vector_class<detail::plugin> &plugins = RT::initialize();
for (unsigned int i = 0; i < plugins.size(); i++) {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Range-based for?

Comment on lines +63 to +67
if (deviceType != info::device_type::gpu) {
return devices[i];
} else if (devices[i].is_gpu() && be == pltf.get_backend()) {
return devices[i];
}
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If we ever have a non-OpenCL based backend for other devices, this will be a bug.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

As described in the comment, the second parameter 'be' is meant to designate specific GPU device backend, and not used for other device types.

@@ -3858,7 +3858,7 @@ _ZNK2cl4sycl14interop_handle12getNativeMemEPNS0_6detail16AccessorImplHostE
_ZNK2cl4sycl14interop_handle14getNativeQueueEv
_ZNK2cl4sycl14interop_handle15getNativeDeviceEv
_ZNK2cl4sycl14interop_handle16getNativeContextEv
_ZNK2cl4sycl15device_selector13select_deviceEv
_ZNK2cl4sycl15device_selector13select_deviceENS0_4info11device_typeENS0_7backendE
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If you still want to break ABI, please, uplift DEV_ABI version.

// to be loaded. Note that SYCL_BE=PI_LEVEL0 prevents from other OCL CPU
// devices to be discovered.

const char *envVal = std::getenv("SYCL_FORCE_PI");
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

SYCLConfig?

@alexbatashev alexbatashev requested a review from romanovvlad July 30, 2020 08:11
Enables partial specialization of atomic_ref for pointer types.

Implementation assumes that both host and device pointers can be stored
in a uintptr_t, but uses compare_exchange to implement pointer arithmetic
rather than make assumptions about how pointers will be represented on
different devices.

Signed-off-by: John Pennycook <[email protected]>
@romanovvlad
Copy link
Contributor

This PR responds to several requests for user controls to loading only specific plugins into SYCL RT, and a way to request a specific device regardless of SYCL_BE setting.
For the first request, this PR introduced a new env var SYCL_FORCE_PI. The possible values are any combination of {opencl, level0, cuda}. Only the specified plugins will be loaded into SYCL RT.
For the second request, this PR extends device_selector::select_device() to take optional parameters device_type and backend.
In current implementation, device availability is heavily depending on the env var SYCL_BE and SYCL_DEVICE_TYPE. The user wanted a control in API to pin down specific device regardless of SYCL_BE setting.
Note that SYCL_BE=PI_LEVEL0 prevents users from using CPU and ACC devices.

Do I understand correctly that SYCL_FORCE_PI is the same as SYCL_BE but support several plugins specified?
If so, why we need two variables? Can't we just extend SYCL_BE to support multiple plugins?

Mike Kinsner and others added 3 commits July 30, 2020 17:04
…tel#2025)

PI_LEVEL0 -> PI_LEVEL_ZERO. 
PI_LEVEL0 is still accepted and handled correctly.

Signed-off-by: Gail Lyons <[email protected]>
This version supports side-by-side installation of OpenCL RT for
Intel CPU and OpenCL RT for Intel FPGA emulation.
@bso-intel
Copy link
Contributor Author

It seems I have to clarify the difference between SYCL_BE and SYCL_FORCE_PI.
SYCL_BE is designed to tell SYCL RT the choice of GPU backend. It has a side effect of hiding other device types such as CPU and ACC.
SYCL_FORCE_PI is limit SYCL RT to load only designated plugins.
Here is some information to help you understand why this makes a difference.
SYCL RT look for all possible plugins in the system, and load them all. And then, it provides platforms and devices supported from these plugins. SYCL_BE can affect the device discovery at this time.
SYCL_FORCE_PI limits SYCL RT to only look for specified plugins at the beginning. Unspecified plugins won't even be loaded to SYCL RT.
There was a request for this user control to avoid some problems on the user side.

Another clarification may be needed between select_device() vs specific device_selector.
Specific device_selectors are disabled when SYCL_BE=PI_LEVEL0 is set. For example, cpu_selector won't return the available device. Customers use SYCL_BE to switch between different GPU devices and they forget they set it once before, and they found CPU and ACC devices are not working.
Customers asked a way to select a particular device by themselves instead of relying on the heuristic point.
Select_device() still continue to work as before based on the heuristic if 2 optional parameters are not passed.
These optional parameters of select_device(device_type, backend) will allow the user to "select" their favorite device without worrying about env vars such as SYCL_BE or SYCL_DEVICE_TYPE.

@bso-intel
Copy link
Contributor Author

How is it going to affect user-defined selectors? Why don't users use custom selectors? If someone needs to control device selection with environment variables, why don't they create a special device selector?

All different kinds of device selectors are governed by SYCL_BE and SYCL_DEVICE_TYPE. For example, CPU and ACC devices are not available as soon as the user sets SYCL_BE=PI_LEVEL0 for GPU. There was a request for customer request to "select" their favorite device regardless of these env vars.

@bso-intel
Copy link
Contributor Author

This PR responds to several requests for user controls to loading only specific plugins into SYCL RT, and a way to request a specific device regardless of SYCL_BE setting.
For the first request, this PR introduced a new env var SYCL_FORCE_PI. The possible values are any combination of {opencl, level0, cuda}. Only the specified plugins will be loaded into SYCL RT.
For the second request, this PR extends device_selector::select_device() to take optional parameters device_type and backend.
In current implementation, device availability is heavily depending on the env var SYCL_BE and SYCL_DEVICE_TYPE. The user wanted a control in API to pin down specific device regardless of SYCL_BE setting.
Note that SYCL_BE=PI_LEVEL0 prevents users from using CPU and ACC devices.

Do I understand correctly that SYCL_FORCE_PI is the same as SYCL_BE but support several plugins specified?
If so, why we need two variables? Can't we just extend SYCL_BE to support multiple plugins?

I explained the difference between SYCL_FORCE_PI and SYCL_BE. I hope I clarified your questions.

bso-intel and others added 13 commits July 30, 2020 13:14
Extended select_device to take optional parameters device_type and backend.
The current implementation of select_device returns different devices
depending on the env var such as SYCL_BE and SYCL_DEVICE_TYPE, and uses heuristic.
This extension of select_device will allow users to set the desired device type and possibly also backend if multiple devices are the same device type.

Signed-off-by: Byoungro So <[email protected]>
Signed-off-by: Byoungro So <[email protected]>
Extended select_device to take optional parameters device_type and backend.
The current implementation of select_device returns different devices
depending on the env var such as SYCL_BE and SYCL_DEVICE_TYPE, and uses heuristic.
This extension of select_device will allow users to set the desired device type and possibly also backend if multiple devices are the same device type.

Signed-off-by: Byoungro So <[email protected]>
Signed-off-by: Byoungro So <[email protected]>
Signed-off-by: Byoungro So <[email protected]>
Signed-off-by: Byoungro So <[email protected]>
Signed-off-by: Byoungro So <[email protected]>
@bso-intel
Copy link
Contributor Author

I am closing this PR and will create a new one based on our discussion today.

@bso-intel bso-intel closed this Jul 31, 2020
@bso-intel bso-intel deleted the select-device branch August 4, 2020 20:11
jsji pushed a commit that referenced this pull request Nov 24, 2023
This PR aims to introduce entities related to OpCooperativeMatrixApplyFunctionINTEL in llvm-spirv translator, according to https://github.com/intel/llvm/blob/sycl/sycl/doc/design/spirv-extensions/SPV_INTEL_joint_matrix.asciidoc.

Co-authored-by: Sidorov, Dmitry <[email protected]>

Original commit:
KhronosGroup/SPIRV-LLVM-Translator@467edf9
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

7 participants