Skip to content

Add sycl ext intel kernel queries extension #16834

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
merged 28 commits into from
Mar 6, 2025

Conversation

kurapov-peter
Copy link
Contributor

@kurapov-peter kurapov-peter commented Jan 29, 2025

This adds a sycl_ext_intel_kernel_queries extension for target-specific low-level information retrieval. In a nutshell, the extension adds compiled kernel traits for the intel backend. The traits can be queried using the usual kernel::get_info() calls (and get_kernel_info). Each query gets its own aspect to tell whether the query is supported on the device. This PR adds ext_intel_spill_memory_size aspect and an implementation for querying the spilled memory size as reported by L0.

Comment on lines 151 to 155
inline ext::intel::info::kernel::spill_mem_size::return_type
get_kernel_device_specific_info<ext::intel::info::kernel::spill_mem_size>(
ur_kernel_handle_t Kernel, ur_device_handle_t Device,
const AdapterPtr &Adapter) {
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is it possible to check for the aspect here as well?

Comment on lines 173 to 190
// Retrieve the associated device list
size_t URDevicesSize = 0;
Adapter->call<UrApiKind::urProgramGetInfo>(Program, UR_PROGRAM_INFO_DEVICES,
0, nullptr, &URDevicesSize);

std::vector<ur_device_handle_t> URDevices(URDevicesSize /
sizeof(ur_device_handle_t));
Adapter->call<UrApiKind::urProgramGetInfo>(Program, UR_PROGRAM_INFO_DEVICES,
URDevicesSize, URDevices.data(),
nullptr);
assert(Result.size() == URDevices.size());

// Map the result back to the program devices
for (size_t idx = 0; idx < URDevices.size(); ++idx) {
if (URDevices[idx] == Device)
return Result[idx];
}
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The mapping to devices relies on UR providing the same order within a single application (see oneapi-src/unified-runtime#2614 for details). There was a problem inside UR implementation that demanded the API to return an array of values for spills instead of a single value (due to device pointer unavailability). This implementation hides that inconvenience from SYCL users.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Fixed in UR v1 adapter

Comment on lines 190 to 192
throw exception(make_error_code(errc::runtime),
"ext::intel::info::kernel::spill_mem_size failed to retrieve "
"the requested value");
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is this the right type of exception?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

There might be a better way to test the thing. This would just return zero. Any suggestions?

@gmlueck
Copy link
Contributor

gmlueck commented Jan 31, 2025

Hi @kurapov-peter, I hope you don't mind, but I pushed a commit with my proposal for how the specification should look in b027f870d67e035e6476dc0a84bc3f5dfddbf99f. Feel free to revert this commit if you are opposed, or to add comments in this PR.

Here is my rationale for some of the changes:

  • Since this PR is implementing the extension, it can't be "proposed". It needs to be either "experimental" or "supported". I thought the extension seemed simple enough to go straight to "supported", but we can do "experimental" if you think the API might change.
  • I reworded the Overview to be a bit more general, allowing us to add more queries in the future.
  • I think each query needs to have its own matching aspect. This is similar to what we did in sycl_ext_intel_device_info.
  • I changed the names to spell out "memory" because the newer APIs in SYCL tend to prefer whole words rather than abbreviations.
  • I changed the return type of the query to size_t because that seemed more appropriate for a C++ API.
  • I changed the namespace of the info descriptor to kernel_device_specific because this is the correct namespace for kernel info descriptors that are used with the get_info(const device&) overload.
  • I updated the general style of the specification to use a newer format that we are moving to.

Note that I did not update the implementation, so that will need to be changed to match the spec.

@kurapov-peter
Copy link
Contributor Author

All the changes look reasonable to me. I'll update the implementation.

=== Spill memory size

This query returns the kernel's spill memory size that is allocated by the
compiler, as reported by Level Zero.
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is L0 important here?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Normally, we would not define a SYCL API in terms of a particular backend like this. However, it seemed to me that "spill memory size" is too vaguely defined. The reality is that SYCL is just returning whatever value comes from Level Zero. If people have questions about what it means, I'd rather direct them to Level Zero.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

fair enough

@kurapov-peter
Copy link
Contributor Author

#16810 is merged, so this has no dependencies. Putting it to ready for review.

@@ -128,6 +128,7 @@ attributes #2 = { nounwind }
!77 = !{!"ext_oneapi_bindless_images_sample_2d_usm", i32 79}
!78 = !{!"ext_oneapi_atomic16", i32 80}
!79 = !{!"ext_oneapi_virtual_functions", i32 81}
!79 = !{!"ext_intel_spill_memory_size", i32 82}
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I doubt that this line has any effect on the test, so you may as well drop it. Moreover, I'm not sure what are the rules of redefining metadata, it may not be a valid change at all

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yep, I agree with @AlexeySachkov on behalf of sanitizer reviewer.

Comment on lines 103 to 106
const size_t spillMemSz =
krn.get_info<ext::intel::info::kernel_device_specific::spill_memory_size>(
dev);
assert(spillMemSz >= 0);
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This should be hidden under an if, the query is not guaranteed to be supported everywhere. The same goes for other change in this file.

@gmlueck
Copy link
Contributor

gmlueck commented Feb 4, 2025

@Pennycook could you review this for @intel/dpcpp-specification-reviewers ? I'm the author of the spec in this PR.

Copy link
Contributor

@Pennycook Pennycook left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Specification looks good, but a meta-comment: if we're going to have to end up exposing all the Level Zero kernel queries, it might be a good idea to just provide a single way to query things using the Level Zero enums and/or backend interop.

@aelovikov-intel
Copy link
Contributor

it might be a good idea to just provide a single way to query things using the Level Zero enums and/or backend interop

Why don't we push for that right now?

This is a draft for the extension and its implementation.

This needs an update, I assume

@kurapov-peter kurapov-peter requested review from a team as code owners March 5, 2025 15:01
@kurapov-peter kurapov-peter force-pushed the sycl_ext_intel_kernel_queries branch from ed41dfc to 7402507 Compare March 5, 2025 15:05
@kurapov-peter kurapov-peter removed request for a team, frasercrmck and reble March 5, 2025 15:06
@sarnex sarnex merged commit 65498a6 into intel:sycl Mar 6, 2025
33 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

8 participants