-
Notifications
You must be signed in to change notification settings - Fork 789
[SYCL][DOC] Initial Draft of Extension for querying free device memory on Level Zero #3468
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
Signed-off-by: James Brodman <[email protected]>
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Rather than creating a new extension for this, I think it should be added to the Level Zero backend specification (https://github.com/intel/llvm/blob/sycl/sycl/doc/extensions/LevelZeroBackend/LevelZeroBackend.md). For example, add a new section in that document called "Device information descriptors" (probably after the existing section 4).
I think it would be easier for our customer to have fewer extensions, so it probably makes sense to combine related features together into a single extension.
BTW, it's generally better to avoid long lines like you have in this document. Break lines at or before 80 columns. Among other benefits, this makes it easier to review when changes are made later.
Signed-off-by: James Brodman <[email protected]>
True, but why make it Level-Zero specific? I think we already have the right extension, which should add the new capability: https://github.com/intel/llvm/blob/sycl/sycl/doc/extensions/IntelGPU/IntelGPUDeviceInfo.md |
The functionality is Level Zero specific. |
How do you conclude that? I can easily imagine CUDA or OpenCL (or any future backend) having this capability. |
OpenCL does not have it. CUDA has something similar. SYCL 2020 added a general capability to query things that one backend has that another may not. That's what we're using. |
I agree this is the way to go for things that we don't expect other backends to support. But if extension is very general like this one, then I'd prefer it be added for all backends (even though some still don't support it). I can live with Level-Zero extension in this case, but then so should be the things that we've already added to https://github.com/intel/llvm/blob/sycl/sycl/doc/extensions/IntelGPU/IntelGPUDeviceInfo.md. I am asking for some consistency. |
We should also migrate the IntelGPU things to the new backend info model in SYCL 2020. |
I'd be OK with that, so all info extensions are per backed. Other backends may create a copy it if they want. |
The problem with this logic is that the queries in IntelGPUDeviceInfo are specific to Intel hardware, whereas the Level Zero backend is intended to be generic, and could be implemented by any vendor. I think the query for free memory is intended to be a generic feature that Level Zero would support for any hardware, right? |
I don't think so, it is just Intel supported it for it's GPU HW only. Others are free to support them for their HW.
I don't think it should be mandatory that each HW implements it. |
Are EU's and subslices really concepts that make sense for other vendor's hardware? I thought these were Intel specific concepts. |
There is nothing in https://en.wikipedia.org/wiki/Execution_unit that makes EU specific to Intel GPU. Not so sure about slices. But then there is also "PCI address" there, which is certainly not Intel GPU specific. I guess it would be hard to judge about what is generic enough, so why not have everything into backend-interoperability specs? If someone would feel eager to not have their "extenstion" used for anything than Intel GPU, for example, then they would add "intel_gpu" into its name, for example. |
Yes. It's a Level Zero interface. |
Are we good to approve and merge now? |
I think we need to decide how this query relates to the IntelGPUDeviceInfo queries. Do we think they all belong in the Level Zero backend spec? Alternatively, should we put the free memory query into the IntelGPUDeviceInfo extension? |
Right, we need to decide. My vote would be for having new queries to be backend-agnostic to keep source code not specialized to particular backends. If not supported by a particular backend it would fail in run-time. This will also encourage other backends to mature their support. |
I don't think we can design the API so that it fails unless we also provide a way for the application to test (in advance) whether it will fail. If we go this route, we'd also have to provide some way for the application to test whether a device supports this query for free memory. I was looking at the IntelGPUDeviceInfo queries again just now. It seems like all those queries are supported for all devices on Level Zero and none of them are supported at all on OpenCL or CUDA. Therefore, it probably would have made more sense to expose them as Level Zero device info queries. In my opinion, we should:
This seems cleaner to me because we will not need some extra query to ask if each query is supported. Instead, applications can just test whether the backend is Level Zero. If other backends start supporting these queries in the future, we can always consider adding a new backend-neutral query that returns the same information. Until that happens, these queries are all specific to Level Zero, so we may as well expose them that way. |
sycl::queue Queue; | ||
auto Device = Queue.get_device(); | ||
|
||
uint64_t freeMemory = Device.get_backend_info<sycl::ext::oneapi::info::device::free_memory>(); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
uint64_t freeMemory = Device.get_backend_info<sycl::ext::oneapi::info::device::free_memory>(); | |
uint64_t freeMemory = Device.get_backend_info<sycl::ext::oneapi::level_zero::info::device::free_memory>(); | |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@gmlueck Do you think we need the extra level_zero here?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yes, I think we need it. Why do you say it is "extra"? It's the only occurrence of "level_zero" in that statement.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
extra in the sense of "in addition to oneapi::"
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Let's take this as an opportunity to define the namespace for the Level Zero backend. The namspace naming convention for a SYCL extension is sycl::ext::<vendorstring>
. We've decided that we have two <vendorstrings>
: oneapi
and intel
, depending on whether the extension is tightly tied to Intel hardware vs. a general extension that we'd like promote into the SYCL spec someday.
At present the Level Zero backend is our own extension because we haven't attempted to add this backend to the SYCL spec in the same way as the OpenCL backend. Therefore, I think it needs to follow the extension naming guidelines. (Note, I'm just talking about the Level Zero backend to SYCL here, not Level Zero itself.)
Putting this together, we arrive at a namespace of sycl::ext::oneapi::level_zero
.
However, I admit that is a mouthful. If we wanted to make it shorter, one option would be to add another <vendorstring>
and say that level_zero
is itself a <vendorstring>
. This would shorten the namespace to sycl::ext::level_zero
.
Now let's consider the naming guidelines for new enumerated constants or new member functions that an extension adds. The guideline says that these should start with a prefix ext_<vendorstring>
. There's no particular guideline for including the name of the backend, but it seems wise to somehow include the name "level_zero" if the API is tied to Level Zero. We can imagine that Level Zero might want to add some new aspects at some point, and of course Level Zero will have an enumerated constant in the sycl::backend
enum. Here's a comparison of how those would look using each of the two <vendorstrings>
above:
// Vendor string is "oneapi"
namespace sycl {
enum class aspect {
// ...
ext_oneapi_level_zero_fancy
};
enum class backend {
// ...
ext_oneapi_level_zero
}:
} // namespace sycl
// Vendor string is "level_zero"
namespace sycl {
enum class aspect {
// ...
ext_level_zero_fancy
};
enum class backend {
// ...
ext_level_zero
}:
} // namespace sycl
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We still need to resolve this naming convention.
} // namespace sycl | ||
``` | ||
|
||
The new struct ```free_memory``` is used in conjuction with the ```get_backend_info()``` method of the ```device``` class in SYCL 2020. The query will return the number of bytes of free memory for that device. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Suggest using a table here to describe each info descriptor because that will be easy to extend as we add more. The description for each query can be shorter since we talk about their general usage in the introduction above. For this one, I think the description can just be:
Returns the number of bytes of free memory for the device.
So, one would write this (1): uint64_t freeMemory = 0;
if (backend == level_zero) {
freeMemory = Device.get_backend_info<sycl::ext::oneapi::level_zero::info::device::free_memory>();
} as opposed to this (2): uint64_t freeMemory = 0;
if (backend.supports<sycl::ext::oneapi::level_zero::info::device::free_memory>) {
freeMemory = Device.get_backend_info<sycl::ext::oneapi::level_zero::info::device::free_memory>();
} I don't see any critical difference between the two above. try {
uint64_t freeMemory = Device.get_backend_info<sycl::ext::oneapi::info::device::free_memory>();
}
catch(...) {
} To me the later is the least backend-specific variant, for which I like it the most. |
One difference is that I don't think we have any other places in SYCL where the only way to find out if an API is supported is by catching an exception, so option (3) doesn't seem like the right direction to me. |
So, do you vote for (1) or (2)?
That's unfortunate that backend specifics are to be exposed to users without real need for that. |
Sorry, I mistyped. I meant "Option 1 ...".
I vote for (1). |
Signed-off-by: James Brodman <[email protected]>
After talking with @bashbaug, it seems like some (all?) of the queries in IntelGPUDeviceInfo do have equivalents in an OpenCL extension. If that is the case, it's probably better to keep them as generic queries, rather than making them Level Zero queries. I have not head of any plans to add the "free memory" query to any other backend, so I think that should remain a Level Zero query. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I agree to proceed with this PR adding the "free memory" as a Level-Zero only extension. Also let's keep https://github.com/intel/llvm/blob/sycl/sycl/doc/extensions/IntelGPU/IntelGPUDeviceInfo.md where it already is for now. But I still think we should tend to add new staff consistently one way or another.
Signed-off-by: James Brodman <[email protected]>
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM
* upstream/sycl: [NFC][SYCL] Avoid -Wreorder warning about order of initialization (intel#3620) [SYCL][DOC] Initial Draft of Extension for querying free device memory on Level Zero (intel#3468) [SYCL][PI][L0] Submit open command batch on event status query (intel#3612) [NFC] Fix the comment (intel#3613) Rename misleading attribute flag (intel#3610) [SYCL] Generate an opt report of kernel arguments. (intel#3492) [SYCL] Support extra environment variables in LIT (intel#3598) [SYCL][Matrix] Make joint_matrix_mad return A*B+C's result instead of C=A*B+C (intel#3586)
* upstream/sycl: [SYCL][Doc] Add group sorting algorithms extension specification (intel#3514) [Buildbot] Update Windows GPU driver to 27.20.100.9466 (intel#3594) [SYCL][NFC] Update tests for FPGA attributes (intel#3632) [CODEOWNERS] Add @kbobrovs back to few projects (intel#3638) [NFC] Update codeowners (intel#3619) [SYCL] Support 3-, 16-elements vectors in SG load/store (intel#3617) [SYCL-PTX] Fix libclc dependencies (intel#3624) [SYCL] Add sycl::span for SYCL2020 support (intel#3569) [NFC][SYCL] Avoid -Wreorder warning about order of initialization (intel#3620) [SYCL][DOC] Initial Draft of Extension for querying free device memory on Level Zero (intel#3468) [SYCL][PI][L0] Submit open command batch on event status query (intel#3612) [NFC] Fix the comment (intel#3613) Rename misleading attribute flag (intel#3610) [SYCL] Generate an opt report of kernel arguments. (intel#3492) [SYCL] Support extra environment variables in LIT (intel#3598) [SYCL][Matrix] Make joint_matrix_mad return A*B+C's result instead of C=A*B+C (intel#3586)
Thank you for the design! When should we expect it available in DPC++? ETA? |
Is this functionality available anywhere? |
An implementation could look somewhat like kokkos/kokkos-kernels#1225. |
Signed-off-by: James Brodman [email protected]