Skip to content

[SYCL][DOC] Initial Draft of Extension for querying free device memory on Level Zero #3468

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
merged 4 commits into from
Apr 26, 2021

Conversation

jbrodman
Copy link
Contributor

@jbrodman jbrodman commented Apr 1, 2021

Signed-off-by: James Brodman [email protected]

@jbrodman jbrodman requested a review from a team as a code owner April 1, 2021 14:47
@jbrodman jbrodman requested a review from gmlueck April 1, 2021 14:47
@bader bader added the spec extension All issues/PRs related to extensions specifications label Apr 1, 2021
Copy link
Contributor

@gmlueck gmlueck left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Rather than creating a new extension for this, I think it should be added to the Level Zero backend specification (https://github.com/intel/llvm/blob/sycl/sycl/doc/extensions/LevelZeroBackend/LevelZeroBackend.md). For example, add a new section in that document called "Device information descriptors" (probably after the existing section 4).

I think it would be easier for our customer to have fewer extensions, so it probably makes sense to combine related features together into a single extension.

BTW, it's generally better to avoid long lines like you have in this document. Break lines at or before 80 columns. Among other benefits, this makes it easier to review when changes are made later.

@jbrodman jbrodman requested a review from smaslov-intel April 13, 2021 18:13
@smaslov-intel
Copy link
Contributor

I think it would be easier for our customer to have fewer extensions, so it probably makes sense to combine related features together into a single extension.

True, but why make it Level-Zero specific? I think we already have the right extension, which should add the new capability: https://github.com/intel/llvm/blob/sycl/sycl/doc/extensions/IntelGPU/IntelGPUDeviceInfo.md

@jbrodman
Copy link
Contributor Author

I think it would be easier for our customer to have fewer extensions, so it probably makes sense to combine related features together into a single extension.

True, but why make it Level-Zero specific? I think we already have the right extension, which should add the new capability: https://github.com/intel/llvm/blob/sycl/sycl/doc/extensions/IntelGPU/IntelGPUDeviceInfo.md

The functionality is Level Zero specific.

@jbrodman jbrodman requested a review from gmlueck April 15, 2021 20:22
@smaslov-intel
Copy link
Contributor

The functionality is Level Zero specific.

How do you conclude that? I can easily imagine CUDA or OpenCL (or any future backend) having this capability.

@jbrodman
Copy link
Contributor Author

OpenCL does not have it.

CUDA has something similar.

SYCL 2020 added a general capability to query things that one backend has that another may not. That's what we're using.

@smaslov-intel
Copy link
Contributor

SYCL 2020 added a general capability to query things that one backend has that another may not.

I agree this is the way to go for things that we don't expect other backends to support. But if extension is very general like this one, then I'd prefer it be added for all backends (even though some still don't support it). I can live with Level-Zero extension in this case, but then so should be the things that we've already added to https://github.com/intel/llvm/blob/sycl/sycl/doc/extensions/IntelGPU/IntelGPUDeviceInfo.md. I am asking for some consistency.

@jbrodman
Copy link
Contributor Author

We should also migrate the IntelGPU things to the new backend info model in SYCL 2020.

@smaslov-intel
Copy link
Contributor

We should also migrate the IntelGPU things to the new backend info model in SYCL 2020.

I'd be OK with that, so all info extensions are per backed. Other backends may create a copy it if they want.
@gmlueck , do you also agree to the direction in general?
I want it to be crystal clear because more such things are coming.

@gmlueck
Copy link
Contributor

gmlueck commented Apr 16, 2021

I'd be OK with that, so all info extensions are per backed. Other backends may create a copy it if they want.
@gmlueck , do you also agree to the direction in general?
I want it to be crystal clear because more such things are coming.

The problem with this logic is that the queries in IntelGPUDeviceInfo are specific to Intel hardware, whereas the Level Zero backend is intended to be generic, and could be implemented by any vendor.

I think the query for free memory is intended to be a generic feature that Level Zero would support for any hardware, right?

@smaslov-intel
Copy link
Contributor

the queries in IntelGPUDeviceInfo are specific to Intel hardware

I don't think so, it is just Intel supported it for it's GPU HW only. Others are free to support them for their HW.

I think the query for free memory is intended to be a generic feature that Level Zero would support for any hardware, right?

I don't think it should be mandatory that each HW implements it.

@gmlueck
Copy link
Contributor

gmlueck commented Apr 16, 2021

I don't think so, it is just Intel supported it for it's GPU HW only. Others are free to support them for their HW.

Are EU's and subslices really concepts that make sense for other vendor's hardware? I thought these were Intel specific concepts.

@smaslov-intel
Copy link
Contributor

I don't think so, it is just Intel supported it for it's GPU HW only. Others are free to support them for their HW.

Are EU's and subslices really concepts that make sense for other vendor's hardware? I thought these were Intel specific concepts.

There is nothing in https://en.wikipedia.org/wiki/Execution_unit that makes EU specific to Intel GPU. Not so sure about slices. But then there is also "PCI address" there, which is certainly not Intel GPU specific. I guess it would be hard to judge about what is generic enough, so why not have everything into backend-interoperability specs? If someone would feel eager to not have their "extenstion" used for anything than Intel GPU, for example, then they would add "intel_gpu" into its name, for example.

@jbrodman
Copy link
Contributor Author

I'd be OK with that, so all info extensions are per backed. Other backends may create a copy it if they want.
@gmlueck , do you also agree to the direction in general?
I want it to be crystal clear because more such things are coming.

The problem with this logic is that the queries in IntelGPUDeviceInfo are specific to Intel hardware, whereas the Level Zero backend is intended to be generic, and could be implemented by any vendor.

I think the query for free memory is intended to be a generic feature that Level Zero would support for any hardware, right?

Yes. It's a Level Zero interface.

@jbrodman
Copy link
Contributor Author

Are we good to approve and merge now?

@gmlueck
Copy link
Contributor

gmlueck commented Apr 19, 2021

Are we good to approve and merge now?

I think we need to decide how this query relates to the IntelGPUDeviceInfo queries. Do we think they all belong in the Level Zero backend spec? Alternatively, should we put the free memory query into the IntelGPUDeviceInfo extension?

@smaslov-intel
Copy link
Contributor

I think we need to decide how this query relates to the IntelGPUDeviceInfo queries. Do we think they all belong in the Level Zero backend spec? Alternatively, should we put the free memory query into the IntelGPUDeviceInfo extension?

Right, we need to decide. My vote would be for having new queries to be backend-agnostic to keep source code not specialized to particular backends. If not supported by a particular backend it would fail in run-time. This will also encourage other backends to mature their support.

@gmlueck
Copy link
Contributor

gmlueck commented Apr 19, 2021

Right, we need to decide. My vote would be for having new queries to be backend-agnostic to keep source code not specialized to particular backends. If not supported by a particular backend it would fail in run-time. This will also encourage other backends to mature their support.

I don't think we can design the API so that it fails unless we also provide a way for the application to test (in advance) whether it will fail. If we go this route, we'd also have to provide some way for the application to test whether a device supports this query for free memory.

I was looking at the IntelGPUDeviceInfo queries again just now. It seems like all those queries are supported for all devices on Level Zero and none of them are supported at all on OpenCL or CUDA. Therefore, it probably would have made more sense to expose them as Level Zero device info queries.

In my opinion, we should:

  • Add the free memory query as a Level Zero backend API as proposed by this MR.
  • Move the functionality from IntelGPUDeviceInfo also to the Level Zero backend API as additional device info descriptors.

This seems cleaner to me because we will not need some extra query to ask if each query is supported. Instead, applications can just test whether the backend is Level Zero. If other backends start supporting these queries in the future, we can always consider adding a new backend-neutral query that returns the same information. Until that happens, these queries are all specific to Level Zero, so we may as well expose them that way.

sycl::queue Queue;
auto Device = Queue.get_device();

uint64_t freeMemory = Device.get_backend_info<sycl::ext::oneapi::info::device::free_memory>();
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
uint64_t freeMemory = Device.get_backend_info<sycl::ext::oneapi::info::device::free_memory>();
uint64_t freeMemory = Device.get_backend_info<sycl::ext::oneapi::level_zero::info::device::free_memory>();

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@gmlueck Do you think we need the extra level_zero here?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes, I think we need it. Why do you say it is "extra"? It's the only occurrence of "level_zero" in that statement.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

extra in the sense of "in addition to oneapi::"

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Let's take this as an opportunity to define the namespace for the Level Zero backend. The namspace naming convention for a SYCL extension is sycl::ext::<vendorstring>. We've decided that we have two <vendorstrings>: oneapi and intel, depending on whether the extension is tightly tied to Intel hardware vs. a general extension that we'd like promote into the SYCL spec someday.

At present the Level Zero backend is our own extension because we haven't attempted to add this backend to the SYCL spec in the same way as the OpenCL backend. Therefore, I think it needs to follow the extension naming guidelines. (Note, I'm just talking about the Level Zero backend to SYCL here, not Level Zero itself.)

Putting this together, we arrive at a namespace of sycl::ext::oneapi::level_zero.

However, I admit that is a mouthful. If we wanted to make it shorter, one option would be to add another <vendorstring> and say that level_zero is itself a <vendorstring>. This would shorten the namespace to sycl::ext::level_zero.

Now let's consider the naming guidelines for new enumerated constants or new member functions that an extension adds. The guideline says that these should start with a prefix ext_<vendorstring>. There's no particular guideline for including the name of the backend, but it seems wise to somehow include the name "level_zero" if the API is tied to Level Zero. We can imagine that Level Zero might want to add some new aspects at some point, and of course Level Zero will have an enumerated constant in the sycl::backend enum. Here's a comparison of how those would look using each of the two <vendorstrings> above:

// Vendor string is "oneapi"

namespace sycl {

enum class aspect {
  // ...
  ext_oneapi_level_zero_fancy
};

enum class backend {
  // ...
  ext_oneapi_level_zero
}:

} // namespace sycl
// Vendor string is "level_zero"

namespace sycl {

enum class aspect {
  // ...
  ext_level_zero_fancy
};

enum class backend {
  // ...
  ext_level_zero
}:

} // namespace sycl

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We still need to resolve this naming convention.

} // namespace sycl
```

The new struct ```free_memory``` is used in conjuction with the ```get_backend_info()``` method of the ```device``` class in SYCL 2020. The query will return the number of bytes of free memory for that device.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggest using a table here to describe each info descriptor because that will be easy to extend as we add more. The description for each query can be shorter since we talk about their general usage in the introduction above. For this one, I think the description can just be:

Returns the number of bytes of free memory for the device.

@smaslov-intel
Copy link
Contributor

.. we will not need some extra query to ask if each query is supported. Instead, applications can just test whether the backend is Level Zero

So, one would write this (1):

uint64_t freeMemory = 0;
if (backend == level_zero) {
  freeMemory = Device.get_backend_info<sycl::ext::oneapi::level_zero::info::device::free_memory>();
}

as opposed to this (2):

uint64_t freeMemory = 0;
if (backend.supports<sycl::ext::oneapi::level_zero::info::device::free_memory>) {
  freeMemory = Device.get_backend_info<sycl::ext::oneapi::level_zero::info::device::free_memory>();
}

I don't see any critical difference between the two above.
If we allow to just throw "unsupported" exception (3):

try {
  uint64_t freeMemory = Device.get_backend_info<sycl::ext::oneapi::info::device::free_memory>();
}
catch(...) {
}

To me the later is the least backend-specific variant, for which I like it the most.
But I can accept (1) or (2) if that's what consensus would be for.
Just let's be consistent, and follow the rule for all (most) of the things.

@gmlueck
Copy link
Contributor

gmlueck commented Apr 19, 2021

I don't see any critical difference between the two above.

One difference is that backend == level_zero is supported now, but we do not have any API like backend.supports currently. Option 2 also seems to fit in nicely with the SYCL 2020 core API. This seems like the exact scenario for which device::get_backend_info() was intended.

I don't think we have any other places in SYCL where the only way to find out if an API is supported is by catching an exception, so option (3) doesn't seem like the right direction to me.

@smaslov-intel
Copy link
Contributor

Option 2 also seems to fit in nicely with the SYCL 2020 core API.

So, do you vote for (1) or (2)?

I don't think we have any other places in SYCL where the only way to find out if an API is supported is by catching an exception, so option (3) doesn't seem like the right direction to me.

That's unfortunate that backend specifics are to be exposed to users without real need for that.

@gmlueck
Copy link
Contributor

gmlueck commented Apr 19, 2021

Option 2 also seems to fit in nicely with the SYCL 2020 core API.

Sorry, I mistyped. I meant "Option 1 ...".

So, do you vote for (1) or (2)?

I vote for (1).

@gmlueck
Copy link
Contributor

gmlueck commented Apr 21, 2021

In my opinion, we should:

  • Add the free memory query as a Level Zero backend API as proposed by this MR.
  • Move the functionality from IntelGPUDeviceInfo also to the Level Zero backend API as additional device info descriptors.

After talking with @bashbaug, it seems like some (all?) of the queries in IntelGPUDeviceInfo do have equivalents in an OpenCL extension. If that is the case, it's probably better to keep them as generic queries, rather than making them Level Zero queries.

I have not head of any plans to add the "free memory" query to any other backend, so I think that should remain a Level Zero query.

smaslov-intel
smaslov-intel previously approved these changes Apr 21, 2021
Copy link
Contributor

@smaslov-intel smaslov-intel left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I agree to proceed with this PR adding the "free memory" as a Level-Zero only extension. Also let's keep https://github.com/intel/llvm/blob/sycl/sycl/doc/extensions/IntelGPU/IntelGPUDeviceInfo.md where it already is for now. But I still think we should tend to add new staff consistently one way or another.

Signed-off-by: James Brodman <[email protected]>
Copy link
Contributor

@gmlueck gmlueck left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM

@bader bader requested a review from smaslov-intel April 26, 2021 15:07
@bader bader merged commit fa428bf into intel:sycl Apr 26, 2021
alexbatashev pushed a commit to alexbatashev/llvm that referenced this pull request Apr 27, 2021
* upstream/sycl:
  [NFC][SYCL] Avoid -Wreorder warning about order of initialization (intel#3620)
  [SYCL][DOC] Initial Draft of Extension for querying free device memory on Level Zero (intel#3468)
  [SYCL][PI][L0] Submit open command batch on event status query (intel#3612)
  [NFC] Fix the comment (intel#3613)
  Rename misleading attribute flag (intel#3610)
  [SYCL] Generate an opt report of kernel arguments.  (intel#3492)
  [SYCL] Support extra environment variables in LIT (intel#3598)
  [SYCL][Matrix] Make joint_matrix_mad return A*B+C's result instead of C=A*B+C (intel#3586)
alexbatashev pushed a commit to alexbatashev/llvm that referenced this pull request Apr 28, 2021
* upstream/sycl:
  [SYCL][Doc] Add group sorting algorithms extension specification (intel#3514)
  [Buildbot] Update Windows GPU driver to 27.20.100.9466 (intel#3594)
  [SYCL][NFC] Update tests for FPGA attributes (intel#3632)
  [CODEOWNERS] Add @kbobrovs back to few projects (intel#3638)
  [NFC] Update codeowners (intel#3619)
  [SYCL] Support 3-, 16-elements vectors in SG load/store (intel#3617)
  [SYCL-PTX] Fix libclc dependencies (intel#3624)
  [SYCL] Add sycl::span for SYCL2020 support (intel#3569)
  [NFC][SYCL] Avoid -Wreorder warning about order of initialization (intel#3620)
  [SYCL][DOC] Initial Draft of Extension for querying free device memory on Level Zero (intel#3468)
  [SYCL][PI][L0] Submit open command batch on event status query (intel#3612)
  [NFC] Fix the comment (intel#3613)
  Rename misleading attribute flag (intel#3610)
  [SYCL] Generate an opt report of kernel arguments.  (intel#3492)
  [SYCL] Support extra environment variables in LIT (intel#3598)
  [SYCL][Matrix] Make joint_matrix_mad return A*B+C's result instead of C=A*B+C (intel#3586)
@altintel
Copy link

Thank you for the design! When should we expect it available in DPC++? ETA?

@masterleinad
Copy link
Contributor

Is this functionality available anywhere?

@masterleinad
Copy link
Contributor

An implementation could look somewhat like kokkos/kokkos-kernels#1225.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
spec extension All issues/PRs related to extensions specifications
Projects
None yet
Development

Successfully merging this pull request may close these issues.

6 participants