Skip to content

[SYCL] Add architecture info to sycl-ls --verbose #13976

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
merged 13 commits into from
Jun 20, 2024
Merged

Conversation

jzc
Copy link
Contributor

@jzc jzc commented May 30, 2024

This PR adds architecture info from the sycl_ext_oneapi_device_architecture to sycl-ls --verbose. Additionally, the E2E lit config is updated to parse this info and add these as features that can be used in lit testing (e.g. // REQUIRES: architecture-intel_gpu_pvc).

@jzc jzc requested a review from a team as a code owner May 30, 2024 18:17
auto arch = Device.get_info<syclex::info::device::architecture>();
switch (arch) {
#define __ARCH(ARCH) \
case syclex::architecture::ARCH: \
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think we should refactor this into a .inc file that would be used both here and in device_info.hpp, so that updates would need to be made in a single place.

@@ -68,6 +68,106 @@ std::string getDeviceTypeName(const device &Device) {
}
}

namespace syclex = sycl::ext::oneapi::experimental;
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This has to be at least local to the getArchName function, if not fully inlined.

@@ -684,11 +688,17 @@
sg_sizes = set(dev_sg_sizes[0]).intersection(*dev_sg_sizes)
lit_config.note("SG sizes for {}: {}".format(sycl_device, ", ".join(sg_sizes)))

lit_config.note("Architectures for {}: {}".format(sycl_device, ", ".join(architectures)))
if len(architectures) != 1 or "unknown" in architectures:
architectures = set()
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think it would be safer to just raise an error and abandon testing.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If we see more (or less) than one architecture returned, we should indeed abandon testing and report that as some fatal failure.

However, if we see unknown we should proceed, I think. This is important for our downstream where we launch those tests on new/unstable/unknown hardware and we don't really want to have a complete blocker like this in there. It is ok that some of the tests may be skipped because they require a certain architecture, but at least the rest of the testing will go through.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can we require an explicit option telling us that "unknown" is fine? What I want to avoid is this auto-detection silently failing and then not running tests that we should be running on a particular runner/HW and thinking we have good test coverage there due to not seeing any failures when in reality the tests are simply skipped.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

What I want to avoid is this auto-detection silently failing and then not running tests that we should be running on a particular runner/HW and thinking we have good test coverage there due to not seeing any failures when in reality the tests are simply skipped.

We could have a test for the detection which will fail indicating us that. And accepting unknown could be done as a downstream customization.

Anyways, the opt-in to allow unknown looks good to me as well. I think it is up to you, @aelovikov-intel, I'm not a code owner in this area.

@jzc jzc temporarily deployed to WindowsCILock May 30, 2024 19:03 — with GitHub Actions Inactive
@jzc jzc temporarily deployed to WindowsCILock May 30, 2024 20:19 — with GitHub Actions Inactive
@jzc jzc temporarily deployed to WindowsCILock June 11, 2024 00:07 — with GitHub Actions Inactive
@jzc jzc temporarily deployed to WindowsCILock June 11, 2024 14:44 — with GitHub Actions Inactive
#undef __SYCL_ARCHITECTURE
#undef __SYCL_ARCHITECTURE_ALIAS
}
assert(false && "Unreachable!");
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We now have unknown architecture. I suppose that it should technically be a default: clause within the switch, but it is hard to achieve, so I suggest that we just return unknown here to avoid warnings that you see on Windows

@jzc jzc temporarily deployed to WindowsCILock June 20, 2024 14:43 — with GitHub Actions Inactive
@jzc jzc temporarily deployed to WindowsCILock June 20, 2024 16:23 — with GitHub Actions Inactive
@againull againull merged commit 2521c03 into intel:sycl Jun 20, 2024
13 checks passed
againull pushed a commit that referenced this pull request Jun 22, 2024
#13976 adds support for using the
`Architecture` output of `sycl-ls` when configuring `e2e` tests. This PR
adds an opt-in option to allow configuring `e2e` tests when the
architecture is `unknown`.
AlexeySachkov added a commit to AlexeySachkov/llvm that referenced this pull request Jul 4, 2024
Previously we had to explicitly specify that certain HW is present to
enable corresponding LIT features and make some tests run (or be
skipped). Forgetting to set that argument could lead to missing
coverage, or unrelated failures.

intel#13976 introduced automatic detection of HW architecture to
E2E tests. This patch is one in a series of transitioning our tests to
use that new functionality.

This patch transitions tests that use `gpu-intel-pvc` feature to use
auto-generated `architecture-intel_gpu_pvc`.

The patch does not completely removes `gpu-intel-pvc` usage in our LIT
scripts, because some other features depend on it and I would like to do
changes incrementally.
AlexeySachkov added a commit that referenced this pull request Jul 10, 2024
Previously we had to explicitly specify that certain HW is present to
enable corresponding LIT features and make some tests run (or be
skipped). Forgetting to set that argument could lead to missing
coverage, or unrelated failures.

#13976 introduced automatic detection of HW architecture to
E2E tests. This patch is one in a series of transitioning our tests to
use that new functionality.

This patch transitions tests that use `gpu-intel-pvc` feature to use
auto-generated `arch-intel_gpu_pvc`.

The patch does not completely removes `gpu-intel-pvc` usage in our LIT
scripts, because some other features depend on it and I would like to do
changes incrementally.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants