Skip to content

[CI] Automatically detect AMD architecture #16071

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
merged 10 commits into from
Nov 14, 2024
Merged

Conversation

sarnex
Copy link
Contributor

@sarnex sarnex commented Nov 13, 2024

We can figure it out from the sycl-ls output. Confirmed working here

Closes: #16057

@sarnex sarnex force-pushed the sycl-devops-pr/sarnex/amd branch from bfdaf4e to 2d28503 Compare November 13, 2024 15:32
@sarnex sarnex marked this pull request as ready for review November 13, 2024 16:56
@sarnex sarnex requested a review from a team as a code owner November 13, 2024 16:56
@sarnex sarnex requested a review from uditagarwal97 November 13, 2024 16:56
@@ -284,11 +284,8 @@ jobs:
echo "opts=$CMAKE_EXTRA_ARGS" >> $GITHUB_OUTPUT
else
if [ "${{ contains(inputs.target_devices, 'ext_oneapi_hip') }}" == "true" ]; then
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I am of the opinion that we should move all the runner specific code to the caller instead. For post-commit, in sycl-post-commit: https://github.com/intel/llvm/blob/sycl/.github/workflows/sycl-post-commit.yml#L85, just like what we do for docker image options. In this case, we can pass CMake build options to sycl-link-run-tests using extra_cmake_args.

Copy link
Contributor Author

@sarnex sarnex Nov 13, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This isn't really runner specific code, it's target device specific code. Any time this workflow is called targeting HIP we will need this to run. Another downside of moving it up is we would have to duplicate this code in the pre and postcommit callsites, or make another workflow that implements this script in call it in both places.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

My initial thoughts were the same as @uditagarwal97 suggested. I'm not sure if that duplication is enough to convince me to do things differently, but I don't have strong opinion about it anyway.

Copy link
Contributor Author

@sarnex sarnex Nov 13, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looking now, even if we wanted to do it at the callsite, I don't think it will work if I understand what's happening correctly. So like here or below here, I don't think this runs on the runner that will run the test, I think it runs on the machine running the larger workflow. We need this command to run on the machine with the AMD GPU, so I'm not sure how we could move it up a layer. Also, since we need to run sycl-ls, we need the toolchain downloaded, which I don't think is available at the callsites either.

uditagarwal97
uditagarwal97 previously approved these changes Nov 13, 2024
Copy link
Contributor

@uditagarwal97 uditagarwal97 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM

@ayylol
Copy link
Contributor

ayylol commented Nov 13, 2024

Should this not be done in the e2e python scripts? that way this issue is not only resolved for the ci, but also when running the e2e tests locally?

also note we have this comment suggesting that the way we are setting amd_arch is a temporary solution. As will as this strange bit that seems to do autodetection for amd_arch, but only for adding the "gpu-amd-gfx90a feature.

@sarnex
Copy link
Contributor Author

sarnex commented Nov 13, 2024

Sorry, I should have thought about it more rather than do the minimum to get it working :)

I'll move it to the py script and update this PR, thanks for the idea.

@@ -6,10 +6,10 @@
//
//===----------------------------------------------------------------------===//

// RUN: %{build} -fsycl -fsycl-targets=amd_gpu_gfx90a %s -o %t.out
// RUN: %clangxx -fsycl -fsycl-targets=amd_gpu_gfx90a %s -o %t.out
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This change is a bug fix, we shouldn't use %{build} if we're setting -fsycl-targets/choosing file and output

// RUN: %{run} %t.out

// REQUIRES: gpu-amd-gfx90a
// REQUIRES: arch-amd_gpu_gfx90a
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

this change is just to remove a hack in the testing infra, should be exactly the same functionally

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

hey, just wondering if you were able to build and run these gfx90a tests. AFAIK we don't have these in our CI, so I don't think they ever ran.

Copy link
Contributor Author

@sarnex sarnex Nov 13, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

yeah i actually tested on the one AMD machine i know not in our CI and it happens to be gfx90a (MI250)

amd_arch_prefix = "arch-amd_gpu_"
amd_device_arch = [i for i in config.sycl_dev_features["hip:gpu"] if amd_arch_prefix in i]
if len(amd_device_arch) == 0:
lit_config.error(
Copy link
Contributor Author

@sarnex sarnex Nov 13, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

i had to move this block down so the sycl-ls parsing we already do is available, but that meant i had to add a loop at the end to update the features

// REQUIRES: gpu-amd-gfx90a
// RUN: %{build} -Xsycl-target-backend=amdgcn-amd-amdhsa --offload-arch=gfx90a -o %t.out
// REQUIRES: arch-amd_gpu_gfx90a
// RUN: %clangxx -Xsycl-target-backend=amdgcn-amd-amdhsa --offload-arch=gfx90a -o %t.out
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

should we add -fsycl?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

good catch, fixed in latest commit

Signed-off-by: Sarnie, Nick <[email protected]>
Signed-off-by: Sarnie, Nick <[email protected]>
Signed-off-by: Sarnie, Nick <[email protected]>
@sarnex
Copy link
Contributor Author

sarnex commented Nov 13, 2024

@YuriPlyakhin Do you mind re-reviewing, I found a missed test fix.

@sarnex
Copy link
Contributor Author

sarnex commented Nov 13, 2024

@aelovikov-intel Just pushed the commit moving it into the existing loop, it's not too bad I don't think

Comment on lines 767 to 774
for a in architecture_feature:
arch = a
amd_arch_prefix = "arch-amd_gpu_"
if amd_arch_prefix not in arch or len(architecture_feature) != 1:
lit_config.error(
"Cannot detect architecture for AMD HIP device, specify it explicitly"
)
config.amd_arch = arch.replace(amd_arch_prefix, "")
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is there a bug with indentation here? Otherwise I don't see a reason behind the loop at line 767...

Copy link
Contributor Author

@sarnex sarnex Nov 14, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It's because architecture_feature is a set and there's no easy way to get the first (and only, because of the error check) element out I think, indexing doesn't work. Another option is next(iter(architecture_feature)) I think.

But I'm not good at python so maybe there's something I missed

@uditagarwal97 @ayylol might know also

Copy link
Contributor

@aelovikov-intel aelovikov-intel Nov 14, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

# Guaranteed to be a single element in the set
arch = [x for x in architecture_feature][0]

Or better yet not use set to start with :)

Copy link
Contributor Author

@sarnex sarnex Nov 14, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

thanks, ill fix it.

the existing code is using a set for this existing var, probably for some reason, and i dont want to touch it :P

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yeah, I think it shouldn't have been a set at all. Someone probably just copied aspects/sg_sizes but that wasn't a good idea.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

tale as old as time

Signed-off-by: Sarnie, Nick <[email protected]>
Copy link
Contributor

@YuriPlyakhin YuriPlyakhin left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM

@sarnex sarnex merged commit 73b0775 into sycl Nov 14, 2024
22 of 27 checks passed
@sarnex sarnex deleted the sycl-devops-pr/sarnex/amd branch November 14, 2024 20:30
KornevNikita pushed a commit that referenced this pull request Dec 5, 2024
We can figure it out from the `sycl-ls` output. Confirmed working
[here](https://github.com/intel/llvm/actions/runs/11841045635/job/32998817316?pr=16071)

Closes: #16057

---------

Signed-off-by: Sarnie, Nick <[email protected]>
aelovikov-intel added a commit to aelovikov-intel/llvm that referenced this pull request Jan 17, 2025
aelovikov-intel added a commit that referenced this pull request Jan 17, 2025
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

[CI] Use sycl-ls output to determine architecture for AMD HIP E2E testing
6 participants