Skip to content

[SYCL][NATIVECPU] Fix assert in MS name mangler #11103

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
merged 19 commits into from
Oct 12, 2023

Conversation

uwedolinsky
Copy link
Contributor

This PR prevents the MS name mangler to assert on the unhandled CC_OpenCLKernel calling convention which can end up in the AST when targeting NativeCPU on Windows - see attached test case that triggers the assert. This assert is also triggered on a number of e2e tests on NativeCPU. This patch handles the CC_OpenCLKernel convention currently only for NativeCPU and the usual error is raised for other targets as before.

Making this change in an upstream llvm file may not be ideal - any suggestions for a better way to fix this are very welcome. Thanks!

@uwedolinsky uwedolinsky requested review from a team as code owners September 7, 2023 14:37
Copy link
Contributor

@elizabethandrews elizabethandrews left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Since this is a front end change, can you add a frontend test instead?

@@ -2902,6 +2902,17 @@ void MicrosoftCXXNameMangler::mangleCallingConvention(CallingConv CC) {
else
Out << "w";
break;
case CC_OpenCLKernel:
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This looks ok to me but @premanandrao can you also please take a look?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I have a lot to learn in this area. I don't understand why OpenCL is relevant for the added test case. Is OpenCL always used for -fsycl-is-native-cpu?

Mangling also seems to be missing for IA64 ABIs. See CXXNameMangler::getCallingConvQualifierName() in clang/lib/AST/ItaniumMangle.cpp. Why isn't a similar change needed there? The test exercises Linux as well.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Has anyone tried to determine what, if any, mangling is used by MSVC for OpenCL kernels?

Copy link
Contributor

@elizabethandrews elizabethandrews Sep 19, 2023

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I have a lot to learn in this area. I don't understand why OpenCL is relevant for the added test case. Is OpenCL always used for -fsycl-is-native-cpu?

My understanding was that this is the calling convention for kernel. I think that when using -fsycl-is-native-cpu device compilation is done with the host triple and this results in us entering MS mangler for the kernel. @uwedolinsky did I get that correct?

Mangling also seems to be missing for IA64 ABIs. See CXXNameMangler::getCallingConvQualifierName() in clang/lib/AST/ItaniumMangle.cpp. Why isn't a similar change needed there? The test exercises Linux as well.

This is a good point. Why is this not an issue for Linux?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't understand why OpenCL is relevant for the added test case.

OpenCL is not relevant for this test, but since the frontend attaches the clang CC_OpenCLKernel convention (independent from C++ ABI and target) to the kernel declaration here, we need to handle that convention in the currently used name mangler (which is the MS mangler when targeting Windows).

Is OpenCL always used for -fsycl-is-native-cpu?

-fsycl-is-native-cpu does not use or enable OpenCL in compiler or runtime. NativeCPU has no dependency on OpenCL.

Mangling also seems to be missing for IA64 ABIs. See CXXNameMangler::getCallingConvQualifierName() in clang/lib/AST/ItaniumMangle.cpp. Why isn't a similar change needed there?

CXXNameMangler::getCallingConvQualifierName() already handles the CC_OpenCLKernel and other conventions here (without providing mangling), but this doesn't even seem to be called on SYCL kernels when targeting the IA64 ABI, so that problem will not occur on that ABI.

The test exercises Linux as well.

We wanted the test to also cover other ABIs and different combinations of cpu triples (for possible cross compilation) to ensure they don't have similar problems. Some of these invocations may not exercise the fix but are still worth testing.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thank you for explaining. I think I'm convinced this change is ok.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It still isn't clear to me that this is the right change. Why is the CC_OpenCLKernel attached to SYCL kernels in the first place? Can we just stop attaching them to SYCL kernels? I think we should understand why they are needed before choosing to ignore them.

MSVC doesn't directly support the OpenCL language. From what I can tell, OpenCL kernels are always resolved by a call to clCreateKernel() that passes the name of the kernel as a string. Overloaded kernels are supported (at least by some implementations) by allowing a custom mangled name to be specified via an attribute. It seems any kind of mangling is therefore implementation dependent and thus, the Microsoft ABI doesn't specify one.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It still isn't clear to me that this is the right change. Why is the CC_OpenCLKernel attached to SYCL kernels in the first place?

Probably because of historical reasons since SYCL initially targeted OpenCL, many data structures and identifiers may have been inspired by it and SYCL kernels were implemented as OpenCL kernels. Also, querying the calling convention is one way to identify kernels.

Can we just stop attaching them to SYCL kernels?

Probably, but that would likely lead to a much larger, potentially intrusive SYCL frontend change I mentioned previously and I was trying to avoid that. If NativeCPU is the only SYCL target that has this issue, and there are no other issues elsewhere with using this convention "under the hood", it may not be worth making that large a change and instead it may be easier to just handle the assert.

MSVC doesn't directly support the OpenCL language. From what I can tell, OpenCL kernels are always resolved by a call to clCreateKernel() that passes the name of the kernel as a string. Overloaded kernels are supported (at least by some implementations) by allowing a custom mangled name to be specified via an attribute. It seems any kind of mangling is therefore implementation dependent and thus, the Microsoft ABI doesn't specify one.

Please note that this PR does not specify any MS mangling, it merely prevents the assert in the MS mangler triggered for the CC_OpenCLKernel convention.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thank you for being responsive to my comments. I won't hold this review up further; the change seems unlikely to cause any problems in the short term. However, we will have to revisit all of this as part of the SYCL upstreaming effort as the reuse of CC_OpenCLKernel for SYCL here is quite confusing.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes, if the clang infrastructure for OpenCL isn't used to implement SYCL kernels (not even when targeting OpenCL), then this "reuse" of CC_OpenCLKernel could probably be avoided. It may be possible to instead use the information from SYCLKernelAttr or other SYCL-related clang attributes that are currently attached to SYCL kernels in the AST. But such potentially larger refactoring is probably better done as a separate PR.

@uwedolinsky
Copy link
Contributor Author

Since this is a front end change, can you add a frontend test instead?

Thank you for your comments. I've converted this into a frontend test.
Would it be ok for us (the @intel/dpcpp-nativecpu-pi-reviewers ) to request ownership for clang/test/Frontend/sycl-native-cpu* to be able to maintain this and potentially other native-cpu -related frontend tests?

@elizabethandrews
Copy link
Contributor

elizabethandrews commented Sep 8, 2023

Thank you for your comments. I've converted this into a frontend test.

Sorry I didn't specify the folder earlier. Since this test is checking IR, I would except it to be in clang/test/CodeGenSYCL

Would it be ok for us (the @intel/dpcpp-nativecpu-pi-reviewers ) to request ownership for clang/test/Frontend/sycl-native-cpu* to be able to maintain this and potentially other native-cpu -related frontend tests?

I think this makes sense. I see there are several other which have similar ownership - #10096

@uwedolinsky
Copy link
Contributor Author

That's the test moved to CodeGenSYCL, and with the current naming we already have ownership over it. Thank you.

template <typename KernelName = name1, typename KernelType>
void parallel_for_work_group1(const KernelType &KernelFunc) {
kernel_parallel_for_work_group<KernelName, KernelType>(KernelFunc);
}
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Could you please add the parallel_for_work_group1 function and the kernel_parall_for_work_group function to the mock header in the Inputs directory and use it from there?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Unless this matches real header implementation, I don't think it should be added.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Having said that, if a barebones API to test nativecpu can be added to mock headers, that would be preferred. This is how we handle most new features.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I've changed the test to reproduce the assert in the MS mangler using the existing mock API instead, so there is no need to add the reduced API from the test to the mock headers. Hope this is acceptable.

@uwedolinsky
Copy link
Contributor Author

Hello @intel/llvm-gatekeepers , this PR looks ready for merge. Thank you!

@bader bader changed the title [SYCLNATIVECPU] fix for assert in MS name mangler [SYCL][NATIVECPU] Fix assert in MS name mangler Sep 12, 2023
@bader
Copy link
Contributor

bader commented Sep 12, 2023

Making this change in an upstream llvm file may not be ideal

What does make you think so?

@uwedolinsky
Copy link
Contributor Author

Making this change in an upstream llvm file may not be ideal

What does make you think so?

Mainly because this issue seems to affect only the NativeCPU target. No other sycl or (upstream) clang/llvm component appears to use the CC_OpenCLKernel convention in C++ code for Windows targeting the Microsoft C++ ABI. Perhaps It may have been better to instead prevent CC_OpenCLKernel from "leaking" into code that reaches the Microsoft name mangler, but that would have likely required larger changes to DPCPP.
This change could also cause potential merge conflicts when updating MicrosoftMangle.cpp from upstream clang (especially if git histories change in either repo), which is why I was looking for alternatives.

@bader
Copy link
Contributor

bader commented Sep 13, 2023

Making this change in an upstream llvm file may not be ideal

What does make you think so?

Mainly because this issue seems to affect only the NativeCPU target. No other sycl or (upstream) clang/llvm component appears to use the CC_OpenCLKernel convention in C++ code for Windows targeting the Microsoft C++ ABI. Perhaps It may have been better to instead prevent CC_OpenCLKernel from "leaking" into code that reaches the Microsoft name mangler, but that would have likely required larger changes to DPCPP. This change could also cause potential merge conflicts when updating MicrosoftMangle.cpp from upstream clang (especially if git histories change in either repo), which is why I was looking for alternatives.

I'll leave it for @elizabethandrews to decide, but eventually this code is supposed to be upstreamed to clang. In my opinion, the earlier we get the feedback from the community, the better (especially if larger changes are required). The general rule is to avoid committing patches, which we think will be rejected in upstream.

@elizabethandrews
Copy link
Contributor

Perhaps It may have been better to instead prevent CC_OpenCLKernel from "leaking" into code that reaches the Microsoft name mangler, but that would have likely required larger changes to DPCPP.

Hmm....Just to make sure I am understanding this PR correctly - For native CPU we compile device code with target triple. In this case the target is Windows, and this is causing an issue because Microsoft mangler is being applied to openCL kernels and mangler cannot handle CC_OpenCLKernel calling convention. The solution in this PR is to explicitly handle CC_OpenCLKernel calling convention.

I guess the question is whether it correct that the kernels have this calling convention (I do not know the answer to this. @premanandrao do you?). If it is correct, I don't see a problem with explicitly handling it at the mangler, but if it is incorrect we should probably fix it before reaching the mangler. @AaronBallman can you also please take a look here?

@elizabethandrews
Copy link
Contributor

I guess the question is whether it correct that the kernels have this calling convention (I do not know the answer to this. @premanandrao do you?). If it is correct, I don't see a problem with explicitly handling it at the mangler, but if it is incorrect we should probably fix it before reaching the mangler. @AaronBallman can you also please take a look here?

@AaronBallman is on vacation this week. @tahonermann @premanandrao could you please weigh in here?

h.parallel_for_work_group<name1>(sycl::range<1>(2),sycl::range<1>(1), [=](sycl::group<1> G) {});
}

// CHECK: void @_ZTS5name1(
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

_ZTS5name1 is an Itanium mangled name. Shouldn't there be a check for a Microsoft mangled name? How does this check pass on Windows?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

All SYCL targets need to generate the same name for a kernel to enable lookup in different device binaries. All SYCL targets in DPC++ use the Itanium mangling for kernel names.

We still need the MS mangler fix though because when targeting NativeCPU to Windows the MS C++ ABI is used (causing the assert), but subsequently SemaSYCL generates the unique Itanium-mangled name for each kernel.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm not sure I'm following. We generate a mangled name for the MS C++ ABI that is never used? If so, wouldn't a better fix be to skip generating such a mangled name in the first place?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm not sure I'm following. We generate a mangled name for the MS C++ ABI that is never used? If so, wouldn't a better fix be to skip generating such a mangled name in the first place?

We were considering this but it would probably be more intrusive to query/skip/change the kernel name mangling only for NativeCPU kernels targeting Windows than just handling the CC_OpenCLKernel convention which is already in core clang. We wanted to avoid changing mangling in the core clang Windows pipeline just for NativeCPU.

Also, SemaSYCL currently always generates a "stable name" (using the Itanium mangling) irrespective of what C++ ABI is used. NativeCPU just selects that existing name - it doesn't create a new name.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Just to close out this thread... I've still been struggling to follow these explanations, but as mentioned elsewhere, I won't hold this review up since the change should be harmless, but we'll need to revisit this as part of SYCL upstreaming; the use of CC_OpenCLKernel in this context just doesn't make sense.

@bader bader merged commit a413a8c into intel:sycl Oct 12, 2023
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

7 participants