[SYCL] Diagnose uses of zero length arrays #4888

Fznamznon · 2021-11-03T18:16:16Z

Adds diagnosing on attempt to use zero length arrays, pointers, refs, arrays
of them and structs/classes containing all of it.
A couple of assertions were removed from SemaSYCL because the new
diagnostics are deferred and emitted later than SemaSYCL routines.
In case a struct/class with zero length array is used this emits a set
of notes pointing out how zero length array got into used struct, like
this:

struct ContainsArr {
  int A[0]; // note: field of illegal type declared here
};
struct Wrapper {
  ContainsArr F; // note: within field of type ContainsArr declared here
  // ...
}

// Device code
Wrapper W;
W.use(); // error: zero-length arrays are not permitted

Total deep check of each used declaration may result in double
diagnosing at the same location.

Adds diagnosing on attempt to use zero length arrays, pointers, refs, arrays of them and structs/classes containing all of it. A couple of assertions were removed from SemaSYCL because the new diagnostics are deferred and emitted later than SemaSYCL routines. In case a struct/class with zero length array is used this emits a set of notes pointing out how zero length array got into used struct, like this: ``` struct ContainsArr { int A[0]; // note: field of illegal type declared here }; struct Wrapper { ContainsArr F; // note: within field of type ContainsArr declared here // ... } // Device code Wrapper W; W.use(); // error: zero-length arrays are not permitted ``` Total deep check of each used declaration may result in double diagnosing at the same location.

erichkeane · 2021-11-03T18:27:47Z

clang/lib/Sema/SemaSYCL.cpp

+  };
+
+  // In case we have a Record used do the DFS for a bad field
+  SmallVector<const ValueDecl *, 4> StackForRecursion;


We shouldn't be doing this level of deep checking in the CFE. Walking the AST this deep is exactly the stuff @AaronBallman and I have been warning against/trying to stop us from doing.

Hmm... I guess at second look, this is going through the fields instead of calls, so it isn't as bad as I thought.

I'd still recommend doing this differently, I'd suggest diagnosing by calling RecordDecl::CheckSYCLTypes (or something, the point is that it loops through its fields and checks, plus recursing down CheckSYCLTypes when necessary), then having it return whether it diagnosed or not so that you can emit the note if you care to.

Printing EVERY error, or the history for every error is likely overly verbose anyway.

Hmm... I guess at second look, this is going through the fields instead of calls, so it isn't as bad as I thought.

Yes, it is going through fields. Actually I didn't invent that, it seems clang already does such thing for OpenCL kernel arguments

llvm/clang/lib/Sema/SemaDecl.cpp

Line 8958 in 35ccdd5

SmallVector<const Decl *, 4> VisitStack;

.

I'd suggest diagnosing by calling RecordDecl::CheckSYCLTypes (or something, the point is that it loops through its fields and checks, plus recursing down CheckSYCLTypes when necessary), then having it return whether it diagnosed or not so that you can emit the note if you care to.

Do you mean we should stop recursion once the first field of illegal type is found?

Ah, weird! I'm surprised that made it by community review, though I do know that the OpenCL code owners are a bit lax as far as code review effectiveness. That said, if instead of this patch we could just generalize the OpenCL version, at least we wouldn't be doing further harm (AND this would be a good opportunity to upstream this immediately instead of doing here).

I would probably give up on the notes after the 1st in each struct/path. Blitting out the 'how we got here' non stop is just going to be a giant mess of notes I suspect.

That said, if instead of this patch we could just generalize the OpenCL version

I will try that, but I'm pretty sure this will be pretty annoying to do (if even possible!), since OpenCL code seems very... OpenCL specific.

we wouldn't be doing further harm

If you think this does harm, I'm open for any ideas. Do recursive search for bad struct fields is the best way to solve the problem that I could figure out.

this would be a good opportunity to upstream this immediately instead of doing here

I believe the best way to upstream this is to ask developers of other single source offload models (CUDA/OMP/maybeHIP?) if they need such thing and generalize the function I've added across them. OpenCL folks already have their diagnostic, the generalization will likely complicate the code and I think they don't care about SYCL and therefore the change won't bring anything good for them, so this will end up rejected or it will take forever to merge.

C++ is not a multi-pass programming language.

But maybe SYCL is?

I got your point. I was just only trying to say that need to parse the entire TU to understand what is what is not only SYCL problem. Deferred diagnostics are widely used across clang, not only in checkTypeSupport(). I agree that SYCL could define device code entry more precisely like CUDA does with its "device/host/etc" attributes instead of "everything called from template function with this name magically becomes device code". But even CUDA has places where deferred diagnostics needed, that means even there it is not possible to understand where is device code until everything is parsed.

But maybe SYCL is?

SYCL is supposed to be an extension to C++ and I'd question whether redefining the compilation model is a conforming extension in practice.

I got your point. I was just only trying to say that need to parse the entire TU to understand what is what is not only SYCL problem. Deferred diagnostics are widely used across clang, not only in checkTypeSupport().

I agree! The difference is: deferred diagnostics typically do not require deferring to the end of the TU, but instead to the end of the function definition or class definition (or perhaps instantiation thereof). Function bodies and class bodies are generally far more bounded than an entire TU. (Consider how well this scales to TUs that are code generated; you could get 1 million lines of code that forms an AST we have to walk multiple times only to discover that there was little or no SYCL code anywhere in the TU to begin with.)

SYCL is supposed to be an extension to C++

Okay, I didn't see SYCL from this point of view and rather perceived it as a separate C++ based language.

@erichkeane , I've reduced number of emitted notes, all notes are emitted only when the first error is discovered. Please let me know if this became better.

I'm ok with it, I'll leave the decision on the rest to the code owners.

C++ is not a multi-pass programming language. [@AaronBallman]

But maybe SYCL is? [@Fznamznon]

Not really. This is an implementation detail.

I got your point. I was just only trying to say that need to parse the entire TU to understand what is what is not only SYCL problem. Deferred diagnostics are widely used across clang, not only in checkTypeSupport(). I agree that SYCL could define device code entry more precisely like CUDA does with its "device/host/etc" attributes instead of "everything called from template function with this name magically becomes device code".

Argh! I have a stroke! I need some oxygen! :-) I think you are ready to send your résumé to NVIDIA at that point. @brycelelbach you have a candidate! ;-)
Single-source and plain C++ without extension is the essence of SYCL.

Even handling templates in plain C++ requires some evaluation delay to provide better diagnostics.

But even CUDA has places where deferred diagnostics needed, that means even there it is not possible to understand where is device code until everything is parsed.

And I guess that if NVIDIA is up-streaming in the future their purer NVC++ C++ compiler they will have a somehow similar issue on this aspect. Any feedback from NVIDIA compiler team on this?

AaronBallman · 2021-11-08T14:03:41Z

clang/lib/Sema/Sema.cpp

+  // In addition such replacement doesn't seem to be possible for types without
+  // constant byte size like zero length arrays. So, do a deep check for SYCL.


What about flexible array members?

Can be diagnosed this way as well. I thought about that, but I left it for next patches to keep this one simpler/smaller.

AaronBallman · 2021-11-08T14:05:51Z

clang/lib/Sema/SemaSYCL.cpp

+  };
+
+  // In case we have a Record used do the DFS for a bad field
+  SmallVector<const ValueDecl *, 4> StackForRecursion;


Ultimately, where we want this diagnostic to live is within SemaDecl.cpp when processing the declaration of a zero-length array. The fact that we have to wait until the entire TU is complete before deciding whether we can or cannot diagnose a declaration elsewhere is a design flaw.

Fznamznon · 2021-11-10T17:55:37Z

I'm also thinking about adding a new error message that says "zero-length arrays are not permitted in SYCL" because right now it says "zero-length arrays are not permitted in C++" which might be confusing because the host code is also C++.
I know that I probably was the one who suggested to re-use the community diagnostic in similar check inside SemaSYCL, because re-using is always better for upstream, but still....
Reviewers, please let me know your thoughts.

AaronBallman · 2021-11-10T18:01:25Z

I'm also thinking about adding a new error message that says "zero-length arrays are not permitted in SYCL" because right now it says "zero-length arrays are not permitted in C++" which might be confusing because the host code is also C++. I know that I probably was the one who suggested to re-use the community diagnostic in similar check inside SemaSYCL, because re-using is always better for upstream, but still.... Reviewers, please let me know your thoughts.

I think the change in diagnostic text isn't a bad idea. Zero-length arrays are an extension in C++ that Clang supports by default, so the current wording could be confusing.

Fznamznon · 2021-11-10T18:17:40Z

I think the change in diagnostic text isn't a bad idea. Zero-length arrays are an extension in C++ that Clang supports by default, so the current wording could be confusing.

Done then, thank you.

AaronBallman

Just nits from me.

clang/include/clang/Basic/DiagnosticSemaKinds.td

clang/lib/Sema/SemaSYCL.cpp

elizabethandrews · 2021-11-12T14:56:40Z

clang/lib/Sema/SemaSYCL.cpp

  return false;
 }

+// NOTE: In case this function gets upstreamed and other single source


Is there a reason you're adding this code here, instead of trying to upstream it?

I think upstreaming of non-trivial SYCL-related patch is a very long shot that probably may not succeed anyway while our customers will face non-user friendly crashes/problems later in the toolchain, so this should be done as a separate task.
In addition, after Aaron's comments - #4888 (comment) , I don't think it is worth it at all.

I'm not sure what the required protocol here is. Unless we have have a non-negotiable strict deadline, I believe we should be upstreaming whatever can be upstreamed (even if it delays patch). If the patch in its current form is not up-streamable, that should be fixed. From what I can see we already have required infrastructure for this patch upstream (like checkTypeSupport), so I don't see what is blocking an attempt to upstream it.

@AaronBallman I am not sure I understand your comment here - #4888 (comment). This patch is building on top of something which has already been upstreamed. Are you suggesting that will be reverted in the future?

Tagging @bader and @zahiraam for their input as well since they drive upstreaming efforts

@AaronBallman I am not sure I understand your comment here - #4888 (comment). This patch is building on top of something which has already been upstreamed. Are you suggesting that will be reverted in the future?

That comment was to suggest that I'm not certain the original code should have landed upstream because it has design issues, and building on top of a questionable foundation is something we want to avoid. However, because it's already landed (for a reasonable amount of time) and downstreams are building on top of it, I think it would cause undue churn to just go and revert it. (That doesn't mean it won't get reverted in the future if someone upstream sees the code and decides it's a problem they won't live with.)

Our goal is to try to upstream everything we can and upstreaming this change fits that goal. To me, the only danger with upstreaming this patch would be that someone in community takes issue with the foundational bits and asks you to rewrite those (which would be a good technical outcome but is not likely planned or particularly trivial effort).

I'm curious what @bader and @zahiraam think, but I would say it's reasonable to upstream the patch (but it could lead to unplanned work).

I think whether this patch is up-streamed now or later there will always be un-planed work. It would be my preference to wait for it to be up-streamed only because it was a request from community to do a couple of things before SemaSYCL.cpp gets into an acceptable shape. The handling for special classes (this has already been done) but also the mix if "sema" and "codegen" work that's happening in SemaSYCL.cpp.

Just to make sure we are on the same page - this patch actually doesn't depend on things already implemented in SemaSYCL (like kernel generation). I could put the function that this patch adds to SemaSYCL anywhere else, I just did that because it is called only for SYCL. So nothing blocks from upstreaming this patch in parallel with kernel generation.

Then upstreaming it is fine. Thanks.

Okay then. I've moved it to https://reviews.llvm.org/D114080

clang/lib/Sema/SemaSYCL.cpp

clang/test/SemaSYCL/zero-length-arrays.cpp

clang/lib/Sema/SemaSYCL.cpp

Fznamznon · 2021-11-17T13:10:09Z

Moved to LLORG. Will close this PR in case this is categorically not accepted there.

Fznamznon requested review from elizabethandrews, premanandrao and smanna12 as code owners November 3, 2021 18:16

erichkeane reviewed Nov 3, 2021

View reviewed changes

AaronBallman reviewed Nov 8, 2021

View reviewed changes

Reduce number of notes

44d8956

Change diagnostic message

77280ce

Fznamznon requested review from AaronBallman and erichkeane November 11, 2021 15:18

AaronBallman reviewed Nov 11, 2021

View reviewed changes

clang/include/clang/Basic/DiagnosticSemaKinds.td Outdated Show resolved Hide resolved

clang/lib/Sema/SemaSYCL.cpp Outdated Show resolved Hide resolved

clang/lib/Sema/SemaSYCL.cpp Outdated Show resolved Hide resolved

clang/lib/Sema/SemaSYCL.cpp Outdated Show resolved Hide resolved

Fznamznon added 2 commits November 12, 2021 17:12

Merge branch 'sycl' into zero-len-arr

256d92c

Apply review feedback

73d3db1

Fznamznon requested a review from AaronBallman November 12, 2021 14:47

elizabethandrews reviewed Nov 12, 2021

View reviewed changes

smanna12 reviewed Nov 12, 2021

View reviewed changes

clang/lib/Sema/SemaSYCL.cpp Show resolved Hide resolved

Fznamznon closed this Nov 17, 2021

		// In addition such replacement doesn't seem to be possible for types without
		// constant byte size like zero length arrays. So, do a deep check for SYCL.

[SYCL] Diagnose uses of zero length arrays #4888

[SYCL] Diagnose uses of zero length arrays #4888

Uh oh!

Conversation

Fznamznon commented Nov 3, 2021

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Fznamznon Nov 3, 2021 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Fznamznon Nov 8, 2021 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

keryell Nov 11, 2021 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Fznamznon commented Nov 10, 2021 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

AaronBallman commented Nov 10, 2021

Uh oh!

Fznamznon commented Nov 10, 2021 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

AaronBallman left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Fznamznon commented Nov 17, 2021

Uh oh!

Fznamznon Nov 3, 2021 •

edited

Loading

Fznamznon Nov 8, 2021 •

edited

Loading

keryell Nov 11, 2021 •

edited

Loading

Fznamznon commented Nov 10, 2021 •

edited

Loading

Fznamznon commented Nov 10, 2021 •

edited

Loading