Skip to content

[Clang][Sema] Print more static_assert exprs #74852

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Open
wants to merge 18 commits into
base: main
Choose a base branch
from

Conversation

sethp
Copy link
Contributor

@sethp sethp commented Dec 8, 2023

This change introspects more values involved in a static_assert, and extends the supported set of operators for introspection to include binary operator method calls.

It's intended to address the use-case where a small static_assert helper looks something like this (via constexpr-builtin-bit-cast.cpp):

struct int_splicer {
  unsigned x;
  unsigned y;

  constexpr bool operator==(const int_splicer &other) const {
    return other.x == x && other.y == y;
  }
};

When used like so:

constexpr int_splicer got{1, 2};
constexpr int_splicer want{3, 4};
static_assert(got == want);

Then we'd expect to get the error:

Static assertion failed due to requirement 'got == want'

And this change adds the helpful note:

Expression evaluates to '{1, 2} == {3, 4}'

This change introspects more values involved in a static_assert, and
extends the supported set of operators for introspection to include
binary operator method calls.

It's intended to address the use-case where a small static_assert helper
looks something like this (via `constexpr-builtin-bit-cast.cpp`):

```c++
struct int_splicer {
  unsigned x;
  unsigned y;

  constexpr bool operator==(const int_splicer &other) const {
    return other.x == x && other.y == y;
  }
};
```

When used like so:

```c++
constexpr int_splicer got{1, 2};
constexpr int_splicer want{3, 4};
static_assert(got == want);
```

Then we'd expect to get the error:

```
Static assertion failed due to requirement 'got == want'
```

And this change adds the helpful note:

```
Expression evaluates to '{1, 2} == {3, 4}'
```
@llvmbot llvmbot added clang Clang issues not falling into any other category clang:frontend Language frontend issues, e.g. anything involving "Sema" labels Dec 8, 2023
@llvmbot
Copy link
Member

llvmbot commented Dec 8, 2023

@llvm/pr-subscribers-clang

Author: None (sethp)

Changes

This change introspects more values involved in a static_assert, and extends the supported set of operators for introspection to include binary operator method calls.

It's intended to address the use-case where a small static_assert helper looks something like this (via constexpr-builtin-bit-cast.cpp):

struct int_splicer {
  unsigned x;
  unsigned y;

  constexpr bool operator==(const int_splicer &other) const {
    return other.x == x && other.y == y;
  }
};

When used like so:

constexpr int_splicer got{1, 2};
constexpr int_splicer want{3, 4};
static_assert(got == want);

Then we'd expect to get the error:

Static assertion failed due to requirement 'got == want'

And this change adds the helpful note:

Expression evaluates to '{1, 2} == {3, 4}'

Full diff: https://github.com/llvm/llvm-project/pull/74852.diff

5 Files Affected:

  • (modified) clang/lib/Sema/SemaDeclCXX.cpp (+23-8)
  • (modified) clang/test/CXX/class/class.compare/class.eq/p3.cpp (+10-10)
  • (modified) clang/test/CXX/class/class.compare/class.rel/p2.cpp (+5-5)
  • (modified) clang/test/CXX/over/over.match/over.match.funcs/over.match.oper/p9-2a.cpp (+1-1)
  • (modified) clang/test/SemaCXX/static-assert-cxx17.cpp (+1-1)
diff --git a/clang/lib/Sema/SemaDeclCXX.cpp b/clang/lib/Sema/SemaDeclCXX.cpp
index c6218a491aece..e3d46c3140741 100644
--- a/clang/lib/Sema/SemaDeclCXX.cpp
+++ b/clang/lib/Sema/SemaDeclCXX.cpp
@@ -17219,6 +17219,13 @@ static bool ConvertAPValueToString(const APValue &V, QualType T,
     OS << "i)";
   } break;
 
+  case APValue::ValueKind::Array:
+  case APValue::ValueKind::Vector:
+  case APValue::ValueKind::Struct: {
+    llvm::raw_svector_ostream OS(Str);
+    V.printPretty(OS, Context, T);
+  } break;
+
   default:
     return false;
   }
@@ -17256,11 +17263,10 @@ static bool UsefulToPrintExpr(const Expr *E) {
 /// Try to print more useful information about a failed static_assert
 /// with expression \E
 void Sema::DiagnoseStaticAssertDetails(const Expr *E) {
-  if (const auto *Op = dyn_cast<BinaryOperator>(E);
-      Op && Op->getOpcode() != BO_LOr) {
-    const Expr *LHS = Op->getLHS()->IgnoreParenImpCasts();
-    const Expr *RHS = Op->getRHS()->IgnoreParenImpCasts();
-
+  const auto Diagnose = [&](const Expr *LHS, const Expr *RHS,
+                            const llvm::StringRef &OpStr) {
+    LHS = LHS->IgnoreParenImpCasts();
+    RHS = RHS->IgnoreParenImpCasts();
     // Ignore comparisons of boolean expressions with a boolean literal.
     if ((isa<CXXBoolLiteralExpr>(LHS) && RHS->getType()->isBooleanType()) ||
         (isa<CXXBoolLiteralExpr>(RHS) && LHS->getType()->isBooleanType()))
@@ -17287,10 +17293,19 @@ void Sema::DiagnoseStaticAssertDetails(const Expr *E) {
                                  DiagSide[I].ValueString, Context);
     }
     if (DiagSide[0].Print && DiagSide[1].Print) {
-      Diag(Op->getExprLoc(), diag::note_expr_evaluates_to)
-          << DiagSide[0].ValueString << Op->getOpcodeStr()
-          << DiagSide[1].ValueString << Op->getSourceRange();
+      Diag(E->getExprLoc(), diag::note_expr_evaluates_to)
+          << DiagSide[0].ValueString << OpStr << DiagSide[1].ValueString
+          << E->getSourceRange();
     }
+  };
+
+  if (const auto *Op = dyn_cast<BinaryOperator>(E);
+      Op && Op->getOpcode() != BO_LOr) {
+    Diagnose(Op->getLHS(), Op->getRHS(), Op->getOpcodeStr());
+  } else if (const auto *Op = dyn_cast<CXXOperatorCallExpr>(E);
+             Op && Op->isInfixBinaryOp()) {
+    Diagnose(Op->getArg(0), Op->getArg(1),
+             getOperatorSpelling(Op->getOperator()));
   }
 }
 
diff --git a/clang/test/CXX/class/class.compare/class.eq/p3.cpp b/clang/test/CXX/class/class.compare/class.eq/p3.cpp
index 04db022fe7302..53c4dda133301 100644
--- a/clang/test/CXX/class/class.compare/class.eq/p3.cpp
+++ b/clang/test/CXX/class/class.compare/class.eq/p3.cpp
@@ -6,11 +6,11 @@ struct A {
 };
 
 static_assert(A{1, 2, 3, 4, 5} == A{1, 2, 3, 4, 5});
-static_assert(A{1, 2, 3, 4, 5} == A{0, 2, 3, 4, 5}); // expected-error {{failed}}
-static_assert(A{1, 2, 3, 4, 5} == A{1, 0, 3, 4, 5}); // expected-error {{failed}}
-static_assert(A{1, 2, 3, 4, 5} == A{1, 2, 0, 4, 5}); // expected-error {{failed}}
-static_assert(A{1, 2, 3, 4, 5} == A{1, 2, 3, 0, 5}); // expected-error {{failed}}
-static_assert(A{1, 2, 3, 4, 5} == A{1, 2, 3, 4, 0}); // expected-error {{failed}}
+static_assert(A{1, 2, 3, 4, 5} == A{0, 2, 3, 4, 5}); // expected-error {{failed}} expected-note {{evaluates to}}
+static_assert(A{1, 2, 3, 4, 5} == A{1, 0, 3, 4, 5}); // expected-error {{failed}} expected-note {{evaluates to}}
+static_assert(A{1, 2, 3, 4, 5} == A{1, 2, 0, 4, 5}); // expected-error {{failed}} expected-note {{evaluates to}}
+static_assert(A{1, 2, 3, 4, 5} == A{1, 2, 3, 0, 5}); // expected-error {{failed}} expected-note {{evaluates to}}
+static_assert(A{1, 2, 3, 4, 5} == A{1, 2, 3, 4, 0}); // expected-error {{failed}} expected-note {{evaluates to}}
 
 struct B {
   int a, b[3], c;
@@ -18,8 +18,8 @@ struct B {
 };
 
 static_assert(B{1, 2, 3, 4, 5} == B{1, 2, 3, 4, 5});
-static_assert(B{1, 2, 3, 4, 5} == B{0, 2, 3, 4, 5}); // expected-error {{failed}}
-static_assert(B{1, 2, 3, 4, 5} == B{1, 0, 3, 4, 5}); // expected-error {{failed}}
-static_assert(B{1, 2, 3, 4, 5} == B{1, 2, 0, 4, 5}); // expected-error {{failed}}
-static_assert(B{1, 2, 3, 4, 5} == B{1, 2, 3, 0, 5}); // expected-error {{failed}}
-static_assert(B{1, 2, 3, 4, 5} == B{1, 2, 3, 4, 0}); // expected-error {{failed}}
+static_assert(B{1, 2, 3, 4, 5} == B{0, 2, 3, 4, 5}); // expected-error {{failed}} expected-note {{evaluates to}}
+static_assert(B{1, 2, 3, 4, 5} == B{1, 0, 3, 4, 5}); // expected-error {{failed}} expected-note {{evaluates to}}
+static_assert(B{1, 2, 3, 4, 5} == B{1, 2, 0, 4, 5}); // expected-error {{failed}} expected-note {{evaluates to}}
+static_assert(B{1, 2, 3, 4, 5} == B{1, 2, 3, 0, 5}); // expected-error {{failed}} expected-note {{evaluates to}}
+static_assert(B{1, 2, 3, 4, 5} == B{1, 2, 3, 4, 0}); // expected-error {{failed}} expected-note {{evaluates to}}
diff --git a/clang/test/CXX/class/class.compare/class.rel/p2.cpp b/clang/test/CXX/class/class.compare/class.rel/p2.cpp
index 90115284d2bd0..07501c6a08184 100644
--- a/clang/test/CXX/class/class.compare/class.rel/p2.cpp
+++ b/clang/test/CXX/class/class.compare/class.rel/p2.cpp
@@ -10,15 +10,15 @@ namespace Rel {
     friend bool operator>=(const A&, const A&) = default;
   };
   static_assert(A{0} < A{1});
-  static_assert(A{1} < A{1}); // expected-error {{failed}}
+  static_assert(A{1} < A{1}); // expected-error {{failed}} expected-note {{'{1} < {1}'}}
   static_assert(A{0} <= A{1});
   static_assert(A{1} <= A{1});
-  static_assert(A{2} <= A{1}); // expected-error {{failed}}
+  static_assert(A{2} <= A{1}); // expected-error {{failed}} expected-note {{'{2} <= {1}'}}
   static_assert(A{1} > A{0});
-  static_assert(A{1} > A{1}); // expected-error {{failed}}
+  static_assert(A{1} > A{1}); // expected-error {{failed}} expected-note {{'{1} > {1}'}}
   static_assert(A{1} >= A{0});
   static_assert(A{1} >= A{1});
-  static_assert(A{1} >= A{2}); // expected-error {{failed}}
+  static_assert(A{1} >= A{2}); // expected-error {{failed}} expected-note {{'{1} >= {2}'}}
 
   struct B {
     bool operator<=>(B) const = delete; // expected-note 4{{deleted here}} expected-note-re 8{{candidate {{.*}} deleted}}
@@ -49,7 +49,7 @@ namespace NotEqual {
     friend bool operator!=(const A&, const A&) = default;
   };
   static_assert(A{1} != A{2});
-  static_assert(A{1} != A{1}); // expected-error {{failed}}
+  static_assert(A{1} != A{1}); // expected-error {{failed}} expected-note {{'{1} != {1}'}}
 
   struct B {
     bool operator==(B) const = delete; // expected-note {{deleted here}} expected-note-re 2{{candidate {{.*}} deleted}}
diff --git a/clang/test/CXX/over/over.match/over.match.funcs/over.match.oper/p9-2a.cpp b/clang/test/CXX/over/over.match/over.match.funcs/over.match.oper/p9-2a.cpp
index 95d6a55aee66a..8f31e8947a768 100644
--- a/clang/test/CXX/over/over.match/over.match.funcs/over.match.oper/p9-2a.cpp
+++ b/clang/test/CXX/over/over.match/over.match.funcs/over.match.oper/p9-2a.cpp
@@ -33,7 +33,7 @@ struct Y {};
 constexpr bool operator==(X x, Y) { return x.equal; }
 
 static_assert(X{true} == Y{});
-static_assert(X{false} == Y{}); // expected-error {{failed}}
+static_assert(X{false} == Y{}); // expected-error {{failed}} expected-note{{'{false} == {}'}}
 
 // x == y -> y == x
 static_assert(Y{} == X{true});
diff --git a/clang/test/SemaCXX/static-assert-cxx17.cpp b/clang/test/SemaCXX/static-assert-cxx17.cpp
index 41a7b025d0eb7..1d78915aa13e1 100644
--- a/clang/test/SemaCXX/static-assert-cxx17.cpp
+++ b/clang/test/SemaCXX/static-assert-cxx17.cpp
@@ -94,7 +94,7 @@ void foo6() {
   // expected-error@-1{{static assertion failed due to requirement '(const X<int> *)nullptr'}}
   static_assert(static_cast<const X<typename T::T> *>(nullptr));
   // expected-error@-1{{static assertion failed due to requirement 'static_cast<const X<int> *>(nullptr)'}}
-  static_assert((const X<typename T::T>[]){} == nullptr);
+  static_assert((const X<typename T::T>[]){} == nullptr); // expected-note{{expression evaluates to '{} == nullptr'}}
   // expected-error@-1{{static assertion failed due to requirement '(const X<int>[0]){} == nullptr'}}
   static_assert(sizeof(X<decltype(X<typename T::T>().X<typename T::T>::~X())>) == 0);
   // expected-error@-1{{static assertion failed due to requirement 'sizeof(X<void>) == 0'}} \

Copy link
Collaborator

@shafik shafik left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't feel like the changes are an improvement to the diagnostic but I would like to hear from others who focus on clang front-end reviews.

@tahonermann
Copy link
Contributor

This looks like a good change to me since coercing a compiler to reveal the values of compile-time objects can be frustrating. We could really use something like constexpr_print from P2758 (Emitting messages at compile time).

@tbaederr
Copy link
Contributor

I like this change, the only thing I'm a little afraid of is that the output will deteriorate for large structs

@sethp
Copy link
Contributor Author

sethp commented Dec 18, 2023

Thanks for the feedback! I've definitely found it helpful trying to puzzle out why one of my static_asserts is failing, though it's certainly a far less complete solution than the document @tahonermann shared.

The output from printPretty will definitely fall apart for a complex structure (especially one with e.g. multiple bases, or lots of similar fields). Still, this change seems like an overall improvement to me: it's still quite possible to cut a complex struct comparison up into multiple static_asserts. Right now I feel more or less compelled to cut every comparison down to asserts on built-in types, and this output was intended to raise that floor up to "a struct with a handful of fields and an operator==."

That said, I'm receiving approximately $0 from Big printPretty lobbying groups, so I'm happy to walk the structure however we feel ConvertAPValueToString ought to do the conversion. I do quite like how clangd does its inline initializer hints, so maybe something like:

Expression evaluates to '{x: 1, y: 2} == {x: 3, y: 4}'

for the original example, or for a int[2]:

Expression evaluates to '{[0]=1, [1]=2} == ...'

What do you think?

Copy link
Collaborator

@erichkeane erichkeane left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I too would like to see some more complicated dumps for 'complicated' structs (inheritance, multiple inheritance, etc), and vectors/arrays of complicated structs.

I am really concerned with the 'depth' that this will show and be unwieldy as a result.

sethp added 2 commits January 5, 2024 08:49
Oops, those were just things I was curious about, not meant to be left
in.
Prior to this commit, it was impossible to use the simple string
matching directives to look for most content that contains `{{`, such
as:

```
// expected-note {{my_struct{{1}, 2}}}
```

Which would parse like so:

```
             "nested" brace v
// expected-note {{my_struct{{1}, 2}}}
          closes the nested brace  ^ |
                            trailing }
```

And the frontend would complain 'cannot find end ('}}') of expected'.

At this snapshot, VerifyDiagnosticConsumer's parser now counts the
opening braces and looks for a matching length of closing sigils,
allowing the above to be written as:

```
// expected-note {{{my_struct{{1}, 2}}}}
   opening brace |-|                 |-|
  closing brace is '}}}', found here ^
```
@sethp
Copy link
Contributor Author

sethp commented Jan 9, 2024

In playing around with this some more, I found it was pretty effective for me to have an intializer-like expression (as in e769f15 ), because then I could copy/paste the output from clang directly into a static_assert() in the source file and start cutting it down interactively with clangd's guidance and my editor identifying common subexpressions:

// evaluates to 'A{0, {0, 3, 4}, 5} == A{1, {2, 3, 4}, 5}' ->
static_assert(A{0, {0, 3, 4}, 5} == A{1, {2, 3, 4}, 5}); // ->
static_assert(A{0, {0}, 5} == A{1, {2}, 5}); // ->
static_assert(A{0, {0}} == A{1, {2}});

gets me to a minimal "diff" of the complex struct plus some nice paths to the different elements:

image

I could reproduce that in the textual output, but I worry that'd be too noisy without the dimming/color text effects.

Comment on lines 43 to 44
// expected-note@+1 {{{evaluates to 'C{{1, {2, 3, 4}, 5}, {7, 6}, 0} == C{{0, {0, 3, 4}, 5}, {5, 0}, 1}'}}}
static_assert(cc == C{a0, {5}, C::E2}); // expected-error {{failed}}
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

NB:

static_assert(C{{1, {2, 3, 4}, 5}, {7, 6}, 0} == C{{0, {0, 3, 4}, 5}, {5, 0}, 1});
//                                         ^                                  ^

doesn't work as an example of two fully initialized Cs because the unnamed enum can't be initialized from the ints indicated above; if we like this idea, I'll look into how to print those out as C::E1 and C::E2.

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I really like the idea of printing enum values (though we obviously have to revert to the number in cases where there isn't a value).

However, these longer ones make me wonder if we should be printing JUST the differences here instead of all of this. I realize this is a divergence in the direction of this patch, but feels like it would be a MUCH nicer experience.

MAYBE it is a second note? And perhaps it should only happen when there is no user-defined == (and maybe the printing everythign should only happen in that case?).

Consider:

struct S {
int a,b,c;
};
constexpr bool operator==(const S&, const S&) { return false;}

static_assert(S{1,2,3} == S{1,2,3});

This diagnostic would be REALLY confusing as is, right? 'static-assert-failed' followed by 'S{1,2,3} != S{1,2,3}`.

Copy link
Contributor Author

@sethp sethp Jan 10, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It is confusing, but I don't think this change makes it any worse; the output I see from clang with my change is:

clang/test/SemaCXX/static-assert-diagnostics.cpp:177:15: error: static assertion failed due to requirement 'S{1, 2, 3} == S{1, 2, 3}'
  177 | static_assert(S{1,2,3} == S{1,2,3});
      |               ^~~~~~~~~~~~~~~~~~~~
clang/test/SemaCXX/static-assert-diagnostics.cpp:177:24: note: expression evaluates to 'S{1, 2, 3} == S{1, 2, 3}'
  177 | static_assert(S{1,2,3} == S{1,2,3});
      |               ~~~~~~~~~^~~~~~~~~~~

The note this change adds is redundant, sure, but I don't think it's more confusing than omitting it: either way, I'm going to go hunt for an operator== between two Ss and see if I can figure out why two things that sure feel like they ought to be == aren't.

A different example with the same operator== overload there might be something like this:

struct alignas(int) S {
unsigned char a, b, c;
};
constexpr bool operator==(const S&, const S&) { return false; }

static_assert(S{1, 2, 3} == std::bit_cast<S>(0x030201));

Which now outputs:

clang/test/SemaCXX/static-assert-diagnostics.cpp:177:15: error: static assertion failed due to requirement 'S{1, 2, 3} == std::bit_cast(197121)'
  177 | static_assert(S{1, 2, 3} == std::bit_cast<S>(0x030201));
      |               ^~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
clang/test/SemaCXX/static-assert-diagnostics.cpp:177:26: note: expression evaluates to 'S{1, 2, 3} == struct S{1, 2, 3}'
  177 | static_assert(S{1, 2, 3} == std::bit_cast<S>(0x030201));
      |               ~~~~~~~~~~~^~~~~~~~~~~~~~~~~~~~~~~~~~~~~

Here the note is, I think, quite helpful: to me it very strongly suggests to me that the issue isn't the bit_cast or a endian-ness mismatch or anything like that, but the operator== implementation.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Oh, I also wanted to add: I like the idea of adding extra notes here when possible to be more helpful in diagnosing the problem. To me, that's what this PR does: adds more context that will help in at least a few situations that I've found.

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

IMO, adding notes when it doesn't add additional information is a net-loss. Folks already complain about the number of our error/notes when trying to figure something out, so I'm concerned this is actually making it worse when we don't print something 'different' in the note.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I hear your frustration there: it's a very goldilocks problem to get the feedback "just right," and not one made easier by the, ah, breadth of individual preferences (and the cartesian product thereof with the standards).

In this case, would it make sense to mirror what clang does with e.g. integer constants, and skip the note if they evaluate identically to as-written? So, we'd avoid noting the redundant info for static_assert(S{1, 2, 3} == S{1, 2, 3}), but still emit the result of the bit_cast for static_assert(S{1, 2, 3} == std::bit_cast<S>(0x030201))?

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think that would be a sensible improvement here. I think mixed with making sure we don't print arbitrarily sized structs/arrays would make this 'net improvement'.

erichkeane pushed a commit that referenced this pull request Jan 9, 2024
Prior to this commit, it was impossible to use the simple string
matching directives to look for any content that contains unbalanced
`{{` `}}` pairs, such as:

```
// expected-note {{my_struct{{1}, 2}}}
```

Which would parse like so:

```
             "nested" brace v
// expected-note {{my_struct{{1}, 2}}}
          closes the nested brace  ^ |
                            trailing }
```

And the frontend would complain 'cannot find end ('}}') of expected'.

At this snapshot, VerifyDiagnosticConsumer's parser now counts the
opening braces and looks for a matching length of closing sigils,
allowing the above to be written as:

```
// expected-note {{{my_struct{{1}, 2}}}}
   opening brace |-|                 |-|
  closing brace is '}}}', found here ^
```

This came about as a result of this discussion:
#74852 (comment)

cc @erichkeane
Teaches APValue::printPretty to print appropriate cast expressions for
enumeral types rather than just bare ints, and to respect the
pre-existing PrintingPolicy::UseEnumerators field (falling back to
printing a cast when the value doesn't match any named constant).
Comment on lines +722 to +725
if (ECD->isCXXClassMember())
ECD->printQualifiedName(Out, Policy);
else
ECD->printName(Out, Policy);
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is a weird heuristic: without it, we were trying to print a function-local enum constant as fn_name()::Constant.

It might be better to go back to just always using the QualifiedName though, I'm not sure.

Comment on lines 188 to 204
#if __BYTE_ORDER__ == __ORDER_LITTLE_ENDIAN__
constexpr auto bits = 0x030201;
#elif __BYTE_ORDER__ == __ORDER_BIG_ENDIAN__
constexpr auto bits = 0x01020300;
#else
#error "don't know what to do with mixed endianness"
#endif

struct alignas(decltype(bits)) S {
unsigned char a, b, c;
};
// confusing `==` on purpose
constexpr bool operator==(const S&, const S&) { return false; }

// the note should clearly implicate the `==` implementation
// expected-note@+1 {{{evaluates to 'S{1, 2, 3} == S{1, 2, 3}'}}}
static_assert(S{1, 2, 3} == std::bit_cast<S>(bits)); // expected-error {{failed}}
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I went ahead and added this example to the test.

Also, writing it served as a small experience report: I got the big-endian bits wrong at first. The little endian constant's padding is in the implicit leading zeros, so when I flipped it I wrote 0x010203, and I saw:

clang/test/SemaCXX/static-assert-diagnostics.cpp:207:15: error: static assertion failed due to requirement '(anonymous namespace)::S{1, 2, 3} == std::bit_cast(bits)'
  207 | static_assert(S{1, 2, 3} == std::bit_cast<S>(bits)); // expected-error {{failed}}
      |               ^~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
clang/test/SemaCXX/static-assert-diagnostics.cpp:207:26: note: expression evaluates to 'S{1, 2, 3} == S{0, 1, 2}'
  207 | static_assert(S{1, 2, 3} == std::bit_cast<S>(bits)); // expected-error {{failed}}
      |               ~~~~~~~~~~~^~~~~~~~~~~~~~~~~~~~~~~~~

the new note's output 'S{1, 2, 3} == S{0, 1, 2}' effectively reminded me to add the padding bits to push the value "left."

There's plenty of cases where the new diagnostic breaks down, but it has been very helpful to me so far, and I wanted to make sure to highlight that.

@sethp
Copy link
Contributor Author

sethp commented Jan 14, 2024

I think this change is useful enough on its own that I'd like us to consider merging it: to me, teaching the compiler to do additional diffing or further clarifying the structure of complicated sub-objects feel like they'd benefit from having this work as a fallback. We could then build up from concrete examples of "it would be nice if the compiler diff'd here", and rely on the much more capable (and domain-aware) programmers to use the complete evaluation result for solving the general case.

That said, I'm curious if there's an upper limit we ought to set here on the size of the string: copy & paste work fine for hundreds or even thousands of characters, but if someone static_asserts on a multi-GB const char*, they might be in for a bit of trouble.

@erichkeane
Copy link
Collaborator

I think this change is useful enough on its own that I'd like us to consider merging it: to me, teaching the compiler to do additional diffing or further clarifying the structure of complicated sub-objects feel like they'd benefit from having this work as a fallback. We could then build up from concrete examples of "it would be nice if the compiler diff'd here", and rely on the much more capable (and domain-aware) programmers to use the complete evaluation result for solving the general case.

As I said before, I'm somewhat concerned this makes some situations worse by adding more noise, and is 'useless' in cases where the size is long enough to make the 'difference' invisible. There is a judgement of value here that I'd probably like to hear the opinion of @AaronBallman and @cor3ntin .

That said, I'm curious if there's an upper limit we ought to set here on the size of the string: copy & paste work fine for hundreds or even thousands of characters, but if someone static_asserts on a multi-GB const char*, they might be in for a bit of trouble.

We absolutely should limit the size of the notes, another concern I had though I thought we auto-snipped when printing large structures. If that doesn't happen, we need to NOT do this patch, crashing a user's terminal is worse than giving them insufficient information.

@sethp
Copy link
Contributor Author

sethp commented Jan 16, 2024

Yeah, definitely agree about adding no new issues like #71675 . The good news is that ought to make for an easy test case: something like struct { _BitInt(__BITINT_MAXWIDTH__ >> 6) F } should do it. I'll take a look!

Introduces size_limited_ostream which clips output sent to it to a
maximum number of leading/trailing bytes.

Makes use of that new facility to avoid emitting overly large struct or
vector/array values (unless explicitly requested with the new
`-fconstexpr-print-value-size-limit`.

Additionally, look a little deeper in the expr tree to try and  avoid
printing redundant notes when the expressions evaluate exactly
as-written.
Copy link
Collaborator

@erichkeane erichkeane left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think this needs a 'merge' to get rid of the }}} patch changes from its diff, but I'm in favor of the patch otherwise. Probably could use a release note, and a look over @cor3ntin or @cjdb perhaps?

@sethp
Copy link
Contributor Author

sethp commented Jan 19, 2024

Yeah, I've still got some history editing to do: for now, the merge made the "all commits" diff look like it contains just the changes this PR represents. Thank you very much for your review feedback: I really like how it ended up coming out. I think we got something that's at a sweet spot for helpful but not noisy or overwhelming, and still lets static_assert authors apply more tools to creating more maintainable conditions.

justinfargnoli pushed a commit to justinfargnoli/llvm-project that referenced this pull request Jan 28, 2024
Prior to this commit, it was impossible to use the simple string
matching directives to look for any content that contains unbalanced
`{{` `}}` pairs, such as:

```
// expected-note {{my_struct{{1}, 2}}}
```

Which would parse like so:

```
             "nested" brace v
// expected-note {{my_struct{{1}, 2}}}
          closes the nested brace  ^ |
                            trailing }
```

And the frontend would complain 'cannot find end ('}}') of expected'.

At this snapshot, VerifyDiagnosticConsumer's parser now counts the
opening braces and looks for a matching length of closing sigils,
allowing the above to be written as:

```
// expected-note {{{my_struct{{1}, 2}}}}
   opening brace |-|                 |-|
  closing brace is '}}}', found here ^
```

This came about as a result of this discussion:
llvm#74852 (comment)

cc @erichkeane
@sethp
Copy link
Contributor Author

sethp commented Feb 5, 2024

Looks like I missed the branch for v18 with this one, hence the conflict (now resolved), too bad.

The test that's failing is looking for clang to emit some options that it's not; it looks like the same test is failing on main, so I haven't dug too deep into it. I believe I'm just waiting on one more reviewer's approval (possibly @cor3ntin or @cjdb) for this change, and then I'll rebase/squash the history & it will be ready.

Does that sound right to y'all?

@tbaederr
Copy link
Contributor

Ping @cor3ntin @cjdb

1 similar comment
@tbaederr
Copy link
Contributor

tbaederr commented Mar 7, 2024

Ping @cor3ntin @cjdb

@sethp
Copy link
Contributor Author

sethp commented Mar 15, 2024

Ping @cor3ntin @cjdb —this one definitely needs a rebase from me, but I don't want to do that if y'all have a review in progress.

@yronglin
Copy link
Contributor

FYI, Does this PR have some overlapping functions with __builtin_dump_struct in structure printing?

namespace {
struct BuiltinDumpStructGenerator {
Sema &S;
CallExpr *TheCall;
SourceLocation Loc = TheCall->getBeginLoc();
SmallVector<Expr *, 32> Actions;
DiagnosticErrorTrap ErrorTracker;
PrintingPolicy Policy;
BuiltinDumpStructGenerator(Sema &S, CallExpr *TheCall)
: S(S), TheCall(TheCall), ErrorTracker(S.getDiagnostics()),
Policy(S.Context.getPrintingPolicy()) {
Policy.AnonymousTagLocations = false;
}
Expr *makeOpaqueValueExpr(Expr *Inner) {
auto *OVE = new (S.Context)
OpaqueValueExpr(Loc, Inner->getType(), Inner->getValueKind(),
Inner->getObjectKind(), Inner);
Actions.push_back(OVE);
return OVE;
}
Expr *getStringLiteral(llvm::StringRef Str) {
Expr *Lit = S.Context.getPredefinedStringLiteralFromCache(Str);
// Wrap the literal in parentheses to attach a source location.
return new (S.Context) ParenExpr(Loc, Loc, Lit);
}
bool callPrintFunction(llvm::StringRef Format,
llvm::ArrayRef<Expr *> Exprs = {}) {
SmallVector<Expr *, 8> Args;
assert(TheCall->getNumArgs() >= 2);
Args.reserve((TheCall->getNumArgs() - 2) + /*Format*/ 1 + Exprs.size());
Args.assign(TheCall->arg_begin() + 2, TheCall->arg_end());
Args.push_back(getStringLiteral(Format));
Args.insert(Args.end(), Exprs.begin(), Exprs.end());
// Register a note to explain why we're performing the call.
Sema::CodeSynthesisContext Ctx;
Ctx.Kind = Sema::CodeSynthesisContext::BuildingBuiltinDumpStructCall;
Ctx.PointOfInstantiation = Loc;
Ctx.CallArgs = Args.data();
Ctx.NumCallArgs = Args.size();
S.pushCodeSynthesisContext(Ctx);
ExprResult RealCall =
S.BuildCallExpr(/*Scope=*/nullptr, TheCall->getArg(1),
TheCall->getBeginLoc(), Args, TheCall->getRParenLoc());
S.popCodeSynthesisContext();
if (!RealCall.isInvalid())
Actions.push_back(RealCall.get());
// Bail out if we've hit any errors, even if we managed to build the
// call. We don't want to produce more than one error.
return RealCall.isInvalid() || ErrorTracker.hasErrorOccurred();
}
Expr *getIndentString(unsigned Depth) {
if (!Depth)
return nullptr;
llvm::SmallString<32> Indent;
Indent.resize(Depth * Policy.Indentation, ' ');
return getStringLiteral(Indent);
}
Expr *getTypeString(QualType T) {
return getStringLiteral(T.getAsString(Policy));
}
bool appendFormatSpecifier(QualType T, llvm::SmallVectorImpl<char> &Str) {
llvm::raw_svector_ostream OS(Str);
// Format 'bool', 'char', 'signed char', 'unsigned char' as numbers, rather
// than trying to print a single character.
if (auto *BT = T->getAs<BuiltinType>()) {
switch (BT->getKind()) {
case BuiltinType::Bool:
OS << "%d";
return true;
case BuiltinType::Char_U:
case BuiltinType::UChar:
OS << "%hhu";
return true;
case BuiltinType::Char_S:
case BuiltinType::SChar:
OS << "%hhd";
return true;
default:
break;
}
}
analyze_printf::PrintfSpecifier Specifier;
if (Specifier.fixType(T, S.getLangOpts(), S.Context, /*IsObjCLiteral=*/false)) {
// We were able to guess how to format this.
if (Specifier.getConversionSpecifier().getKind() ==
analyze_printf::PrintfConversionSpecifier::sArg) {
// Wrap double-quotes around a '%s' specifier and limit its maximum
// length. Ideally we'd also somehow escape special characters in the
// contents but printf doesn't support that.
// FIXME: '%s' formatting is not safe in general.
OS << '"';
Specifier.setPrecision(analyze_printf::OptionalAmount(32u));
Specifier.toString(OS);
OS << '"';
// FIXME: It would be nice to include a '...' if the string doesn't fit
// in the length limit.
} else {
Specifier.toString(OS);
}
return true;
}
if (T->isPointerType()) {
// Format all pointers with '%p'.
OS << "%p";
return true;
}
return false;
}
bool dumpUnnamedRecord(const RecordDecl *RD, Expr *E, unsigned Depth) {
Expr *IndentLit = getIndentString(Depth);
Expr *TypeLit = getTypeString(S.Context.getRecordType(RD));
if (IndentLit ? callPrintFunction("%s%s", {IndentLit, TypeLit})
: callPrintFunction("%s", {TypeLit}))
return true;
return dumpRecordValue(RD, E, IndentLit, Depth);
}
// Dump a record value. E should be a pointer or lvalue referring to an RD.
bool dumpRecordValue(const RecordDecl *RD, Expr *E, Expr *RecordIndent,
unsigned Depth) {
// FIXME: Decide what to do if RD is a union. At least we should probably
// turn off printing `const char*` members with `%s`, because that is very
// likely to crash if that's not the active member. Whatever we decide, we
// should document it.
// Build an OpaqueValueExpr so we can refer to E more than once without
// triggering re-evaluation.
Expr *RecordArg = makeOpaqueValueExpr(E);
bool RecordArgIsPtr = RecordArg->getType()->isPointerType();
if (callPrintFunction(" {\n"))
return true;
// Dump each base class, regardless of whether they're aggregates.
if (const auto *CXXRD = dyn_cast<CXXRecordDecl>(RD)) {
for (const auto &Base : CXXRD->bases()) {
QualType BaseType =
RecordArgIsPtr ? S.Context.getPointerType(Base.getType())
: S.Context.getLValueReferenceType(Base.getType());
ExprResult BasePtr = S.BuildCStyleCastExpr(
Loc, S.Context.getTrivialTypeSourceInfo(BaseType, Loc), Loc,
RecordArg);
if (BasePtr.isInvalid() ||
dumpUnnamedRecord(Base.getType()->getAsRecordDecl(), BasePtr.get(),
Depth + 1))
return true;
}
}
Expr *FieldIndentArg = getIndentString(Depth + 1);
// Dump each field.
for (auto *D : RD->decls()) {
auto *IFD = dyn_cast<IndirectFieldDecl>(D);
auto *FD = IFD ? IFD->getAnonField() : dyn_cast<FieldDecl>(D);
if (!FD || FD->isUnnamedBitfield() || FD->isAnonymousStructOrUnion())
continue;
llvm::SmallString<20> Format = llvm::StringRef("%s%s %s ");
llvm::SmallVector<Expr *, 5> Args = {FieldIndentArg,
getTypeString(FD->getType()),
getStringLiteral(FD->getName())};
if (FD->isBitField()) {
Format += ": %zu ";
QualType SizeT = S.Context.getSizeType();
llvm::APInt BitWidth(S.Context.getIntWidth(SizeT),
FD->getBitWidthValue(S.Context));
Args.push_back(IntegerLiteral::Create(S.Context, BitWidth, SizeT, Loc));
}
Format += "=";
ExprResult Field =
IFD ? S.BuildAnonymousStructUnionMemberReference(
CXXScopeSpec(), Loc, IFD,
DeclAccessPair::make(IFD, AS_public), RecordArg, Loc)
: S.BuildFieldReferenceExpr(
RecordArg, RecordArgIsPtr, Loc, CXXScopeSpec(), FD,
DeclAccessPair::make(FD, AS_public),
DeclarationNameInfo(FD->getDeclName(), Loc));
if (Field.isInvalid())
return true;
auto *InnerRD = FD->getType()->getAsRecordDecl();
auto *InnerCXXRD = dyn_cast_or_null<CXXRecordDecl>(InnerRD);
if (InnerRD && (!InnerCXXRD || InnerCXXRD->isAggregate())) {
// Recursively print the values of members of aggregate record type.
if (callPrintFunction(Format, Args) ||
dumpRecordValue(InnerRD, Field.get(), FieldIndentArg, Depth + 1))
return true;
} else {
Format += " ";
if (appendFormatSpecifier(FD->getType(), Format)) {
// We know how to print this field.
Args.push_back(Field.get());
} else {
// We don't know how to print this field. Print out its address
// with a format specifier that a smart tool will be able to
// recognize and treat specially.
Format += "*%p";
ExprResult FieldAddr =
S.BuildUnaryOp(nullptr, Loc, UO_AddrOf, Field.get());
if (FieldAddr.isInvalid())
return true;
Args.push_back(FieldAddr.get());
}
Format += "\n";
if (callPrintFunction(Format, Args))
return true;
}
}
return RecordIndent ? callPrintFunction("%s}\n", RecordIndent)
: callPrintFunction("}\n");
}
Expr *buildWrapper() {
auto *Wrapper = PseudoObjectExpr::Create(S.Context, TheCall, Actions,
PseudoObjectExpr::NoResult);
TheCall->setType(Wrapper->getType());
TheCall->setValueKind(Wrapper->getValueKind());
return Wrapper;
}
};
} // namespace

@sethp
Copy link
Contributor Author

sethp commented Mar 19, 2024

@yronglin Thanks for introducing me to __builtin_dump_struct, that looks useful!

I'm not sure I see much in the way of overlap with this PR, though: would you say more about what you mean? Is there maybe some commonality that you're concerned might drift apart?

@yronglin
Copy link
Contributor

@yronglin Thanks for introducing me to __builtin_dump_struct, that looks useful!

I'm not sure I see much in the way of overlap with this PR, though: would you say more about what you mean? Is there maybe some commonality that you're concerned might drift apart?

Sorry for not describing it clearly before, I really like this improvements. __builtin_dump_struct has the dilemma for complicated dumps for 'complicated' structs too. Can they have similar restricts and output formats in the future (not in this PR).

@sethp
Copy link
Contributor Author

sethp commented Mar 19, 2024

Ah, I think I understand you: it sounds like you're imagining using __builtin_dump_struct to inspect/trace a value that ends up in a failing static_assert. Is that right?

I can definitely look into what combination of arguments to __builtin_dump_struct (if any) would produce similar output, that makes a lot of sense to me.

@sethp
Copy link
Contributor Author

sethp commented Apr 4, 2024

Ping @cor3ntin @cjdb

2 similar comments
@sethp
Copy link
Contributor Author

sethp commented Apr 30, 2024

Ping @cor3ntin @cjdb

@sethp
Copy link
Contributor Author

sethp commented Jun 18, 2024

Ping @cor3ntin @cjdb

@tbaederr
Copy link
Contributor

tbaederr commented Oct 4, 2024

Ping

Comment on lines 17181 to +17182

/// A raw_ostream that clips its output beyond a certain size, inserting a
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I this we should move that to llvm/Support - either in raw_ostream.h or a new file

@@ -84,6 +84,7 @@ class DiagnosticOptions : public RefCountedBase<DiagnosticOptions>{
DefaultMacroBacktraceLimit = 6,
DefaultTemplateBacktraceLimit = 10,
DefaultConstexprBacktraceLimit = 10,
DefaultConstexprValueSizeLimit = 320,
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

320?

Comment on lines +742 to +747
} else if (const NamedDecl *ND = dyn_cast_if_present<NamedDecl>(
Defn->getNextDeclInContext())) {
// if it's part of a declaration, then use `(decltype(...))7`
Out << "(decltype(";
ND->printQualifiedName(Out);
Out << "))";
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Does it really make sense to support the unnamed case?

Comment on lines +17400 to +17413
if (RT->hasUnnamedOrLocalType()) {
OS << '(';
// e.g. `(unnamed struct at ...)`, unless...
if (const auto *Defn = RT->getDecl()->getDefinition()) {
if (const NamedDecl *ND =
dyn_cast_if_present<NamedDecl>(Defn->getNextDeclInContext())) {
// ... it's part of a declaration, then use `(decltype(...))`
OS << "decltype(";
ND->printQualifiedName(OS);
OS << ")";
} else
Defn->printQualifiedName(OS);
}
OS << ')';
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

There too, I think only supporting the named case would be a good start

Comment on lines +17414 to +17420
} else if (T.hasQualifiers()) {
OS << '(' << T << ')';
} else {
PrintingPolicy TyPolicy(Context.getPrintingPolicy());
TyPolicy.SuppressUnwrittenScope = true;
T.print(OS, TyPolicy);
}
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why are these two cases so different?

Comment on lines +17423 to +17428
unsigned Limit = Context.getDiagnostics().getConstexprValueSizeLimit();
if (Limit) {
size_limited_ostream OSS(OS, Limit);
V.printPretty(OSS, Policy, T, &Context);
} else
V.printPretty(OS, Policy, T, &Context);
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

You do that in a few places, can you create a function or lambda?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
clang:frontend Language frontend issues, e.g. anything involving "Sema" clang Clang issues not falling into any other category
Projects
None yet
Development

Successfully merging this pull request may close these issues.

8 participants