Skip to content

[lldb] Analyze enum promotion type during parsing #115005

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
merged 15 commits into from
Feb 13, 2025

Conversation

kuilpd
Copy link
Contributor

@kuilpd kuilpd commented Nov 5, 2024

The information about an enum's best promotion type is discarded after compilation and is not present in debug info. This patch repeats the same analysis of each enum value as in the front-end to determine the best promotion type during DWARF info parsing.

Fixes #86989

@kuilpd kuilpd added the lldb label Nov 5, 2024
@kuilpd kuilpd requested a review from JDevlieghere as a code owner November 5, 2024 15:07
@llvmbot llvmbot added clang Clang issues not falling into any other category clang:frontend Language frontend issues, e.g. anything involving "Sema" labels Nov 5, 2024
@llvmbot
Copy link
Member

llvmbot commented Nov 5, 2024

@llvm/pr-subscribers-clang

@llvm/pr-subscribers-lldb

Author: Ilia Kuklin (kuilpd)

Changes

The information about an enum's best promotion type is discarded after compilation and is not present in debug info. This patch repeats the same analysis of each enum value as in the front-end to determine the best promotion type during DWARF info parsing.

Fixes #86989


Full diff: https://github.com/llvm/llvm-project/pull/115005.diff

6 Files Affected:

  • (modified) clang/include/clang/AST/Decl.h (+1-1)
  • (modified) lldb/source/Plugins/SymbolFile/DWARF/DWARFASTParserClang.cpp (+95-1)
  • (modified) lldb/source/Plugins/TypeSystem/Clang/TypeSystemClang.cpp (+5-24)
  • (added) lldb/test/API/lang/cpp/enum_promotion/Makefile (+3)
  • (added) lldb/test/API/lang/cpp/enum_promotion/TestCPPEnumPromotion.py (+37)
  • (added) lldb/test/API/lang/cpp/enum_promotion/main.cpp (+22)
diff --git a/clang/include/clang/AST/Decl.h b/clang/include/clang/AST/Decl.h
index 8c39ef3d5a9fa6..a78e6c92abb22b 100644
--- a/clang/include/clang/AST/Decl.h
+++ b/clang/include/clang/AST/Decl.h
@@ -3891,6 +3891,7 @@ class EnumDecl : public TagDecl {
   void setInstantiationOfMemberEnum(ASTContext &C, EnumDecl *ED,
                                     TemplateSpecializationKind TSK);
 
+public:
   /// Sets the width in bits required to store all the
   /// non-negative enumerators of this enum.
   void setNumPositiveBits(unsigned Num) {
@@ -3902,7 +3903,6 @@ class EnumDecl : public TagDecl {
   /// negative enumerators of this enum. (see getNumNegativeBits)
   void setNumNegativeBits(unsigned Num) { EnumDeclBits.NumNegativeBits = Num; }
 
-public:
   /// True if this tag declaration is a scoped enumeration. Only
   /// possible in C++11 mode.
   void setScoped(bool Scoped = true) { EnumDeclBits.IsScoped = Scoped; }
diff --git a/lldb/source/Plugins/SymbolFile/DWARF/DWARFASTParserClang.cpp b/lldb/source/Plugins/SymbolFile/DWARF/DWARFASTParserClang.cpp
index a30d898a93cc4d..520fd40eda4825 100644
--- a/lldb/source/Plugins/SymbolFile/DWARF/DWARFASTParserClang.cpp
+++ b/lldb/source/Plugins/SymbolFile/DWARF/DWARFASTParserClang.cpp
@@ -2248,6 +2248,8 @@ size_t DWARFASTParserClang::ParseChildEnumerators(
     return 0;
 
   size_t enumerators_added = 0;
+  unsigned NumNegativeBits = 0;
+  unsigned NumPositiveBits = 0;
 
   for (DWARFDIE die : parent_die.children()) {
     const dw_tag_t tag = die.Tag();
@@ -2299,11 +2301,103 @@ size_t DWARFASTParserClang::ParseChildEnumerators(
     }
 
     if (name && name[0] && got_value) {
-      m_ast.AddEnumerationValueToEnumerationType(
+      auto ECD = m_ast.AddEnumerationValueToEnumerationType(
           clang_type, decl, name, enum_value, enumerator_byte_size * 8);
       ++enumerators_added;
+
+      llvm::APSInt InitVal = ECD->getInitVal();
+      // Keep track of the size of positive and negative values.
+      if (InitVal.isUnsigned() || InitVal.isNonNegative()) {
+        // If the enumerator is zero that should still be counted as a positive
+        // bit since we need a bit to store the value zero.
+        unsigned ActiveBits = InitVal.getActiveBits();
+        NumPositiveBits = std::max({NumPositiveBits, ActiveBits, 1u});
+      } else {
+        NumNegativeBits =
+            std::max(NumNegativeBits, (unsigned)InitVal.getSignificantBits());
+      }
     }
   }
+
+  /// The following code follows the same logic as in Sema::ActOnEnumBody
+  /// clang/lib/Sema/SemaDecl.cpp
+  // If we have an empty set of enumerators we still need one bit.
+  // From [dcl.enum]p8
+  // If the enumerator-list is empty, the values of the enumeration are as if
+  // the enumeration had a single enumerator with value 0
+  if (!NumPositiveBits && !NumNegativeBits)
+    NumPositiveBits = 1;
+
+  clang::EnumDecl *enum_decl =
+      ClangUtil::GetQualType(clang_type)->getAs<clang::EnumType>()->getDecl();
+  enum_decl->setNumPositiveBits(NumPositiveBits);
+  enum_decl->setNumNegativeBits(NumNegativeBits);
+
+  // C++0x N3000 [conv.prom]p3:
+  //   An rvalue of an unscoped enumeration type whose underlying
+  //   type is not fixed can be converted to an rvalue of the first
+  //   of the following types that can represent all the values of
+  //   the enumeration: int, unsigned int, long int, unsigned long
+  //   int, long long int, or unsigned long long int.
+  // C99 6.4.4.3p2:
+  //   An identifier declared as an enumeration constant has type int.
+  // The C99 rule is modified by C23.
+  clang::QualType BestPromotionType;
+  unsigned BestWidth;
+
+  auto &Context = m_ast.getASTContext();
+  unsigned LongWidth = Context.getTargetInfo().getLongWidth();
+  unsigned IntWidth = Context.getTargetInfo().getIntWidth();
+  unsigned CharWidth = Context.getTargetInfo().getCharWidth();
+  unsigned ShortWidth = Context.getTargetInfo().getShortWidth();
+
+  bool is_cpp = Language::LanguageIsCPlusPlus(
+      SymbolFileDWARF::GetLanguage(*parent_die.GetCU()));
+
+  if (NumNegativeBits) {
+    // If there is a negative value, figure out the smallest integer type (of
+    // int/long/longlong) that fits.
+    if (NumNegativeBits <= CharWidth && NumPositiveBits < CharWidth) {
+      BestWidth = CharWidth;
+    } else if (NumNegativeBits <= ShortWidth && NumPositiveBits < ShortWidth) {
+      BestWidth = ShortWidth;
+    } else if (NumNegativeBits <= IntWidth && NumPositiveBits < IntWidth) {
+      BestWidth = IntWidth;
+    } else if (NumNegativeBits <= LongWidth && NumPositiveBits < LongWidth) {
+      BestWidth = LongWidth;
+    } else {
+      BestWidth = Context.getTargetInfo().getLongLongWidth();
+    }
+    BestPromotionType =
+        BestWidth <= IntWidth ? Context.IntTy : enum_decl->getIntegerType();
+  } else {
+    // If there is no negative value, figure out the smallest type that fits
+    // all of the enumerator values.
+    if (NumPositiveBits <= CharWidth) {
+      BestPromotionType = Context.IntTy;
+      BestWidth = CharWidth;
+    } else if (NumPositiveBits <= ShortWidth) {
+      BestPromotionType = Context.IntTy;
+      BestWidth = ShortWidth;
+    } else if (NumPositiveBits <= IntWidth) {
+      BestWidth = IntWidth;
+      BestPromotionType = (NumPositiveBits == BestWidth || !is_cpp)
+                              ? Context.UnsignedIntTy
+                              : Context.IntTy;
+    } else if (NumPositiveBits <= LongWidth) {
+      BestWidth = LongWidth;
+      BestPromotionType = (NumPositiveBits == BestWidth || !is_cpp)
+                              ? Context.UnsignedLongTy
+                              : Context.LongTy;
+    } else {
+      BestWidth = Context.getTargetInfo().getLongLongWidth();
+      BestPromotionType = (NumPositiveBits == BestWidth || !is_cpp)
+                              ? Context.UnsignedLongLongTy
+                              : Context.LongLongTy;
+    }
+  }
+  enum_decl->setPromotionType(BestPromotionType);
+
   return enumerators_added;
 }
 
diff --git a/lldb/source/Plugins/TypeSystem/Clang/TypeSystemClang.cpp b/lldb/source/Plugins/TypeSystem/Clang/TypeSystemClang.cpp
index f5063175d6e070..68bc81f51e6a92 100644
--- a/lldb/source/Plugins/TypeSystem/Clang/TypeSystemClang.cpp
+++ b/lldb/source/Plugins/TypeSystem/Clang/TypeSystemClang.cpp
@@ -2341,7 +2341,6 @@ CompilerType TypeSystemClang::CreateEnumerationType(
   if (decl_ctx)
     decl_ctx->addDecl(enum_decl);
 
-  // TODO: check if we should be setting the promotion type too?
   enum_decl->setIntegerType(ClangUtil::GetQualType(integer_clang_type));
 
   enum_decl->setAccess(AS_public); // TODO respect what's in the debug info
@@ -8461,30 +8460,11 @@ bool TypeSystemClang::CompleteTagDeclarationDefinition(
   if (enum_decl->isCompleteDefinition())
     return true;
 
-  clang::ASTContext &ast = lldb_ast->getASTContext();
-
-  /// TODO This really needs to be fixed.
-
   QualType integer_type(enum_decl->getIntegerType());
   if (!integer_type.isNull()) {
-    unsigned NumPositiveBits = 1;
-    unsigned NumNegativeBits = 0;
-
-    clang::QualType promotion_qual_type;
-    // If the enum integer type is less than an integer in bit width,
-    // then we must promote it to an integer size.
-    if (ast.getTypeSize(enum_decl->getIntegerType()) <
-        ast.getTypeSize(ast.IntTy)) {
-      if (enum_decl->getIntegerType()->isSignedIntegerType())
-        promotion_qual_type = ast.IntTy;
-      else
-        promotion_qual_type = ast.UnsignedIntTy;
-    } else
-      promotion_qual_type = enum_decl->getIntegerType();
-
-    enum_decl->completeDefinition(enum_decl->getIntegerType(),
-                                  promotion_qual_type, NumPositiveBits,
-                                  NumNegativeBits);
+    enum_decl->completeDefinition(
+        enum_decl->getIntegerType(), enum_decl->getPromotionType(),
+        enum_decl->getNumPositiveBits(), enum_decl->getNumNegativeBits());
   }
   return true;
 }
@@ -8544,7 +8524,8 @@ clang::EnumConstantDecl *TypeSystemClang::AddEnumerationValueToEnumerationType(
   bool is_signed = false;
   underlying_type.IsIntegerType(is_signed);
 
-  llvm::APSInt value(enum_value_bit_size, is_signed);
+  // APSInt constructor's sign argument is isUnsigned
+  llvm::APSInt value(enum_value_bit_size, !is_signed);
   value = enum_value;
 
   return AddEnumerationValueToEnumerationType(enum_type, decl, name, value);
diff --git a/lldb/test/API/lang/cpp/enum_promotion/Makefile b/lldb/test/API/lang/cpp/enum_promotion/Makefile
new file mode 100644
index 00000000000000..99998b20bcb050
--- /dev/null
+++ b/lldb/test/API/lang/cpp/enum_promotion/Makefile
@@ -0,0 +1,3 @@
+CXX_SOURCES := main.cpp
+
+include Makefile.rules
diff --git a/lldb/test/API/lang/cpp/enum_promotion/TestCPPEnumPromotion.py b/lldb/test/API/lang/cpp/enum_promotion/TestCPPEnumPromotion.py
new file mode 100644
index 00000000000000..a4eed53c8d80f2
--- /dev/null
+++ b/lldb/test/API/lang/cpp/enum_promotion/TestCPPEnumPromotion.py
@@ -0,0 +1,37 @@
+"""
+Test LLDB type promotion of unscoped enums.
+"""
+
+import lldb
+from lldbsuite.test.decorators import *
+from lldbsuite.test.lldbtest import *
+from lldbsuite.test import lldbutil
+
+class TestCPPEnumPromotion(TestBase):
+
+    @skipIf(debug_info=no_match(["dwarf", "dwo"]))
+    def test(self):
+        self.build()
+        lldbutil.run_to_source_breakpoint(
+            self, "// break here", lldb.SBFileSpec("main.cpp")
+        )
+        UChar_promoted = self.frame().FindVariable("UChar_promoted")
+        UShort_promoted = self.frame().FindVariable("UShort_promoted")
+        UInt_promoted = self.frame().FindVariable("UInt_promoted")
+        SLong_promoted = self.frame().FindVariable("SLong_promoted")
+        ULong_promoted = self.frame().FindVariable("ULong_promoted")
+        NChar_promoted = self.frame().FindVariable("NChar_promoted")
+        NShort_promoted = self.frame().FindVariable("NShort_promoted")
+        NInt_promoted = self.frame().FindVariable("NInt_promoted")
+        NLong_promoted = self.frame().FindVariable("NLong_promoted")
+
+        # Check that LLDB's promoted type is the same as the compiler's
+        self.expect_expr("+EnumUChar::UChar", result_type=UChar_promoted.type.name)
+        self.expect_expr("+EnumUShort::UShort", result_type=UShort_promoted.type.name)
+        self.expect_expr("+EnumUInt::UInt", result_type=UInt_promoted.type.name)
+        self.expect_expr("+EnumSLong::SLong", result_type=SLong_promoted.type.name)
+        self.expect_expr("+EnumULong::ULong", result_type=ULong_promoted.type.name)
+        self.expect_expr("+EnumNChar::NChar", result_type=NChar_promoted.type.name)
+        self.expect_expr("+EnumNShort::NShort", result_type=NShort_promoted.type.name)
+        self.expect_expr("+EnumNInt::NInt", result_type=NInt_promoted.type.name)
+        self.expect_expr("+EnumNLong::NLong", result_type=NLong_promoted.type.name)
diff --git a/lldb/test/API/lang/cpp/enum_promotion/main.cpp b/lldb/test/API/lang/cpp/enum_promotion/main.cpp
new file mode 100644
index 00000000000000..fafb96f6c7b075
--- /dev/null
+++ b/lldb/test/API/lang/cpp/enum_promotion/main.cpp
@@ -0,0 +1,22 @@
+enum EnumUChar { UChar = 1 };
+enum EnumUShort { UShort = 0x101 };
+enum EnumUInt { UInt = 0x10001 };
+enum EnumSLong { SLong = 0x100000001 };
+enum EnumULong { ULong = 0xFFFFFFFFFFFFFFF0 };
+enum EnumNChar { NChar = -1 };
+enum EnumNShort { NShort= -0x101 };
+enum EnumNInt { NInt = -0x10001 };
+enum EnumNLong { NLong = -0x100000001 };
+
+int main() {
+  auto UChar_promoted = +EnumUChar::UChar;
+  auto UShort_promoted = +EnumUShort::UShort;
+  auto UInt_promoted = +EnumUInt::UInt;
+  auto SLong_promoted = +EnumSLong::SLong;
+  auto ULong_promoted = +EnumULong::ULong;
+  auto NChar_promoted = +EnumNChar::NChar;
+  auto NShort_promoted = +EnumNShort::NShort;
+  auto NInt_promoted = +EnumNInt::NInt;
+  auto NLong_promoted = +EnumNLong::NLong;
+  return 0; // break here
+}

@kuilpd
Copy link
Contributor Author

kuilpd commented Nov 5, 2024

My thought process for this patch: #86989 (comment)

Copy link

github-actions bot commented Nov 5, 2024

✅ With the latest revision this PR passed the C/C++ code formatter.

Copy link

github-actions bot commented Nov 5, 2024

✅ With the latest revision this PR passed the Python code formatter.

@kuilpd kuilpd force-pushed the lldb-analyze-enum-promotion-type branch from 91f8293 to 19b98ad Compare November 5, 2024 15:21
@Michael137
Copy link
Member

Michael137 commented Nov 5, 2024

Thank you for adressing this. I haven't done an in-depth review of the patch yet but my first instinct here is that this looks like a lot of work for LLDB which the compiler has already done, so we ideally don't want to repeat. Where is this actually an issue from a user perspective? In the example you gave one could just cast to the appropriate enum type and get the bit-mask style presentation:

(lldb) expr (UnscopedEnum) (-enum_one)
(UnscopedEnum) $0 = One | 0xfffffffe

Not that this shouldn't be fixed, just weighing of the amount of complexity added here versus the benefit.

Also, is this promotion-type something that can be encoded in DWARF? GCC generates the same DWARF as Clang here. What happens if we make the type of the DW_TAG_variables be UnscopedEnum instead of int? I guess that wouldn't help with

(lldb) expr UnscopedEnum::One + 1

@labath
Copy link
Collaborator

labath commented Nov 5, 2024

I'm worried about the same thing as Michael. My question is: if the promotion type can be computed from the information in dwarf (can it always?), and clang already has code to compute it (not from DWARF, but from.. clang AST I guess), can we refactor that code somehow so that it is usable from lldb as well?

@kuilpd
Copy link
Contributor Author

kuilpd commented Nov 5, 2024

Not that this shouldn't be fixed, just weighing of the amount of complexity added here versus the benefit.

I don't really know how useful it is in general to know the actual promotion type of the enum, I guess only for using enum values in expressions without explicit casting.

Also, is this promotion-type something that can be encoded in DWARF? GCC generates the same DWARF as Clang here.

From what I could tell there is nothing in the DWARF spec that allows to hold a promotion type of an enum. We could just use some undocumented field, but then it wouldn't work with GCC binaries.

My question is: if the promotion type can be computed from the information in dwarf (can it always?), and clang already has code to compute it (not from DWARF, but from.. clang AST I guess), can we refactor that code somehow so that it is usable from lldb as well?

It can be computed if all the enum values are present in DWARF. I thought about reusing the code from Sema as well, but it uses different interface to iterate through every value, and does other analysis along the way which is not needed during debugging anymore. Plus the patch allows to remove some of the erroneous analysis in TypeSystemClang.cpp

@kuilpd
Copy link
Contributor Author

kuilpd commented Nov 14, 2024

Ping

@kuilpd
Copy link
Contributor Author

kuilpd commented Dec 5, 2024

So, is this patch worth pursuing or is it too much code for a too specific use case?

@Michael137
Copy link
Member

So, is this patch worth pursuing or is it too much code for a too specific use case?

Sorry I was out for a few weeks when you pinged. I'll have another pass/think about it early next week (if nobody else gets to it before me).

@@ -8544,7 +8524,8 @@ clang::EnumConstantDecl *TypeSystemClang::AddEnumerationValueToEnumerationType(
bool is_signed = false;
underlying_type.IsIntegerType(is_signed);

llvm::APSInt value(enum_value_bit_size, is_signed);
// APSInt constructor's sign argument is isUnsigned
llvm::APSInt value(enum_value_bit_size, !is_signed);
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Good catch, how about we split this out into a separate PR?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Also, no need for the comment

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Made a PR #120794, once it's merged I will rebase this branch on main and run tests again.

bool is_cpp = Language::LanguageIsCPlusPlus(
SymbolFileDWARF::GetLanguage(*parent_die.GetCU()));

if (NumNegativeBits) {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

My preference would be to factor out these two if blocks from Sema into one (or two) helpers. It could look like:

std::pair</*BestWidth=*/unsigned, /*BestPromotionType=*/clang::QualType>
ComputeBestEnumProperties(unsigned NumNegativeBits, unsigned NumPositiveBits, bool IsPacked) {
  if (NumNegativeBits) {
    ...
  } else {
    ...
  }
}

Then LLDB would always pass IsPacked=false for now (we don't encode it in DWARF, but maybe could derive it? we have this problem already with packed structures anyway).

With this in place i think this patch seems very reasonable!

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If I understood you correctly, I created this separate function in clang's SemaDecl.cpp, and then called it from both LLDB and Sema itself. Seems to be working, but it needed a lot of arguments, since when called from LLDB Sema doesn't have any information stored at all, so ASTContext, EnumDecl and other arguments have to come from LLDB, and it also returns 3 values.
This looks clunky because of it, any suggestions how to make it better?

Copy link
Member

@Michael137 Michael137 Dec 23, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for trying this out! It doesn't look aweful in my opinion (considering the code duplication that it prevents). I think the best thing to do here is to split the Clang changes out into a separate PR. That would also give you more visibility from the Clang maintainers. If you didn't want to have all those output parameters you could return a tuple or maybe even a helper structure that encapsulates all those arguments. Either way, I'm sure they'll have some useful feedback on the Clang PR

@kuilpd
Copy link
Contributor Author

kuilpd commented Jan 27, 2025

@Michael137
Reused the code I merged in #120965 to get the best promotion type.

Copy link
Member

@Michael137 Michael137 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks much better

Left some more clarification comments

@@ -3903,6 +3903,7 @@ class EnumDecl : public TagDecl {
void setInstantiationOfMemberEnum(ASTContext &C, EnumDecl *ED,
TemplateSpecializationKind TSK);

public:
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Don't need this change anymore

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I left this in so that that I could simply do enum_decl->setNumPositiveBits in DWARFASTParserClang.cpp and then retrieve it via enum_decl->getNumPositiveBits in TypeSystemClang.cpp.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can we not just leave this whole block in TypeSystemClang::CompleteTagDeclarationDefinition? That way you can just pass the NumNegativeBits/NumPositiveBits to EnumDecl::completeDefinition like we used to. Why do we need to move it over here?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I see why you moved it. It's because we're iterating over the enumerator values there. But i think it's still possible to do it from TypeSystemClang (see my other comment: https://github.com/llvm/llvm-project/pull/115005/files#r1931986605)

Comment on lines 2381 to 2382
unsigned ActiveBits = InitVal.getActiveBits();
NumPositiveBits = std::max({NumPositiveBits, ActiveBits, 1u});
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
unsigned ActiveBits = InitVal.getActiveBits();
NumPositiveBits = std::max({NumPositiveBits, ActiveBits, 1u});
NumPositiveBits = std::max(InitVal.getActiveBits(), 1u);

NumPositiveBits is 0 at this point anyway right?

Copy link
Contributor Author

@kuilpd kuilpd Jan 27, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The condition checks that the value is just non negative or unsigned, so I think the comment means that it could be zero, in which case the max function would prefer 1u over it.

Copy link
Member

@Michael137 Michael137 Jan 28, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Oh I didn't realize you're doing this over a loop. And you're mimicking the behaviour of SemaDecl. Hmmm again this feels like something we could share with Clang. Something like bool clang::ASTContext::computeEnumBits(unsigned &NumNegativeBits, unsigned &NumPositiveBits) (where the return value implies MembersRepresentableByInt). Then just call it from TypeSystemClang::CompleteTagDeclarationDefinition. What do you think?

I feel like that would reduce the diff in this patch quite nicely

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't think I can move the entire loop into a separate function, the way DWARFASTParserClang and SemaDecl iterate over enum values is completely different, so for now I just moved these few lines that determine the updated NumPositiveBits NumNegativeBits.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I left a comment on my latest review on how you might be able to do that. Once we did that I think we're good to land this! Thanks for the patience

clang::QualType BestPromotionType;
clang::QualType BestType;
m_ast.getASTContext().computeBestEnumTypes(
false, NumNegativeBits, NumPositiveBits, BestType, BestPromotionType);
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
false, NumNegativeBits, NumPositiveBits, BestType, BestPromotionType);
/*IsPacked=*/false, NumNegativeBits, NumPositiveBits, BestType, BestPromotionType);



class TestCPPEnumPromotion(TestBase):
@skipIf(debug_info=no_match(["dwarf", "dwo"]))
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Any reason we're skipping this for dSYMs?



class TestCPPEnumPromotion(TestBase):
@skipIf(debug_info=no_match(["dwarf", "dwo", "dsym"]))
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
@skipIf(debug_info=no_match(["dwarf", "dwo", "dsym"]))

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

No need to specify this at all

@@ -2367,11 +2369,36 @@ size_t DWARFASTParserClang::ParseChildEnumerators(
}

if (name && name[0] && got_value) {
m_ast.AddEnumerationValueToEnumerationType(
auto ECD = m_ast.AddEnumerationValueToEnumerationType(
Copy link
Member

@Michael137 Michael137 Jan 30, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I do think you can move the iteration into a helper function on clang ASTContext. It would look something like this:

// In ASTContext class
template<typename RangeT>                                                  
bool computeEnumBits(unsigned &NumNegativeBits, unsigned &NumPositiveBits, 
                     RangeT Elements) {
   for (auto * Elem : Elements) {
       // blah
   }

  // etc.

  if (!NumPositiveBits && !NumNegativeBits)
     NumPositiveBits = 1;

   return MembersRepresentableByInt;
}

Then in SemaDecl.cpp in Sema::ActOnEnumBody you would call it as follows:

unsigned NumNegativeBits;                                                                            
unsigned NumPositiveBits;                                                                            
bool MembersRepresentableByInt =
    Context.computeEnumBits(NumNegativeBits, NumPositiveBits, Elements);

And in LLDB in CompleteTagDeclarationDefinition, you would call it like so:

ast.computeEnumBits(NumNegativeBits, NumPositiveBits, enum_decl->enumerators());
                                                                                
enum_decl->completeDefinition(enum_decl->getIntegerType(),                      
                              promotion_qual_type, NumPositiveBits,             
                              NumNegativeBits);                                 

Wdyt? And please do the Clang-side changes in a separate NFC PR

Let me know if you have questions about this approach

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Well... I tried this, it works, but I encountered some problems:

  1. computeEnumBits definition has to be in the header file, otherwise template instantiation doesn't work. I tried making the elements argument an ArrayRef, but it's also not working well with enum_decl->enumerators() returning llvm::iterator_range<specific_decl_iterator<EnumConstantDecl>>
  2. Iteration over elements in SemaDecl.cpp checks for the element is a EnumConstantDecl, otherwise it's skipped, so I had to create a separate collection in case something is removed, and then pass that one to computeEnumBits
  3. isRepresentableIntegerValue is a static function in SemaDecl.cpp, I'd rather not move somewhere else as well, since it's not needed on LLDB side anyway

Mostly I'm just not sure if it's a problem or not that ASTContext.h contains a definition of a function.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Mostly I'm just not sure if it's a problem or not that ASTContext.h contains a definition of a function.

Well if it's a template that has to be where it lives. Unless the Clang maintainers find a way to pass the iterator/arrayref as a non-template parameter I think this is fine

isRepresentableIntegerValue is a static function in SemaDecl.cpp, I'd rather not move somewhere else as well, since it's not needed on LLDB side anyway

Yea I think that's also fine. If it's a private helper on ASTContext, I don't see that being a big deal. Especially because that doesn't need to be defined in the header.

Iteration over elements in SemaDecl.cpp checks for the element is a EnumConstantDecl, otherwise it's skipped, so I had to create a separate collection in case something is removed, and then pass that one to computeEnumBits

Could you elaborate on this? LLDB only ever puts EnumConstantDecls there right?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Iteration over elements in SemaDecl.cpp checks for the element is a EnumConstantDecl, otherwise it's skipped, so I had to create a separate collection in case something is removed, and then pass that one to computeEnumBits

Could you elaborate on this? LLDB only ever puts EnumConstantDecls there right?

This line: SemaDecl.cpp:20070, if it discards something, I cannot pass the original Elements to computeEnumBits.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Apologies I still don't follow. Why can't computeEnumBits just be:

  template <typename RangeT>
  bool computeEnumBits(RangeT EnumConstants, unsigned &NumNegativeBits,
                       unsigned &NumPositiveBits) {
    unsigned NumNegativeBits = 0;
    unsigned NumPositiveBits = 0;
    bool MembersRepresentableByInt = true;
    for (auto * Elem : EnumConstants) {
    EnumConstantDecl *ECD =
      cast_or_null<EnumConstantDecl>(Elem);
    if (!ECD) continue;  // Already issued a diagnostic.

      llvm::APSInt InitVal = ECD->getInitVal();
      if (InitVal.isUnsigned() || InitVal.isNonNegative()) {
        // If the enumerator is zero that should still be counted as a positive
        // bit since we need a bit to store the value zero.
        unsigned ActiveBits = InitVal.getActiveBits();
        NumPositiveBits = std::max({NumPositiveBits, ActiveBits, 1u});
      } else {
        NumNegativeBits =
            std::max(NumNegativeBits, (unsigned)InitVal.getSignificantBits());
      }

      MembersRepresentableByInt &=
          isRepresentableIntegerValue(InitVal, IntTy);
    }

    // If we have an empty set of enumerators we still need one bit.
    // From [dcl.enum]p8
    // If the enumerator-list is empty, the values of the enumeration are as if
    // the enumeration had a single enumerator with value 0
    if (!NumPositiveBits && !NumNegativeBits)
      NumPositiveBits = 1;

    return MembersRepresentableByInt;
  }

And SemaDecl.cpp just does:

  unsigned NumNegativeBits;
  unsigned NumPositiveBits;
  bool MembersRepresentableByInt = Context.computeEnumBits(Elements, NumNegativeBits, NumPositiveBits);

?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

That's totally fine (in my opinion). Just make it a private member of ASTContext and it won't bother anyone. Could you prepare such a patch in a separate PR and see what the Clang maintainers say?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It's used in 5 other places in SemaDecl.cpp, unfortunately, so it will have to be yet another ASTContext function.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ah true, I still think it's worth doing. Worst case they recommend a different place to put these APIs. If anyone is strongly against this we can resort to your previous version I suppose

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Feel free to link to this thread in that PR so they have context

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Alright, will do, thank you!

@kuilpd
Copy link
Contributor Author

kuilpd commented Feb 8, 2025

@Michael137
Changed the first argument of computeEnumBits to an ArrayRef to avoid the template and so it can be still seamlessly used from Sema.
On LLDB side, I had to create a SmallVector and put enum constants there at the point of their creation (AddEnumerationValueToEnumerationType returns a pointer anyway) so that it can be passed to computeEnumBits as is. It's only a vector of pointers, and it's discarded after, so if it's not a problem here, I'll make the same changes in the Sema PR.

++enumerators_added;
}
}

clang::EnumDecl *enum_decl =
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Please move this back into TypeSystemClang::CompleteTagDeclarationDefinition so we don't need to make setNumNegativeBits/setNumPositiveBits public. And instead just pass the values to completeDefinition.

Icohedron pushed a commit to Icohedron/llvm-project that referenced this pull request Feb 11, 2025
…#126096)

Move the code that computes `NumNegativeBits` and `NumPositiveBits` for
an enum to a separate function in `ASTContext.h`.
This function needs to be called from LLDB as well (llvm#115005)
@kuilpd kuilpd force-pushed the lldb-analyze-enum-promotion-type branch from a02074c to fe6a395 Compare February 12, 2025 17:28
@@ -8476,27 +8476,20 @@ bool TypeSystemClang::CompleteTagDeclarationDefinition(

clang::ASTContext &ast = lldb_ast->getASTContext();

/// TODO This really needs to be fixed.
unsigned NumNegativeBits = 0;
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can we move this block back into the if (!integer_type.isNull()) { condition below?

Copy link
Member

@Michael137 Michael137 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks great! Thank you for moving all this functionality into Clang

LGTM (just one minor comment left)

flovent pushed a commit to flovent/llvm-project that referenced this pull request Feb 13, 2025
…#126096)

Move the code that computes `NumNegativeBits` and `NumPositiveBits` for
an enum to a separate function in `ASTContext.h`.
This function needs to be called from LLDB as well (llvm#115005)
@kuilpd
Copy link
Contributor Author

kuilpd commented Feb 13, 2025

@Michael137
Thank you for seeing this through!

@Michael137
Copy link
Member

let me know if you need me to merge this for you

@kuilpd kuilpd merged commit f30c891 into llvm:main Feb 13, 2025
5 of 6 checks passed
@Michael137
Copy link
Member

FYI on macOS CI this is failing with:

FAIL: test_dsym (TestCPPEnumPromotion.TestCPPEnumPromotion)
----------------------------------------------------------------------
Traceback (most recent call last):
  File "/Users/ec2-user/jenkins/workspace/llvm.org/as-lldb-cmake/llvm-project/lldb/packages/Python/lldbsuite/test/lldbtest.py", line 1784, in test_method
    return attrvalue(self)
  File "/Users/ec2-user/jenkins/workspace/llvm.org/as-lldb-cmake/llvm-project/lldb/test/API/lang/cpp/enum_promotion/TestCPPEnumPromotion.py", line 28, in test
    self.expect_expr("+EnumUChar::UChar", result_type=UChar_promoted.type.name)
  File "/Users/ec2-user/jenkins/workspace/llvm.org/as-lldb-cmake/llvm-project/lldb/packages/Python/lldbsuite/test/lldbtest.py", line 2540, in expect_expr
    value_check.check_value(self, eval_result, str(eval_result))
  File "/Users/ec2-user/jenkins/workspace/llvm.org/as-lldb-cmake/llvm-project/lldb/packages/Python/lldbsuite/test/lldbtest.py", line 299, in check_value
    test_base.assertSuccess(val.GetError())
  File "/Users/ec2-user/jenkins/workspace/llvm.org/as-lldb-cmake/llvm-project/lldb/packages/Python/lldbsuite/test/lldbtest.py", line 2575, in assertSuccess
    self.fail(self._formatMessage(msg, "'{}' is not success".format(error)))
AssertionError: 'error: <user expression 0>:1:2: use of undeclared identifier 'EnumUChar'
    1 | +EnumUChar::UChar
      |  ^
' is not success

But only for dSYMs. Looking at the dSYM, none of the enums are actually preserved in the debug-info. You have to actually use the enum types in source to get dsymutil to preserve them. I'll fix it

@kuilpd
Copy link
Contributor Author

kuilpd commented Feb 13, 2025

But only for dSYMs. Looking at the dSYM, none of the enums are actually preserved in the debug-info. You have to actually use the enum types in source to get dsymutil to preserve them. I'll fix it

Ah, I see. Thank you for handling this!

@Michael137
Copy link
Member

Fixed in 0feb00f17cbaac7428dcb7aed13d903b65974978

joaosaffran pushed a commit to joaosaffran/llvm-project that referenced this pull request Feb 14, 2025
…#126096)

Move the code that computes `NumNegativeBits` and `NumPositiveBits` for
an enum to a separate function in `ASTContext.h`.
This function needs to be called from LLDB as well (llvm#115005)
joaosaffran pushed a commit to joaosaffran/llvm-project that referenced this pull request Feb 14, 2025
The information about an enum's best promotion type is discarded after
compilation and is not present in debug info. This patch repeats the
same analysis of each enum value as in the front-end to determine the
best promotion type during DWARF info parsing.

Fixes llvm#86989
sivan-shani pushed a commit to sivan-shani/llvm-project that referenced this pull request Feb 24, 2025
…#126096)

Move the code that computes `NumNegativeBits` and `NumPositiveBits` for
an enum to a separate function in `ASTContext.h`.
This function needs to be called from LLDB as well (llvm#115005)
sivan-shani pushed a commit to sivan-shani/llvm-project that referenced this pull request Feb 24, 2025
The information about an enum's best promotion type is discarded after
compilation and is not present in debug info. This patch repeats the
same analysis of each enum value as in the front-end to determine the
best promotion type during DWARF info parsing.

Fixes llvm#86989
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
clang:frontend Language frontend issues, e.g. anything involving "Sema" clang Clang issues not falling into any other category lldb
Projects
None yet
Development

Successfully merging this pull request may close these issues.

[LLDB] Incorrect type promotion of an unscoped enum with the default type in arithmetic operations
4 participants