Skip to content

[lldb] Parse and display register field enums #95768

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
merged 5 commits into from
Jun 27, 2024

Conversation

DavidSpickett
Copy link
Collaborator

This teaches lldb to parse the enum XML elements sent by lldb-server, and make use of the information in register read and register info.

The format is described in
https://sourceware.org/gdb/current/onlinedocs/gdb.html/Enum-Target-Types.html.

The target XML parser will drop any invalid enum or evalue. If we find multiple evalue for the same value, we will use the last one we find.

The order of evalues from the XML is preserved as there may be good reason they are not in numerical order.

@DavidSpickett DavidSpickett marked this pull request as ready for review June 17, 2024 11:21
@llvmbot llvmbot added the lldb label Jun 17, 2024
@llvmbot
Copy link
Member

llvmbot commented Jun 17, 2024

@llvm/pr-subscribers-lldb

Author: David Spickett (DavidSpickett)

Changes

This teaches lldb to parse the enum XML elements sent by lldb-server, and make use of the information in register read and register info.

The format is described in
https://sourceware.org/gdb/current/onlinedocs/gdb.html/Enum-Target-Types.html.

The target XML parser will drop any invalid enum or evalue. If we find multiple evalue for the same value, we will use the last one we find.

The order of evalues from the XML is preserved as there may be good reason they are not in numerical order.


Patch is 30.96 KiB, truncated to 20.00 KiB below, full version: https://github.com/llvm/llvm-project/pull/95768.diff

8 Files Affected:

  • (modified) lldb/include/lldb/Target/RegisterFlags.h (+7)
  • (modified) lldb/source/Core/DumpRegisterInfo.cpp (+6-1)
  • (modified) lldb/source/Plugins/Process/gdb-remote/ProcessGDBRemote.cpp (+170-18)
  • (modified) lldb/source/Plugins/Process/gdb-remote/ProcessGDBRemote.h (+5)
  • (modified) lldb/source/Plugins/RegisterTypeBuilder/RegisterTypeBuilderClang.cpp (+27-3)
  • (modified) lldb/source/Target/RegisterFlags.cpp (+10)
  • (modified) lldb/test/API/functionalities/gdb_remote_client/TestXMLRegisterFlags.py (+322)
  • (modified) lldb/unittests/Core/DumpRegisterInfoTest.cpp (+26)
diff --git a/lldb/include/lldb/Target/RegisterFlags.h b/lldb/include/lldb/Target/RegisterFlags.h
index 1c6bf5dcf4a7f..628c841c10d95 100644
--- a/lldb/include/lldb/Target/RegisterFlags.h
+++ b/lldb/include/lldb/Target/RegisterFlags.h
@@ -32,10 +32,15 @@ class FieldEnum {
         : m_value(value), m_name(std::move(name)) {}
 
     void ToXML(Stream &strm) const;
+
+    void log(Log *log) const;
   };
 
   typedef std::vector<Enumerator> Enumerators;
 
+  // GDB also includes a "size" that is the size of the underlying register.
+  // We will not store that here but instead use the size of the register
+  // this gets attached to when emitting XML.
   FieldEnum(std::string id, const Enumerators &enumerators);
 
   const Enumerators &GetEnumerators() const { return m_enumerators; }
@@ -44,6 +49,8 @@ class FieldEnum {
 
   void ToXML(Stream &strm, unsigned size) const;
 
+  void log(Log *log) const;
+
 private:
   std::string m_id;
   Enumerators m_enumerators;
diff --git a/lldb/source/Core/DumpRegisterInfo.cpp b/lldb/source/Core/DumpRegisterInfo.cpp
index 8334795416902..eccc6784cd497 100644
--- a/lldb/source/Core/DumpRegisterInfo.cpp
+++ b/lldb/source/Core/DumpRegisterInfo.cpp
@@ -111,6 +111,11 @@ void lldb_private::DoDumpRegisterInfo(
   };
   DumpList(strm, "    In sets: ", in_sets, emit_set);
 
-  if (flags_type)
+  if (flags_type) {
     strm.Printf("\n\n%s", flags_type->AsTable(terminal_width).c_str());
+
+    std::string enumerators = flags_type->DumpEnums(terminal_width);
+    if (enumerators.size())
+      strm << "\n\n" << enumerators;
+  }
 }
diff --git a/lldb/source/Plugins/Process/gdb-remote/ProcessGDBRemote.cpp b/lldb/source/Plugins/Process/gdb-remote/ProcessGDBRemote.cpp
index a5a731981299f..4754698e6e88f 100644
--- a/lldb/source/Plugins/Process/gdb-remote/ProcessGDBRemote.cpp
+++ b/lldb/source/Plugins/Process/gdb-remote/ProcessGDBRemote.cpp
@@ -4179,21 +4179,124 @@ struct GdbServerTargetInfo {
   RegisterSetMap reg_set_map;
 };
 
-static std::vector<RegisterFlags::Field> ParseFlagsFields(XMLNode flags_node,
-                                                          unsigned size) {
+static FieldEnum::Enumerators ParseEnumEvalues(const XMLNode &enum_node) {
+  Log *log(GetLog(GDBRLog::Process));
+  // We will use the last instance of each value. Also we preserve the order
+  // of declaration in the XML, as it may not be numerical.
+  std::map<uint64_t, FieldEnum::Enumerator> enumerators;
+
+  enum_node.ForEachChildElementWithName(
+      "evalue", [&enumerators, &log](const XMLNode &enumerator_node) {
+        std::optional<llvm::StringRef> name;
+        std::optional<uint64_t> value;
+
+        enumerator_node.ForEachAttribute(
+            [&name, &value, &log](const llvm::StringRef &attr_name,
+                                  const llvm::StringRef &attr_value) {
+              if (attr_name == "name") {
+                if (attr_value.size())
+                  name = attr_value;
+                else
+                  LLDB_LOG(log, "ProcessGDBRemote::ParseEnumEvalues "
+                                "Ignoring empty name in evalue");
+              } else if (attr_name == "value") {
+                uint64_t parsed_value = 0;
+                if (llvm::to_integer(attr_value, parsed_value))
+                  value = parsed_value;
+                else
+                  LLDB_LOG(log,
+                           "ProcessGDBRemote::ParseEnumEvalues "
+                           "Invalid value \"{0}\" in "
+                           "evalue",
+                           attr_value.data());
+              } else
+                LLDB_LOG(log,
+                         "ProcessGDBRemote::ParseEnumEvalues Ignoring "
+                         "unknown attribute "
+                         "\"{0}\" in evalue",
+                         attr_name.data());
+
+              // Keep walking attributes.
+              return true;
+            });
+
+        if (value && name)
+          enumerators.insert_or_assign(
+              *value, FieldEnum::Enumerator(*value, name->str()));
+
+        // Find all evalue elements.
+        return true;
+      });
+
+  FieldEnum::Enumerators final_enumerators;
+  for (auto [_, enumerator] : enumerators)
+    final_enumerators.push_back(enumerator);
+
+  return final_enumerators;
+}
+
+static void
+ParseEnums(XMLNode feature_node,
+           llvm::StringMap<std::unique_ptr<FieldEnum>> &registers_enum_types) {
+  Log *log(GetLog(GDBRLog::Process));
+
+  // The top level element is "<enum...".
+  feature_node.ForEachChildElementWithName(
+      "enum", [log, &registers_enum_types](const XMLNode &enum_node) {
+        std::string id;
+
+        enum_node.ForEachAttribute([&id](const llvm::StringRef &attr_name,
+                                         const llvm::StringRef &attr_value) {
+          if (attr_name == "id")
+            id = attr_value;
+
+          // There is also a "size" attribute that is supposed to be the size in
+          // bytes of the register this applies to. However:
+          // * LLDB doesn't need this information.
+          // * It  is difficult to verify because you have to wait until the
+          //   enum is applied to a field.
+          //
+          // So we will emit this attribute in XML for GDB's sake, but will not
+          // bother ingesting it.
+
+          // Walk all attributes.
+          return true;
+        });
+
+        if (!id.empty()) {
+          FieldEnum::Enumerators enumerators = ParseEnumEvalues(enum_node);
+          if (!enumerators.empty()) {
+            LLDB_LOG(log,
+                     "ProcessGDBRemote::ParseEnums Found enum type \"{0}\"",
+                     id);
+            registers_enum_types.insert_or_assign(
+                id, std::make_unique<FieldEnum>(id, enumerators));
+          }
+        }
+
+        // Find all <enum> elements.
+        return true;
+      });
+}
+
+static std::vector<RegisterFlags::Field> ParseFlagsFields(
+    XMLNode flags_node, unsigned size,
+    const llvm::StringMap<std::unique_ptr<FieldEnum>> &registers_enum_types) {
   Log *log(GetLog(GDBRLog::Process));
   const unsigned max_start_bit = size * 8 - 1;
 
   // Process the fields of this set of flags.
   std::vector<RegisterFlags::Field> fields;
-  flags_node.ForEachChildElementWithName("field", [&fields, max_start_bit,
-                                                   &log](const XMLNode
-                                                             &field_node) {
+  flags_node.ForEachChildElementWithName("field", [&fields, max_start_bit, &log,
+                                                   &registers_enum_types](
+                                                      const XMLNode
+                                                          &field_node) {
     std::optional<llvm::StringRef> name;
     std::optional<unsigned> start;
     std::optional<unsigned> end;
+    std::optional<llvm::StringRef> type;
 
-    field_node.ForEachAttribute([&name, &start, &end, max_start_bit,
+    field_node.ForEachAttribute([&name, &start, &end, &type, max_start_bit,
                                  &log](const llvm::StringRef &attr_name,
                                        const llvm::StringRef &attr_value) {
       // Note that XML in general requires that each of these attributes only
@@ -4240,8 +4343,7 @@ static std::vector<RegisterFlags::Field> ParseFlagsFields(XMLNode flags_node,
                    attr_value.data());
         }
       } else if (attr_name == "type") {
-        // Type is a known attribute but we do not currently use it and it is
-        // not required.
+        type = attr_value;
       } else {
         LLDB_LOG(
             log,
@@ -4254,14 +4356,55 @@ static std::vector<RegisterFlags::Field> ParseFlagsFields(XMLNode flags_node,
     });
 
     if (name && start && end) {
-      if (*start > *end) {
+      if (*start > *end)
         LLDB_LOG(
             log,
             "ProcessGDBRemote::ParseFlagsFields Start {0} > end {1} in field "
             "\"{2}\", ignoring",
             *start, *end, name->data());
-      } else {
-        fields.push_back(RegisterFlags::Field(name->str(), *start, *end));
+      else {
+        if (RegisterFlags::Field::GetSizeInBits(*start, *end) > 64)
+          LLDB_LOG(log,
+                   "ProcessGDBRemote::ParseFlagsFields Ignoring field \"{2}\" "
+                   "that has "
+                   " size > 64 bits, this is not supported",
+                   name->data());
+        else {
+          // A field's type may be set to the name of an enum type.
+          const FieldEnum *enum_type = nullptr;
+          if (type && !type->empty()) {
+            auto found = registers_enum_types.find(*type);
+            if (found != registers_enum_types.end()) {
+              enum_type = found->second.get();
+
+              // No enumerator can exceed the range of the field itself.
+              uint64_t max_value =
+                  RegisterFlags::Field::GetMaxValue(*start, *end);
+              for (const auto &enumerator : enum_type->GetEnumerators()) {
+                if (enumerator.m_value > max_value) {
+                  enum_type = nullptr;
+                  LLDB_LOG(
+                      log,
+                      "ProcessGDBRemote::ParseFlagsFields In enum \"{0}\" "
+                      "evalue \"{1}\" with value {2} exceeds the maximum value "
+                      "of field \"{3}\" ({4}), ignoring enum",
+                      type->data(), enumerator.m_name, enumerator.m_value,
+                      name->data(), max_value);
+                  break;
+                }
+              }
+            } else {
+              LLDB_LOG(log,
+                       "ProcessGDBRemote::ParseFlagsFields Could not find type "
+                       "\"{0}\" "
+                       "for field \"{1}\", ignoring",
+                       type->data(), name->data());
+            }
+          }
+
+          fields.push_back(
+              RegisterFlags::Field(name->str(), *start, *end, enum_type));
+        }
       }
     }
 
@@ -4272,12 +4415,14 @@ static std::vector<RegisterFlags::Field> ParseFlagsFields(XMLNode flags_node,
 
 void ParseFlags(
     XMLNode feature_node,
-    llvm::StringMap<std::unique_ptr<RegisterFlags>> &registers_flags_types) {
+    llvm::StringMap<std::unique_ptr<RegisterFlags>> &registers_flags_types,
+    const llvm::StringMap<std::unique_ptr<FieldEnum>> &registers_enum_types) {
   Log *log(GetLog(GDBRLog::Process));
 
   feature_node.ForEachChildElementWithName(
       "flags",
-      [&log, &registers_flags_types](const XMLNode &flags_node) -> bool {
+      [&log, &registers_flags_types,
+       &registers_enum_types](const XMLNode &flags_node) -> bool {
         LLDB_LOG(log, "ProcessGDBRemote::ParseFlags Found flags node \"{0}\"",
                  flags_node.GetAttributeValue("id").c_str());
 
@@ -4310,7 +4455,7 @@ void ParseFlags(
         if (id && size) {
           // Process the fields of this set of flags.
           std::vector<RegisterFlags::Field> fields =
-              ParseFlagsFields(flags_node, *size);
+              ParseFlagsFields(flags_node, *size, registers_enum_types);
           if (fields.size()) {
             // Sort so that the fields with the MSBs are first.
             std::sort(fields.rbegin(), fields.rend());
@@ -4375,13 +4520,19 @@ void ParseFlags(
 bool ParseRegisters(
     XMLNode feature_node, GdbServerTargetInfo &target_info,
     std::vector<DynamicRegisterInfo::Register> &registers,
-    llvm::StringMap<std::unique_ptr<RegisterFlags>> &registers_flags_types) {
+    llvm::StringMap<std::unique_ptr<RegisterFlags>> &registers_flags_types,
+    llvm::StringMap<std::unique_ptr<FieldEnum>> &registers_enum_types) {
   if (!feature_node)
     return false;
 
   Log *log(GetLog(GDBRLog::Process));
 
-  ParseFlags(feature_node, registers_flags_types);
+  // Enums first because they are referenced by fields in the flags.
+  ParseEnums(feature_node, registers_enum_types);
+  for (const auto &enum_type : registers_enum_types)
+    enum_type.second->log(log);
+
+  ParseFlags(feature_node, registers_flags_types, registers_enum_types);
   for (const auto &flags : registers_flags_types)
     flags.second->log(log);
 
@@ -4643,7 +4794,7 @@ bool ProcessGDBRemote::GetGDBServerRegisterInfoXMLAndProcess(
     if (arch_to_use.IsValid()) {
       for (auto &feature_node : feature_nodes) {
         ParseRegisters(feature_node, target_info, registers,
-                       m_registers_flags_types);
+                       m_registers_flags_types, m_registers_enum_types);
       }
 
       for (const auto &include : target_info.includes) {
@@ -4708,13 +4859,14 @@ bool ProcessGDBRemote::GetGDBServerRegisterInfo(ArchSpec &arch_to_use) {
   if (!m_gdb_comm.GetQXferFeaturesReadSupported())
     return false;
 
-  // This holds register flags information for the whole of target.xml.
+  // These hold register type information for the whole of target.xml.
   // target.xml may include further documents that
   // GetGDBServerRegisterInfoXMLAndProcess will recurse to fetch and process.
   // That's why we clear the cache here, and not in
   // GetGDBServerRegisterInfoXMLAndProcess. To prevent it being cleared on every
   // include read.
   m_registers_flags_types.clear();
+  m_registers_enum_types.clear();
   std::vector<DynamicRegisterInfo::Register> registers;
   if (GetGDBServerRegisterInfoXMLAndProcess(arch_to_use, "target.xml",
                                             registers))
diff --git a/lldb/source/Plugins/Process/gdb-remote/ProcessGDBRemote.h b/lldb/source/Plugins/Process/gdb-remote/ProcessGDBRemote.h
index 610a1ee0b34d2..b44ffefcd0d36 100644
--- a/lldb/source/Plugins/Process/gdb-remote/ProcessGDBRemote.h
+++ b/lldb/source/Plugins/Process/gdb-remote/ProcessGDBRemote.h
@@ -484,6 +484,11 @@ class ProcessGDBRemote : public Process,
   // entries are added. Which would invalidate any pointers set in the register
   // info up to that point.
   llvm::StringMap<std::unique_ptr<RegisterFlags>> m_registers_flags_types;
+
+  // Enum types are referenced by register fields. This does not store the data
+  // directly because the map may reallocate. Pointers to these are contained
+  // within instances of RegisterFlags.
+  llvm::StringMap<std::unique_ptr<FieldEnum>> m_registers_enum_types;
 };
 
 } // namespace process_gdb_remote
diff --git a/lldb/source/Plugins/RegisterTypeBuilder/RegisterTypeBuilderClang.cpp b/lldb/source/Plugins/RegisterTypeBuilder/RegisterTypeBuilderClang.cpp
index 067768537c068..556febc1f6cf1 100644
--- a/lldb/source/Plugins/RegisterTypeBuilder/RegisterTypeBuilderClang.cpp
+++ b/lldb/source/Plugins/RegisterTypeBuilder/RegisterTypeBuilderClang.cpp
@@ -43,8 +43,7 @@ CompilerType RegisterTypeBuilderClang::GetRegisterType(
       ScratchTypeSystemClang::GetForTarget(m_target);
   assert(type_system);
 
-  std::string register_type_name = "__lldb_register_fields_";
-  register_type_name += name;
+  std::string register_type_name = "__lldb_register_fields_" + name;
   // See if we have made this type before and can reuse it.
   CompilerType fields_type =
       type_system->GetTypeForIdentifier<clang::CXXRecordDecl>(
@@ -67,8 +66,33 @@ CompilerType RegisterTypeBuilderClang::GetRegisterType(
     // We assume that RegisterFlags has padded and sorted the fields
     // already.
     for (const RegisterFlags::Field &field : flags.GetFields()) {
+      CompilerType field_type = field_uint_type;
+
+      if (const FieldEnum *enum_type = field.GetEnum()) {
+        const FieldEnum::Enumerators &enumerators = enum_type->GetEnumerators();
+
+        if (enumerators.empty())
+          continue;
+
+        std::string enum_type_name =
+            register_type_name + "_" + field.GetName() + "_enum";
+        field_type = type_system->CreateEnumerationType(
+            enum_type_name, type_system->GetTranslationUnitDecl(),
+            OptionalClangModuleID(), Declaration(), field_uint_type, false);
+
+        type_system->StartTagDeclarationDefinition(field_type);
+
+        Declaration decl;
+        for (auto enumerator : enumerators)
+          type_system->AddEnumerationValueToEnumerationType(
+              field_type, decl, enumerator.m_name.c_str(), enumerator.m_value,
+              byte_size * 8);
+
+        type_system->CompleteTagDeclarationDefinition(field_type);
+      }
+
       type_system->AddFieldToRecordType(fields_type, field.GetName(),
-                                        field_uint_type, lldb::eAccessPublic,
+                                        field_type, lldb::eAccessPublic,
                                         field.GetSizeInBits());
     }
 
diff --git a/lldb/source/Target/RegisterFlags.cpp b/lldb/source/Target/RegisterFlags.cpp
index d8a87090a7a41..d6669a3edd003 100644
--- a/lldb/source/Target/RegisterFlags.cpp
+++ b/lldb/source/Target/RegisterFlags.cpp
@@ -370,6 +370,16 @@ void FieldEnum::Enumerator::ToXML(Stream &strm) const {
               escaped_name.c_str(), m_value);
 }
 
+void FieldEnum::Enumerator::log(Log *log) const {
+  LLDB_LOG(log, "  Name: \"{0}\" Value: {1}", m_name.c_str(), m_value);
+}
+
+void FieldEnum::log(Log *log) const {
+  LLDB_LOG(log, "ID: \"{0}\"", m_id.c_str());
+  for (const auto &enumerator : GetEnumerators())
+    enumerator.log(log);
+}
+
 void RegisterFlags::ToXML(Stream &strm) const {
   // Example XML:
   // <flags id="cpsr_flags" size="4">
diff --git a/lldb/test/API/functionalities/gdb_remote_client/TestXMLRegisterFlags.py b/lldb/test/API/functionalities/gdb_remote_client/TestXMLRegisterFlags.py
index e2c75970c2d2e..59beb1857ffd9 100644
--- a/lldb/test/API/functionalities/gdb_remote_client/TestXMLRegisterFlags.py
+++ b/lldb/test/API/functionalities/gdb_remote_client/TestXMLRegisterFlags.py
@@ -654,3 +654,325 @@ def test_flags_name_xml_reserved_characters(self):
             "register info cpsr",
             substrs=["| A< | B> | C' | D\" | E& |"],
         )
+
+    @skipIfXmlSupportMissing
+    @skipIfRemote
+    def test_no_enum(self):
+        """Check that lldb does not try to print an enum when there isn't one."""
+
+        self.setup_flags_test('<field name="E" start="0" end="0">' "</field>")
+
+        self.expect("register info cpsr", patterns=["E:.*$"], matching=False)
+
+    @skipIfXmlSupportMissing
+    @skipIfRemote
+    def test_enum_type_not_found(self):
+        """Check that lldb uses the default format if we don't find the enum type."""
+        self.setup_register_test(
+            """\
+          <flags id="cpsr_flags" size="4">
+            <field name="E" start="0" end="0" type="some_enum"/>
+          </flags>
+          <reg name="pc" bitsize="64"/>
+          <reg name="x0" regnum="0" bitsize="64"/>
+          <reg name="cpsr" regnum="33" bitsize="32" type="cpsr_flags"/>"""
+        )
+
+        self.expect("register read cpsr", patterns=["\(E = 1\)$"])
+
+    @skipIfXmlSupportMissing
+    @skipIfRemote
+    def test_enum_duplicated_evalue(self):
+        """Check that lldb only uses the last instance of a evalue for each
+        value."""
+        self.setup_register_test(
+            """\
+          <enum id="some_enum" size="4">
+            <evalue name="abc" value="1"/>
+            <evalue name="def" value="1"/>
+            <evalue name="geh" value="2"/>
+          </enum>
+          <flags id="cpsr_flags" size="4">
+            <field name="E" start="0" end="1" type="some_enum"/>
+          </flags>
+          <reg name="pc" bitsize="64"/>
+          <reg name="x0" regnum="0" bitsize="64"/>
+          <reg name="cpsr" regnum="33" bitsize="32" type="cpsr_flags"/>"""
+        )
+
+        self.expect("register info cpsr", patterns=["E: 1 = def, 2 = geh$"])
+        self.expect("register read cpsr", patterns=["\(E = def \| geh\)$"])
+
+    @skipIfXmlSupportMissing
+    @skipIfRemote
+    def test_enum_duplicated(self):
+        """Check that lldb only uses the last instance of enums with the same
+        id."""
+        self.setup_register_test(
+            """\
+          <enum id="some_enum" size="4">
+            <evalue name="abc" value="1"/>
+          </enum>
+          <enum id="some_enum" size="4">
+            <evalue name="def" value="1"/>
+          </enum>
+          <flags id="cpsr_flags" size="4">
+            <field name="E" start="0" end="0" type="some_enum"/>
+          </flags>
+          <reg name="pc" bitsize="64"/>
...
[truncated]

This teaches lldb to parse the enum XML elements sent by
lldb-server, and make use of the information in `register read`
and `register info`.

The format is described in
https://sourceware.org/gdb/current/onlinedocs/gdb.html/Enum-Target-Types.html.

The target XML parser will drop any invalid enum or evalue. If
we find multiple evalue for the same value, we will use the last
one we find.

The order of evalues from the XML is preserved as there may be
good reason they are not in numerical order.
Copy link
Member

@medismailben medismailben left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is great! LGTM with nits :)

* Key enum types by enum ID not register name, since they can be used
  across many registers.
* Add tests for sharing enums across registers and fields, even if
  the field names are identical.
Copy link
Collaborator

@jasonmolenda jasonmolenda left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This looks good to me. The only thing I could suggest is that we check the specified fields/size are less than 64 bits when reading the enum field definitions, but that means I can specify fields above 32 bits and then associate that with a 32-bit cpsr register and it won't be detected. I feel like it's critical we be that careful with our input handling, but I thought I'd mention it to see what you think.

@DavidSpickett
Copy link
Collaborator Author

The only thing I could suggest is that we check the specified fields/size are less than 64 bits when reading the enum field definitions, but that means I can specify fields above 32 bits and then associate that with a 32-bit cpsr register and it won't be detected.

I have updated the test test_evalue_value_limits to check that we are able to parse an evalue that is the maximum 64 bit unsigned value, and that anything above that is rejected.

The other part is covered by the test test_enum_value_range.

So you could define an enum with a > 32 bit evalue, then assign that to a 32 bit register that had a single 32 bit field. lldb would check that all the evalues of that enum fit into that field. They do not, so it will ignore the enum.

Any other 64 bit registers with fields large enough to be compatible can still use the enum.

<enum id="some_enum" size="8">
  <evalue name="min" value="0"/>
  <evalue name="max" value="0xffffffffffffffff"/>
</enum>
<flags id="x0_flags" size="8">
  <!-- This works because all the evalues of some_enum fit into this 64 bit field -->
  <field name="foo" start="0" end="63" type="some_enum"/>
 </flags>
<reg name="x0" regnum="0" bitsize="64" type="x0_flags"/>
<flags id="cpsr_flags" size="4">
  <!-- The type here is ignored because the evalue "max" won't fit into the field. The field will just be treated as an unsigned integer. -->
  <field name="foo" start="0" end="31" type="some_enum"/>
</flags>
<reg name="cpsr" regnum="33" bitsize="32" type="cpsr_flags"/>

There are log messages for both these scenarios.

Copy link
Member

@bulbazord bulbazord left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm a bit late to the party, I was out sick last week. :p

I think I understand this change and it looks fine to me overall.
One thing I noticed is that a lot of things that could be considered errors are only written to the logs and then are silently glossed over. That information is contained only within these parse functions, so it can't really be percolated back up and displayed to developers in any meaningful way. From a developer's perspective, something may just not work because the debug stub gave you invalid data and only by hunting in the logs can you determine what went wrong. Perhaps this could be changed (not in this PR) to return not only the parsed data but also some error information? Like some union of (ParsedData, ParsingErrors) and the callers can determine what to do with the error? The error could then be surfaced back in a meaningful way, like "hey the server gave me some bunk data so pretty-printing registers might look weird". What do you think?

Log *log(GetLog(GDBRLog::Process));
// We will use the last instance of each value. Also we preserve the order
// of declaration in the XML, as it may not be numerical.
std::map<uint64_t, FieldEnum::Enumerator> enumerators;
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Do we know the density of the keys here? Might be a good candidate for llvm::IndexedMap?

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

In theory it's an enum so you'd start from 0 then 1, 2,3,4, etc. but this is hardware, so it's not going to be as regular as software enums.

The other problem with IndexedMap is that you've got to have a function uint64_t -> size_t to make the index, so for a 32 bit lldb the naive version ends up clipping off the top half of values. Are we likely to get a 64 bit enumerator value? No, but it's simpler to handle it than decide on another arbitrary cut off point.

If we ignore that issue, there is the issue of order of insertion. I'm assuming you iterate an IndexedMap using the indexes, but if the values of the enumerator are used for the indexes we may lose the original order.

Maybe if that requirement didn't come across clearly enough, I need to expand the comment with an example?

Or maybe I have the wrong end of the stick entirely.

Copy link
Collaborator Author

@DavidSpickett DavidSpickett Jun 26, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I've expanded the comment anyway.

I don't think anyone on the GDB side ever considered that order might matter, because I don't think they have a register info equivalent. You can print the type of the register but that's more of a C type, where enumerators being sorted isn't unexpected.

(where "matter" is defined as "I thought of some niche use case that would be nice to have" :) )

Gray code values is another use case for this. 00, 01, 11, 10 - you might list it that way and wouldn't want lldb to sort it into normal numerical order.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

That's fair, was mostly asking to get an idea of what one can expect the data to look like. std::map is a fine choice here :)

@DavidSpickett
Copy link
Collaborator Author

The error could then be surfaced back in a meaningful way, like "hey the server gave me some bunk data so pretty-printing registers might look weird". What do you think?

IIRC GDB will do something like this, though it may just be (paraphrasing) "yo this XML be weird". Even so, we could say "... check the log for more detail". There may be a lot of log output even for relatively normal sessions, and I can tune that to some extent but even so, some users won't care.

I'll see how much output there is at the moment.

@DavidSpickett
Copy link
Collaborator Author

DavidSpickett commented Jun 27, 2024

I'm going to land this now but I can fix up the map type later if you want.

@DavidSpickett DavidSpickett merged commit ba60d8a into llvm:main Jun 27, 2024
6 checks passed
@DavidSpickett DavidSpickett deleted the lldb-enums-pr3 branch June 27, 2024 09:03
@chelcassanova
Copy link
Contributor

Hey David! It looks like this commit broke the TestXMLRegisterFlags.py test on macOS sanitized buildbots: https://green.lab.llvm.org/job/llvm.org/view/LLDB/job/lldb-cmake-sanitized/425/console

@DavidSpickett
Copy link
Collaborator Author

Thanks for the report, fixed in #97270.

lravenclaw pushed a commit to lravenclaw/llvm-project that referenced this pull request Jul 3, 2024
This teaches lldb to parse the enum XML elements sent by lldb-server,
and make use of the information in `register read` and `register info`.

The format is described in

https://sourceware.org/gdb/current/onlinedocs/gdb.html/Enum-Target-Types.html.

The target XML parser will drop any invalid enum or evalue. If we find
multiple evalue for the same value, we will use the last one we find.

The order of evalues from the XML is preserved as there may be good
reason they are not in numerical order.
lravenclaw pushed a commit to lravenclaw/llvm-project that referenced this pull request Jul 3, 2024
kbluck pushed a commit to kbluck/llvm-project that referenced this pull request Jul 6, 2024
AlexisPerry pushed a commit to llvm-project-tlp/llvm-project that referenced this pull request Jul 9, 2024
This teaches lldb to parse the enum XML elements sent by lldb-server,
and make use of the information in `register read` and `register info`.

The format is described in

https://sourceware.org/gdb/current/onlinedocs/gdb.html/Enum-Target-Types.html.

The target XML parser will drop any invalid enum or evalue. If we find
multiple evalue for the same value, we will use the last one we find.

The order of evalues from the XML is preserved as there may be good
reason they are not in numerical order.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

Successfully merging this pull request may close these issues.

6 participants