Skip to content

[lldb] Improve summary string handling of dollar chars #98190

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Conversation

kastiglione
Copy link
Contributor

@kastiglione kastiglione commented Jul 9, 2024

This improves the handling of $ (dollar) characters in summary strings in the following ways:

  1. When a $ is not followed by an open paren ({), it should be treated as a literal character and preserved in the output. Previously, the dollar would be consumed by the parser and not shown in the output.
  2. When a $ is the last character of a format string, this change eliminates the infinite loop lldb would enter into.

rdar://131392446

@kastiglione kastiglione requested a review from JDevlieghere as a code owner July 9, 2024 17:08
@llvmbot llvmbot added the lldb label Jul 9, 2024
@llvmbot
Copy link
Member

llvmbot commented Jul 9, 2024

@llvm/pr-subscribers-lldb

Author: Dave Lee (kastiglione)

Changes

Full diff: https://github.com/llvm/llvm-project/pull/98190.diff

4 Files Affected:

  • (modified) lldb/source/Core/FormatEntity.cpp (+126-132)
  • (added) lldb/test/API/functionalities/data-formatter/special-chars/Makefile (+2)
  • (added) lldb/test/API/functionalities/data-formatter/special-chars/TestSummaryStringSpecialChars.py (+19)
  • (added) lldb/test/API/functionalities/data-formatter/special-chars/main.c (+9)
diff --git a/lldb/source/Core/FormatEntity.cpp b/lldb/source/Core/FormatEntity.cpp
index fe95858f35c9f..4e1f37099148b 100644
--- a/lldb/source/Core/FormatEntity.cpp
+++ b/lldb/source/Core/FormatEntity.cpp
@@ -2170,154 +2170,148 @@ static Status ParseInternal(llvm::StringRef &format, Entry &parent_entry,
     } break;
 
     case '$':
-      if (format.size() == 1) {
-        // '$' at the end of a format string, just print the '$'
+      format = format.drop_front(); // Skip the '$'
+      if (format.empty() || format.front() != '{') {
+        // Print '$' when not followed by '{'.
         parent_entry.AppendText("$");
       } else {
-        format = format.drop_front(); // Skip the '$'
-
-        if (format[0] == '{') {
-          format = format.drop_front(); // Skip the '{'
-
-          llvm::StringRef variable, variable_format;
-          error = FormatEntity::ExtractVariableInfo(format, variable,
-                                                    variable_format);
-          if (error.Fail())
-            return error;
-          bool verify_is_thread_id = false;
-          Entry entry;
-          if (!variable_format.empty()) {
-            entry.printf_format = variable_format.str();
-
-            // If the format contains a '%' we are going to assume this is a
-            // printf style format. So if you want to format your thread ID
-            // using "0x%llx" you can use: ${thread.id%0x%llx}
-            //
-            // If there is no '%' in the format, then it is assumed to be a
-            // LLDB format name, or one of the extended formats specified in
-            // the switch statement below.
-
-            if (entry.printf_format.find('%') == std::string::npos) {
-              bool clear_printf = false;
-
-              if (entry.printf_format.size() == 1) {
-                switch (entry.printf_format[0]) {
-                case '@': // if this is an @ sign, print ObjC description
-                  entry.number = ValueObject::
-                      eValueObjectRepresentationStyleLanguageSpecific;
-                  clear_printf = true;
-                  break;
-                case 'V': // if this is a V, print the value using the default
-                          // format
-                  entry.number =
-                      ValueObject::eValueObjectRepresentationStyleValue;
-                  clear_printf = true;
-                  break;
-                case 'L': // if this is an L, print the location of the value
-                  entry.number =
-                      ValueObject::eValueObjectRepresentationStyleLocation;
-                  clear_printf = true;
-                  break;
-                case 'S': // if this is an S, print the summary after all
-                  entry.number =
-                      ValueObject::eValueObjectRepresentationStyleSummary;
-                  clear_printf = true;
-                  break;
-                case '#': // if this is a '#', print the number of children
-                  entry.number =
-                      ValueObject::eValueObjectRepresentationStyleChildrenCount;
-                  clear_printf = true;
-                  break;
-                case 'T': // if this is a 'T', print the type
-                  entry.number =
-                      ValueObject::eValueObjectRepresentationStyleType;
-                  clear_printf = true;
-                  break;
-                case 'N': // if this is a 'N', print the name
-                  entry.number =
-                      ValueObject::eValueObjectRepresentationStyleName;
-                  clear_printf = true;
-                  break;
-                case '>': // if this is a '>', print the expression path
-                  entry.number = ValueObject::
-                      eValueObjectRepresentationStyleExpressionPath;
-                  clear_printf = true;
-                  break;
-                }
+        format = format.drop_front(); // Skip the '{'
+
+        llvm::StringRef variable, variable_format;
+        error = FormatEntity::ExtractVariableInfo(format, variable,
+                                                  variable_format);
+        if (error.Fail())
+          return error;
+        bool verify_is_thread_id = false;
+        Entry entry;
+        if (!variable_format.empty()) {
+          entry.printf_format = variable_format.str();
+
+          // If the format contains a '%' we are going to assume this is a
+          // printf style format. So if you want to format your thread ID
+          // using "0x%llx" you can use: ${thread.id%0x%llx}
+          //
+          // If there is no '%' in the format, then it is assumed to be a
+          // LLDB format name, or one of the extended formats specified in
+          // the switch statement below.
+
+          if (entry.printf_format.find('%') == std::string::npos) {
+            bool clear_printf = false;
+
+            if (entry.printf_format.size() == 1) {
+              switch (entry.printf_format[0]) {
+              case '@': // if this is an @ sign, print ObjC description
+                entry.number = ValueObject::
+                    eValueObjectRepresentationStyleLanguageSpecific;
+                clear_printf = true;
+                break;
+              case 'V': // if this is a V, print the value using the default
+                        // format
+                entry.number =
+                    ValueObject::eValueObjectRepresentationStyleValue;
+                clear_printf = true;
+                break;
+              case 'L': // if this is an L, print the location of the value
+                entry.number =
+                    ValueObject::eValueObjectRepresentationStyleLocation;
+                clear_printf = true;
+                break;
+              case 'S': // if this is an S, print the summary after all
+                entry.number =
+                    ValueObject::eValueObjectRepresentationStyleSummary;
+                clear_printf = true;
+                break;
+              case '#': // if this is a '#', print the number of children
+                entry.number =
+                    ValueObject::eValueObjectRepresentationStyleChildrenCount;
+                clear_printf = true;
+                break;
+              case 'T': // if this is a 'T', print the type
+                entry.number = ValueObject::eValueObjectRepresentationStyleType;
+                clear_printf = true;
+                break;
+              case 'N': // if this is a 'N', print the name
+                entry.number = ValueObject::eValueObjectRepresentationStyleName;
+                clear_printf = true;
+                break;
+              case '>': // if this is a '>', print the expression path
+                entry.number =
+                    ValueObject::eValueObjectRepresentationStyleExpressionPath;
+                clear_printf = true;
+                break;
               }
+            }
 
-              if (entry.number == 0) {
-                if (FormatManager::GetFormatFromCString(
-                        entry.printf_format.c_str(), entry.fmt)) {
-                  clear_printf = true;
-                } else if (entry.printf_format == "tid") {
-                  verify_is_thread_id = true;
-                } else {
-                  error.SetErrorStringWithFormat("invalid format: '%s'",
-                                                 entry.printf_format.c_str());
-                  return error;
-                }
+            if (entry.number == 0) {
+              if (FormatManager::GetFormatFromCString(
+                      entry.printf_format.c_str(), entry.fmt)) {
+                clear_printf = true;
+              } else if (entry.printf_format == "tid") {
+                verify_is_thread_id = true;
+              } else {
+                error.SetErrorStringWithFormat("invalid format: '%s'",
+                                               entry.printf_format.c_str());
+                return error;
               }
-
-              // Our format string turned out to not be a printf style format
-              // so lets clear the string
-              if (clear_printf)
-                entry.printf_format.clear();
             }
-          }
 
-          // Check for dereferences
-          if (variable[0] == '*') {
-            entry.deref = true;
-            variable = variable.drop_front();
+            // Our format string turned out to not be a printf style format
+            // so lets clear the string
+            if (clear_printf)
+              entry.printf_format.clear();
           }
+        }
 
-          error = ParseEntry(variable, &g_root, entry);
-          if (error.Fail())
-            return error;
+        // Check for dereferences
+        if (variable[0] == '*') {
+          entry.deref = true;
+          variable = variable.drop_front();
+        }
 
-          llvm::StringRef entry_string(entry.string);
-          if (entry_string.contains(':')) {
-            auto [_, llvm_format] = entry_string.split(':');
-            if (!llvm_format.empty() && !LLVMFormatPattern.match(llvm_format)) {
-              error.SetErrorStringWithFormat("invalid llvm format: '%s'",
-                                             llvm_format.data());
-              return error;
-            }
+        error = ParseEntry(variable, &g_root, entry);
+        if (error.Fail())
+          return error;
+
+        llvm::StringRef entry_string(entry.string);
+        if (entry_string.contains(':')) {
+          auto [_, llvm_format] = entry_string.split(':');
+          if (!llvm_format.empty() && !LLVMFormatPattern.match(llvm_format)) {
+            error.SetErrorStringWithFormat("invalid llvm format: '%s'",
+                                           llvm_format.data());
+            return error;
           }
+        }
 
-          if (verify_is_thread_id) {
-            if (entry.type != Entry::Type::ThreadID &&
-                entry.type != Entry::Type::ThreadProtocolID) {
-              error.SetErrorString("the 'tid' format can only be used on "
-                                   "${thread.id} and ${thread.protocol_id}");
-            }
+        if (verify_is_thread_id) {
+          if (entry.type != Entry::Type::ThreadID &&
+              entry.type != Entry::Type::ThreadProtocolID) {
+            error.SetErrorString("the 'tid' format can only be used on "
+                                 "${thread.id} and ${thread.protocol_id}");
           }
+        }
 
-          switch (entry.type) {
-          case Entry::Type::Variable:
-          case Entry::Type::VariableSynthetic:
-            if (entry.number == 0) {
-              if (entry.string.empty())
-                entry.number =
-                    ValueObject::eValueObjectRepresentationStyleValue;
-              else
-                entry.number =
-                    ValueObject::eValueObjectRepresentationStyleSummary;
-            }
-            break;
-          default:
-            // Make sure someone didn't try to dereference anything but ${var}
-            // or ${svar}
-            if (entry.deref) {
-              error.SetErrorStringWithFormat(
-                  "${%s} can't be dereferenced, only ${var} and ${svar} can.",
-                  variable.str().c_str());
-              return error;
-            }
+        switch (entry.type) {
+        case Entry::Type::Variable:
+        case Entry::Type::VariableSynthetic:
+          if (entry.number == 0) {
+            if (entry.string.empty())
+              entry.number = ValueObject::eValueObjectRepresentationStyleValue;
+            else
+              entry.number =
+                  ValueObject::eValueObjectRepresentationStyleSummary;
+          }
+          break;
+        default:
+          // Make sure someone didn't try to dereference anything but ${var}
+          // or ${svar}
+          if (entry.deref) {
+            error.SetErrorStringWithFormat(
+                "${%s} can't be dereferenced, only ${var} and ${svar} can.",
+                variable.str().c_str());
+            return error;
           }
-          parent_entry.AppendEntry(std::move(entry));
         }
+        parent_entry.AppendEntry(std::move(entry));
       }
       break;
     }
diff --git a/lldb/test/API/functionalities/data-formatter/special-chars/Makefile b/lldb/test/API/functionalities/data-formatter/special-chars/Makefile
new file mode 100644
index 0000000000000..197a5cc71fd75
--- /dev/null
+++ b/lldb/test/API/functionalities/data-formatter/special-chars/Makefile
@@ -0,0 +1,2 @@
+C_SOURCES = main.c
+include Makefile.rules
diff --git a/lldb/test/API/functionalities/data-formatter/special-chars/TestSummaryStringSpecialChars.py b/lldb/test/API/functionalities/data-formatter/special-chars/TestSummaryStringSpecialChars.py
new file mode 100644
index 0000000000000..98de031ac3b36
--- /dev/null
+++ b/lldb/test/API/functionalities/data-formatter/special-chars/TestSummaryStringSpecialChars.py
@@ -0,0 +1,19 @@
+import lldb
+from lldbsuite.test.decorators import *
+from lldbsuite.test.lldbtest import *
+from lldbsuite.test import lldbutil
+
+
+class TestCase(TestBase):
+
+    def test_summary_string_with_bare_dollar_char(self):
+        self.build()
+        lldbutil.run_to_source_breakpoint(self, "break here", lldb.SBFileSpec("main.c"))
+        self.runCmd("type summary add --summary-string '$ $CASH $' --no-value Dollars")
+        self.expect("v cash", startstr="(Dollars) cash = $ $CASH $")
+
+    def test_summary_string_with_bare_dollar_char_before_var(self):
+        self.build()
+        lldbutil.run_to_source_breakpoint(self, "break here", lldb.SBFileSpec("main.c"))
+        self.runCmd("type summary add --summary-string '$${var}' --no-value Dollars")
+        self.expect("v cash", startstr="(Dollars) cash = $99")
diff --git a/lldb/test/API/functionalities/data-formatter/special-chars/main.c b/lldb/test/API/functionalities/data-formatter/special-chars/main.c
new file mode 100644
index 0000000000000..36e6ea91da1fb
--- /dev/null
+++ b/lldb/test/API/functionalities/data-formatter/special-chars/main.c
@@ -0,0 +1,9 @@
+#include <stdio.h>
+
+typedef int Dollars;
+
+int main() {
+  Dollars cash = 99;
+  printf("break here: %d\n", cash);
+  return 0;
+}

@kastiglione kastiglione requested a review from jimingham July 9, 2024 17:10
Copy link

github-actions bot commented Jul 9, 2024

✅ With the latest revision this PR passed the Python code formatter.

@kastiglione
Copy link
Contributor Author

Note that this is essentially a few lines of concrete change, most of the change is whitespace only. Use ?w=1 to filter out whitespace only changes: https://github.com/llvm/llvm-project/pull/98190/files?w=1

Copy link
Collaborator

@jimingham jimingham left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM

@kastiglione kastiglione merged commit 10f3f06 into llvm:main Jul 9, 2024
4 of 5 checks passed
@kastiglione kastiglione deleted the lldb-Improve-summary-string-handling-of-dollar-chars branch July 9, 2024 20:35
kastiglione added a commit to swiftlang/llvm-project that referenced this pull request Jul 9, 2024
This improves the handling of `$` (dollar) characters in summary strings in the
following ways:

1. When a `$` is not followed by an open paren (`{`), it should be treated as a literal
character and preserved in the output. Previously, the dollar would be consumed by the
parser and not shown in the output.
2. When a `$` is the last character of a format string, this change eliminates the
infinite loop lldb would enter into.

rdar://131392446
(cherry picked from commit 10f3f06)
aaryanshukla pushed a commit to aaryanshukla/llvm-project that referenced this pull request Jul 14, 2024
This improves the handling of `$` (dollar) characters in summary strings in the 
following ways:

1. When a `$` is not followed by an open paren (`{`), it should be treated as a literal 
character and preserved in the output. Previously, the dollar would be consumed by the 
parser and not shown in the output.
2. When a `$` is the last character of a format string, this change eliminates the 
infinite loop lldb would enter into.

rdar://131392446
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants