Skip to content

Add register lookup as another fallback computation for address-expressions #85492

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
merged 2 commits into from
Mar 25, 2024

Conversation

jimingham
Copy link
Collaborator

The idea behind the address-expression is that it handles all the common expressions that produce addresses. It handles actual valid expressions that return a scalar, and it handles useful cases that the various source languages don't support. At present, the fallback handles:

<symbol_name>{+-}

which isn't valid C but is very handy.

This patch adds handling of:

$<reg_name>

and

$<reg_name>{+-}

That's kind of pointless in C because the C expression parser handles that expression already. But some languages don't have a straightforward way to represent register values like this (swift) so having this fallback is quite a quality of life improvement.

I added a test which tests that I didn't mess up either of these fallbacks, though it doesn't test the actually handling of registers that I added, since the expression parser for C succeeds in that case and returns before this code gets run.

I will add a test on the swift fork for that checks that this works the same way for a swift frame after this check.

…ssions

The idea behind the address-expression is that it handles all the common
expressions that produce addresses.  It handles actual valid expressions
that return a scalar, and it handles useful cases that the various source languages
don't support.  At present, the fallback handles:

<symbol_name>{+-}<offset>

which isn't valid C but is very handy.

This patch adds handling of:

$<reg_name>

and

$<reg_name>{+-}<offset>

That's kind of pointless in C because the C expression parser handles
that expression already.  But some languages don't have a straightforward
way to represent register values like this (swift) so having this fallback
is quite a quality of life improvement.

I added a test which tests that I didn't mess up either of these fallbacks,
though it doesn't test the actually handling of registers that I added, since
the expression parser for C succeeds in that case and returns before this code
gets run.

I will add a test on the swift fork for that checks that this works the same
way for a swift frame after this check.
@llvmbot
Copy link
Member

llvmbot commented Mar 16, 2024

@llvm/pr-subscribers-lldb

Author: None (jimingham)

Changes

The idea behind the address-expression is that it handles all the common expressions that produce addresses. It handles actual valid expressions that return a scalar, and it handles useful cases that the various source languages don't support. At present, the fallback handles:

<symbol_name>{+-}<offset>

which isn't valid C but is very handy.

This patch adds handling of:

$<reg_name>

and

$<reg_name>{+-}<offset>

That's kind of pointless in C because the C expression parser handles that expression already. But some languages don't have a straightforward way to represent register values like this (swift) so having this fallback is quite a quality of life improvement.

I added a test which tests that I didn't mess up either of these fallbacks, though it doesn't test the actually handling of registers that I added, since the expression parser for C succeeds in that case and returns before this code gets run.

I will add a test on the swift fork for that checks that this works the same way for a swift frame after this check.


Full diff: https://github.com/llvm/llvm-project/pull/85492.diff

4 Files Affected:

  • (modified) lldb/source/Interpreter/OptionArgParser.cpp (+42-7)
  • (added) lldb/test/API/commands/target/modules/lookup/Makefile (+4)
  • (added) lldb/test/API/commands/target/modules/lookup/TestImageLookupPCExpression.py (+27)
  • (added) lldb/test/API/commands/target/modules/lookup/main.c (+15)
diff --git a/lldb/source/Interpreter/OptionArgParser.cpp b/lldb/source/Interpreter/OptionArgParser.cpp
index 75ccad87467e95..dd914138fe0e6b 100644
--- a/lldb/source/Interpreter/OptionArgParser.cpp
+++ b/lldb/source/Interpreter/OptionArgParser.cpp
@@ -9,7 +9,9 @@
 #include "lldb/Interpreter/OptionArgParser.h"
 #include "lldb/DataFormatters/FormatManager.h"
 #include "lldb/Target/ABI.h"
+#include "lldb/Target/RegisterContext.h"
 #include "lldb/Target/Target.h"
+#include "lldb/Utility/RegisterValue.h"
 #include "lldb/Utility/Status.h"
 #include "lldb/Utility/StreamString.h"
 
@@ -233,24 +235,57 @@ OptionArgParser::DoToAddress(const ExecutionContext *exe_ctx, llvm::StringRef s,
   // Since the compiler can't handle things like "main + 12" we should try to
   // do this for now. The compiler doesn't like adding offsets to function
   // pointer types.
+  // Some languages also don't have a natural representation for register
+  // values (e.g. swift) so handle simple uses of them here as well.
   static RegularExpression g_symbol_plus_offset_regex(
-      "^(.*)([-\\+])[[:space:]]*(0x[0-9A-Fa-f]+|[0-9]+)[[:space:]]*$");
+      "^(\\$[^ +-]+)|(([^ +-]+)([-\\+])[[:space:]]*(0x[0-9A-Fa-f]+|[0-9]+)[[:space:]]*)$");
 
   llvm::SmallVector<llvm::StringRef, 4> matches;
   if (g_symbol_plus_offset_regex.Execute(sref, &matches)) {
     uint64_t offset = 0;
-    llvm::StringRef name = matches[1];
-    llvm::StringRef sign = matches[2];
-    llvm::StringRef str_offset = matches[3];
-    if (!str_offset.getAsInteger(0, offset)) {
+    llvm::StringRef name;
+    if (!matches[1].empty())
+      name = matches[1];
+    else
+      name = matches[3];
+
+    llvm::StringRef sign = matches[4];
+    llvm::StringRef str_offset = matches[5];
+    std::optional<lldb::addr_t> register_value;
+    StackFrame *frame = exe_ctx->GetFramePtr();
+    if (frame && !name.empty() && name[0] == '$') {
+      // Some languages don't have a natural type for register values, but it
+      // is still useful to look them up here:
+      RegisterContextSP reg_ctx_sp = frame->GetRegisterContext();
+      if (reg_ctx_sp) {
+        llvm::StringRef base_name = name.substr(1);
+        const RegisterInfo *reg_info = reg_ctx_sp->GetRegisterInfoByName(base_name);
+        if (reg_info) {
+          RegisterValue reg_val;
+          bool success = reg_ctx_sp->ReadRegister(reg_info, reg_val);
+          if (success && reg_val.GetType() != RegisterValue::eTypeInvalid) {
+            register_value = reg_val.GetAsUInt64(0, &success);
+            if (!success)
+              register_value.reset();
+          }
+        }
+      } 
+    }
+    if (!str_offset.empty() && !str_offset.getAsInteger(0, offset)) {
       Status error;
-      addr = ToAddress(exe_ctx, name, LLDB_INVALID_ADDRESS, &error);
+      if (register_value)
+        addr = register_value.value();
+      else
+        addr = ToAddress(exe_ctx, name, LLDB_INVALID_ADDRESS, &error);
       if (addr != LLDB_INVALID_ADDRESS) {
         if (sign[0] == '+')
           return addr + offset;
         return addr - offset;
       }
-    }
+    } else if (register_value)
+      // In the case of register values, someone might just want to get the 
+      // value in a language whose expression parser doesn't support registers.
+      return register_value.value();
   }
 
   if (error_ptr)
diff --git a/lldb/test/API/commands/target/modules/lookup/Makefile b/lldb/test/API/commands/target/modules/lookup/Makefile
new file mode 100644
index 00000000000000..695335e068c0c9
--- /dev/null
+++ b/lldb/test/API/commands/target/modules/lookup/Makefile
@@ -0,0 +1,4 @@
+C_SOURCES := main.c
+CFLAGS_EXTRAS := -std=c99
+
+include Makefile.rules
diff --git a/lldb/test/API/commands/target/modules/lookup/TestImageLookupPCExpression.py b/lldb/test/API/commands/target/modules/lookup/TestImageLookupPCExpression.py
new file mode 100644
index 00000000000000..9872e057cbbfba
--- /dev/null
+++ b/lldb/test/API/commands/target/modules/lookup/TestImageLookupPCExpression.py
@@ -0,0 +1,27 @@
+"""
+Make sure that "target modules lookup -va $pc" works
+"""
+
+
+import lldb
+import lldbsuite.test.lldbutil as lldbutil
+from lldbsuite.test.lldbtest import *
+
+
+class TestImageLookupPCInC(TestBase):
+    def test_sample_rename_this(self):
+        """There can be many tests in a test case - describe this test here."""
+        self.build()
+        self.main_source_file = lldb.SBFileSpec("main.c")
+        self.sample_test()
+
+    def sample_test(self):
+        """Make sure the address expression resolves to the right function"""
+
+        (target, process, thread, bkpt) = lldbutil.run_to_source_breakpoint(
+            self, "Set a breakpoint here", self.main_source_file
+        )
+
+        self.expect("target modules lookup -va $pc", substrs=["doSomething"])
+        self.expect("target modules lookup -va $pc+4", substrs=["doSomething"])
+
diff --git a/lldb/test/API/commands/target/modules/lookup/main.c b/lldb/test/API/commands/target/modules/lookup/main.c
new file mode 100644
index 00000000000000..afe962f30916c9
--- /dev/null
+++ b/lldb/test/API/commands/target/modules/lookup/main.c
@@ -0,0 +1,15 @@
+#include <stdio.h>
+
+void
+doSomething()
+{
+  printf ("Set a breakpoint here.\n");
+  printf ("Need a bit more code.\n");
+}
+
+int
+main()
+{
+  doSomething();
+  return 0;
+}

Copy link

github-actions bot commented Mar 16, 2024

⚠️ Python code formatter, darker found issues in your code. ⚠️

You can test this locally with the following command:
darker --check --diff -r 9405d5af65853ac548cce2656497195010db1d86...59fd0e9b4f598b56d79be9e75eeed0043c514f7b lldb/test/API/commands/target/modules/lookup/TestImageLookupPCExpression.py
View the diff from darker here.
--- TestImageLookupPCExpression.py	2024-03-16 00:56:20.000000 +0000
+++ TestImageLookupPCExpression.py	2024-03-18 22:21:26.358279 +0000
@@ -22,6 +22,5 @@
             self, "Set a breakpoint here", self.main_source_file
         )
 
         self.expect("target modules lookup -va $pc", substrs=["doSomething"])
         self.expect("target modules lookup -va $pc+4", substrs=["doSomething"])
-

Copy link

github-actions bot commented Mar 16, 2024

⚠️ C/C++ code formatter, clang-format found issues in your code. ⚠️

You can test this locally with the following command:
git-clang-format --diff 9405d5af65853ac548cce2656497195010db1d86 59fd0e9b4f598b56d79be9e75eeed0043c514f7b -- lldb/test/API/commands/target/modules/lookup/main.c lldb/source/Interpreter/OptionArgParser.cpp
View the diff from clang-format here.
diff --git a/lldb/source/Interpreter/OptionArgParser.cpp b/lldb/source/Interpreter/OptionArgParser.cpp
index 9a8275128e..860a1e4bc1 100644
--- a/lldb/source/Interpreter/OptionArgParser.cpp
+++ b/lldb/source/Interpreter/OptionArgParser.cpp
@@ -248,7 +248,8 @@ OptionArgParser::DoToAddress(const ExecutionContext *exe_ctx, llvm::StringRef s,
   // 4: +/-
   // 5: The offset value.
   static RegularExpression g_symbol_plus_offset_regex(
-      "^(\\$[^ +-]+)|(([^ +-]+)([-\\+])[[:space:]]*(0x[0-9A-Fa-f]+|[0-9]+)[[:space:]]*)$");
+      "^(\\$[^ +-]+)|(([^ "
+      "+-]+)([-\\+])[[:space:]]*(0x[0-9A-Fa-f]+|[0-9]+)[[:space:]]*)$");
 
   llvm::SmallVector<llvm::StringRef, 4> matches;
   if (g_symbol_plus_offset_regex.Execute(sref, &matches)) {
@@ -270,7 +271,8 @@ OptionArgParser::DoToAddress(const ExecutionContext *exe_ctx, llvm::StringRef s,
     if (frame && reg_name.consume_front("$")) {
       RegisterContextSP reg_ctx_sp = frame->GetRegisterContext();
       if (reg_ctx_sp) {
-        const RegisterInfo *reg_info = reg_ctx_sp->GetRegisterInfoByName(reg_name);
+        const RegisterInfo *reg_info =
+            reg_ctx_sp->GetRegisterInfoByName(reg_name);
         if (reg_info) {
           RegisterValue reg_val;
           bool success = reg_ctx_sp->ReadRegister(reg_info, reg_val);
@@ -280,7 +282,7 @@ OptionArgParser::DoToAddress(const ExecutionContext *exe_ctx, llvm::StringRef s,
               register_value.reset();
           }
         }
-      } 
+      }
     }
     if (!str_offset.empty() && !str_offset.getAsInteger(0, offset)) {
       Status error;
@@ -294,7 +296,7 @@ OptionArgParser::DoToAddress(const ExecutionContext *exe_ctx, llvm::StringRef s,
         return addr - offset;
       }
     } else if (register_value)
-      // In the case of register values, someone might just want to get the 
+      // In the case of register values, someone might just want to get the
       // value in a language whose expression parser doesn't support registers.
       return register_value.value();
   }
diff --git a/lldb/test/API/commands/target/modules/lookup/main.c b/lldb/test/API/commands/target/modules/lookup/main.c
index afe962f309..f2be761a72 100644
--- a/lldb/test/API/commands/target/modules/lookup/main.c
+++ b/lldb/test/API/commands/target/modules/lookup/main.c
@@ -1,15 +1,11 @@
 #include <stdio.h>
 
-void
-doSomething()
-{
-  printf ("Set a breakpoint here.\n");
-  printf ("Need a bit more code.\n");
+void doSomething() {
+  printf("Set a breakpoint here.\n");
+  printf("Need a bit more code.\n");
 }
 
-int
-main()
-{
+int main() {
   doSomething();
   return 0;
 }

llvm::StringRef str_offset = matches[5];
std::optional<lldb::addr_t> register_value;
StackFrame *frame = exe_ctx->GetFramePtr();
if (frame && !name.empty() && name[0] == '$') {
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
if (frame && !name.empty() && name[0] == '$') {
if (frame && name.starts_with("$") {

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

or even better:
StringRef regname = name.consume_front("$");
if (!regname.empty) ...

if (!matches[1].empty())
name = matches[1];
else
name = matches[3];
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It might help to comment with an example what match[1,3] are:

// $pc + 0  ... match[1] == "$pc"

@adrian-prantl
Copy link
Collaborator

This is going to be incredibly useful, thanks!

@jimingham jimingham merged commit 2c76e88 into llvm:main Mar 25, 2024
@jimingham jimingham deleted the reg-name-fallback branch March 25, 2024 22:17
jimingham added a commit to jimingham/from-apple-llvm-project that referenced this pull request Mar 25, 2024
…ssions (llvm#85492)

The idea behind the address-expression is that it handles all the common
expressions that produce addresses. It handles actual valid expressions
that return a scalar, and it handles useful cases that the various
source languages don't support. At present, the fallback handles:

<symbol_name>{+-}<offset>

which isn't valid C but is very handy.

This patch adds handling of:

$<reg_name>

and

$<reg_name>{+-}<offset>

That's kind of pointless in C because the C expression parser handles
that expression already. But some languages don't have a straightforward
way to represent register values like this (swift) so having this
fallback is quite a quality of life improvement.

I added a test which tests that I didn't mess up either of these
fallbacks, though it doesn't test the actually handling of registers
that I added, since the expression parser for C succeeds in that case
and returns before this code gets run.

I will add a test on the swift fork for that checks that this works the
same way for a swift frame after this check.

(cherry picked from commit 2c76e88)
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants