Skip to content

[lldb] Don't invalid register context after setting thread pc's #109499

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged

Conversation

jasonmolenda
Copy link
Collaborator

Some gdb remote serial protocol stubs will send the thread IDs and PCs for all threads in a process in the stop-reply packet. lldb often needs to know the pc values for all threads while at a private stop, and that results in read-register packets for threads, and can be a big performance problem when this is a hot code path.

GDBRemoteRegisterContext tracks the StopID of when its values were set, and when the thread's StopID has incremented, it marks all values it has as Invalid, and knows to refetch them.

We have a code path that resulted in setting the PCs for all the threads, and then ProcessGDBRemote::CalculateThreadStopInfo forcing an invalidation of all the register contexts, forcing us to re-read the pc values for all threads except the one that stopped.

There are times when it is valid to force an invalidation of the regsiter cache - for instance, if the layout of the registers has changed because the processor state is different, or we've sent a write-all-registers packet to the inferior and we want to make sure we stay in sync with the inferior. But there was no reason for this method to be forcing the register context to be invalid.

I added a test when running on Darwin systems, where debugserver always sends the thread IDs and PCs, which turns on packet logging. The test runs against an inferior which has 4 threads; it steps over a dlopen() call, steps in to a user function with debug info, steps-over and steps-in across source lines with multiple function calls, and then examines the packet log and flags it as an error if lldb asked for the pc value of any thread at any point in the debug session.

For this program and the operations we're doing, with debugserver that provides thread IDs and PCs, we should never ask for the value of a pc register.

rdar://136247381

Some gdb remote serial protocol stubs will send the thread IDs and
PCs for all threads in a process in the stop-reply packet.  lldb
often needs to know the pc values for all threads while at a private
stop, and that results in <n-1> read-register packets for <n>
threads, and can be a big performance problem when this is a hot
code path.

GDBRemoteRegisterContext tracks the StopID of when its values were
set, and when the thread's StopID has incremented, it marks all
values it has as Invalid, and knows to refetch them.

We have a code path that resulted in setting the PCs for all the
threads, and then `ProcessGDBRemote::CalculateThreadStopInfo`
*forcing* an invalidation of all the register contexts, forcing
us to re-read the pc values for all threads except the one that
stopped.

There are times when it is valid to force an invalidation of the
regsiter cache - for instance, if the layout of the registers has
changed because the processor state is different, or we've sent
a write-all-registers packet to the inferior and we want to make
sure we stay in sync with the inferior.  But there was no reason
for this method to be forcing the register context to be invalid.

I added a test when running on Darwin systems, where debugserver
always sends the thread IDs and PCs, which turns on packet logging.
The test runs against an inferior which has 4 threads; it steps
over a dlopen() call, steps in to a user function with debug info,
steps-over and steps-in across source lines with multiple function
calls, and then examines the packet log and flags it as an error
if lldb asked for the pc value of any thread at any point in the
debug session.

For this program and the operations we're doing, with debugserver
that provides thread IDs and PCs, we should never ask for the value
of a pc register.

rdar://136247381
@llvmbot
Copy link
Member

llvmbot commented Sep 21, 2024

@llvm/pr-subscribers-lldb

Author: Jason Molenda (jasonmolenda)

Changes

Some gdb remote serial protocol stubs will send the thread IDs and PCs for all threads in a process in the stop-reply packet. lldb often needs to know the pc values for all threads while at a private stop, and that results in <n-1> read-register packets for <n> threads, and can be a big performance problem when this is a hot code path.

GDBRemoteRegisterContext tracks the StopID of when its values were set, and when the thread's StopID has incremented, it marks all values it has as Invalid, and knows to refetch them.

We have a code path that resulted in setting the PCs for all the threads, and then ProcessGDBRemote::CalculateThreadStopInfo forcing an invalidation of all the register contexts, forcing us to re-read the pc values for all threads except the one that stopped.

There are times when it is valid to force an invalidation of the regsiter cache - for instance, if the layout of the registers has changed because the processor state is different, or we've sent a write-all-registers packet to the inferior and we want to make sure we stay in sync with the inferior. But there was no reason for this method to be forcing the register context to be invalid.

I added a test when running on Darwin systems, where debugserver always sends the thread IDs and PCs, which turns on packet logging. The test runs against an inferior which has 4 threads; it steps over a dlopen() call, steps in to a user function with debug info, steps-over and steps-in across source lines with multiple function calls, and then examines the packet log and flags it as an error if lldb asked for the pc value of any thread at any point in the debug session.

For this program and the operations we're doing, with debugserver that provides thread IDs and PCs, we should never ask for the value of a pc register.

rdar://136247381


Full diff: https://github.com/llvm/llvm-project/pull/109499.diff

5 Files Affected:

  • (modified) lldb/source/Plugins/Process/gdb-remote/ProcessGDBRemote.cpp (-1)
  • (added) lldb/test/API/macosx/expedited-thread-pcs/Makefile (+11)
  • (added) lldb/test/API/macosx/expedited-thread-pcs/TestExpeditedThreadPCs.py (+91)
  • (added) lldb/test/API/macosx/expedited-thread-pcs/foo.c (+1)
  • (added) lldb/test/API/macosx/expedited-thread-pcs/main.cpp (+62)
diff --git a/lldb/source/Plugins/Process/gdb-remote/ProcessGDBRemote.cpp b/lldb/source/Plugins/Process/gdb-remote/ProcessGDBRemote.cpp
index d5dfe79fd8862a..9e8c6046179631 100644
--- a/lldb/source/Plugins/Process/gdb-remote/ProcessGDBRemote.cpp
+++ b/lldb/source/Plugins/Process/gdb-remote/ProcessGDBRemote.cpp
@@ -1600,7 +1600,6 @@ bool ProcessGDBRemote::CalculateThreadStopInfo(ThreadGDBRemote *thread) {
     // If we have "jstopinfo" then we have stop descriptions for all threads
     // that have stop reasons, and if there is no entry for a thread, then it
     // has no stop reason.
-    thread->GetRegisterContext()->InvalidateIfNeeded(true);
     if (!GetThreadStopInfoFromJSON(thread, m_jstopinfo_sp)) {
       // If a thread is stopped at a breakpoint site, set that as the stop
       // reason even if it hasn't executed the breakpoint instruction yet.
diff --git a/lldb/test/API/macosx/expedited-thread-pcs/Makefile b/lldb/test/API/macosx/expedited-thread-pcs/Makefile
new file mode 100644
index 00000000000000..7799f06e770970
--- /dev/null
+++ b/lldb/test/API/macosx/expedited-thread-pcs/Makefile
@@ -0,0 +1,11 @@
+CXX_SOURCES := main.cpp
+
+.PHONY: build-libfoo
+all: build-libfoo a.out
+
+include Makefile.rules
+
+build-libfoo: foo.c
+	$(MAKE) -f $(MAKEFILE_RULES) \
+		DYLIB_C_SOURCES=foo.c DYLIB_NAME=foo DYLIB_ONLY=YES
+
diff --git a/lldb/test/API/macosx/expedited-thread-pcs/TestExpeditedThreadPCs.py b/lldb/test/API/macosx/expedited-thread-pcs/TestExpeditedThreadPCs.py
new file mode 100644
index 00000000000000..0611907a34b0d6
--- /dev/null
+++ b/lldb/test/API/macosx/expedited-thread-pcs/TestExpeditedThreadPCs.py
@@ -0,0 +1,91 @@
+"""Test that the expedited thread pc values are not re-fetched by lldb."""
+
+import subprocess
+import lldb
+from lldbsuite.test.decorators import *
+from lldbsuite.test.lldbtest import *
+from lldbsuite.test import lldbutil
+
+file_index = 0
+
+
+class TestExpeditedThreadPCs(TestBase):
+    NO_DEBUG_INFO_TESTCASE = True
+
+    @skipUnlessDarwin
+    def test_expedited_thread_pcs(self):
+        TestBase.setUp(self)
+
+        global file_index
+        ++file_index
+        logfile = os.path.join(
+            self.getBuildDir(),
+            "packet-log-" + self.getArchitecture() + "-" + str(file_index) + ".txt",
+        )
+        self.runCmd("log enable -f %s gdb-remote packets" % (logfile))
+
+        def cleanup():
+            self.runCmd("log disable gdb-remote packets")
+            if os.path.exists(logfile):
+                os.unlink(logfile)
+
+        self.addTearDownHook(cleanup)
+
+        self.source = "main.cpp"
+        self.build()
+        (target, process, thread, bkpt) = lldbutil.run_to_source_breakpoint(
+            self, "break here", lldb.SBFileSpec(self.source, False)
+        )
+
+        # verify that libfoo.dylib hasn't loaded yet
+        for m in target.modules:
+            self.assertNotEqual(m.GetFileSpec().GetFilename(), "libfoo.dylib")
+
+        thread.StepInto()
+        thread.StepInto()
+
+        thread.StepInto()
+        thread.StepInto()
+        thread.StepInto()
+
+        # verify that libfoo.dylib has loaded
+        for m in target.modules:
+            if m.GetFileSpec().GetFilename() == "libfoo.dylib":
+                found_libfoo = True
+        self.assertTrue(found_libfoo)
+
+        thread.StepInto()
+        thread.StepInto()
+        thread.StepOver()
+        thread.StepOver()
+        thread.StepOver()
+        thread.StepOver()
+        thread.StepOver()
+        thread.StepOver()
+        thread.StepOver()
+        thread.StepOver()
+        thread.StepOver()
+        thread.StepOver()
+
+        process.Kill()
+
+        # Confirm that we never fetched the pc for any threads during
+        # this debug session.
+        if os.path.exists(logfile):
+            f = open(logfile)
+            lines = f.readlines()
+            num_errors = 0
+            for line in lines:
+                arch = self.getArchitecture()
+                if arch == "arm64" or arch == "arm64_32":
+                    #   <reg name="pc" regnum="32" offset="256" bitsize="64" group="general" group_id="1" ehframe_regnum="32" dwarf_regnum="32" generic="pc"/>
+                    # A fetch of $pc on arm64 looks like
+                    #  <  22> send packet: $p20;thread:91698e;#70
+                    self.assertNotIn("$p20;thread", line)
+                else:
+                    #   <reg name="rip" regnum="16" offset="128" bitsize="64" group="general" altname="pc" group_id="1" ehframe_regnum="16" dwarf_regnum="16" generic="pc"/>
+                    # A fetch of $pc on x86_64 looks like
+                    #  <  22> send packet: $p10;thread:91889c;#6f
+                    self.assertNotIn("$p10;thread", line)
+
+            f.close()
diff --git a/lldb/test/API/macosx/expedited-thread-pcs/foo.c b/lldb/test/API/macosx/expedited-thread-pcs/foo.c
new file mode 100644
index 00000000000000..de1cbc4c4648a1
--- /dev/null
+++ b/lldb/test/API/macosx/expedited-thread-pcs/foo.c
@@ -0,0 +1 @@
+int foo() { return 5; }
diff --git a/lldb/test/API/macosx/expedited-thread-pcs/main.cpp b/lldb/test/API/macosx/expedited-thread-pcs/main.cpp
new file mode 100644
index 00000000000000..d77c6793afb6b2
--- /dev/null
+++ b/lldb/test/API/macosx/expedited-thread-pcs/main.cpp
@@ -0,0 +1,62 @@
+#include <dlfcn.h>
+#include <stdio.h>
+#include <thread>
+#include <unistd.h>
+
+void f1() {
+  while (1)
+    sleep(1);
+}
+void f2() {
+  while (1)
+    sleep(1);
+}
+void f3() {
+  while (1)
+    sleep(1);
+}
+
+int main() {
+  std::thread t1{f1};
+  std::thread t2{f2};
+  std::thread t3{f3};
+
+  puts("break here");
+
+  void *handle = dlopen("libfoo.dylib", RTLD_LAZY);
+  int (*foo_ptr)() = (int (*)())dlsym(handle, "foo");
+  int c = foo_ptr();
+
+  // clang-format off
+  // multiple function calls on a single source line so 'step'
+  // and 'next' need to do multiple steps of work.
+  puts("1"); puts("2"); puts("3"); puts("4"); puts("5");
+  puts("6"); puts("7"); puts("8"); puts("9"); puts("10");
+  puts("11"); puts("12"); puts("13"); puts("14"); puts("15");
+  puts("16"); puts("17"); puts("18"); puts("19"); puts("20");
+  puts("21"); puts("22"); puts("23"); puts("24"); puts("24");
+  // clang-format on
+  puts("one");
+  puts("two");
+  puts("three");
+  puts("four");
+  puts("five");
+  puts("six");
+  puts("seven");
+  puts("eight");
+  puts("nine");
+  puts("ten");
+  c++;
+  c++;
+  c++;
+  c++;
+  c++;
+  c++;
+  c++;
+  c++;
+  c++;
+  c++;
+  c++;
+  c++;
+  return c;
+}

@labath
Copy link
Collaborator

labath commented Sep 23, 2024

What's the exact situation that triggered these extra packets. Could it be simulated from a gdb-remote client test (i.e., by mocking server responses)?

@jasonmolenda
Copy link
Collaborator Author

What's the exact situation that triggered these extra packets. Could it be simulated from a gdb-remote client test (i.e., by mocking server responses)?

We've always done it when we step over our dynamic loader notification breakpoint (when new solibs are loaded and lldb is informed), I think it depends on the ordering that we invoke these methods in ProcessGDBRemote, I didn't try to debug what shifted in our call ordering to make this become a hotter problem recently.

(that's why my test case dlopen's a solib, because that was a specific issue)

Copy link
Collaborator

@jimingham jimingham left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM.

Given InvalidateIfNeeded (which by name seems harmless) can have these undesirably effects (presumably only when you pass true for force) it might be nice to warn developers about this by adding an instructive comment to InvalidateIfNeeded?

But the content seems fine here.

@jasonmolenda jasonmolenda merged commit 6e6d5ea into llvm:main Sep 23, 2024
9 checks passed
@jasonmolenda jasonmolenda deleted the do-not-fetch-pc-values-with-debugserver branch September 23, 2024 19:13
jasonmolenda added a commit to jasonmolenda/llvm-project that referenced this pull request Sep 23, 2024
…#109499)

Some gdb remote serial protocol stubs will send the thread IDs and PCs
for all threads in a process in the stop-reply packet. lldb often needs
to know the pc values for all threads while at a private stop, and that
results in <n-1> read-register packets for <n> threads, and can be a big
performance problem when this is a hot code path.

GDBRemoteRegisterContext tracks the StopID of when its values were set,
and when the thread's StopID has incremented, it marks all values it has
as Invalid, and knows to refetch them.

We have a code path that resulted in setting the PCs for all the
threads, and then `ProcessGDBRemote::CalculateThreadStopInfo` *forcing*
an invalidation of all the register contexts, forcing us to re-read the
pc values for all threads except the one that stopped.

There are times when it is valid to force an invalidation of the
regsiter cache - for instance, if the layout of the registers has
changed because the processor state is different, or we've sent a
write-all-registers packet to the inferior and we want to make sure we
stay in sync with the inferior. But there was no reason for this method
to be forcing the register context to be invalid.

I added a test when running on Darwin systems, where debugserver always
sends the thread IDs and PCs, which turns on packet logging. The test
runs against an inferior which has 4 threads; it steps over a dlopen()
call, steps in to a user function with debug info, steps-over and
steps-in across source lines with multiple function calls, and then
examines the packet log and flags it as an error if lldb asked for the
pc value of any thread at any point in the debug session.

For this program and the operations we're doing, with debugserver that
provides thread IDs and PCs, we should never ask for the value of a pc
register.

rdar://136247381
(cherry picked from commit 6e6d5ea)
jasonmolenda added a commit to swiftlang/llvm-project that referenced this pull request Sep 23, 2024
…register-context-unnecessarily

[lldb] Don't invalid register context after setting thread pc's (llvm#109499)
jasonmolenda added a commit to jasonmolenda/llvm-project that referenced this pull request Sep 23, 2024
…#109499)

Some gdb remote serial protocol stubs will send the thread IDs and PCs
for all threads in a process in the stop-reply packet. lldb often needs
to know the pc values for all threads while at a private stop, and that
results in <n-1> read-register packets for <n> threads, and can be a big
performance problem when this is a hot code path.

GDBRemoteRegisterContext tracks the StopID of when its values were set,
and when the thread's StopID has incremented, it marks all values it has
as Invalid, and knows to refetch them.

We have a code path that resulted in setting the PCs for all the
threads, and then `ProcessGDBRemote::CalculateThreadStopInfo` *forcing*
an invalidation of all the register contexts, forcing us to re-read the
pc values for all threads except the one that stopped.

There are times when it is valid to force an invalidation of the
regsiter cache - for instance, if the layout of the registers has
changed because the processor state is different, or we've sent a
write-all-registers packet to the inferior and we want to make sure we
stay in sync with the inferior. But there was no reason for this method
to be forcing the register context to be invalid.

I added a test when running on Darwin systems, where debugserver always
sends the thread IDs and PCs, which turns on packet logging. The test
runs against an inferior which has 4 threads; it steps over a dlopen()
call, steps in to a user function with debug info, steps-over and
steps-in across source lines with multiple function calls, and then
examines the packet log and flags it as an error if lldb asked for the
pc value of any thread at any point in the debug session.

For this program and the operations we're doing, with debugserver that
provides thread IDs and PCs, we should never ask for the value of a pc
register.

rdar://136247381
(cherry picked from commit 6e6d5ea)
jasonmolenda added a commit to swiftlang/llvm-project that referenced this pull request Sep 24, 2024
…te-register-context-unnecessarily

[lldb] Don't invalid register context after setting thread pc's (llvm#109499)
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants