-
Notifications
You must be signed in to change notification settings - Fork 14.3k
[lldb] Don't invalid register context after setting thread pc's #109499
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[lldb] Don't invalid register context after setting thread pc's #109499
Conversation
Some gdb remote serial protocol stubs will send the thread IDs and PCs for all threads in a process in the stop-reply packet. lldb often needs to know the pc values for all threads while at a private stop, and that results in <n-1> read-register packets for <n> threads, and can be a big performance problem when this is a hot code path. GDBRemoteRegisterContext tracks the StopID of when its values were set, and when the thread's StopID has incremented, it marks all values it has as Invalid, and knows to refetch them. We have a code path that resulted in setting the PCs for all the threads, and then `ProcessGDBRemote::CalculateThreadStopInfo` *forcing* an invalidation of all the register contexts, forcing us to re-read the pc values for all threads except the one that stopped. There are times when it is valid to force an invalidation of the regsiter cache - for instance, if the layout of the registers has changed because the processor state is different, or we've sent a write-all-registers packet to the inferior and we want to make sure we stay in sync with the inferior. But there was no reason for this method to be forcing the register context to be invalid. I added a test when running on Darwin systems, where debugserver always sends the thread IDs and PCs, which turns on packet logging. The test runs against an inferior which has 4 threads; it steps over a dlopen() call, steps in to a user function with debug info, steps-over and steps-in across source lines with multiple function calls, and then examines the packet log and flags it as an error if lldb asked for the pc value of any thread at any point in the debug session. For this program and the operations we're doing, with debugserver that provides thread IDs and PCs, we should never ask for the value of a pc register. rdar://136247381
@llvm/pr-subscribers-lldb Author: Jason Molenda (jasonmolenda) ChangesSome gdb remote serial protocol stubs will send the thread IDs and PCs for all threads in a process in the stop-reply packet. lldb often needs to know the pc values for all threads while at a private stop, and that results in <n-1> read-register packets for <n> threads, and can be a big performance problem when this is a hot code path. GDBRemoteRegisterContext tracks the StopID of when its values were set, and when the thread's StopID has incremented, it marks all values it has as Invalid, and knows to refetch them. We have a code path that resulted in setting the PCs for all the threads, and then There are times when it is valid to force an invalidation of the regsiter cache - for instance, if the layout of the registers has changed because the processor state is different, or we've sent a write-all-registers packet to the inferior and we want to make sure we stay in sync with the inferior. But there was no reason for this method to be forcing the register context to be invalid. I added a test when running on Darwin systems, where debugserver always sends the thread IDs and PCs, which turns on packet logging. The test runs against an inferior which has 4 threads; it steps over a dlopen() call, steps in to a user function with debug info, steps-over and steps-in across source lines with multiple function calls, and then examines the packet log and flags it as an error if lldb asked for the pc value of any thread at any point in the debug session. For this program and the operations we're doing, with debugserver that provides thread IDs and PCs, we should never ask for the value of a pc register. rdar://136247381 Full diff: https://github.com/llvm/llvm-project/pull/109499.diff 5 Files Affected:
diff --git a/lldb/source/Plugins/Process/gdb-remote/ProcessGDBRemote.cpp b/lldb/source/Plugins/Process/gdb-remote/ProcessGDBRemote.cpp
index d5dfe79fd8862a..9e8c6046179631 100644
--- a/lldb/source/Plugins/Process/gdb-remote/ProcessGDBRemote.cpp
+++ b/lldb/source/Plugins/Process/gdb-remote/ProcessGDBRemote.cpp
@@ -1600,7 +1600,6 @@ bool ProcessGDBRemote::CalculateThreadStopInfo(ThreadGDBRemote *thread) {
// If we have "jstopinfo" then we have stop descriptions for all threads
// that have stop reasons, and if there is no entry for a thread, then it
// has no stop reason.
- thread->GetRegisterContext()->InvalidateIfNeeded(true);
if (!GetThreadStopInfoFromJSON(thread, m_jstopinfo_sp)) {
// If a thread is stopped at a breakpoint site, set that as the stop
// reason even if it hasn't executed the breakpoint instruction yet.
diff --git a/lldb/test/API/macosx/expedited-thread-pcs/Makefile b/lldb/test/API/macosx/expedited-thread-pcs/Makefile
new file mode 100644
index 00000000000000..7799f06e770970
--- /dev/null
+++ b/lldb/test/API/macosx/expedited-thread-pcs/Makefile
@@ -0,0 +1,11 @@
+CXX_SOURCES := main.cpp
+
+.PHONY: build-libfoo
+all: build-libfoo a.out
+
+include Makefile.rules
+
+build-libfoo: foo.c
+ $(MAKE) -f $(MAKEFILE_RULES) \
+ DYLIB_C_SOURCES=foo.c DYLIB_NAME=foo DYLIB_ONLY=YES
+
diff --git a/lldb/test/API/macosx/expedited-thread-pcs/TestExpeditedThreadPCs.py b/lldb/test/API/macosx/expedited-thread-pcs/TestExpeditedThreadPCs.py
new file mode 100644
index 00000000000000..0611907a34b0d6
--- /dev/null
+++ b/lldb/test/API/macosx/expedited-thread-pcs/TestExpeditedThreadPCs.py
@@ -0,0 +1,91 @@
+"""Test that the expedited thread pc values are not re-fetched by lldb."""
+
+import subprocess
+import lldb
+from lldbsuite.test.decorators import *
+from lldbsuite.test.lldbtest import *
+from lldbsuite.test import lldbutil
+
+file_index = 0
+
+
+class TestExpeditedThreadPCs(TestBase):
+ NO_DEBUG_INFO_TESTCASE = True
+
+ @skipUnlessDarwin
+ def test_expedited_thread_pcs(self):
+ TestBase.setUp(self)
+
+ global file_index
+ ++file_index
+ logfile = os.path.join(
+ self.getBuildDir(),
+ "packet-log-" + self.getArchitecture() + "-" + str(file_index) + ".txt",
+ )
+ self.runCmd("log enable -f %s gdb-remote packets" % (logfile))
+
+ def cleanup():
+ self.runCmd("log disable gdb-remote packets")
+ if os.path.exists(logfile):
+ os.unlink(logfile)
+
+ self.addTearDownHook(cleanup)
+
+ self.source = "main.cpp"
+ self.build()
+ (target, process, thread, bkpt) = lldbutil.run_to_source_breakpoint(
+ self, "break here", lldb.SBFileSpec(self.source, False)
+ )
+
+ # verify that libfoo.dylib hasn't loaded yet
+ for m in target.modules:
+ self.assertNotEqual(m.GetFileSpec().GetFilename(), "libfoo.dylib")
+
+ thread.StepInto()
+ thread.StepInto()
+
+ thread.StepInto()
+ thread.StepInto()
+ thread.StepInto()
+
+ # verify that libfoo.dylib has loaded
+ for m in target.modules:
+ if m.GetFileSpec().GetFilename() == "libfoo.dylib":
+ found_libfoo = True
+ self.assertTrue(found_libfoo)
+
+ thread.StepInto()
+ thread.StepInto()
+ thread.StepOver()
+ thread.StepOver()
+ thread.StepOver()
+ thread.StepOver()
+ thread.StepOver()
+ thread.StepOver()
+ thread.StepOver()
+ thread.StepOver()
+ thread.StepOver()
+ thread.StepOver()
+
+ process.Kill()
+
+ # Confirm that we never fetched the pc for any threads during
+ # this debug session.
+ if os.path.exists(logfile):
+ f = open(logfile)
+ lines = f.readlines()
+ num_errors = 0
+ for line in lines:
+ arch = self.getArchitecture()
+ if arch == "arm64" or arch == "arm64_32":
+ # <reg name="pc" regnum="32" offset="256" bitsize="64" group="general" group_id="1" ehframe_regnum="32" dwarf_regnum="32" generic="pc"/>
+ # A fetch of $pc on arm64 looks like
+ # < 22> send packet: $p20;thread:91698e;#70
+ self.assertNotIn("$p20;thread", line)
+ else:
+ # <reg name="rip" regnum="16" offset="128" bitsize="64" group="general" altname="pc" group_id="1" ehframe_regnum="16" dwarf_regnum="16" generic="pc"/>
+ # A fetch of $pc on x86_64 looks like
+ # < 22> send packet: $p10;thread:91889c;#6f
+ self.assertNotIn("$p10;thread", line)
+
+ f.close()
diff --git a/lldb/test/API/macosx/expedited-thread-pcs/foo.c b/lldb/test/API/macosx/expedited-thread-pcs/foo.c
new file mode 100644
index 00000000000000..de1cbc4c4648a1
--- /dev/null
+++ b/lldb/test/API/macosx/expedited-thread-pcs/foo.c
@@ -0,0 +1 @@
+int foo() { return 5; }
diff --git a/lldb/test/API/macosx/expedited-thread-pcs/main.cpp b/lldb/test/API/macosx/expedited-thread-pcs/main.cpp
new file mode 100644
index 00000000000000..d77c6793afb6b2
--- /dev/null
+++ b/lldb/test/API/macosx/expedited-thread-pcs/main.cpp
@@ -0,0 +1,62 @@
+#include <dlfcn.h>
+#include <stdio.h>
+#include <thread>
+#include <unistd.h>
+
+void f1() {
+ while (1)
+ sleep(1);
+}
+void f2() {
+ while (1)
+ sleep(1);
+}
+void f3() {
+ while (1)
+ sleep(1);
+}
+
+int main() {
+ std::thread t1{f1};
+ std::thread t2{f2};
+ std::thread t3{f3};
+
+ puts("break here");
+
+ void *handle = dlopen("libfoo.dylib", RTLD_LAZY);
+ int (*foo_ptr)() = (int (*)())dlsym(handle, "foo");
+ int c = foo_ptr();
+
+ // clang-format off
+ // multiple function calls on a single source line so 'step'
+ // and 'next' need to do multiple steps of work.
+ puts("1"); puts("2"); puts("3"); puts("4"); puts("5");
+ puts("6"); puts("7"); puts("8"); puts("9"); puts("10");
+ puts("11"); puts("12"); puts("13"); puts("14"); puts("15");
+ puts("16"); puts("17"); puts("18"); puts("19"); puts("20");
+ puts("21"); puts("22"); puts("23"); puts("24"); puts("24");
+ // clang-format on
+ puts("one");
+ puts("two");
+ puts("three");
+ puts("four");
+ puts("five");
+ puts("six");
+ puts("seven");
+ puts("eight");
+ puts("nine");
+ puts("ten");
+ c++;
+ c++;
+ c++;
+ c++;
+ c++;
+ c++;
+ c++;
+ c++;
+ c++;
+ c++;
+ c++;
+ c++;
+ return c;
+}
|
What's the exact situation that triggered these extra packets. Could it be simulated from a gdb-remote client test (i.e., by mocking server responses)? |
We've always done it when we step over our dynamic loader notification breakpoint (when new solibs are loaded and lldb is informed), I think it depends on the ordering that we invoke these methods in ProcessGDBRemote, I didn't try to debug what shifted in our call ordering to make this become a hotter problem recently. (that's why my test case dlopen's a solib, because that was a specific issue) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM.
Given InvalidateIfNeeded (which by name seems harmless) can have these undesirably effects (presumably only when you pass true for force) it might be nice to warn developers about this by adding an instructive comment to InvalidateIfNeeded?
But the content seems fine here.
…#109499) Some gdb remote serial protocol stubs will send the thread IDs and PCs for all threads in a process in the stop-reply packet. lldb often needs to know the pc values for all threads while at a private stop, and that results in <n-1> read-register packets for <n> threads, and can be a big performance problem when this is a hot code path. GDBRemoteRegisterContext tracks the StopID of when its values were set, and when the thread's StopID has incremented, it marks all values it has as Invalid, and knows to refetch them. We have a code path that resulted in setting the PCs for all the threads, and then `ProcessGDBRemote::CalculateThreadStopInfo` *forcing* an invalidation of all the register contexts, forcing us to re-read the pc values for all threads except the one that stopped. There are times when it is valid to force an invalidation of the regsiter cache - for instance, if the layout of the registers has changed because the processor state is different, or we've sent a write-all-registers packet to the inferior and we want to make sure we stay in sync with the inferior. But there was no reason for this method to be forcing the register context to be invalid. I added a test when running on Darwin systems, where debugserver always sends the thread IDs and PCs, which turns on packet logging. The test runs against an inferior which has 4 threads; it steps over a dlopen() call, steps in to a user function with debug info, steps-over and steps-in across source lines with multiple function calls, and then examines the packet log and flags it as an error if lldb asked for the pc value of any thread at any point in the debug session. For this program and the operations we're doing, with debugserver that provides thread IDs and PCs, we should never ask for the value of a pc register. rdar://136247381 (cherry picked from commit 6e6d5ea)
…register-context-unnecessarily [lldb] Don't invalid register context after setting thread pc's (llvm#109499)
…#109499) Some gdb remote serial protocol stubs will send the thread IDs and PCs for all threads in a process in the stop-reply packet. lldb often needs to know the pc values for all threads while at a private stop, and that results in <n-1> read-register packets for <n> threads, and can be a big performance problem when this is a hot code path. GDBRemoteRegisterContext tracks the StopID of when its values were set, and when the thread's StopID has incremented, it marks all values it has as Invalid, and knows to refetch them. We have a code path that resulted in setting the PCs for all the threads, and then `ProcessGDBRemote::CalculateThreadStopInfo` *forcing* an invalidation of all the register contexts, forcing us to re-read the pc values for all threads except the one that stopped. There are times when it is valid to force an invalidation of the regsiter cache - for instance, if the layout of the registers has changed because the processor state is different, or we've sent a write-all-registers packet to the inferior and we want to make sure we stay in sync with the inferior. But there was no reason for this method to be forcing the register context to be invalid. I added a test when running on Darwin systems, where debugserver always sends the thread IDs and PCs, which turns on packet logging. The test runs against an inferior which has 4 threads; it steps over a dlopen() call, steps in to a user function with debug info, steps-over and steps-in across source lines with multiple function calls, and then examines the packet log and flags it as an error if lldb asked for the pc value of any thread at any point in the debug session. For this program and the operations we're doing, with debugserver that provides thread IDs and PCs, we should never ask for the value of a pc register. rdar://136247381 (cherry picked from commit 6e6d5ea)
…te-register-context-unnecessarily [lldb] Don't invalid register context after setting thread pc's (llvm#109499)
Some gdb remote serial protocol stubs will send the thread IDs and PCs for all threads in a process in the stop-reply packet. lldb often needs to know the pc values for all threads while at a private stop, and that results in read-register packets for threads, and can be a big performance problem when this is a hot code path.
GDBRemoteRegisterContext tracks the StopID of when its values were set, and when the thread's StopID has incremented, it marks all values it has as Invalid, and knows to refetch them.
We have a code path that resulted in setting the PCs for all the threads, and then
ProcessGDBRemote::CalculateThreadStopInfo
forcing an invalidation of all the register contexts, forcing us to re-read the pc values for all threads except the one that stopped.There are times when it is valid to force an invalidation of the regsiter cache - for instance, if the layout of the registers has changed because the processor state is different, or we've sent a write-all-registers packet to the inferior and we want to make sure we stay in sync with the inferior. But there was no reason for this method to be forcing the register context to be invalid.
I added a test when running on Darwin systems, where debugserver always sends the thread IDs and PCs, which turns on packet logging. The test runs against an inferior which has 4 threads; it steps over a dlopen() call, steps in to a user function with debug info, steps-over and steps-in across source lines with multiple function calls, and then examines the packet log and flags it as an error if lldb asked for the pc value of any thread at any point in the debug session.
For this program and the operations we're doing, with debugserver that provides thread IDs and PCs, we should never ask for the value of a pc register.
rdar://136247381