-
Notifications
You must be signed in to change notification settings - Fork 14.3k
[lldb] Add pc check for thread-step-by-bp algorithms #108504
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[lldb] Add pc check for thread-step-by-bp algorithms #108504
Conversation
lldb-server built with NativeProcessLinux.cpp and NativeProcessFreeBSD.cpp can use breakpoints to implement instruction stepping on cores where there is no native instruction-step primitive. Currently these set a breakpoint, continue, and if we hit the breakpoint with the original thread, set the stop reason to be "trace". I am wrapping up a change to lldb's breakpoint algorithm where I change its current behavior of "if a thread stops at a breakpoint site, we set the thread's stop reason to breakpoint-hit, even if the breakpoint hasn't been executed" + "when resuming any thread at a breakpoint site, instruction-step past the breakpoint before resuming" to a behavior of "when a thread executes a breakpoint, set the stop reason to breakpoint-hit" + "when a thread has hit a breakpoint, when the thread resumes, we silently step past the breakpoint and then resume the thread". For these lldb-server targets doing breakpoint stepping, this means that if we are sitting on a breakpoint that has not yet executed, and instruction-step the thread, we will execute the breakpoint instruction at $pc (instead of $next-pc where it meant to go), and stop again -- at the same pc value. Then we will rewrite the stop reason to 'trace'. The higher level logic will see that we haven't hit the breakpoint instruction again, so it will try to instruction step again, hitting the breakpoint again forever. To fix this, I'm checking that the thread matches the one we are instruction-stepping-by-breakpoint AND that we've stopped at the breakpoint address we are stepping to. Only in that case will the stop reason be rewritten to "trace" hiding the implementation detail that the step was done by breakpoints.
@llvm/pr-subscribers-lldb Author: Jason Molenda (jasonmolenda) Changeslldb-server built with NativeProcessLinux.cpp and NativeProcessFreeBSD.cpp can use breakpoints to implement instruction stepping on cores where there is no native instruction-step primitive. Currently these set a breakpoint, continue, and if we hit the breakpoint with the original thread, set the stop reason to be "trace". I am wrapping up a change to lldb's breakpoint algorithm where I change its current behavior of "if a thread stops at a breakpoint site, we set to a behavior of "when a thread executes a breakpoint, set the stop reason to breakpoint-hit" + For these lldb-server targets doing breakpoint stepping, this means that if we are sitting on a breakpoint that has not yet executed, and instruction-step the thread, we will execute the breakpoint instruction at $pc (instead of $next-pc where it meant to go), and stop again -- at the same pc value. Then we will rewrite the stop reason to 'trace'. The higher level logic will see that we haven't hit the breakpoint instruction again, so it will try to instruction step again, hitting the breakpoint again forever. To fix this, I'm checking that the thread matches the one we are instruction-stepping-by-breakpoint AND that we've stopped at the breakpoint address we are stepping to. Only in that case will the stop reason be rewritten to "trace" hiding the implementation detail that the step was done by breakpoints. Full diff: https://github.com/llvm/llvm-project/pull/108504.diff 2 Files Affected:
diff --git a/lldb/source/Plugins/Process/FreeBSD/NativeProcessFreeBSD.cpp b/lldb/source/Plugins/Process/FreeBSD/NativeProcessFreeBSD.cpp
index 97fff4b9f65a82..80b27571f43d55 100644
--- a/lldb/source/Plugins/Process/FreeBSD/NativeProcessFreeBSD.cpp
+++ b/lldb/source/Plugins/Process/FreeBSD/NativeProcessFreeBSD.cpp
@@ -319,9 +319,12 @@ void NativeProcessFreeBSD::MonitorSIGTRAP(lldb::pid_t pid) {
info.pl_siginfo.si_addr);
if (thread) {
+ auto ®ctx = static_cast<NativeRegisterContextFreeBSD &>(
+ thread->GetRegisterContext());
auto thread_info =
m_threads_stepping_with_breakpoint.find(thread->GetID());
- if (thread_info != m_threads_stepping_with_breakpoint.end()) {
+ if (thread_info != m_threads_stepping_with_breakpoint.end() &&
+ threads_info->second == regctx.GetPC()) {
thread->SetStoppedByTrace();
Status brkpt_error = RemoveBreakpoint(thread_info->second);
if (brkpt_error.Fail())
diff --git a/lldb/source/Plugins/Process/Linux/NativeProcessLinux.cpp b/lldb/source/Plugins/Process/Linux/NativeProcessLinux.cpp
index 5c262db8db7fde..38b7092682873b 100644
--- a/lldb/source/Plugins/Process/Linux/NativeProcessLinux.cpp
+++ b/lldb/source/Plugins/Process/Linux/NativeProcessLinux.cpp
@@ -829,8 +829,11 @@ void NativeProcessLinux::MonitorBreakpoint(NativeThreadLinux &thread) {
thread.SetStoppedByBreakpoint();
FixupBreakpointPCAsNeeded(thread);
- if (m_threads_stepping_with_breakpoint.find(thread.GetID()) !=
- m_threads_stepping_with_breakpoint.end())
+ NativeRegisterContextLinux ®_ctx = thread.GetRegisterContext();
+ auto stepping_with_bp_it =
+ m_threads_stepping_with_breakpoint.find(thread.GetID());
+ if (stepping_with_bp_it != m_threads_stepping_with_breakpoint.end() &&
+ stepping_with_bp_it->second == reg_ctx.GetPC())
thread.SetStoppedByTrace();
StopRunningThreads(thread.GetID());
|
This is addressing the discussion about the CI failures (and David's repoing of them) from when I tried to land #96260 . I had this as a commit for the PR where I'm addressing the CI fails so I can re-land ( #105594 ) but this one has no dependance on my change and I'd like to land it separately, like I already did with the Dexter fix to work with current lldb or my new lldb stepping behavior. These plugins are only used for targets without an instruction step capability and I don't have access to any of those so I haven't tested it -- but I think it's straightforward enough that landing it and seeing the CI bots react is fine. |
Landing this in a separate PR llvm#108504 This reverts commit 48a0fc1.
What does it look like for a thread to be stopped at a breakpoint site but not to have executed the breakpoint? Is this just a timing thing, if one thread executes the breakpoint and that causes all others to stop, and one of those others happens to be about to execute the same breakpoint?
|
It can happen through timing, but also in single-threaded code as well. A thread could reach-but-not execute a breakpoint insn due to some activity on the previous insn. E.g. it could be single-stepping over it (so the stop reason is "i've completed a step", not "I've hit a breakpoint"), or (on x86, where watchpoint hits are reported after the fact), the previous insn could have triggered a watchpoint. None of this really matters for this patch though. The problem here was that the lldb-server implementation has assuming that the stop reason of a software-single-stepping thread must be "i've completed a step" , but that won't be the case if the instruction you are stepping over is actually a breakpoint insn -- in that case, the reason will be "i've hit a breakpoint (other than the temporary bkpt i've set to single-step)". It just so happens that without that Jason's patch, lldb would never actually try to step over a breakpoint insn it knew of, so we'd never hit this problem. |
Thanks I understand now. |
Yeah, the problem with our current algorithm is that you lose the actual stop reason, and sometimes that's important. For instance, if you have a store instruction that modifies a watchpointed memory, and you have a user breakpoint on the next instruction, on AArch64, the store triggers the watchpoint. lldb reads the old value, instruction steps, reads the new value and then tells the user. But now it's sitting on a breakpoint site so that overwrites the watchpoint stop reason and we never tell the user about that - we tell it we've hit the breakpoint. gdb behaves the same way. The user-visible down side to this new approach was seen with Dexter debuginfo engine, where it sets a breakpoint on every source line in a program, and then does step-in across the source lines. In this case, the step-in executes to the first instruction of the next source line (where there is a breakpoint). So we have a "step-in completed" stop reason. Then Dexter step-in's again, and now it executes the breakpoint instruction and stops, without having advanced the pc, with a "breakpoint hit" stop reason. then doing a step-in will advance to the next source line. Users will surely experience this (with only one breakpoint in code they're stepping over), and I'm sure they will scratch their heads about why they needed to Step twice to get past a breakpoint (if they notice at all), but that's the only bad fallout from this change to the algorithm. |
lldb-server built with NativeProcessLinux.cpp and NativeProcessFreeBSD.cpp can use breakpoints to implement instruction stepping on cores where there is no native instruction-step primitive. Currently these set a breakpoint, continue, and if we hit the breakpoint with the original thread, set the stop reason to be "trace". I am wrapping up a change to lldb's breakpoint algorithm where I change its current behavior of "if a thread stops at a breakpoint site, we set the thread's stop reason to breakpoint-hit, even if the breakpoint hasn't been executed" + "when resuming any thread at a breakpoint site, instruction-step past the breakpoint before resuming" to a behavior of "when a thread executes a breakpoint, set the stop reason to breakpoint-hit" + "when a thread has hit a breakpoint, when the thread resumes, we silently step past the breakpoint and then resume the thread". For these lldb-server targets doing breakpoint stepping, this means that if we are sitting on a breakpoint that has not yet executed, and instruction-step the thread, we will execute the breakpoint instruction at $pc (instead of $next-pc where it meant to go), and stop again -- at the same pc value. Then we will rewrite the stop reason to 'trace'. The higher level logic will see that we haven't hit the breakpoint instruction again, so it will try to instruction step again, hitting the breakpoint again forever. To fix this, I'm checking that the thread matches the one we are instruction-stepping-by-breakpoint AND that we've stopped at the breakpoint address we are stepping to. Only in that case will the stop reason be rewritten to "trace" hiding the implementation detail that the step was done by breakpoints. (cherry picked from commit 213c59d)
…lvm#109643) Apparently a typo is causing compile error, added by llvm#108504. (cherry picked from commit 85220a0)
lldb-server built with NativeProcessLinux.cpp and NativeProcessFreeBSD.cpp can use breakpoints to implement instruction stepping on cores where there is no native instruction-step primitive. Currently these set a breakpoint, continue, and if we hit the breakpoint with the original thread, set the stop reason to be "trace". I am wrapping up a change to lldb's breakpoint algorithm where I change its current behavior of "if a thread stops at a breakpoint site, we set the thread's stop reason to breakpoint-hit, even if the breakpoint hasn't been executed" + "when resuming any thread at a breakpoint site, instruction-step past the breakpoint before resuming" to a behavior of "when a thread executes a breakpoint, set the stop reason to breakpoint-hit" + "when a thread has hit a breakpoint, when the thread resumes, we silently step past the breakpoint and then resume the thread". For these lldb-server targets doing breakpoint stepping, this means that if we are sitting on a breakpoint that has not yet executed, and instruction-step the thread, we will execute the breakpoint instruction at $pc (instead of $next-pc where it meant to go), and stop again -- at the same pc value. Then we will rewrite the stop reason to 'trace'. The higher level logic will see that we haven't hit the breakpoint instruction again, so it will try to instruction step again, hitting the breakpoint again forever. To fix this, I'm checking that the thread matches the one we are instruction-stepping-by-breakpoint AND that we've stopped at the breakpoint address we are stepping to. Only in that case will the stop reason be rewritten to "trace" hiding the implementation detail that the step was done by breakpoints. (cherry picked from commit 213c59d)
…lvm#109643) Apparently a typo is causing compile error, added by llvm#108504. (cherry picked from commit 85220a0)
lldb-server built with NativeProcessLinux.cpp and NativeProcessFreeBSD.cpp can use breakpoints to implement instruction stepping on cores where there is no native instruction-step primitive. Currently these set a breakpoint, continue, and if we hit the breakpoint with the original thread, set the stop reason to be "trace". I am wrapping up a change to lldb's breakpoint algorithm where I change its current behavior of "if a thread stops at a breakpoint site, we set the thread's stop reason to breakpoint-hit, even if the breakpoint hasn't been executed" + "when resuming any thread at a breakpoint site, instruction-step past the breakpoint before resuming" to a behavior of "when a thread executes a breakpoint, set the stop reason to breakpoint-hit" + "when a thread has hit a breakpoint, when the thread resumes, we silently step past the breakpoint and then resume the thread". For these lldb-server targets doing breakpoint stepping, this means that if we are sitting on a breakpoint that has not yet executed, and instruction-step the thread, we will execute the breakpoint instruction at $pc (instead of $next-pc where it meant to go), and stop again -- at the same pc value. Then we will rewrite the stop reason to 'trace'. The higher level logic will see that we haven't hit the breakpoint instruction again, so it will try to instruction step again, hitting the breakpoint again forever. To fix this, I'm checking that the thread matches the one we are instruction-stepping-by-breakpoint AND that we've stopped at the breakpoint address we are stepping to. Only in that case will the stop reason be rewritten to "trace" hiding the implementation detail that the step was done by breakpoints. (cherry picked from commit 213c59d)
…lvm#109643) Apparently a typo is causing compile error, added by llvm#108504. (cherry picked from commit 85220a0)
lldb-server built with NativeProcessLinux.cpp and NativeProcessFreeBSD.cpp can use breakpoints to implement instruction stepping on cores where there is no native instruction-step primitive. Currently these set a breakpoint, continue, and if we hit the breakpoint with the original thread, set the stop reason to be "trace".
I am wrapping up a change to lldb's breakpoint algorithm where I change its current behavior of
"if a thread stops at a breakpoint site, we set
the thread's stop reason to breakpoint-hit, even if the breakpoint hasn't been executed" +
"when resuming any thread at a breakpoint site, instruction-step past the breakpoint before resuming"
to a behavior of
"when a thread executes a breakpoint, set the stop reason to breakpoint-hit" +
"when a thread has hit a breakpoint, when the thread resumes, we silently step past the breakpoint and then resume the thread".
For these lldb-server targets doing breakpoint stepping, this means that if we are sitting on a breakpoint that has not yet executed, and instruction-step the thread, we will execute the breakpoint instruction at $pc (instead of $next-pc where it meant to go), and stop again -- at the same pc value. Then we will rewrite the stop reason to 'trace'. The higher level logic will see that we haven't hit the breakpoint instruction again, so it will try to instruction step again, hitting the breakpoint again forever.
To fix this, I'm checking that the thread matches the one we are instruction-stepping-by-breakpoint AND that we've stopped at the breakpoint address we are stepping to. Only in that case will the stop reason be rewritten to "trace" hiding the implementation detail that the step was done by breakpoints.