[lldb][debugserver] Interrupt should reset outstanding SIGSTOP (#132128) #10309
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
This fixes an uncommon bug with debugserver controlling an inferior process that is hitting an internal breakpoint & continuing when multiple interrupts are sent by SB API to lldb -- with the result being that lldb never stops the inferior process, ignoring the interrupt/stops being sent by the driver layer (Xcode, in this case).
In the reproducing setup (which required a machine with unique timing characteristics), lldb is sent SBProcess::Stop and then shortly after, SBProcess::SendAsyncInterrupt. The driver process only sees that the inferior is publicly running at this point, even though it's hitting an internal breakpoint (new dylib being loaded), disabling the bp, step instructioning, re-enabling the breakpoint, then continuing.
The packet sequence lldb sends to debugserver looks like
When debugserver needs to interrupt a running process (
MachProcess::Interrupt
), the main thread in debugserver sends a SIGSTOP posix signal to the inferior process, and notes that it has sent this signal by settingm_sent_interrupt_signo
.When we send the first async interrupt while instruction stepping, the signal is sent (probably after the inferior has already stopped) but lldb can only receive the mach exception that includes the SIGSTOP when the process is running. So at the point of step (3), we have a SIGSTOP outstanding in the kernel, and
m_sent_interrupt_signo
is set to SIGSTOP.When we resume the inferior (
c
in step 4), debugserver sees thatm_sent_interrupt_signo
is still set for an outstanding SIGSTOP, but at this point we've already stopped so it's an unnecessary stop. It records that (1) we've got a SIGSTOP still coming that debugserver sent and (2) we should ignore it by also settingm_auto_resume_signo
to the same signal value.Once we've resumed the process, the mach exception thread (
MachTask::ExceptionThread
) receives the outstanding mach exception, adds it to a queue to be processed(
MachProcess::ExceptionMessageReceived
) and when we've collected all outstanding mach exceptions, it callsMachProcess::ExceptionMessageBundleComplete
top evaluate them.MachProcess::ExceptionMessageBundleComplete
halts the process (without updating the MachProcessm_state
) while evaluating them. It sees that this incoming SIGSTOP was meant to be ignored (m_auto_resume_signo
is set), so itMachProcess::PrivateResume
's the process again.At the same time
MachTask::ExceptionThread
is receiving and processing the ME,MachProcess::Interrupt
is called with another interrupt that debugserver received. This method checks that we're still eStateRunning (we are) but then sees that we have an outstanding SIGSTOP already (m_sent_interrupt_signo
) and does nothing, assuming that we will stop shortly from that one. It then returns to callRNBRemote::HandlePacket_last_signal
to print the status -- but because the process is stilleStateRunning
, this does nothing.So the first ^c (resulting in a pending SIGSTOP) is received and we resume the process silently. And the second ^c is ignored because we've got one interrupt already being processed.
The fix was very simple. In
MachProcess::Interrupt
when we detect that we have a SIGSTOP out in the wild (m_sent_interrupt_signo
), we need to clearm_auto_resume_signo
which is used to indicate that this SIGSTOP is meant to be ignored, because it was from before our most recent resume.MachProcess::Interrupt holds the
m_exception_and_signal_mutex
mutex already (after Jonas's commit last week), and all ofMachProcess::ExceptionMessageBundleComplete
holds that same mutex, so we know we can modifym_auto_resume_signo
here and it will be handled correctly when the outstanding mach exception is finally processed.rdar://145872120
(cherry picked from commit e60e064)