Skip to content

[lldb] Make SBProcess thread related actions listen to StopLocker #134339

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
merged 3 commits into from
Apr 15, 2025

Conversation

kusmour
Copy link
Contributor

@kusmour kusmour commented Apr 4, 2025

Summary

This PR updates SBProcess::GetNumThreads() and SBProcess::GetThreadAtIndex() to listen to the stop locker. SBProcess::GetNumThreads() will return 0 if the process is running.

Problem Description

Recently upon debugging a program with thousands of threads in VS Code, lldb-dap would hang at a threads request sent right after receiving the configurationDone response. Soon after it will end the debug session with the following error

Process <pid> exited with status = -1 (0xffffffff) lost connection

This is because LLDB is still in the middle of resuming all the threads. And requesting threads will end up interrupt the process on Linux. From the gdb-remote log it ended up getting lldb::StateType::eStateInvalid and just exit with status -1.

I don't think it's reasonable to allow getting threads from a running process. There are a few approaches to fix this:

  1. Send the stopped event to IDE after configurationDone. This aligns with the CLI behavior.
  2. However, the above approach will break the existing user facing behavior. The alternative will be reject the threads request if the process is not stopped.
  3. Improve the run lock. This is a synchronize issue where process was in the middle of resuming while lldb-dap attempts to interrupt it.

This PR implements the option 3

HOWEVER

This fixed the "lost connection" issue below but new issue has surfaced. From testing, and also from checking the VSCode source code, it expects having threadID to perform pause. So after attaching, without any threads reported to the client, the user will not be able to pause the attached process. setBreakpoint will still work and once we make a stop at the bp (or any stop that will report threads, client can perform pause again.

NEXT

  1. Made an attempt to return initial thread list so that VSCode can pause (second commit in the PR)
  2. Investigate why threads will trigger unwinding the second frame of a thread, which leads to sending the interrupt
  3. Decided if we want to support stopOnEntry for attaching, given
    i. This is not an official specification
    ii. If enable stopOnEntry, we need to fix attaching on Linux, to send only one stopped event. Currently, all threads upon attaching will have stop reason SIGSTOP and lldb-dap will send stopped event for each one of them. Every stopped will trigger the client request for threads.
    iii. Alternatively, we can support auto continue correspond to (lldb) process attach --continue. This require the ii above.

Additionally

lldb-dap will not send a continued event after configurationDone because it checks dap.focus_tid == LLDB_INVALID_THREAD_ID (so that we don't send it for launch request). Notice dap.focus_tid will only get assigned when handling stop or stepping.

According to DAP

Please note: a debug adapter is not expected to send this event in response to a request that implies that execution continues, e.g. launch or continue.
It is only necessary to send a continued event if there was no previous request that implied this.

So I guess we are not violating DAP if we don't send continued event. But I'd like to get some sense about this.

Test Plan

Used following program for testing: https://gist.github.com/kusmour/1729d2e07b7b1063897db77de194e47d
NOTE: Utilize stdin to get pid and attach AFTER hitting enter. Attach should happen when all the threads start running.

DAP messages before the change
image

DAP message after the change - report zero threads after attaching
image

@llvmbot
Copy link
Member

llvmbot commented Apr 4, 2025

@llvm/pr-subscribers-lldb

Author: Wanyi (kusmour)

Changes

Recently upon debugging a program with thousands of threads in VS Code, lldb-dap would hang at a threads request sent right after receiving the configurationDone response. Soon after it will end the debug session with the following error

Process &lt;pid&gt; exited with status = -1 (0xffffffff) lost connection

This is because LLDB is still in the middle of resuming all the threads. And requesting threads will require stopping the process. From the gdb-remote log it ended up getting lldb::StateType::eStateInvalid and just exit with status -1.

I don't think it's reasonable to allow getting threads from a running process. The alternative will be reject the threads request if the process is not stopped.


Full diff: https://github.com/llvm/llvm-project/pull/134339.diff

2 Files Affected:

  • (added) lldb/test/API/tools/lldb-dap/attach/multithreading.cpp (+102)
  • (modified) lldb/tools/lldb-dap/Handler/AttachRequestHandler.cpp (+5-3)
diff --git a/lldb/test/API/tools/lldb-dap/attach/multithreading.cpp b/lldb/test/API/tools/lldb-dap/attach/multithreading.cpp
new file mode 100644
index 0000000000000..0e9c340916ece
--- /dev/null
+++ b/lldb/test/API/tools/lldb-dap/attach/multithreading.cpp
@@ -0,0 +1,102 @@
+#include <condition_variable>
+#include <functional>
+#include <iostream>
+#include <mutex>
+#include <queue>
+#include <thread>
+
+int plus_one(int n) {
+  std::cout << "In plus_one" << std::endl;
+  return n + 1;
+}
+
+class ThreadPool {
+ public:
+  ThreadPool(size_t num_threads = std::thread::hardware_concurrency()) {
+    for (uint32_t i = 0; i < num_threads; ++i) {
+      threads.emplace_back(std::thread(&ThreadPool::ThreadLoop, this));
+    }
+  }
+  ~ThreadPool() {
+    {
+      std::unique_lock<std::mutex> lock(queue_mutex);
+      terminated = true;
+    }
+    condition.notify_all();
+    for (std::thread& thread : threads) {
+      thread.join();
+    }
+  }
+  void enqueue(const std::function<void()>& job) {
+    {
+      std::unique_lock<std::mutex> lock(queue_mutex);
+      if (terminated)
+        throw std::runtime_error("enqueue on stopped ThreadPool");
+      jobs.push(job);
+    }
+    condition.notify_one();
+  }
+
+ private:
+  void ThreadLoop() {
+    while (true) {
+      std::function<void()> job;
+      {
+        std::unique_lock<std::mutex> lock(queue_mutex);
+        condition.wait(lock, [this] { return !jobs.empty() || terminated; });
+        if (terminated && jobs.empty())
+          return;
+        job = jobs.front();
+        jobs.pop();
+      }
+      job();
+    }
+  }
+
+  bool terminated = false;
+  std::mutex queue_mutex;
+  std::condition_variable condition;
+  std::vector<std::thread> threads;
+  std::queue<std::function<void()>> jobs;
+};
+
+void thread_func(int job_index) {
+  std::cout << __FUNCTION__ << " (job index = " << job_index
+            << ") running on thread " << std::this_thread::get_id() << "\n";
+  std::cout << "Calling function plus_one(int)\nResult = "
+            << plus_one(job_index) << "\n";
+}
+
+void thread_sleep(int job_index, int sleep_sec) {
+  std::cout << __FUNCTION__ << " (job index = " << job_index
+            << ") starting on thread " << std::this_thread::get_id() << "\n";
+  std::this_thread::sleep_for(std::chrono::seconds(sleep_sec));
+  std::cout << __FUNCTION__ << " (job index = " << job_index
+            << ") finished on thread " << std::this_thread::get_id() << "\n";
+}
+
+int main() {
+  ThreadPool tp1(2000);
+  ThreadPool tp2;
+  std::cout << "main() running on thread " << std::this_thread::get_id()
+            << "\n";
+  std::cout
+      << "Program is expecting stdin. Please attach debugger and hit enter to continue\n";
+  std::cin.get();
+  // At least one of the thread in tp1 will be sceduled with a task twice or
+  // more if num_jobs is larger than #threads in tp1.
+  int num_jobs = 3000;
+  for (int i = 0; i < num_jobs; ++i) {
+    tp1.enqueue(std::bind(thread_func, i));
+  }
+  // We may or may not hit the breakpoint thread_sleep here, as the tread pool
+  // might finish executing before debugger release the bp for tp1.
+  // To make sure we hit the bp, we can increase the sleep time in the call.
+  for (int i = 0; i < num_jobs; ++i) {
+    tp2.enqueue([i] { thread_sleep(i, 1); });
+  }
+  for (int i = 0; i < num_jobs; ++i) {
+    tp1.enqueue(std::bind(thread_sleep, i, 1));
+    return 0;
+  }
+}
diff --git a/lldb/tools/lldb-dap/Handler/AttachRequestHandler.cpp b/lldb/tools/lldb-dap/Handler/AttachRequestHandler.cpp
index 5e622f3d3dcd4..aa7f3c0d57f9d 100644
--- a/lldb/tools/lldb-dap/Handler/AttachRequestHandler.cpp
+++ b/lldb/tools/lldb-dap/Handler/AttachRequestHandler.cpp
@@ -73,9 +73,11 @@ void AttachRequestHandler::operator()(const llvm::json::Object &request) const {
   llvm::StringRef core_file = GetString(arguments, "coreFile").value_or("");
   const uint64_t timeout_seconds =
       GetInteger<uint64_t>(arguments, "timeout").value_or(30);
-  dap.stop_at_entry = core_file.empty()
-                          ? GetBoolean(arguments, "stopOnEntry").value_or(false)
-                          : true;
+  // Clients like VS Code sends threads request right after receiving
+  // configurationDone reponse where the process might be resuming.
+  // Getting threads list on a running process is not supported by LLDB.
+  // Always stop the process after attaching.
+  dap.stop_at_entry = true;
   dap.configuration.postRunCommands = GetStrings(arguments, "postRunCommands");
   const llvm::StringRef debuggerRoot =
       GetString(arguments, "debuggerRoot").value_or("");

@kusmour kusmour requested a review from clayborg April 4, 2025 02:53
Copy link

github-actions bot commented Apr 4, 2025

✅ With the latest revision this PR passed the C/C++ code formatter.

@kusmour kusmour force-pushed the lldb-dap-attaching-to-multithreads branch from ba22857 to 18e4af1 Compare April 4, 2025 05:04
@kusmour kusmour requested a review from bulbazord April 4, 2025 05:43
Comment on lines 76 to 80
// Clients like VS Code sends threads request right after receiving
// configurationDone reponse where the process might be resuming.
// Getting threads list on a running process is not supported by LLDB.
// Always stop the process after attaching.
dap.stop_at_entry = true;
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Does this mean that the 'stop on entry' setting of VSCode will now have no effect on LLDB when debugging? The user will have no choice but to stop now?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes, for attaching users will always see the stop. If you read the launch.json config for attaching. It actually does not mention this flag at all. So it was doing the attaching without sending stopped event to the IDE by default. However, it still accept this flag if it's presented in the launch.json config.

@jeffreytan81 jeffreytan81 self-requested a review April 4, 2025 17:03
@JDevlieghere
Copy link
Member

Reading the description, it seems like the third option would be the right one:

3. Improve the run lock. This is a synchronize issue where process was in the middle of resuming while lldb-dap attempts to interrupt it.

and this patch is working around that while and breaking existing behavior in the process. What would it take to go with (3)?

@jeffreytan81
Copy link
Contributor

lost connection usually indicates lldb-server crasing. Is it what happening in this situation?
If so, we should harden lldb-server to not crash.

@kusmour
Copy link
Contributor Author

kusmour commented Apr 4, 2025

lost connection usually indicates lldb-server crasing. Is it what happening in this situation? If so, we should harden lldb-server to not crash.

I have debugged lldb-server and it didn't crash. Instead it finished the and exit with status 0. From my investigation this is because the gdb-remote packets vCont;c was sent to resume the process then when asking for threads, I saw it triggered \x03 (ctrl-c). Now the process will become unresponsive and ProcessGDBRemote got eStateInvalid leads to a "lost connection" then it just quit.

@kusmour
Copy link
Contributor Author

kusmour commented Apr 4, 2025

Reading the description, it seems like the third option would be the right one:

3. Improve the run lock. This is a synchronize issue where process was in the middle of resuming while lldb-dap attempts to interrupt it.

and this patch is working around that while and breaking existing behavior in the process. What would it take to go with (3)?

I have tried a fix attempts but failed. I am exploring related code around the GDBRemoteClientBase::Lock rn. Before moving forward with this, I also want to mention that we should probs not accepting threads request when the process is not stopped (option 2). Even if this is fixed, all treads resumed, then interrupted again and get a thread list that will very likely to change might not be a good idea.

@JDevlieghere
Copy link
Member

JDevlieghere commented Apr 4, 2025

In the PR description you said "I don't think it's reasonable to allow getting threads from a running process." and I agree with that. However, that's not what lldb-dap does right now. The request handler calls SBProcess::GetNumThreads and SBProcess::GetThreadAtIndex without checking that the process is stopped, or trying to interrupt the process. Under the hood these two methods use the StopLocker to check if the process is stopped and only update the thread list if it is.

The current implementation is incorrect and unsafe in multiple ways:

  1. If we get a Thread request while the process is running, we're going to return stale data from the last stop. The protocol doesn't actually specify that this request returns the current threads (as opposed to the last threads the adapter knows about), so maybe this part is fine.
  2. If we get a Threads request while the process is running, and it stops (e.g. crashes) while we're handling the request, we're going to reply with partially state data. The threads we're iterating over until the stop will be stale and the ones after the stop will be active. And if the number of threads changed we'll get that wrong too.
  3. The same problem exists if the process were to resume, but I don't think that's possible, except maybe due to a stop hook or expression evaluation.

I created a draft PR to stop the process before we get the threads, but that's not sufficient. We need proper locking, like we added for #131242, but this time to protect the process control. That probably requires exposing the StopLocker through the SB API and having a method that halts the process and returns the StopLocker, which lldb-dap then can hold onto until it's done.

@jimingham
Copy link
Collaborator

Note, the current stop locker is a pthread_rwlock_t. These locks according to the docs I could find are recursive for reading, but not recursive for writing. We need them to be recursive for reading because you want to do:

locker = process.GetStopLocker()
process.GetThreadAtIndex()

and the latter will also acquire need to acquire the stop locker's reader end.

That should be okay since we would only expose the reader side, and we would always set the state in the locker on some lldb private thread before running any code. But we might want to make sure that you can't Continue the process while still holding the stop locker to keep from having the reader still in force when the writer needs to change the value.

@clayborg
Copy link
Collaborator

clayborg commented Apr 7, 2025

So the issue from only the lldb-dap side here is VS Code is sending a threads request, but we have resumed the process upon attach after sending the configuration done request. So the race condition happens like:

  • send configuration done
  • send continue to process if stop on entry if false
  • start race condition...
    • Our event thread will at some point consume the eStateRunning and cause our public API to not allow stop locker requests to succeed, but any API calls that come through before we consume this event will succeed,
    • at the same time as the above event thread, we get a "threads" request and start calling SBProcess::GetNumThreads() and SBProcess::GetThreadAtIndex(...) and might get valid results if the run locker is allowing stopped process calls. so we may start getting the number of threads and some threads by index, but any of these calls can fail as soon as we consume the eStateRunning process event in lldb-dap

So from a DAP only perspective, we need to sync with our lldb-dap event thread. Current code looks like:

void ConfigurationDoneRequestHandler::operator()(
    const llvm::json::Object &request) const {
  llvm::json::Object response;
  FillResponse(request, response);
  dap.SendJSON(llvm::json::Value(std::move(response)));
  dap.configuration_done_sent = true;
  if (dap.stop_at_entry)
    SendThreadStoppedEvent(dap);
  else
    dap.target.GetProcess().Continue();
}

But probably should look like:

void ConfigurationDoneRequestHandler::operator()(
    const llvm::json::Object &request) const {
  llvm::json::Object response;
  FillResponse(request, response);
  dap.SendJSON(llvm::json::Value(std::move(response)));
  dap.configuration_done_sent = true;
  if (dap.stop_at_entry)
    SendThreadStoppedEvent(dap);
  else {
    dap.target.GetProcess().Continue();
    dap.WaitForProcessResumedEvent(); /// New sync point here!
  }
}

The call to dap.WaitForProcessResumedEvent() will make sure we have received the eStateRunning process event before continuing. Then the "threads" call can fail as the process stop locker won't succeed for the SBProcess::GetNumThreads() and SBProcess::GetThreadAtIndex(....) calls. Once we get past this race, we are good, but again, only from the DAP perspective.

The fact that GetNumThreads() is able to interrupt (send a CTRL+C) to the lldb-server is another bug that needs fixing. I thought that only the breakpoint setting code would interrupt a process and send GDB remote packets while running. Why are we allowing the threads request to interrupt?

@kusmour
Copy link
Contributor Author

kusmour commented Apr 7, 2025

Why are we allowing the threads request to interrupt?

If it's helpful here's a stack trace from debugging lldb-dap. We got to frame #7 in GDBRemoteClientBase::Lock::SyncWithContinueThread where it sends interrupt because it's trying to get register (which I believe, is for getting the frame pc).

* thread #1, name = 'lldb-dap', stop reason = signal SIGSTOP
  * frame #0: 0x00007f7e8be972c1 libc.so.6`__GI___futex_abstimed_wait_cancelable64 [inlined] __futex_abstimed_wait_common64(private=<unavailable>, cancel=true, abstime=0x0000000000000000, op=393, expected=0, futex_word=0x000056064dcc00e0) at futex-internal.c:57:12
    frame #1: 0x00007f7e8be97297 libc.so.6`__GI___futex_abstimed_wait_cancelable64 [inlined] __futex_abstimed_wait_common(cancel=true, private=<unavailable>, abstime=0x0000000000000000, clockid=<unavailable>, expected=0, futex_word=0x000056064dcc00e0) at futex-internal.c:87:9
    frame #2: 0x00007f7e8be97297 libc.so.6`__GI___futex_abstimed_wait_cancelable64(futex_word=0x000056064dcc00e0, expected=0, clockid=<unavailable>, abstime=0x0000000000000000, private=<unavailable>) at futex-internal.c:139:10
    frame #3: 0x00007f7e8be99bb1 libc.so.6`___pthread_cond_wait [inlined] __pthread_cond_wait_common(abstime=0x0000000000000000, clockid=0, mutex=<unavailable>, cond=0x000056064dcc00b8) at pthread_cond_wait.c:504:10
    frame #4: 0x00007f7e8be99ad0 libc.so.6`___pthread_cond_wait(cond=0x000056064dcc00b8, mutex=<unavailable>) at pthread_cond_wait.c:619:10
    frame #5: 0x00007f7e8c2d80f0 libstdc++.so.6`std::condition_variable::wait(std::unique_lock<std::mutex>&) [inlined] __gthread_cond_wait(__mutex=<unavailable>, __cond=<unavailable>) at gthr-default.h:865:38
    frame #6: 0x00007f7e8c2d80eb libstdc++.so.6`std::condition_variable::wait(std::unique_lock<std::mutex>&) [inlined] std::__condvar::wait(__m=<unavailable>, this=<unavailable>) at std_mutex.h:155:23
    frame #7: 0x00007f7e914f0c5e liblldb.so.15`void std::condition_variable::wait<lldb_private::process_gdb_remote::GDBRemoteClientBase::Lock::SyncWithContinueThread()::$_0>(this=0x000056064dcc00b8, __lock=0x00007ffccc86c918, __p=(unnamed class) @ 0x00007ffccc86c880) at condition_variable:103:4
    frame #8: 0x00007f7e914f0b4a liblldb.so.15`lldb_private::process_gdb_remote::GDBRemoteClientBase::Lock::SyncWithContinueThread(this=0x00007ffccc86c9f0) at GDBRemoteClientBase.cpp:390:17
    frame #9: 0x00007f7e914f08d0 liblldb.so.15`lldb_private::process_gdb_remote::GDBRemoteClientBase::Lock::Lock(this=0x00007ffccc86c9f0, comm=0x000056064dcbff28, interrupt_timeout=(__r = 20)) at GDBRemoteClientBase.cpp:360:3
    frame #10: 0x00007f7e914efefb liblldb.so.15`lldb_private::process_gdb_remote::GDBRemoteClientBase::SendPacketAndWaitForResponse(this=0x000056064dcbff28, payload=(Data = "x7f37eadff400,200", Length = 17), response=0x00007ffccc86cbc8, interrupt_timeout=(__r = 20)) at GDBRemoteClientBase.cpp:184:8
    frame #11: 0x00007f7e9152ec6a liblldb.so.15`lldb_private::process_gdb_remote::ProcessGDBRemote::DoReadMemory(this=0x000056064dcbf2a0, addr=139878140474368, buf=0x000056064ddabbd0, size=512, error=0x00007ffccc86d100) at ProcessGDBRemote.cpp:2673:18
    frame #12: 0x00007f7e90eb7895 liblldb.so.15`lldb_private::Process::ReadMemoryFromInferior(this=0x000056064dcbf2a0, addr=139878140474368, buf=0x000056064ddabbd0, size=512, error=0x00007ffccc86d100) at Process.cpp:2234:9
    frame #13: 0x00007f7e90ee8fe9 liblldb.so.15`lldb_private::MemoryCache::GetL2CacheLine(this=0x000056064dcbfb58, line_base_addr=139878140474368, error=0x00007ffccc86d100) at Memory.cpp:138:41
    frame #14: 0x00007f7e90ee94a4 liblldb.so.15`lldb_private::MemoryCache::Read(this=0x000056064dcbfb58, addr=139878140474376, dst=0x00007ffccc86cfe8, dst_len=8, error=0x00007ffccc86d100) at Memory.cpp:209:35
    frame #15: 0x00007f7e90eb770d liblldb.so.15`lldb_private::Process::ReadMemory(this=0x000056064dcbf2a0, addr=139878140474376, buf=0x00007ffccc86cfe8, size=8, error=0x00007ffccc86d100) at Process.cpp:2018:27
    frame #16: 0x00007f7e90f03748 liblldb.so.15`lldb_private::RegisterContext::ReadRegisterValueFromMemory(this=0x000056064db19a80, reg_info=0x00007f7e540f1610, src_addr=139878140474376, src_len=8, reg_value=0x00007ffccc86d288) at RegisterContext.cpp:342:21
    frame #17: 0x00007f7e90ff4ff4 liblldb.so.15`lldb_private::RegisterContextUnwind::ReadRegisterValueFromRegisterLocation(this=0x000056064db19a80, regloc=ConcreteRegisterLocation @ 0x00007ffccc86d200, reg_info=0x00007f7e540f1610, value=0x00007ffccc86d288) at RegisterContextUnwind.cpp:1146:18
    frame #18: 0x00007f7e90ff4459 liblldb.so.15`lldb_private::RegisterContextUnwind::ReadGPRValue(this=0x000056064db19a80, register_kind=eRegisterKindGeneric, regnum=0, value=0x00007ffccc86d920) at RegisterContextUnwind.cpp:2200:7
    frame #19: 0x00007f7e90feeffb liblldb.so.15`lldb_private::RegisterContextUnwind::InitializeNonZerothFrame(this=0x000056064db19a80) at RegisterContextUnwind.cpp:349:8
    frame #20: 0x00007f7e90fede9c liblldb.so.15`lldb_private::RegisterContextUnwind::RegisterContextUnwind(this=0x000056064db19a80, thread=0x00007f7e4cc63700, next_frame=ptr = 0x56064e297ac0, sym_ctx=0x000056064db199b0, frame_number=1, unwind_lldb=0x000056064e293f70) at RegisterContextUnwind.cpp:75:5
    frame #21: 0x00007f7e90fead8d liblldb.so.15`lldb_private::UnwindLLDB::GetOneMoreFrame(this=0x000056064e293f70, abi=0x00007f7e4c7e4d80) at UnwindLLDB.cpp:129:40
    frame #22: 0x00007f7e90fea566 liblldb.so.15`lldb_private::UnwindLLDB::AddOneMoreFrame(this=0x000056064e293f70, abi=0x00007f7e4c7e4d80) at UnwindLLDB.cpp:337:17
    frame #23: 0x00007f7e90feab3e liblldb.so.15`lldb_private::UnwindLLDB::UpdateUnwindPlanForFirstFrameIfInvalid(this=0x000056064e293f70, abi=0x00007f7e4c7e4d80) at UnwindLLDB.cpp:311:3
    frame #24: 0x00007f7e90fea3b1 liblldb.so.15`lldb_private::UnwindLLDB::AddFirstFrame(this=0x000056064e293f70) at UnwindLLDB.cpp:100:3
    frame #25: 0x00007f7e90fec0ef liblldb.so.15`lldb_private::UnwindLLDB::DoGetFrameInfoAtIndex(this=0x000056064e293f70, idx=0, cfa=0x00007ffccc86e368, pc=0x00007ffccc86e370, behaves_like_zeroth_frame=0x00007ffccc86e367) at UnwindLLDB.cpp:400:10
    frame #26: 0x00007f7e90f1bee2 liblldb.so.15`lldb_private::Unwind::GetFrameInfoAtIndex(this=0x000056064e293f70, frame_idx=0, cfa=0x00007ffccc86e368, pc=0x00007ffccc86e370, behaves_like_zeroth_frame=0x00007ffccc86e367) at Unwind.h:53:12
    frame #27: 0x00007f7e90f1b42f liblldb.so.15`lldb_private::StackFrameList::FetchFramesUpTo(this=0x000056064e297cd0, end_idx=0, allow_interrupt=AllowInterruption) at StackFrameList.cpp:434:41
    frame #28: 0x00007f7e90f1b0b5 liblldb.so.15`lldb_private::StackFrameList::GetFramesUpTo(this=0x000056064e297cd0, end_idx=0, allow_interrupt=AllowInterruption) at StackFrameList.cpp:367:21
    frame #29: 0x00007f7e90f1c698 liblldb.so.15`lldb_private::StackFrameList::GetFrameAtIndex(this=0x000056064e297cd0, idx=0) at StackFrameList.cpp:632:7
    frame #30: 0x00007f7e90f992a7 liblldb.so.15`lldb_private::Thread::GetStackFrameAtIndex(this=0x00007f7e4cc63700, idx=0) at Thread.h:432:33
    frame #31: 0x00007f7e90f960d8 liblldb.so.15`lldb_private::Thread::DumpUsingFormat(this=0x00007f7e4cc63700, strm=0x000056064db224a0, frame_idx=0, format=0x000056064d5c9e00) at Thread.cpp:1656:16
    frame #32: 0x00007f7e909b7f15 liblldb.so.15`lldb::SBThread::GetDescriptionWithFormat(this=0x00007ffccc86ebb0, format=0x00007ffccc86f6f0, output=0x00007ffccc86e9b8) at SBThread.cpp:1264:33
    frame #33: 0x000056064d0f2c27 lldb-dap`lldb_dap::CreateThread(thread=0x00007ffccc86ebb0, format=0x00007ffccc86f6f0) at JSONUtils.cpp:877:24
    frame #34: 0x000056064d140ece lldb-dap`lldb_dap::ThreadsRequestHandler::operator()(this=0x000056064d3ad5e0, request=0x00007ffccc86ec48) const at ThreadsRequestHandler.cpp:61:26
    frame #35: 0x000056064d0ad21f lldb-dap`lldb_dap::LegacyRequestHandler::operator()(this=0x000056064d3ad5e0, request=0x00007ffccc86efb0) const at RequestHandler.h:88:5
    frame #36: 0x000056064d0cd576 lldb-dap`lldb_dap::DAP::HandleObject(this=0x00007ffccc86f280, M= Active Type = lldb_dap::protocol::Request ) at DAP.cpp:685:7
    frame #37: 0x000056064d0ce6f3 lldb-dap`lldb_dap::DAP::Loop(this=0x00007ffccc86f280) at DAP.cpp:811:10
    frame #38: 0x000056064d09e491 lldb-dap`main(argc=3, argv=0x00007ffccc870138) at lldb-dap.cpp:615:22
    frame #39: 0x00007f7e8be2c657 libc.so.6`__libc_start_call_main(main=(lldb-dap`main at lldb-dap.cpp:402), argc=3, argv=0x00007ffccc870138) at libc_start_call_main.h:58:16
    frame #40: 0x00007f7e8be2c718 libc.so.6`__libc_start_main_impl(main=(lldb-dap`main at lldb-dap.cpp:402), argc=3, argv=0x00007ffccc870138, init=<unavailable>, fini=<unavailable>, rtld_fini=<unavailable>, stack_end=0x00007ffccc870128) at libc-start.c:409:3
    frame #41: 0x000056064d09ced1 lldb-dap`_start at start.S:116

@clayborg
Copy link
Collaborator

clayborg commented Apr 8, 2025

So after looking further into the, the issue is that SBProcess::GetNumThreads() and SBProcess::GetThreadAtIndex() will ignore the process stop locker and return stale data. If the run lock can't be acquired, it will not allow the thread list to update, but will return the previous stops' thread list. Most APIs will return nothing if we are running, not sure why we would ever want to get stale data, but that is what we are returning. So we need to avoid calling any such APIs by making lldb-dap track if the process is running manually in the DAP global structure, and check the lldb-dap's notion of wether we are running and if so, return nothing for threads.

@jimingham
Copy link
Collaborator

That code has been there at least since the dread reformatting.

I have no idea why past one of us thought it was a good idea to return the previous number of threads or the old thread list when you weren't stopped. We don't do that with most anything else you ask about threads, and I can't think of any way that would be useful.

Whatever we do with DAP, we should stop doing this in SBThread...

@clayborg
Copy link
Collaborator

clayborg commented Apr 8, 2025

That code has been there at least since the dread reformatting.

I have no idea why past one of us thought it was a good idea to return the previous number of threads or the old thread list when you weren't stopped. We don't do that with most anything else you ask about threads, and I can't think of any way that would be useful.

Whatever we do with DAP, we should stop doing this in SBThread...

I am totally fine updating SBProcess::GetNumThreads() and SBProcess::GetThreadAtIndex() to fully listen to the stop locker and return 0 and empty SBThread objects.

So the fix now should be we modify SBProcess::GetNumThreads() and SBProcess::GetThreadAtIndex() to listen to the thread locker. Old code:

SBThread SBProcess::GetThreadAtIndex(size_t index) {
  LLDB_INSTRUMENT_VA(this, index);

  SBThread sb_thread;
  ThreadSP thread_sp;
  ProcessSP process_sp(GetSP());
  if (process_sp) {
    Process::StopLocker stop_locker;
    const bool can_update = stop_locker.TryLock(&process_sp->GetRunLock());
    std::lock_guard<std::recursive_mutex> guard(
        process_sp->GetTarget().GetAPIMutex());
    thread_sp = process_sp->GetThreadList().GetThreadAtIndex(index, can_update);
    sb_thread.SetThread(thread_sp);
  }

  return sb_thread;
}

New code should be:

SBThread SBProcess::GetThreadAtIndex(size_t index) {
  LLDB_INSTRUMENT_VA(this, index);

  SBThread sb_thread;
  ThreadSP thread_sp;
  ProcessSP process_sp(GetSP());
  if (process_sp) {
    Process::StopLocker stop_locker;
    if (stop_locker.TryLock(&process_sp->GetRunLock())) {
      std::lock_guard<std::recursive_mutex> guard(
          process_sp->GetTarget().GetAPIMutex());
      thread_sp = process_sp->GetThreadList().GetThreadAtIndex(index, false);
      sb_thread.SetThread(thread_sp);
    }
  }
  return sb_thread;
}

Then we also need to sync in :

Old code:

void ConfigurationDoneRequestHandler::operator()(
    const llvm::json::Object &request) const {
  llvm::json::Object response;
  FillResponse(request, response);
  dap.SendJSON(llvm::json::Value(std::move(response)));
  dap.configuration_done_sent = true;
  if (dap.stop_at_entry)
    SendThreadStoppedEvent(dap);
  else
    dap.target.GetProcess().Continue();
}

New code:

void ConfigurationDoneRequestHandler::operator()(
    const llvm::json::Object &request) const {
  llvm::json::Object response;
  FillResponse(request, response);
  dap.SendJSON(llvm::json::Value(std::move(response)));
  dap.configuration_done_sent = true;
  if (dap.stop_at_entry)
    SendThreadStoppedEvent(dap);
  else
    dap.SynchronousContinue();
}

Where dap.SynchronousContinue() will send the dap.target.GetProcess().Continue() and then make sure we consume the eStateRunning event to ensure we have the process running and that run locker will work as expected.

Does that sound ok to everyone?

@jimingham
Copy link
Collaborator

That sounds right to me.

@jimingham
Copy link
Collaborator

I wonder if we should add a test about the things we DON'T hand out when a process is running just so we claim the behavior concretely.

@kusmour
Copy link
Contributor Author

kusmour commented Apr 8, 2025

I wonder if we should add a test about the things we DON'T hand out when a process is running just so we claim the behavior concretely.

Unit tests on the SB API or DAP API tests? I am happy to add tests as follow-ups

@jimingham
Copy link
Collaborator

I wonder if we should add a test about the things we DON'T hand out when a process is running just so we claim the behavior concretely.

Unit tests on the SB API or DAP API tests? I am happy to add tests as follow-ups

I meant SB API tests, to assert that we don't hand out this stale data when the process is running.

@JDevlieghere
Copy link
Member

JDevlieghere commented Apr 9, 2025

@clayborg's suggestion sounds good to me 👍 If we make GetNumThreads() return 0 if the process is running, then lldb-dap will automatically do the right thing and report success = false for the threads request.

@kusmour kusmour force-pushed the lldb-dap-attaching-to-multithreads branch from 18e4af1 to 9c6a379 Compare April 9, 2025 02:38
@kusmour kusmour changed the title [RFC][lldb-dap] Always stop on enrty for attaching [lldb] Make SBProcess thread related actions listen to StopLocker Apr 9, 2025
@kusmour kusmour force-pushed the lldb-dap-attaching-to-multithreads branch 2 times, most recently from 811b06d to 9618197 Compare April 10, 2025 01:45
Copy link

github-actions bot commented Apr 10, 2025

✅ With the latest revision this PR passed the Python code formatter.

@kusmour kusmour force-pushed the lldb-dap-attaching-to-multithreads branch from 9618197 to 0db4c3f Compare April 10, 2025 05:28
// Client requests the baseline of currently existing threads after
// a successful launch or attach.
// A 'threads' request will be sent after configurationDone response
// Obtain the list of threads before we resume the process
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Maybe expand on this a bit saying something like:

// If a 'threads' request returns no threads, then the interrupt 
// button won't be available in the GUI debugger controls, so
// we need to report valid threads back to the IDE. We also need
// a valid thread for the "continued" packet to be sent.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The pause button relies on having some threads informations and it always send the threadID in the pause request. I have also checked the VSCode source code about fetching threads. If no threads are returned in the threads response then VSCode won't update the know thread list.

Which means, we really just need to return the baseline of threads for the debug session to carry on.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

As for the continued event, VSCode will ignore the threadID if allThreadsContinued: true. Looks like can send an invalid threadID if resuming all threads

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

yeah, we just need to change our code to always send it even if the dap.focus_tid is not valid.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

According to DAP

Please note: a debug adapter is not expected to send this event in response to a request that implies that execution continues, e.g. launch or continue.
It is only necessary to send a continued event if there was no previous request that implied this.

In this case, if the attach is expecting the execution continues then we should treat it the same way as launch. I am happy to discuss more about this as well as if we should support stopOnEntry for attaching :) going forward.

@kusmour kusmour force-pushed the lldb-dap-attaching-to-multithreads branch 2 times, most recently from 44af671 to 931b3fb Compare April 11, 2025 03:06
@kusmour kusmour force-pushed the lldb-dap-attaching-to-multithreads branch from 6e3ec6c to 38bfc5b Compare April 11, 2025 20:19
@kusmour kusmour requested a review from clayborg April 11, 2025 21:03
@kusmour kusmour force-pushed the lldb-dap-attaching-to-multithreads branch from 38bfc5b to 37340cb Compare April 15, 2025 01:51
@kusmour kusmour merged commit 7a41761 into llvm:main Apr 15, 2025
10 checks passed
@kusmour kusmour deleted the lldb-dap-attaching-to-multithreads branch April 15, 2025 17:46
@llvm-ci
Copy link
Collaborator

llvm-ci commented Apr 15, 2025

LLVM Buildbot has detected a new failure on builder lldb-aarch64-ubuntu running on linaro-lldb-aarch64-ubuntu while building lldb at step 6 "test".

Full details are available at: https://lab.llvm.org/buildbot/#/builders/59/builds/16066

Here is the relevant piece of the build log for the reference
Step 6 (test) failure: build (failure)
...
UNSUPPORTED: lldb-api :: tools/lldb-server/TestGdbRemoteFork.py (1199 of 2123)
PASS: lldb-api :: tools/lldb-server/TestGdbRemoteAttach.py (1200 of 2123)
PASS: lldb-api :: tools/lldb-server/TestGdbRemoteCompletion.py (1201 of 2123)
UNSUPPORTED: lldb-api :: tools/lldb-server/TestGdbRemoteForkNonStop.py (1202 of 2123)
UNSUPPORTED: lldb-api :: tools/lldb-server/TestGdbRemoteForkResume.py (1203 of 2123)
PASS: lldb-api :: tools/lldb-server/TestGdbRemoteExitCode.py (1204 of 2123)
PASS: lldb-api :: tools/lldb-server/TestGdbRemoteHostInfo.py (1205 of 2123)
PASS: lldb-api :: tools/lldb-server/TestGdbRemoteModuleInfo.py (1206 of 2123)
PASS: lldb-api :: tools/lldb-server/TestGdbRemoteAuxvSupport.py (1207 of 2123)
UNRESOLVED: lldb-api :: tools/lldb-dap/variables/TestDAP_variables.py (1208 of 2123)
******************** TEST 'lldb-api :: tools/lldb-dap/variables/TestDAP_variables.py' FAILED ********************
Script:
--
/usr/bin/python3.10 /home/tcwg-buildbot/worker/lldb-aarch64-ubuntu/llvm-project/lldb/test/API/dotest.py -u CXXFLAGS -u CFLAGS --env LLVM_LIBS_DIR=/home/tcwg-buildbot/worker/lldb-aarch64-ubuntu/build/./lib --env LLVM_INCLUDE_DIR=/home/tcwg-buildbot/worker/lldb-aarch64-ubuntu/build/include --env LLVM_TOOLS_DIR=/home/tcwg-buildbot/worker/lldb-aarch64-ubuntu/build/./bin --arch aarch64 --build-dir /home/tcwg-buildbot/worker/lldb-aarch64-ubuntu/build/lldb-test-build.noindex --lldb-module-cache-dir /home/tcwg-buildbot/worker/lldb-aarch64-ubuntu/build/lldb-test-build.noindex/module-cache-lldb/lldb-api --clang-module-cache-dir /home/tcwg-buildbot/worker/lldb-aarch64-ubuntu/build/lldb-test-build.noindex/module-cache-clang/lldb-api --executable /home/tcwg-buildbot/worker/lldb-aarch64-ubuntu/build/./bin/lldb --compiler /home/tcwg-buildbot/worker/lldb-aarch64-ubuntu/build/./bin/clang --dsymutil /home/tcwg-buildbot/worker/lldb-aarch64-ubuntu/build/./bin/dsymutil --make /usr/bin/gmake --llvm-tools-dir /home/tcwg-buildbot/worker/lldb-aarch64-ubuntu/build/./bin --lldb-obj-root /home/tcwg-buildbot/worker/lldb-aarch64-ubuntu/build/tools/lldb --lldb-libs-dir /home/tcwg-buildbot/worker/lldb-aarch64-ubuntu/build/./lib /home/tcwg-buildbot/worker/lldb-aarch64-ubuntu/llvm-project/lldb/test/API/tools/lldb-dap/variables -p TestDAP_variables.py
--
Exit Code: 1

Command Output (stdout):
--
lldb version 21.0.0git (https://github.com/llvm/llvm-project.git revision 7a41761407c485d18b7d48232b308556b3b43934)
  clang revision 7a41761407c485d18b7d48232b308556b3b43934
  llvm revision 7a41761407c485d18b7d48232b308556b3b43934
Skipping the following test categories: ['libc++', 'dsym', 'gmodules', 'debugserver', 'objc']

--
Command Output (stderr):
--
UNSUPPORTED: LLDB (/home/tcwg-buildbot/worker/lldb-aarch64-ubuntu/build/bin/clang-aarch64) :: test_darwin_dwarf_missing_obj (TestDAP_variables.TestDAP_variables) (requires one of macosx, darwin, ios, tvos, watchos, bridgeos, iphonesimulator, watchsimulator, appletvsimulator) 
UNSUPPORTED: LLDB (/home/tcwg-buildbot/worker/lldb-aarch64-ubuntu/build/bin/clang-aarch64) :: test_darwin_dwarf_missing_obj_with_symbol_ondemand_enabled (TestDAP_variables.TestDAP_variables) (requires one of macosx, darwin, ios, tvos, watchos, bridgeos, iphonesimulator, watchsimulator, appletvsimulator) 
========= DEBUG ADAPTER PROTOCOL LOGS =========
1744739844.540583611 --> (stdin/stdout) {"command":"initialize","type":"request","arguments":{"adapterID":"lldb-native","clientID":"vscode","columnsStartAt1":true,"linesStartAt1":true,"locale":"en-us","pathFormat":"path","supportsRunInTerminalRequest":true,"supportsVariablePaging":true,"supportsVariableType":true,"supportsStartDebuggingRequest":true,"supportsProgressReporting":true,"$__lldb_sourceInitFile":false},"seq":1}
1744739844.542667866 <-- (stdin/stdout) {"body":{"$__lldb_version":"lldb version 21.0.0git (https://github.com/llvm/llvm-project.git revision 7a41761407c485d18b7d48232b308556b3b43934)\n  clang revision 7a41761407c485d18b7d48232b308556b3b43934\n  llvm revision 7a41761407c485d18b7d48232b308556b3b43934","completionTriggerCharacters":["."," ","\t"],"exceptionBreakpointFilters":[{"default":false,"filter":"cpp_catch","label":"C++ Catch"},{"default":false,"filter":"cpp_throw","label":"C++ Throw"},{"default":false,"filter":"objc_catch","label":"Objective-C Catch"},{"default":false,"filter":"objc_throw","label":"Objective-C Throw"}],"supportTerminateDebuggee":true,"supportsBreakpointLocationsRequest":true,"supportsCancelRequest":true,"supportsCompletionsRequest":true,"supportsConditionalBreakpoints":true,"supportsConfigurationDoneRequest":true,"supportsDataBreakpoints":true,"supportsDelayedStackTraceLoading":true,"supportsDisassembleRequest":true,"supportsEvaluateForHovers":true,"supportsExceptionInfoRequest":true,"supportsExceptionOptions":true,"supportsFunctionBreakpoints":true,"supportsHitConditionalBreakpoints":true,"supportsInstructionBreakpoints":true,"supportsLogPoints":true,"supportsModulesRequest":true,"supportsReadMemoryRequest":true,"supportsRestartRequest":true,"supportsSetVariable":true,"supportsStepInTargetsRequest":true,"supportsSteppingGranularity":true,"supportsValueFormattingOptions":true},"command":"initialize","request_seq":1,"seq":0,"success":true,"type":"response"}
1744739844.542915344 --> (stdin/stdout) {"command":"launch","type":"request","arguments":{"program":"/home/tcwg-buildbot/worker/lldb-aarch64-ubuntu/build/lldb-test-build.noindex/tools/lldb-dap/variables/TestDAP_variables.test_indexedVariables/a.out","initCommands":["settings clear -all","settings set symbols.enable-external-lookup false","settings set target.inherit-tcc true","settings set target.disable-aslr false","settings set target.detach-on-error false","settings set target.auto-apply-fixits false","settings set plugin.process.gdb-remote.packet-timeout 60","settings set symbols.clang-modules-cache-path \"/home/tcwg-buildbot/worker/lldb-aarch64-ubuntu/build/lldb-test-build.noindex/module-cache-lldb/lldb-api\"","settings set use-color false","settings set show-statusline false"],"disableASLR":false,"enableAutoVariableSummaries":false,"enableSyntheticChildDebugging":false,"displayExtendedBacktrace":false,"commandEscapePrefix":null},"seq":2}
1744739844.543168545 <-- (stdin/stdout) {"body":{"category":"console","output":"Running initCommands:\n"},"event":"output","seq":0,"type":"event"}
1744739844.543248892 <-- (stdin/stdout) {"body":{"category":"console","output":"(lldb) settings clear -all\n"},"event":"output","seq":0,"type":"event"}
1744739844.543294907 <-- (stdin/stdout) {"body":{"category":"console","output":"(lldb) settings set symbols.enable-external-lookup false\n"},"event":"output","seq":0,"type":"event"}
1744739844.543327093 <-- (stdin/stdout) {"body":{"category":"console","output":"(lldb) settings set target.inherit-tcc true\n"},"event":"output","seq":0,"type":"event"}
1744739844.543357134 <-- (stdin/stdout) {"body":{"category":"console","output":"(lldb) settings set target.disable-aslr false\n"},"event":"output","seq":0,"type":"event"}
1744739844.543386459 <-- (stdin/stdout) {"body":{"category":"console","output":"(lldb) settings set target.detach-on-error false\n"},"event":"output","seq":0,"type":"event"}
1744739844.543414116 <-- (stdin/stdout) {"body":{"category":"console","output":"(lldb) settings set target.auto-apply-fixits false\n"},"event":"output","seq":0,"type":"event"}
1744739844.543444395 <-- (stdin/stdout) {"body":{"category":"console","output":"(lldb) settings set plugin.process.gdb-remote.packet-timeout 60\n"},"event":"output","seq":0,"type":"event"}
1744739844.543486834 <-- (stdin/stdout) {"body":{"category":"console","output":"(lldb) settings set symbols.clang-modules-cache-path \"/home/tcwg-buildbot/worker/lldb-aarch64-ubuntu/build/lldb-test-build.noindex/module-cache-lldb/lldb-api\"\n"},"event":"output","seq":0,"type":"event"}
1744739844.543516636 <-- (stdin/stdout) {"body":{"category":"console","output":"(lldb) settings set use-color false\n"},"event":"output","seq":0,"type":"event"}
1744739844.543544531 <-- (stdin/stdout) {"body":{"category":"console","output":"(lldb) settings set show-statusline false\n"},"event":"output","seq":0,"type":"event"}
1744739844.621682405 <-- (stdin/stdout) {"command":"launch","request_seq":2,"seq":0,"success":true,"type":"response"}
1744739844.621733427 <-- (stdin/stdout) {"body":{"isLocalProcess":true,"name":"/home/tcwg-buildbot/worker/lldb-aarch64-ubuntu/build/lldb-test-build.noindex/tools/lldb-dap/variables/TestDAP_variables.test_indexedVariables/a.out","startMethod":"launch","systemProcessId":1787121},"event":"process","seq":0,"type":"event"}
1744739844.621743202 <-- (stdin/stdout) {"event":"initialized","seq":0,"type":"event"}
1744739844.622123957 --> (stdin/stdout) {"command":"setBreakpoints","type":"request","arguments":{"source":{"name":"main.cpp","path":"main.cpp"},"sourceModified":false,"lines":[40],"breakpoints":[{"line":40}]},"seq":3}
1744739844.623628378 <-- (stdin/stdout) {"body":{"breakpoints":[{"column":1,"id":1,"instructionReference":"0xAAAAC43E0C54","line":41,"source":{"name":"main.cpp","path":"main.cpp"},"verified":true}]},"command":"setBreakpoints","request_seq":3,"seq":0,"success":true,"type":"response"}

var-const pushed a commit to ldionne/llvm-project that referenced this pull request Apr 17, 2025
…vm#134339)

# Summary

This PR updates `SBProcess::GetNumThreads()` and
`SBProcess::GetThreadAtIndex()` to listen to the stop locker.
`SBProcess::GetNumThreads()` will return 0 if the process is running.

## Problem Description

Recently upon debugging a program with thousands of threads in VS Code,
lldb-dap would hang at a `threads` request sent right after receiving
the `configurationDone` response. Soon after it will end the debug
session with the following error
```
Process <pid> exited with status = -1 (0xffffffff) lost connection
```

This is because LLDB is still in the middle of resuming all the threads.
And requesting threads will end up interrupt the process on Linux. From
the gdb-remote log it ended up getting `lldb::StateType::eStateInvalid`
and just exit with status -1.

I don't think it's reasonable to allow getting threads from a running
process. There are a few approaches to fix this:
1) Send the stopped event to IDE after `configurationDone`. This aligns
with the CLI behavior.
2) However, the above approach will break the existing user facing
behavior. The alternative will be reject the `threads` request if the
process is not stopped.
3) Improve the run lock. This is a synchronize issue where process was
in the middle of resuming while lldb-dap attempts to interrupt it.

**This PR implements the option 3**

## HOWEVER

This fixed the "lost connection" issue below but new issue has surfaced.
From testing, and also from checking the [VSCode source
code](https://github.com/microsoft/vscode/blob/174af221c9ea2ccdb64abe4aab8e1a805e77beae/src/vs/workbench/contrib/debug/browser/debugSession.ts#L791),
it expects having threadID to perform `pause`. So after attaching,
without any threads reported to the client, the user will not be able to
pause the attached process. `setBreakpoint` will still work and once we
make a stop at the bp (or any stop that will report threads, client can
perform pause again.

## NEXT
1) Made an attempt to return initial thread list so that VSCode can
pause (second commit in the PR)
2) Investigate why threads will trigger unwinding the second frame of a
thread, which leads to sending the interrupt
3) Decided if we want to support `stopOnEntry` for attaching, given
  i. This is not an official specification
ii. If enable stopOnEntry, we need to fix attaching on Linux, to send
only one stopped event. Currently, all threads upon attaching will have
stop reason `SIGSTOP` and lldb-dap will send `stopped` event for each
one of them. Every `stopped` will trigger the client request for
threads.
iii. Alternatively, we can support auto continue correspond to `(lldb)
process attach --continue`. This require the ii above.


### Additionally

lldb-dap will not send a `continued` event after `configurationDone`
because it checks `dap.focus_tid == LLDB_INVALID_THREAD_ID` (so that we
don't send it for `launch` request). Notice `dap.focus_tid` will only
get assigned when handling stop or stepping.

According to DAP

> Please note: a debug adapter is not expected to send this event in
response to a request that implies that execution continues, e.g. launch
or continue.
It is only necessary to send a continued event if there was no previous
request that implied this.

So I guess we are not violating DAP if we don't send `continued` event.
But I'd like to get some sense about this.


## Test Plan
Used following program for testing:
https://gist.github.com/kusmour/1729d2e07b7b1063897db77de194e47d
**NOTE: Utilize stdin to get pid and attach AFTER hitting enter. Attach
should happen when all the threads start running.**

DAP messages before the change
<img width="1165" alt="image"
src="https://github.com/user-attachments/assets/a9ad85fb-81ce-419c-95e5-612639905c66"
/>

DAP message after the change - report zero threads after attaching
<img width="1165" alt="image"
src="https://github.com/user-attachments/assets/a1179e18-6844-437a-938c-0383702294cd"
/>

---------

Co-authored-by: Jonas Devlieghere <[email protected]>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

8 participants