Initial step in targets DAP support #86623

jeffreytan81 · 2024-03-26T03:58:58Z

This patch provides the initial implementation for the "Step Into Specific/Step In Targets" feature in VSCode DAP.

The implementation disassembles all the call instructions in step range and try to resolve operand name (assuming one operand) using debug info. Later, the call target function name is chosen by end user and specified in the StepInto() API call.

It is v1 because of using the existing step in target function name API. This implementation has several limitations:

Won't for indirect/virtual function call -- in most cases, our disassembler won't be able to solve the indirect call target address/name.
Won't work for target function without debug info -- if the target function has symbol but not debug info, the existing ThreadPlanStepInRange won't stop.
Relying on function names can be fragile -- if there is some middle glue/thunk code, our disassembler can only resolve the glue/thunk code's name not the real target function name. It can be fragile to depend compiler/linker emits the same names for both.
Does not support step into raw address call sites -- it is a valid scenario that in Visual Studio debugger, user can explicitly choose a raw address to step into which land in the function without debug info/symbol, then choose UI to load the debug info on-demand for that module/frame to continue exploring.

A more reliable design could be extending the ThreadPlanStepInRange to support step in based on call-site instruction offset/PC which I will propose in next iteration.

github-actions · 2024-03-26T04:02:09Z

✅ With the latest revision this PR passed the C/C++ code formatter.

github-actions · 2024-03-26T04:02:09Z

✅ With the latest revision this PR passed the Python code formatter.

llvmbot · 2024-03-26T16:20:50Z

@llvm/pr-subscribers-lldb

Author: None (jeffreytan81)

Changes

This patch provides the initial implementation for the "Step Into Specific/Step In Targets" feature in VSCode DAP.

The implementation disassembles all the call instructions in step range and try to resolve operand name (assuming one operand) using debug info. Later, the call target function name is chosen by end user and specified in the StepInto() API call.

It is v1 because of using the existing step in target function name API. This implementation has several limitations:

Won't for indirect/virtual function call -- in most cases, our disassembler won't be able to solve the indirect call target address/name.
Won't work for target function without debug info -- if the target function has symbol but not debug info, the existing ThreadPlanStepInRange won't stop.
Relying on function names can be fragile -- if there is some middle glue/thunk code, our disassembler can only resolve the glue/thunk code's name not the real target function name. It can be fragile to depend compiler/linker emits the same names for both.
Does not support step into raw address call sites -- it is a valid scenario that in Visual Studio debugger, user can explicitly choose a raw address to step into which land in the function without debug info/symbol, then choose UI to load the debug info on-demand for that module/frame to continue exploring.

A more reliable design could be extending the ThreadPlanStepInRange to support step in based on call-site instruction offset/PC which I will propose in next iteration.

Full diff: https://github.com/llvm/llvm-project/pull/86623.diff

10 Files Affected:

(modified) lldb/include/lldb/API/SBLineEntry.h (+3)
(modified) lldb/include/lldb/API/SBSymbolContextList.h (+1)
(modified) lldb/packages/Python/lldbsuite/test/tools/lldb-dap/dap_server.py (+16-5)
(modified) lldb/packages/Python/lldbsuite/test/tools/lldb-dap/lldbdap_testcase.py (+2-2)
(modified) lldb/source/API/SBLineEntry.cpp (+15)
(added) lldb/test/API/tools/lldb-dap/stepInTargets/Makefile (+6)
(added) lldb/test/API/tools/lldb-dap/stepInTargets/TestDAP_stepInTargets.py (+66)
(added) lldb/test/API/tools/lldb-dap/stepInTargets/main.cpp (+11)
(modified) lldb/tools/lldb-dap/DAP.h (+2)
(modified) lldb/tools/lldb-dap/lldb-dap.cpp (+148-2)

diff --git a/lldb/include/lldb/API/SBLineEntry.h b/lldb/include/lldb/API/SBLineEntry.h
index 7c2431ba3c8a51..d70c4fac6ec717 100644
--- a/lldb/include/lldb/API/SBLineEntry.h
+++ b/lldb/include/lldb/API/SBLineEntry.h
@@ -29,6 +29,9 @@ class LLDB_API SBLineEntry {
 
   lldb::SBAddress GetEndAddress() const;
 
+  lldb::SBAddress
+  GetSameLineContiguousAddressRangeEnd(bool include_inlined_functions) const;
+
   explicit operator bool() const;
 
   bool IsValid() const;
diff --git a/lldb/include/lldb/API/SBSymbolContextList.h b/lldb/include/lldb/API/SBSymbolContextList.h
index 4026afc213571c..95100d219df20f 100644
--- a/lldb/include/lldb/API/SBSymbolContextList.h
+++ b/lldb/include/lldb/API/SBSymbolContextList.h
@@ -44,6 +44,7 @@ class LLDB_API SBSymbolContextList {
 protected:
   friend class SBModule;
   friend class SBTarget;
+  friend class SBCompileUnit;
 
   lldb_private::SymbolContextList *operator->() const;
 
diff --git a/lldb/packages/Python/lldbsuite/test/tools/lldb-dap/dap_server.py b/lldb/packages/Python/lldbsuite/test/tools/lldb-dap/dap_server.py
index 27a76a652f4063..45e48a10c7611d 100644
--- a/lldb/packages/Python/lldbsuite/test/tools/lldb-dap/dap_server.py
+++ b/lldb/packages/Python/lldbsuite/test/tools/lldb-dap/dap_server.py
@@ -811,23 +811,34 @@ def request_next(self, threadId):
         command_dict = {"command": "next", "type": "request", "arguments": args_dict}
         return self.send_recv(command_dict)
 
-    def request_stepIn(self, threadId):
+    def request_stepInTargets(self, frameId):
         if self.exit_status is not None:
-            raise ValueError("request_continue called after process exited")
-        args_dict = {"threadId": threadId}
+            raise ValueError("request_stepInTargets called after process exited")
+        args_dict = {"frameId": frameId}
+        command_dict = {
+            "command": "stepInTargets",
+            "type": "request",
+            "arguments": args_dict,
+        }
+        return self.send_recv(command_dict)
+    
+    def request_stepIn(self, threadId, targetId):
+        if self.exit_status is not None:
+            raise ValueError("request_stepIn called after process exited")
+        args_dict = {"threadId": threadId, "targetId": targetId}
         command_dict = {"command": "stepIn", "type": "request", "arguments": args_dict}
         return self.send_recv(command_dict)
 
     def request_stepOut(self, threadId):
         if self.exit_status is not None:
-            raise ValueError("request_continue called after process exited")
+            raise ValueError("request_stepOut called after process exited")
         args_dict = {"threadId": threadId}
         command_dict = {"command": "stepOut", "type": "request", "arguments": args_dict}
         return self.send_recv(command_dict)
 
     def request_pause(self, threadId=None):
         if self.exit_status is not None:
-            raise ValueError("request_continue called after process exited")
+            raise ValueError("request_pause called after process exited")
         if threadId is None:
             threadId = self.get_thread_id()
         args_dict = {"threadId": threadId}
diff --git a/lldb/packages/Python/lldbsuite/test/tools/lldb-dap/lldbdap_testcase.py b/lldb/packages/Python/lldbsuite/test/tools/lldb-dap/lldbdap_testcase.py
index 23f650d2d36fdd..d56ea5dca14beb 100644
--- a/lldb/packages/Python/lldbsuite/test/tools/lldb-dap/lldbdap_testcase.py
+++ b/lldb/packages/Python/lldbsuite/test/tools/lldb-dap/lldbdap_testcase.py
@@ -218,8 +218,8 @@ def set_global(self, name, value, id=None):
         """Set a top level global variable only."""
         return self.dap_server.request_setVariable(2, name, str(value), id=id)
 
-    def stepIn(self, threadId=None, waitForStop=True):
-        self.dap_server.request_stepIn(threadId=threadId)
+    def stepIn(self, threadId=None, targetId=None, waitForStop=True):
+        self.dap_server.request_stepIn(threadId=threadId, targetId=targetId)
         if waitForStop:
             return self.dap_server.wait_for_stopped()
         return None
diff --git a/lldb/source/API/SBLineEntry.cpp b/lldb/source/API/SBLineEntry.cpp
index 99a7b8fe644cb5..216ea6d18eab89 100644
--- a/lldb/source/API/SBLineEntry.cpp
+++ b/lldb/source/API/SBLineEntry.cpp
@@ -67,6 +67,21 @@ SBAddress SBLineEntry::GetEndAddress() const {
   return sb_address;
 }
 
+SBAddress SBLineEntry::GetSameLineContiguousAddressRangeEnd(
+    bool include_inlined_functions) const {
+  LLDB_INSTRUMENT_VA(this);
+
+  SBAddress sb_address;
+  if (m_opaque_up) {
+    AddressRange line_range = m_opaque_up->GetSameLineContiguousAddressRange(
+        include_inlined_functions);
+
+    sb_address.SetAddress(line_range.GetBaseAddress());
+    sb_address.OffsetAddress(line_range.GetByteSize());
+  }
+  return sb_address;
+}
+
 bool SBLineEntry::IsValid() const {
   LLDB_INSTRUMENT_VA(this);
   return this->operator bool();
diff --git a/lldb/test/API/tools/lldb-dap/stepInTargets/Makefile b/lldb/test/API/tools/lldb-dap/stepInTargets/Makefile
new file mode 100644
index 00000000000000..f772575cd5613b
--- /dev/null
+++ b/lldb/test/API/tools/lldb-dap/stepInTargets/Makefile
@@ -0,0 +1,6 @@
+	
+ENABLE_THREADS := YES
+
+CXX_SOURCES := main.cpp
+
+include Makefile.rules
diff --git a/lldb/test/API/tools/lldb-dap/stepInTargets/TestDAP_stepInTargets.py b/lldb/test/API/tools/lldb-dap/stepInTargets/TestDAP_stepInTargets.py
new file mode 100644
index 00000000000000..3fa955305037cf
--- /dev/null
+++ b/lldb/test/API/tools/lldb-dap/stepInTargets/TestDAP_stepInTargets.py
@@ -0,0 +1,66 @@
+"""
+Test lldb-dap stepInTargets request
+"""
+
+import dap_server
+from lldbsuite.test.decorators import *
+from lldbsuite.test.lldbtest import *
+import lldbdap_testcase
+from lldbsuite.test import lldbutil
+
+
+class TestDAP_stepInTargets(lldbdap_testcase.DAPTestCaseBase):
+    @skipIf(archs=no_match(["x86_64"]))  # ARM flow kind is not supported yet.
+    def test_basic(self):
+        """
+        Tests the basic stepping in targets with directly calls.
+        """
+        program = self.getBuildArtifact("a.out")
+        self.build_and_launch(program)
+        source = "main.cpp"
+
+        breakpoint_line = line_number(source, "// set breakpoint here")
+        lines = [breakpoint_line]
+        # Set breakpoint in the thread function so we can step the threads
+        breakpoint_ids = self.set_source_breakpoints(source, lines)
+        self.assertEqual(
+            len(breakpoint_ids), len(lines), "expect correct number of breakpoints"
+        )
+        self.continue_to_breakpoints(breakpoint_ids)
+
+        threads = self.dap_server.get_threads()
+        self.assertEqual(len(threads), 1, "expect one thread")
+        tid = threads[0]["id"]
+
+        leaf_frame = self.dap_server.get_stackFrame()
+        self.assertIsNotNone(leaf_frame, "expect a leaf frame")
+
+        # Request all step in targets list and verify the response.
+        step_in_targets_response = self.dap_server.request_stepInTargets(
+            leaf_frame["id"]
+        )
+        self.assertEqual(step_in_targets_response["success"], True, "expect success")
+        self.assertIn(
+            "body", step_in_targets_response, "expect body field in response body"
+        )
+        self.assertIn(
+            "targets",
+            step_in_targets_response["body"],
+            "expect targets field in response body",
+        )
+
+        step_in_targets = step_in_targets_response["body"]["targets"]
+        self.assertEqual(len(step_in_targets), 3, "expect 3 step in targets")
+
+        # Verify the target names are correct.
+        self.assertEqual(step_in_targets[0]["label"], "bar()", "expect bar()")
+        self.assertEqual(step_in_targets[1]["label"], "bar2()", "expect bar2()")
+        self.assertEqual(
+            step_in_targets[2]["label"], "foo(int, int)", "expect foo(int, int)"
+        )
+
+        # Choose to step into second target and verify that we are in bar2()
+        self.stepIn(threadId=tid, targetId=step_in_targets[1]["id"], waitForStop=True)
+        leaf_frame = self.dap_server.get_stackFrame()
+        self.assertIsNotNone(leaf_frame, "expect a leaf frame")
+        self.assertEqual(leaf_frame["name"], "bar2()")
diff --git a/lldb/test/API/tools/lldb-dap/stepInTargets/main.cpp b/lldb/test/API/tools/lldb-dap/stepInTargets/main.cpp
new file mode 100644
index 00000000000000..d3c3dbcc139ef0
--- /dev/null
+++ b/lldb/test/API/tools/lldb-dap/stepInTargets/main.cpp
@@ -0,0 +1,11 @@
+
+int foo(int val, int extra) { return val + extra; }
+
+int bar() { return 22; }
+
+int bar2() { return 54; }
+
+int main(int argc, char const *argv[]) {
+  foo(bar(), bar2()); // set breakpoint here
+  return 0;
+}
diff --git a/lldb/tools/lldb-dap/DAP.h b/lldb/tools/lldb-dap/DAP.h
index 8015dec9ba8fe6..5c70a056fea4bf 100644
--- a/lldb/tools/lldb-dap/DAP.h
+++ b/lldb/tools/lldb-dap/DAP.h
@@ -162,6 +162,8 @@ struct DAP {
   std::vector<std::string> exit_commands;
   std::vector<std::string> stop_commands;
   std::vector<std::string> terminate_commands;
+  // Map step in target id to list of function targets that user can choose.
+  llvm::DenseMap<lldb::addr_t, std::string> step_in_targets;
   // A copy of the last LaunchRequest or AttachRequest so we can reuse its
   // arguments if we get a RestartRequest.
   std::optional<llvm::json::Object> last_launch_or_attach_request;
diff --git a/lldb/tools/lldb-dap/lldb-dap.cpp b/lldb/tools/lldb-dap/lldb-dap.cpp
index 55f8c920e60016..30cba6d23f542d 100644
--- a/lldb/tools/lldb-dap/lldb-dap.cpp
+++ b/lldb/tools/lldb-dap/lldb-dap.cpp
@@ -1645,7 +1645,7 @@ void request_initialize(const llvm::json::Object &request) {
   // The debug adapter supports the gotoTargetsRequest.
   body.try_emplace("supportsGotoTargetsRequest", false);
   // The debug adapter supports the stepInTargetsRequest.
-  body.try_emplace("supportsStepInTargetsRequest", false);
+  body.try_emplace("supportsStepInTargetsRequest", true);
   // The debug adapter supports the completions request.
   body.try_emplace("supportsCompletionsRequest", true);
   // The debug adapter supports the disassembly request.
@@ -3180,14 +3180,159 @@ void request_stepIn(const llvm::json::Object &request) {
   llvm::json::Object response;
   FillResponse(request, response);
   auto arguments = request.getObject("arguments");
+
+  std::string step_in_target;
+  uint64_t target_id = GetUnsigned(arguments, "targetId", 0);
+  auto it = g_dap.step_in_targets.find(target_id);
+  if (it != g_dap.step_in_targets.end())
+    step_in_target = it->second;
+
+  const bool single_thread = GetBoolean(arguments, "singleThread", false);
+  lldb::RunMode run_mode =
+      single_thread ? lldb::eOnlyThisThread : lldb::eOnlyDuringStepping;
   lldb::SBThread thread = g_dap.GetLLDBThread(*arguments);
   if (thread.IsValid()) {
     // Remember the thread ID that caused the resume so we can set the
     // "threadCausedFocus" boolean value in the "stopped" events.
     g_dap.focus_tid = thread.GetThreadID();
-    thread.StepInto();
+    thread.StepInto(step_in_target.c_str(), run_mode);
+  } else {
+    response["success"] = llvm::json::Value(false);
+  }
+  g_dap.SendJSON(llvm::json::Value(std::move(response)));
+}
+
+// "StepInTargetsRequest": {
+//   "allOf": [ { "$ref": "#/definitions/Request" }, {
+//     "type": "object",
+//     "description": "This request retrieves the possible step-in targets for
+//     the specified stack frame.\nThese targets can be used in the `stepIn`
+//     request.\nClients should only call this request if the corresponding
+//     capability `supportsStepInTargetsRequest` is true.", "properties": {
+//       "command": {
+//         "type": "string",
+//         "enum": [ "stepInTargets" ]
+//       },
+//       "arguments": {
+//         "$ref": "#/definitions/StepInTargetsArguments"
+//       }
+//     },
+//     "required": [ "command", "arguments"  ]
+//   }]
+// },
+// "StepInTargetsArguments": {
+//   "type": "object",
+//   "description": "Arguments for `stepInTargets` request.",
+//   "properties": {
+//     "frameId": {
+//       "type": "integer",
+//       "description": "The stack frame for which to retrieve the possible
+//       step-in targets."
+//     }
+//   },
+//   "required": [ "frameId" ]
+// },
+// "StepInTargetsResponse": {
+//   "allOf": [ { "$ref": "#/definitions/Response" }, {
+//     "type": "object",
+//     "description": "Response to `stepInTargets` request.",
+//     "properties": {
+//       "body": {
+//         "type": "object",
+//         "properties": {
+//           "targets": {
+//             "type": "array",
+//             "items": {
+//               "$ref": "#/definitions/StepInTarget"
+//             },
+//             "description": "The possible step-in targets of the specified
+//             source location."
+//           }
+//         },
+//         "required": [ "targets" ]
+//       }
+//     },
+//     "required": [ "body" ]
+//   }]
+// }
+void request_stepInTargets(const llvm::json::Object &request) {
+  llvm::json::Object response;
+  FillResponse(request, response);
+  auto arguments = request.getObject("arguments");
+
+  g_dap.step_in_targets.clear();
+  lldb::SBFrame frame = g_dap.GetLLDBFrame(*arguments);
+  if (frame.IsValid()) {
+    lldb::SBAddress pc_addr = frame.GetPCAddress();
+    lldb::addr_t line_end_addr = pc_addr.GetLineEntry()
+                                     .GetSameLineContiguousAddressRangeEnd(
+                                         /*include_inlined_functions=*/true)
+                                     .GetLoadAddress(g_dap.target);
+
+    int max_inst_count = line_end_addr - pc_addr.GetLoadAddress(g_dap.target);
+    lldb::SBInstructionList insts =
+        g_dap.target.ReadInstructions(pc_addr, max_inst_count);
+
+    if (!insts.IsValid()) {
+      response["success"] = false;
+      response["message"] = "Failed to get instructions for frame.";
+      g_dap.SendJSON(llvm::json::Value(std::move(response)));
+      return;
+    }
+
+    llvm::json::Array step_in_targets;
+    const auto num_insts = insts.GetSize();
+    for (size_t i = 0; i < num_insts; ++i) {
+      lldb::SBInstruction inst = insts.GetInstructionAtIndex(i);
+      if (!inst.IsValid())
+        break;
+
+      lldb::addr_t inst_addr = inst.GetAddress().GetLoadAddress(g_dap.target);
+      // line_end_addr is exclusive of the line range.
+      if (inst_addr >= line_end_addr)
+        break;
+
+      // Note: currently only x86/x64 supports flow kind.
+      lldb::InstructionControlFlowKind flow_kind =
+          inst.GetControlFlowKind(g_dap.target);
+      if (flow_kind == lldb::eInstructionControlFlowKindCall) {
+        // Use call site instruction address as id which is easy to debug.
+        llvm::json::Object step_in_target;
+        step_in_target["id"] = inst_addr;
+
+        llvm::StringRef call_operand_name = inst.GetOperands(g_dap.target);
+        lldb::addr_t call_target_addr;
+        if (call_operand_name.getAsInteger(0, call_target_addr))
+          continue;
+
+        lldb::SBAddress call_target_load_addr =
+            g_dap.target.ResolveLoadAddress(call_target_addr);
+        if (!call_target_load_addr.IsValid())
+          continue;
+
+        lldb::SBSymbolContext sc = g_dap.target.ResolveSymbolContextForAddress(
+            call_target_load_addr, lldb::eSymbolContextFunction);
+
+        // Step in targets only supports functions with debug info.
+        std::string step_in_target_name;
+        if (sc.IsValid() && sc.GetFunction().IsValid())
+          step_in_target_name = sc.GetFunction().GetDisplayName();
+
+        // Skip call sites if we fail to resolve its symbol name.
+        if (step_in_target_name.empty())
+          continue;
+
+        g_dap.step_in_targets.try_emplace(inst_addr, step_in_target_name);
+        step_in_target.try_emplace("label", step_in_target_name);
+        step_in_targets.emplace_back(std::move(step_in_target));
+      }
+    }
+    llvm::json::Object body;
+    body.try_emplace("targets", std::move(step_in_targets));
+    response.try_emplace("body", std::move(body));
   } else {
     response["success"] = llvm::json::Value(false);
+    response["message"] = "Failed to get frame for input frameId.";
   }
   g_dap.SendJSON(llvm::json::Value(std::move(response)));
 }
@@ -3904,6 +4049,7 @@ void RegisterRequestCallbacks() {
   g_dap.RegisterRequestCallback("source", request_source);
   g_dap.RegisterRequestCallback("stackTrace", request_stackTrace);
   g_dap.RegisterRequestCallback("stepIn", request_stepIn);
+  g_dap.RegisterRequestCallback("stepInTargets", request_stepInTargets);
   g_dap.RegisterRequestCallback("stepOut", request_stepOut);
   g_dap.RegisterRequestCallback("threads", request_threads);
   g_dap.RegisterRequestCallback("variables", request_variables);

walter-erquinigo · 2024-03-26T16:38:20Z

lldb/test/API/tools/lldb-dap/stepInTargets/TestDAP_stepInTargets.py

+
+
+class TestDAP_stepInTargets(lldbdap_testcase.DAPTestCaseBase):
+    @skipIf(archs=no_match(["x86_64"]))  # ARM flow kind is not supported yet.


mention that this doesn't work on arm because of lack of InstructionControlFlowKind support

walter-erquinigo · 2024-03-26T16:41:54Z

lldb/tools/lldb-dap/lldb-dap.cpp

+    int max_inst_count = line_end_addr - pc_addr.GetLoadAddress(g_dap.target);
+    lldb::SBInstructionList insts =
+        g_dap.target.ReadInstructions(pc_addr, max_inst_count);


Please write a new ReadInstructions API that is range based. Internally you can use Disassembler::DisassembleRange.
That way you'll decode only the minimum necessary instructions.

walter-erquinigo · 2024-03-26T16:45:11Z

lldb/tools/lldb-dap/lldb-dap.cpp

+          continue;
+
+        lldb::SBSymbolContext sc = g_dap.target.ResolveSymbolContextForAddress(
+            call_target_load_addr, lldb::eSymbolContextFunction);


would be good to add a comment here explaining why eSymbolContextSymbol is not being considered, even when seemingly it should just work

I explained the line below:

// Step in targets only supports functions with debug info

But can repeat here as well.

But you don't explain why

clayborg · 2024-03-27T00:08:02Z

lldb/include/lldb/API/SBLineEntry.h

+  lldb::SBAddress
+  GetSameLineContiguousAddressRangeEnd(bool include_inlined_functions) const;
+


We don't need this API right? We spoke about using existing APIs for this. I believe we can find the index the SBLineEntry in the line table using:

uint32_t SBCompileUnit::FindLineEntryIndex(lldb::SBLineEntry &line_entry, bool exact = false) const;

And then you can use:

lldb::SBLineEntry SBCompileUnit::GetLineEntryAtIndex(uint32_t idx) const;

With the next index until the line doesn't match.

@clayborg, right, I know I can do that but I did not use it for two reasons:

That seems to be what GetSameLineContiguousAddressRangeEnd API is doing internally so I prefer to reuse the API than implement myself.

This seems to depend on the dwarf line table encoding which may not work for other platforms like PDB? Hiding behind GetSameLineContiguousAddressRangeEnd makes the code more resilient?

Anyway, if you still prefer us calling "GetLineEntryAtIndex" repeat, I will change it.

I take a second look at GetSameLineContiguousAddressRangeEnd implementation, it seems to take care of compiler generate code and inline functions which I find it is tedious to duplicate the implementation in lldb-dap:

llvm-project/lldb/source/Symbol/LineEntry.cpp

Lines 202 to 233 in a6d932b

if (*original_file_sp == *next_line_sc.line_entry.original_file_sp &&

(next_line_sc.line_entry.line == 0 ||

line == next_line_sc.line_entry.line)) {

// Include any line 0 entries - they indicate that this is compiler-

// generated code that does not correspond to user source code.

// next_line_sc is the same file & line as this LineEntry, so extend

// our AddressRange by its size and continue to see if there are more

// LineEntries that we can combine. However, if there was nothing to

// extend we're done.

if (!complete_line_range.Extend(next_line_sc.line_entry.range))

break;

continue;

}

if (include_inlined_functions && next_line_sc.block &&

next_line_sc.block->GetContainingInlinedBlock() != nullptr) {

// The next_line_sc might be in a different file if it's an inlined

// function. If this is the case then we still want to expand our line

// range to include them if the inlined function is at the same call site

// as this line entry. The current block could represent a nested inline

// function call so we need to need to check up the block tree to see if

// we find one.

auto inlined_parent_block =

next_line_sc.block->GetContainingInlinedBlockWithCallSite(

start_call_site);

if (!inlined_parent_block)

// We didn't find any parent inlined block with a call site at this line

// entry so this inlined function is probably at another line.

break;

// Extend our AddressRange by the size of the inlined block, but if there

// was nothing to add then we're done.

if (!complete_line_range.Extend(next_line_sc.line_entry.range))

So I think it makes more sense to reuse it. Please take a further review. Thanks.

clayborg · 2024-03-27T00:13:34Z

lldb/packages/Python/lldbsuite/test/tools/lldb-dap/dap_server.py

+    def request_stepInTargets(self, frameId):
        if self.exit_status is not None:
-            raise ValueError("request_continue called after process exited")
-        args_dict = {"threadId": threadId}
+            raise ValueError("request_stepInTargets called after process exited")
+        args_dict = {"frameId": frameId}
+        command_dict = {
+            "command": "stepInTargets",
+            "type": "request",
+            "arguments": args_dict,
+        }
+        return self.send_recv(command_dict)


Can you move request_stepInTargets after the request_stepIn(self, threadID): So the diff doesn't get muddled up? Right now it thinks you rewrite request_stepIn, but if you add request_stepInTargets after this function the diff will be easier to read.

clayborg · 2024-03-27T00:14:10Z

lldb/source/API/SBLineEntry.cpp

+SBAddress SBLineEntry::GetSameLineContiguousAddressRangeEnd(
+    bool include_inlined_functions) const {
+  LLDB_INSTRUMENT_VA(this);
+
+  SBAddress sb_address;
+  if (m_opaque_up) {
+    AddressRange line_range = m_opaque_up->GetSameLineContiguousAddressRange(
+        include_inlined_functions);
+
+    sb_address.SetAddress(line_range.GetBaseAddress());
+    sb_address.OffsetAddress(line_range.GetByteSize());
+  }
+  return sb_address;
+}
+


remove this API and use existing ones I spoke about in the inline comment in the header file?

clayborg · 2024-03-27T00:18:44Z

lldb/tools/lldb-dap/lldb-dap.cpp

+    lldb::SBAddress line_end_addr =
+        pc_addr.GetLineEntry().GetSameLineContiguousAddressRangeEnd(true);


We can use existing APIs for this without adding GetSameLineContiguousAddressRangeEnd() as mentioned above in inline comments.

clayborg · 2024-03-27T00:22:22Z

lldb/tools/lldb-dap/lldb-dap.cpp

+        llvm::StringRef call_operand_name = inst.GetOperands(g_dap.target);
+        lldb::addr_t call_target_addr;
+        if (call_operand_name.getAsInteger(0, call_target_addr))
+          continue;


Not sure how portable this will as different disassemblers will have different things in the operands. Also many disassemblers will have the function name already in the comment:

const char *func_name = inst.GetComment(g_dap.target);

Might be worth checking into this?

I have looked into using the comment. But it is not very portable:

0x55555555a426 <+934>: callq 0x555555558990 ; bar at main.cpp:64 0x55555555a42b <+939>: movl %eax, -0xb0(%rbp) 0x55555555a431 <+945>: callq 0x5555555589a0 ; bar2 at main.cpp:68 0x55555555a436 <+950>: movl -0xb0(%rbp), %edi 0x55555555a43c <+956>: movl %eax, %esi 0x55555555a43e <+958>: callq 0x555555558970 ; foo at main.cpp:60

We will have to manually parse the comment to get the target function name.

The function name does not include the function overloaded parameters like "foo" vs "for(int, int)".

jimingham · 2024-03-27T00:33:37Z

This patch provides the initial implementation for the "Step Into Specific/Step In Targets" feature in VSCode DAP.

The implementation disassembles all the call instructions in step range and try to resolve operand name (assuming one operand) using debug info. Later, the call target function name is chosen by end user and specified in the StepInto() API call.

It is v1 because of using the existing step in target function name API. This implementation has several limitations:

Won't for indirect/virtual function call -- in most cases, our disassembler won't be able to solve the indirect call target address/name.

Targeted step-in's don't predict anything. The algorithm works by stepping in and then matching the target name against the function you've stopped in, and if it doesn't match step out and keep going. So you don't ever need to resolve any names up front. Are you saying for indirect/virtual function calls you can't figure out the name once you've arrived at the target?

Won't work for target function without debug info -- if the target function has symbol but not debug info, the existing ThreadPlanStepInRange won't stop.

That's just a matter of adding the --step-in-avoids-nodebug option to the SB API's. That works from the command line
already.

Relying on function names can be fragile -- if there is some middle glue/thunk code, our disassembler can only resolve the glue/thunk code's name not the real target function name. It can be fragile to depend compiler/linker emits the same names for both.

lldb should always step through glue or thunks automatically. That's going to get handled by the trampoline handler's "step through" plan so the targeted StepIn plan won't ever stop on a thunk. If you have a system where there aren't trampoline handlers for your indirect thunks or other glue code, you would be better served adding those than trying to get the step in plans to handle them directly.

Does not support step into raw address call sites -- it is a valid scenario that in Visual Studio debugger, user can explicitly choose a raw address to step into which land in the function without debug info/symbol, then choose UI to load the debug info on-demand for that module/frame to continue exploring.

That would be trivial to add.

A more reliable design could be extending the ThreadPlanStepInRange to support step in based on call-site instruction offset/PC which I will propose in next iteration.

For setting step-in targets based on call-site instruction, you definitely will need to add some support for a pc rather than a symbol match. That should be quite easy.

But I don't think that's a viable method for the "step in target" that people really want to use. They see:

some_func(other_func(method_call()));

and they want to step into method_call. I don't think there's any way you are going to know what address "method_call" will resolve to in this context. You'd have to know what "this" actually is, and deal with overload resolution, etc. That does not sound very doable to me.

Note that this feature as implemented is robust because it doesn't try to predict anything. It just steps through the range you told it to, stepping one level deep (not counting trampolines) and stops if the place it stopped matches the target. I don't think switching to a "predict the target up front and go there" method will be nearly as reliable. The expression you are stepping through could be way more complicated than the one I showed, and trying to figure out what some text snippet is actually going to resolve to could be quite complex.

Adding the address match in this algorithm is easy, since instead of asking "does the name match my target" you would ask "am I at the target address".

The problem of thunks and trampolines should be handled by having a trampoline handler that handles them. That will benefit all of your step-in behavior since otherwise you'll be stopping in the thunk, which no-one really wants. And it will mean you can ignore them for the purposes of targeted step in.

For the other problems you mentioned, I think what is needed is a richer way to provide the match between what the user sees in their code, and the symbol they are actually going to stop at. We could allow regex matches, or be smarter about stripping out irrelevant bits to get to a base name as we do in other instances.

jeffreytan81 · 2024-03-27T21:56:45Z

@jimingham, thanks for detailed comment.

"Step in targets" or "Step Into Specific" is an IDE feature that works in two steps:

Show a list of call sites before stepping to allow users specify which call to step into
Perform step into with user choosing call site function to stop.
https://www.tabsoverspaces.com/233742-step-into-a-specific-call-when-debugging-in-visual-studio contains a good screenshot.

Note: in step 1, we will have to show users the callsite target function before performing stepping. For indirect/virtual call, you can't reliable resolve the target function name without running a smaller interpreter to simulate the execution.

I don't think there's any way you are going to know what address "method_call"
will resolve to in this context. You'd have to know what "this" actually is, and deal with overload resolution, etc. That does > not sound very doable to me.

I am not proposing using the function target PC, but the call instruction address in the call site side. For example:

    0x55555555a426 <+934>:  callq  0x555555558990 ; bar at main.cpp:64
    0x55555555a42b <+939>:  movl   %eax, -0xb0(%rbp)
    0x55555555a431 <+945>:  callq  0x5555555589a0 ; bar2 at main.cpp:68
    0x55555555a436 <+950>:  movl   -0xb0(%rbp), %edi
    0x55555555a43c <+956>:  movl   %eax, %esi
    0x55555555a43e <+958>:  callq  0x555555558970 ; foo at main.cpp:60

For the above step range above, if user wants to step into "bar2()" function, I propose us pass 0x55555555a431 as an extra callsite PC to underlying ThreadPlanStepInRange, so ThreadPlanStepInRange::DoWillResume() can detect the current IP would match the user specified call-site PC 0x55555555a431. Then ThreadPlanStepInRange::DefaultShouldStopHereCallback() can detect the state that there is a previous matching callsite PC, so return true to stop when eFrameCompareYounger.

Hope this makes sense.

jimingham · 2024-03-27T22:32:27Z

@jimingham, thanks for detailed comment.

"Step in targets" or "Step Into Specific" is an IDE feature that works in two steps:

Show a list of call sites before stepping to allow users specify which call to step into

Perform step into with user choosing call site function to stop.
https://www.tabsoverspaces.com/233742-step-into-a-specific-call-when-debugging-in-visual-studio contains a good screenshot.

Note: in step 1, we will have to show users the callsite target function before performing stepping. For indirect/virtual call, you can't reliable resolve the target function name without running a smaller interpreter to simulate the execution.

I don't think there's any way you are going to know what address "method_call"
will resolve to in this context. You'd have to know what "this" actually is, and deal with overload resolution, etc. That does > not sound very doable to me.

I am not proposing using the function target PC, but the call instruction address in the call site side. For example:
    0x55555555a426 <+934>:  callq  0x555555558990 ; bar at main.cpp:64
    0x55555555a42b <+939>:  movl   %eax, -0xb0(%rbp)
    0x55555555a431 <+945>:  callq  0x5555555589a0 ; bar2 at main.cpp:68
    0x55555555a436 <+950>:  movl   -0xb0(%rbp), %edi
    0x55555555a43c <+956>:  movl   %eax, %esi
    0x55555555a43e <+958>:  callq  0x555555558970 ; foo at main.cpp:60
For the above step range above, if user wants to step into "bar2()" function, I propose us pass 0x55555555a431 as an extra callsite PC to underlying ThreadPlanStepInRange, so ThreadPlanStepInRange::DoWillResume() can detect the current IP would match the user specified call-site PC 0x55555555a431. Then ThreadPlanStepInRange::DefaultShouldStopHereCallback() can detect the state that there is a previous matching callsite PC, so return true to stop when eFrameCompareYounger.

Hope this makes sense.

That sounds fine. I try to stay out of the game of telling people what is going to run in any given range of code because it is pretty hard to get that right from assembly, whereas if you watch execution you are always right. But if that's what the DAP requires, that seems to be your fate.

It would be fine to add an m_step_into_target_addr to ThreadPlanStepInRange and check that in its ShouldStopHereCallback. Not sure why you mention DoWillResume, that's not where ShouldStopHere gets called, and deciding your step was done then would be way too late in the whole "I've stopped, what should I do next" negotiation. But if you put it in the ShouldStopHereCallback it will get called at the right time.

kusmour

Most comments are addressed. Stamping to unblock. We can iterate with another PR :D

jeffreytan81 and others added 6 commits March 6, 2024 12:07

Fix strcmp build error on buildbot

6cccde2

Merge branch 'llvm:main' into main

2918dcf

Merge branch 'llvm:main' into main

42c2d34

Merge branch 'llvm:main' into main

dc9f5db

Initial support for stepInTargets feature

06ec5e2

Remove unnecessary changes

b2dfdb5

jeffreytan81 requested review from clayborg, kusmour and jimingham March 26, 2024 16:17

jeffreytan81 marked this pull request as ready for review March 26, 2024 16:20

jeffreytan81 requested a review from JDevlieghere as a code owner March 26, 2024 16:20

llvmbot added the lldb label Mar 26, 2024

Fix formatter

f7f8a69

jeffreytan81 requested a review from zhyty March 26, 2024 16:21

Remove empty space

55a9f70

walter-erquinigo reviewed Mar 26, 2024

View reviewed changes

Address review feedback

d51bf46

clayborg reviewed Mar 27, 2024

View reviewed changes

Address review comments

13ffb1c

kusmour approved these changes Apr 25, 2024

View reviewed changes

jeffreytan81 merged commit 2f2e31c into llvm:main Apr 25, 2024



		class TestDAP_stepInTargets(lldbdap_testcase.DAPTestCaseBase):
		@skipIf(archs=no_match(["x86_64"])) # ARM flow kind is not supported yet.

		lldb::SBAddress
		GetSameLineContiguousAddressRangeEnd(bool include_inlined_functions) const;

	if (original_file_sp == next_line_sc.line_entry.original_file_sp &&
	(next_line_sc.line_entry.line == 0 \|\|
	line == next_line_sc.line_entry.line)) {
	// Include any line 0 entries - they indicate that this is compiler-
	// generated code that does not correspond to user source code.
	// next_line_sc is the same file & line as this LineEntry, so extend
	// our AddressRange by its size and continue to see if there are more
	// LineEntries that we can combine. However, if there was nothing to
	// extend we're done.
	if (!complete_line_range.Extend(next_line_sc.line_entry.range))
	break;
	continue;
	}

	if (include_inlined_functions && next_line_sc.block &&
	next_line_sc.block->GetContainingInlinedBlock() != nullptr) {
	// The next_line_sc might be in a different file if it's an inlined
	// function. If this is the case then we still want to expand our line
	// range to include them if the inlined function is at the same call site
	// as this line entry. The current block could represent a nested inline
	// function call so we need to need to check up the block tree to see if
	// we find one.
	auto inlined_parent_block =
	next_line_sc.block->GetContainingInlinedBlockWithCallSite(
	start_call_site);
	if (!inlined_parent_block)
	// We didn't find any parent inlined block with a call site at this line
	// entry so this inlined function is probably at another line.
	break;
	// Extend our AddressRange by the size of the inlined block, but if there
	// was nothing to add then we're done.
	if (!complete_line_range.Extend(next_line_sc.line_entry.range))

		lldb::SBAddress line_end_addr =
		pc_addr.GetLineEntry().GetSameLineContiguousAddressRangeEnd(true);

Initial step in targets DAP support #86623

Initial step in targets DAP support #86623

Uh oh!

Conversation

jeffreytan81 commented Mar 26, 2024 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

github-actions bot commented Mar 26, 2024 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

github-actions bot commented Mar 26, 2024 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

llvmbot commented Mar 26, 2024

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

jimingham commented Mar 27, 2024 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

jeffreytan81 commented Mar 27, 2024 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

jimingham commented Mar 27, 2024

Uh oh!

kusmour left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

jeffreytan81 commented Mar 26, 2024 •

edited

Loading

github-actions bot commented Mar 26, 2024 •

edited

Loading

github-actions bot commented Mar 26, 2024 •

edited

Loading

jimingham commented Mar 27, 2024 •

edited

Loading

jeffreytan81 commented Mar 27, 2024 •

edited

Loading