Skip to content

[lldb-dap] Support inspecting memory #104317

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
merged 10 commits into from
Sep 16, 2024
Merged

Conversation

vogelsgesang
Copy link
Member

Adds support for the readMemory request which allows VS-Code to inspect memory. Also, add memoryReference to variablesa and evaluate responses, such that the binary view can be opened from the variables view and from the "watch" pane.

@llvmbot
Copy link
Member

llvmbot commented Aug 14, 2024

@llvm/pr-subscribers-lldb

Author: Adrian Vogelsgesang (vogelsgesang)

Changes

Adds support for the readMemory request which allows VS-Code to inspect memory. Also, add memoryReference to variablesa and evaluate responses, such that the binary view can be opened from the variables view and from the "watch" pane.


Full diff: https://github.com/llvm/llvm-project/pull/104317.diff

7 Files Affected:

  • (modified) lldb/packages/Python/lldbsuite/test/tools/lldb-dap/dap_server.py (+15)
  • (added) lldb/test/API/tools/lldb-dap/memory/Makefile (+3)
  • (added) lldb/test/API/tools/lldb-dap/memory/TestDAP_memory.py (+121)
  • (added) lldb/test/API/tools/lldb-dap/memory/main.cpp (+10)
  • (modified) lldb/tools/lldb-dap/JSONUtils.cpp (+28-2)
  • (modified) lldb/tools/lldb-dap/JSONUtils.h (+6)
  • (modified) lldb/tools/lldb-dap/lldb-dap.cpp (+171-1)
diff --git a/lldb/packages/Python/lldbsuite/test/tools/lldb-dap/dap_server.py b/lldb/packages/Python/lldbsuite/test/tools/lldb-dap/dap_server.py
index a324af57b61df3..35d792351e6bfc 100644
--- a/lldb/packages/Python/lldbsuite/test/tools/lldb-dap/dap_server.py
+++ b/lldb/packages/Python/lldbsuite/test/tools/lldb-dap/dap_server.py
@@ -691,6 +691,21 @@ def request_disassemble(
         for inst in instructions:
             self.disassembled_instructions[inst["address"]] = inst
 
+    def request_read_memory(
+        self, memoryReference, offset, count
+    ):
+        args_dict = {
+            "memoryReference": memoryReference,
+            "offset": offset,
+            "count": count,
+        }
+        command_dict = {
+            "command": "readMemory",
+            "type": "request",
+            "arguments": args_dict,
+        }
+        return self.send_recv(command_dict)
+
     def request_evaluate(self, expression, frameIndex=0, threadId=None, context=None):
         stackFrame = self.get_stackFrame(frameIndex=frameIndex, threadId=threadId)
         if stackFrame is None:
diff --git a/lldb/test/API/tools/lldb-dap/memory/Makefile b/lldb/test/API/tools/lldb-dap/memory/Makefile
new file mode 100644
index 00000000000000..99998b20bcb050
--- /dev/null
+++ b/lldb/test/API/tools/lldb-dap/memory/Makefile
@@ -0,0 +1,3 @@
+CXX_SOURCES := main.cpp
+
+include Makefile.rules
diff --git a/lldb/test/API/tools/lldb-dap/memory/TestDAP_memory.py b/lldb/test/API/tools/lldb-dap/memory/TestDAP_memory.py
new file mode 100644
index 00000000000000..9ff4dbd3138428
--- /dev/null
+++ b/lldb/test/API/tools/lldb-dap/memory/TestDAP_memory.py
@@ -0,0 +1,121 @@
+"""
+Test lldb-dap memory support
+"""
+
+
+from base64 import b64decode
+import dap_server
+from lldbsuite.test.decorators import *
+from lldbsuite.test.lldbtest import *
+from lldbsuite.test import lldbutil
+import lldbdap_testcase
+import os
+
+
+class TestDAP_memory(lldbdap_testcase.DAPTestCaseBase):
+    def test_read_memory(self):
+        """
+        Tests the 'read_memory' request
+        """
+        program = self.getBuildArtifact("a.out")
+        self.build_and_launch(program)
+        source = "main.cpp"
+        self.source_path = os.path.join(os.getcwd(), source)
+        self.set_source_breakpoints(
+            source,
+            [
+                line_number(source, "// Breakpoint"),
+            ],
+        )
+        self.continue_to_next_stop()
+
+        locals = {l['name']: l for l in self.dap_server.get_local_variables()}
+        rawptr_ref = locals['rawptr']['memoryReference']
+
+        # We can read the complete string
+        mem = self.dap_server.request_read_memory(rawptr_ref, 0, 5)["body"]
+        self.assertEqual(mem["unreadableBytes"], 0);
+        self.assertEqual(b64decode(mem["data"]), b"dead\0")
+
+        # Use an offset
+        mem = self.dap_server.request_read_memory(rawptr_ref, 2, 3)["body"]
+        self.assertEqual(b64decode(mem["data"]), b"ad\0")
+
+        # Use a negative offset
+        mem = self.dap_server.request_read_memory(rawptr_ref, -1, 6)["body"]
+        self.assertEqual(b64decode(mem["data"])[1:], b"dead\0")
+
+        # Reads of size 0 are successful
+        # VS-Code sends those in order to check if a `memoryReference` can actually be dereferenced.
+        mem = self.dap_server.request_read_memory(rawptr_ref, 0, 0)
+        self.assertEqual(mem["success"], True)
+        self.assertEqual(mem["body"]["data"], "")
+
+        # Reads at offset 0x0 fail
+        mem = self.dap_server.request_read_memory("0x0", 0, 6)
+        self.assertEqual(mem["success"], False)
+        self.assertTrue(mem["message"].startswith("Unable to read memory: "))
+
+
+    def test_memory_refs_variables(self):
+        """
+        Tests memory references for evaluate
+        """
+        program = self.getBuildArtifact("a.out")
+        self.build_and_launch(program)
+        source = "main.cpp"
+        self.source_path = os.path.join(os.getcwd(), source)
+        self.set_source_breakpoints(
+            source, [line_number(source, "// Breakpoint")],
+        )
+        self.continue_to_next_stop()
+
+        locals = {l['name']: l for l in self.dap_server.get_local_variables()}
+
+        # Pointers should have memory-references
+        self.assertIn("memoryReference", locals['rawptr'].keys())
+        # Smart-pointers also have memory-references
+        self.assertIn("memoryReference", self.dap_server.get_local_variable_child("smartptr", "pointer").keys())
+        # Non-pointers should not have memory-references
+        self.assertNotIn("memoryReference", locals['not_a_ptr'].keys())
+
+
+    def test_memory_refs_evaluate(self):
+        """
+        Tests memory references for evaluate
+        """
+        program = self.getBuildArtifact("a.out")
+        self.build_and_launch(program)
+        source = "main.cpp"
+        self.source_path = os.path.join(os.getcwd(), source)
+        self.set_source_breakpoints(
+            source, [line_number(source, "// Breakpoint")],
+        )
+        self.continue_to_next_stop()
+
+        # Pointers contain memory references
+        self.assertIn("memoryReference", self.dap_server.request_evaluate("rawptr + 1")["body"].keys())
+
+        # Non-pointer expressions don't include a memory reference
+        self.assertNotIn("memoryReference", self.dap_server.request_evaluate("1 + 3")["body"].keys())
+
+
+    def test_memory_refs_set_variable(self):
+        """
+        Tests memory references for `setVariable`
+        """
+        program = self.getBuildArtifact("a.out")
+        self.build_and_launch(program)
+        source = "main.cpp"
+        self.source_path = os.path.join(os.getcwd(), source)
+        self.set_source_breakpoints(
+            source, [line_number(source, "// Breakpoint")],
+        )
+        self.continue_to_next_stop()
+
+        # Pointers contain memory references
+        ptr_value = self.get_local_as_int("rawptr")
+        self.assertIn("memoryReference", self.dap_server.request_setVariable(1, "rawptr", ptr_value + 2)["body"].keys())
+
+        # Non-pointer expressions don't include a memory reference
+        self.assertNotIn("memoryReference", self.dap_server.request_setVariable(1, "not_a_ptr", 42)["body"].keys())
diff --git a/lldb/test/API/tools/lldb-dap/memory/main.cpp b/lldb/test/API/tools/lldb-dap/memory/main.cpp
new file mode 100644
index 00000000000000..95ecde82ea2798
--- /dev/null
+++ b/lldb/test/API/tools/lldb-dap/memory/main.cpp
@@ -0,0 +1,10 @@
+#include <memory>
+#include <iostream>
+
+int main() {
+    int not_a_ptr = 666;
+    const char* rawptr = "dead";
+    std::unique_ptr<int> smartptr(new int(42));
+    // Breakpoint
+    return 0;
+}
diff --git a/lldb/tools/lldb-dap/JSONUtils.cpp b/lldb/tools/lldb-dap/JSONUtils.cpp
index a8b85f55939e17..dc8932d5f3181d 100644
--- a/lldb/tools/lldb-dap/JSONUtils.cpp
+++ b/lldb/tools/lldb-dap/JSONUtils.cpp
@@ -1085,6 +1085,19 @@ std::string VariableDescription::GetResult(llvm::StringRef context) {
   return description.trim().str();
 }
 
+lldb::addr_t GetMemoryReference(lldb::SBValue v) {
+  if (!v.GetType().IsPointerType() && !v.GetType().IsArrayType()) {
+    return LLDB_INVALID_ADDRESS;
+  }
+
+  lldb::SBValue deref = v.Dereference();
+  if (!deref.IsValid()) {
+    return LLDB_INVALID_ADDRESS;
+  }
+
+  return deref.GetLoadAddress();
+}
+
 // "Variable": {
 //   "type": "object",
 //   "description": "A Variable is a name/value pair. Optionally a variable
@@ -1144,8 +1157,16 @@ std::string VariableDescription::GetResult(llvm::StringRef context) {
 //                       can use this optional information to present the
 //                       children in a paged UI and fetch them in chunks."
 //     }
-//
-//
+//     "memoryReference": {
+//        "type": "string",
+//        "description": "A memory reference associated with this variable.
+//                        For pointer type variables, this is generally a
+//                        reference to the memory address contained in the
+//                        pointer. For executable data, this reference may later
+//                        be used in a `disassemble` request. This attribute may
+//                        be returned by a debug adapter if corresponding
+//                        capability `supportsMemoryReferences` is true."
+//      },
 //     "$__lldb_extensions": {
 //       "description": "Unofficial extensions to the protocol",
 //       "properties": {
@@ -1253,6 +1274,11 @@ llvm::json::Value CreateVariable(lldb::SBValue v, int64_t variablesReference,
   else
     object.try_emplace("variablesReference", (int64_t)0);
 
+  
+  if (lldb::addr_t addr = GetMemoryReference(v); addr != LLDB_INVALID_ADDRESS) {
+    object.try_emplace("memoryReference", "0x" + llvm::utohexstr(addr));
+  }
+
   object.try_emplace("$__lldb_extensions", desc.GetVariableExtensionsJSON());
   return llvm::json::Value(std::move(object));
 }
diff --git a/lldb/tools/lldb-dap/JSONUtils.h b/lldb/tools/lldb-dap/JSONUtils.h
index 1515f5ba2e5f4d..c619712c2c682b 100644
--- a/lldb/tools/lldb-dap/JSONUtils.h
+++ b/lldb/tools/lldb-dap/JSONUtils.h
@@ -411,6 +411,12 @@ struct VariableDescription {
   std::string GetResult(llvm::StringRef context);
 };
 
+/// Get the corresponding `memoryReference` for a value.
+///
+/// According to the DAP documentation, the `memoryReference` should
+/// refer to the pointee, not to the address of the pointer itself.
+lldb::addr_t GetMemoryReference(lldb::SBValue v);
+
 /// Create a "Variable" object for a LLDB thread object.
 ///
 /// This function will fill in the following keys in the returned
diff --git a/lldb/tools/lldb-dap/lldb-dap.cpp b/lldb/tools/lldb-dap/lldb-dap.cpp
index ea84f31aec3a6c..d138a79d4488e5 100644
--- a/lldb/tools/lldb-dap/lldb-dap.cpp
+++ b/lldb/tools/lldb-dap/lldb-dap.cpp
@@ -9,6 +9,7 @@
 #include "DAP.h"
 #include "Watchpoint.h"
 #include "lldb/API/SBMemoryRegionInfo.h"
+#include "llvm/Support/Base64.h"
 
 #include <cassert>
 #include <climits>
@@ -1346,6 +1347,16 @@ void request_completions(const llvm::json::Object &request) {
 //                              present the variables in a paged UI and fetch
 //                              them in chunks."
 //            }
+//            "memoryReference": {
+//               "type": "string",
+//                "description": "A memory reference to a location appropriate
+//                                for this result. For pointer type eval
+//                                results, this is generally a reference to the
+//                                memory address contained in the pointer. This
+//                                attribute may be returned by a debug adapter
+//                                if corresponding capability
+//                                `supportsMemoryReferences` is true."
+//             },
 //          },
 //          "required": [ "result", "variablesReference" ]
 //        }
@@ -1411,6 +1422,10 @@ void request_evaluate(const llvm::json::Object &request) {
       } else {
         body.try_emplace("variablesReference", (int64_t)0);
       }
+      if (lldb::addr_t addr = GetMemoryReference(value);
+          addr != LLDB_INVALID_ADDRESS) {
+        body.try_emplace("memoryReference", "0x" + llvm::utohexstr(addr));
+      }
     }
   }
   response.try_emplace("body", std::move(body));
@@ -1718,6 +1733,8 @@ void request_initialize(const llvm::json::Object &request) {
   body.try_emplace("supportsLogPoints", true);
   // The debug adapter supports data watchpoints.
   body.try_emplace("supportsDataBreakpoints", true);
+  // The debug adapter supports the `readMemory` request.
+  body.try_emplace("supportsReadMemoryRequest", true);
 
   // Put in non-DAP specification lldb specific information.
   llvm::json::Object lldb_json;
@@ -3600,8 +3617,12 @@ void request_setVariable(const llvm::json::Object &request) {
       if (variable.MightHaveChildren())
         newVariablesReference = g_dap.variables.InsertExpandableVariable(
             variable, /*is_permanent=*/false);
-
       body.try_emplace("variablesReference", newVariablesReference);
+
+      if (lldb::addr_t addr = GetMemoryReference(variable);
+          addr != LLDB_INVALID_ADDRESS) {
+        body.try_emplace("memoryReference", "0x" + llvm::utohexstr(addr));
+      }
     } else {
       EmplaceSafeString(body, "message", std::string(error.GetCString()));
     }
@@ -4028,6 +4049,154 @@ void request_disassemble(const llvm::json::Object &request) {
   response.try_emplace("body", std::move(body));
   g_dap.SendJSON(llvm::json::Value(std::move(response)));
 }
+
+// "ReadMemoryRequest": {
+//   "allOf": [ { "$ref": "#/definitions/Request" }, {
+//     "type": "object",
+//     "description": "Reads bytes from memory at the provided location. Clients
+//                     should only call this request if the corresponding
+//                     capability `supportsReadMemoryRequest` is true.",
+//     "properties": {
+//       "command": {
+//         "type": "string",
+//         "enum": [ "readMemory" ]
+//       },
+//       "arguments": {
+//         "$ref": "#/definitions/ReadMemoryArguments"
+//       }
+//     },
+//     "required": [ "command", "arguments" ]
+//   }]
+// },
+// "ReadMemoryArguments": {
+//   "type": "object",
+//   "description": "Arguments for `readMemory` request.",
+//   "properties": {
+//     "memoryReference": {
+//       "type": "string",
+//       "description": "Memory reference to the base location from which data
+//                       should be read."
+//     },
+//     "offset": {
+//       "type": "integer",
+//       "description": "Offset (in bytes) to be applied to the reference
+//                       location before reading data. Can be negative."
+//     },
+//     "count": {
+//       "type": "integer",
+//       "description": "Number of bytes to read at the specified location and
+//                       offset."
+//     }
+//   },
+//   "required": [ "memoryReference", "count" ]
+// },
+// "ReadMemoryResponse": {
+//   "allOf": [ { "$ref": "#/definitions/Response" }, {
+//     "type": "object",
+//     "description": "Response to `readMemory` request.",
+//     "properties": {
+//       "body": {
+//         "type": "object",
+//         "properties": {
+//           "address": {
+//             "type": "string",
+//             "description": "The address of the first byte of data returned.
+//                             Treated as a hex value if prefixed with `0x`, or
+//                             as a decimal value otherwise."
+//           },
+//           "unreadableBytes": {
+//             "type": "integer",
+//             "description": "The number of unreadable bytes encountered after
+//                             the last successfully read byte.\nThis can be
+//                             used to determine the number of bytes that should
+//                             be skipped before a subsequent
+//             `readMemory` request succeeds."
+//           },
+//           "data": {
+//             "type": "string",
+//             "description": "The bytes read from memory, encoded using base64.
+//                             If the decoded length of `data` is less than the
+//                             requested `count` in the original `readMemory`
+//                             request, and `unreadableBytes` is zero or
+//                             omitted, then the client should assume it's
+//                             reached the end of readable memory."
+//           }
+//         },
+//         "required": [ "address" ]
+//       }
+//     }
+//   }]
+// },
+void request_readMemory(const llvm::json::Object &request) {
+  llvm::json::Object response;
+  FillResponse(request, response);
+  auto arguments = request.getObject("arguments");
+
+  lldb::SBProcess process = g_dap.target.GetProcess();
+  if (!process.IsValid()) {
+    response["success"] = false;
+    response["message"] = "No process running";
+    g_dap.SendJSON(llvm::json::Value(std::move(response)));
+    return;
+  }
+
+  auto memoryReference = GetString(arguments, "memoryReference");
+  lldb::addr_t addr;
+  if (memoryReference.consumeInteger(0, addr)) {
+    response["success"] = false;
+    response["message"] =
+        "Malformed memory reference: " + memoryReference.str();
+    g_dap.SendJSON(llvm::json::Value(std::move(response)));
+    return;
+  }
+
+  addr += GetSigned(arguments, "offset", 0);
+  const auto requested_count = GetUnsigned(arguments, "count", 0);
+  lldb::SBMemoryRegionInfo region_info;
+  lldb::SBError memRegError = process.GetMemoryRegionInfo(addr, region_info);
+  if (memRegError.Fail()) {
+    response["success"] = false;
+    EmplaceSafeString(response, "message",
+                      "Unable to find memory region: " +
+                          std::string(memRegError.GetCString()));
+    g_dap.SendJSON(llvm::json::Value(std::move(response)));
+    return;
+  }
+  const auto available_count =
+      std::min(requested_count, region_info.GetRegionEnd() - addr);
+  const auto unavailable_count = requested_count - available_count;
+
+  std::vector<uint8_t> buf;
+  buf.resize(available_count);
+  if (available_count > 0) {
+    lldb::SBError memReadError;
+    auto bytes_read =
+        process.ReadMemory(addr, buf.data(), available_count, memReadError);
+    if (memReadError.Fail()) {
+      response["success"] = false;
+      EmplaceSafeString(response, "message",
+                        "Unable to read memory: " +
+                            std::string(memReadError.GetCString()));
+      g_dap.SendJSON(llvm::json::Value(std::move(response)));
+      return;
+    }
+    if (bytes_read != available_count) {
+      response["success"] = false;
+      EmplaceSafeString(response, "message", "Unexpected, short read");
+      g_dap.SendJSON(llvm::json::Value(std::move(response)));
+      return;
+    }
+  }
+
+  llvm::json::Object body;
+  std::string formatted_addr = "0x" + llvm::utohexstr(addr);
+  body.try_emplace("address", formatted_addr);
+  body.try_emplace("data", llvm::encodeBase64(buf));
+  body.try_emplace("unreadableBytes", unavailable_count);
+  response.try_emplace("body", std::move(body));
+  g_dap.SendJSON(llvm::json::Value(std::move(response)));
+}
+
 // A request used in testing to get the details on all breakpoints that are
 // currently set in the target. This helps us to test "setBreakpoints" and
 // "setFunctionBreakpoints" requests to verify we have the correct set of
@@ -4078,6 +4247,7 @@ void RegisterRequestCallbacks() {
   g_dap.RegisterRequestCallback("threads", request_threads);
   g_dap.RegisterRequestCallback("variables", request_variables);
   g_dap.RegisterRequestCallback("disassemble", request_disassemble);
+  g_dap.RegisterRequestCallback("readMemory", request_readMemory);
   // Custom requests
   g_dap.RegisterRequestCallback("compileUnits", request_compileUnits);
   g_dap.RegisterRequestCallback("modules", request_modules);

Copy link

github-actions bot commented Aug 14, 2024

✅ With the latest revision this PR passed the Python code formatter.

Copy link

github-actions bot commented Aug 14, 2024

✅ With the latest revision this PR passed the C/C++ code formatter.

@walter-erquinigo
Copy link
Member

@clayborg you'll be happy to review this

@walter-erquinigo
Copy link
Member

@vogelsgesang can you share a screenshot?

@vogelsgesang
Copy link
Member Author

VSCode-memory-screenrecording2.mov

lldb::SBError memReadError;
auto bytes_read =
process.ReadMemory(addr, buf.data(), available_count, memReadError);
if (memReadError.Fail()) {
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Maybe we should check if bytes_read > 0 instead of the memReadError.Fail(). Some process plug-ins might return "error: couldn't read all bytes" but they might return some bytes. Best to check bytes_read. So maybe:

if (bytes_read == 0) {
  response["success"] = false;
  if (memReadError.Fail()) {
    EmplaceSafeString(response, "message",
                        "Unable to read memory: " +
                            std::string(memReadError.GetCString()));
  } else {
    EmplaceSafeString(response, "message",
                        "Unable to read memory);
  }
  g_dap.SendJSON(llvm::json::Value(std::move(response)));
  return;
}

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We already have the even more restrictive check if (bytes_read != available_count) { a couple of lines down. Is that check already sufficient?

@vogelsgesang
Copy link
Member Author

Thanks for the super-fast review, @clayborg! I addressed all the straightforward issues and replied to the non-straightforward ones

Comment on lines 1088 to 1090
if (!v.GetType().IsPointerType() && !v.GetType().IsArrayType())
return std::nullopt;

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This means that we only return a valid memory reference if we have a pointer or an array. Do we really only want a memory reference for pointers and arrays? I would think we would want a valid memory reference for any variable that lives in memory. For example:

void foo() {
  int x = 12;
}

Will create a local variable, and with no optimizations with -O0, the variable x will live on the stack. We can return a value memory reference for this even though it isn't a pointer or an array. So my suggestion for this function was:

std::optional<lldb::addr_t> GetMemoryReference(lldb::SBValue v) {
  SBType v_type = v.GetType();
  if (v_type.IsPointerType() || v_type.IsReferenceType())
    v = v.Dereference();
  lldb::addr_t load_addr = deref.GetLoadAddress();
  if (load_addr != LLDB_INVALID_ADDRESS)
    return load_addr;
  return std::nullopt;
}

This will transparently look through both pointers and references to get the pointee type's location, and for all other types, including arrays, it will get the addresss of the data for the value.

Copy link
Member Author

@vogelsgesang vogelsgesang Aug 16, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The proposed code would led to my_value and my_ptr having the same memoryReference in the following code snippet:

   int my_value;
   int* my_ptr = &my_value;

Also, when entering my_value and &my_value into the debug console, I would get the same memory references back. As a user, I would find this very confusing.

As such, I do prefer the current semantics, where my_value does not have a memory reference. If users want to get to that memory location, they can use &my_value

(Also, I noticed that the current code did not work for arrays. Hence, I removed the IsArrayType check)

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I used to love being able to see a value in memory with the Metrowerks IDE many years ago. Do we want to just alway return the load address?

std::optional<lldb::addr_t> GetMemoryReference(lldb::SBValue v) {
  lldb::addr_t load_addr = v.GetLoadAddress();
  if (load_addr != LLDB_INVALID_ADDRESS)
    return load_addr;
  return std::nullopt;
}

This way we always show the address of any value. If the value is a pointer, we will show the address of pointer itself in memory, not what is being pointed to. This way these will have different memory addresses:

int my_value;
int* my_ptr = &my_value;

The specification states:

    /**
     * A memory reference to a location appropriate for this result.
     * For pointer type eval results, this is generally a reference to the
     * memory address contained in the pointer.
     * This attribute may be returned by a debug adapter if corresponding
     * capability `supportsMemoryReferences` is true.
     */
    memoryReference?: string;

So A memory reference to a location appropriate for this result. in my mind is where the value lives in memory and I would like to see any value and where it is in memory. Even though the next sentence mentions a pointer should be dereferenced. Seems like a shame to only allow us to view memory for pointers as they are the easiest to view in memory with memory read commands already because we can copy the value of the pointer from the variable view and then type memory read <paste> and see the memory in the debug console already. But if we have a structure that lives in memory, it is harder to see that value in memory as we would always need to first type &myStruct in the debug console just to get a pointer so that we could then open up the memory viewer.

Happy to hear opinions on this.

@clayborg
Copy link
Collaborator

Let me know what you think of my idea to always just show the load address of any SBValue. I think that is the best for LLDB, but I am open to opinions.

@clayborg
Copy link
Collaborator

clayborg commented Sep 3, 2024

Ping for my comments?

@vogelsgesang
Copy link
Member Author

Hi @clayborg,

sorry for the late reply - got knocked out by Covid for a while 😷

Let me know what you think of my idea to always just show the load address of any SBValue. I think that is the best for LLDB, but I am open to opinions.

I did check the other lldb-based debug adapter. See https://github.com/vadimcn/codelldb/blob/05502bf75e4e7878a99b0bf0a7a81bba2922cbe3/adapter/codelldb/src/debug_session/variables.rs#L280. It turns out, they also always use the load address of the value itself, without dereferencing pointers. Given this prior art, I will update this commit to do the same.

I will hopefully find time to update this pull request early next week

@vogelsgesang
Copy link
Member Author

@clayborg the code is now up-to-date and ready for another review 🙂

Copy link
Collaborator

@clayborg clayborg left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for doing the modification. LGTM

@vogelsgesang vogelsgesang merged commit 3acb1ea into llvm:main Sep 16, 2024
7 checks passed
@vogelsgesang vogelsgesang deleted the lldb-dap-memory branch September 17, 2024 06:47
@DavidSpickett
Copy link
Collaborator

DavidSpickett commented Sep 17, 2024

The test case is failing on Windows, for reasons not related to it being Windows but due to data layout.

https://lab.llvm.org/buildbot/#/builders/141/builds/2456

The const string is placed at the start of a region:

(lldb) p rawptr
(const char *) 0x00007ff77fcb5000 "dead"
(lldb) memory region --all
<...>
[0x00007ff77fc61000-0x00007ff77fcb5000) r-x .text
[0x00007ff77fcb5000-0x00007ff77fcc2000) r-- .rdata

This means that when you read with offset -1 it starts 1 byte into the .text section, reads that single byte, then returns.

(same thing would happen if the memory was the last part of a region, a positive offset would make the read end early when it hit the end of the region)

Your code needs to be aware that this may happen, or just do the memory read without looking up the region and assume that the lower layers do the right thing (I don't know if they do).

DavidSpickett added a commit that referenced this pull request Sep 17, 2024
This isn't strictly a Windows issue but for now it's the only bot
that was hit by this failure.

It can happen on Linux too but I expect we'll fix it and remove the
skip soon anyway.

Test was added in #104317.
@DavidSpickett
Copy link
Collaborator

I've disabled this on Windows for now but please address the issue generally.

Testing memory region overlaps is tricky. I did it for memory tagging by hoping that the kernel would allocate pages in the right order, which worked for at least simulated systems. Pavel recently added lldb/test/API/functionalities/memory/holes/main.cpp which has Windows and Linux code for allocating regions.

It's tricky though, because you need to split a region here instead of making a gap. Maybe you can allocate a larger area, unmap part of it, then remap that part again.

If you don't have a Windows machine to do that on, I'm happy for the test to be Linux specific because:

  • The functionality is pretty generic, if it works there, it'll work on Windows.
  • I'm happy to add the Windows support code myself.

@vogelsgesang
Copy link
Member Author

I think there is actually an even simpler fix for this test failure...
#109057

Copy link
Collaborator

@labath labath left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Sorry for being late to the party. I don't usually follow lldb-dap patches, but the mention of memory reads and windows errors intrigued me.

I don't think this is a good approach to reading memory. You shouldn't trust GetMemoryRegionInfo over ReadMemory. The data returned by the first may not be always there, and (in case of linux, at least), it will return the memory permissions in effect for the debugged application. The debugger can bypass (most of) those restrictions and also access memory that's unreadable to the debugged process.

The method I'd recommend is to call ReadMemory, and then if (and only if) it returns a partial (or no) result, consult the memory region info, to find the size of the unreadable region (by querying for the memory region containing the first byte that could not be read).

It's possible you had to do it this way, since before #106532 (and it looks like you started this patch before that), truncated reads did not work (on linux, at least), but it should be possible to do it now.

That should also avoid all of the business with being limited to a single memory region, since crossing from one readable region to another would be handled inside ReadMemory.

tmsri pushed a commit to tmsri/llvm-project that referenced this pull request Sep 19, 2024
This isn't strictly a Windows issue but for now it's the only bot
that was hit by this failure.

It can happen on Linux too but I expect we'll fix it and remove the
skip soon anyway.

Test was added in llvm#104317.
@vogelsgesang
Copy link
Member Author

Thanks for that thoughtful write-up!

#109485 should address your points

The method I'd recommend is to call ReadMemory

I assume I should prefer Target::ReadMemory over Process::readMemory, right?

vogelsgesang added a commit that referenced this pull request Sep 25, 2024
The `readMemory` request used the `MemoryRegionInfo` so it could also
support short reads. Since #106532, this is no longer necessary, as
mentioned by @labath in a comment on #104317.

With this commit, we no longer set the `unreadableBytes` in the result.
But this is optional, anyway, according to the spec, and afaik the
VS Code UI does not make good use of `unreadableBytes`, anyway.

We prefer `SBTarget::ReadMemory` over `SBProcess::ReadMemory`, because
the `memory read` command also reads memory through the target instead
of the process, and because users would expect the UI view and the
results from memory read to be in-sync.
JDevlieghere pushed a commit to swiftlang/llvm-project that referenced this pull request Oct 16, 2024
Add support for the `readMemory` request which allows VS-Code to
inspect memory. Also, add `memoryReference` to variables and `evaluate`
responses, such that the binary view can be opened from the variables
view and from the "watch" pane.

(cherry picked from commit 3acb1ea)
JDevlieghere pushed a commit to swiftlang/llvm-project that referenced this pull request Oct 16, 2024
The `readMemory` request used the `MemoryRegionInfo` so it could also
support short reads. Since llvm#106532, this is no longer necessary, as
mentioned by @labath in a comment on llvm#104317.

With this commit, we no longer set the `unreadableBytes` in the result.
But this is optional, anyway, according to the spec, and afaik the
VS Code UI does not make good use of `unreadableBytes`, anyway.

We prefer `SBTarget::ReadMemory` over `SBProcess::ReadMemory`, because
the `memory read` command also reads memory through the target instead
of the process, and because users would expect the UI view and the
results from memory read to be in-sync.

(cherry picked from commit 786dc5a)
adrian-prantl pushed a commit to adrian-prantl/llvm-project that referenced this pull request Dec 20, 2024
This isn't strictly a Windows issue but for now it's the only bot
that was hit by this failure.

It can happen on Linux too but I expect we'll fix it and remove the
skip soon anyway.

Test was added in llvm#104317.

(cherry picked from commit ab38ec9)
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

7 participants