Skip to content

[lldb][lldb-dap] Added readMemory and WriteMemory, var type correction #108036

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Open
wants to merge 3 commits into
base: main
Choose a base branch
from

Conversation

jennphilqc
Copy link

Added debug adapter support for read memory and write memory. This in turn required adding Base64 variable type. We need this support to expand debugging capability.

Copy link

Thank you for submitting a Pull Request (PR) to the LLVM Project!

This PR will be automatically labeled and the relevant teams will be notified.

If you wish to, you can add reviewers by using the "Reviewers" section on this page.

If this is not working for you, it is probably because you do not have write permissions for the repository. In which case you can instead tag reviewers by name in a comment by using @ followed by their GitHub username.

If you have received no comments on your PR for a week, you can request a review by "ping"ing the PR by adding a comment “Ping”. The common courtesy "ping" rate is once a week. Please remember that you are asking for valuable time from other developers.

If you have further questions, they may be answered by the LLVM GitHub User Guide.

You can also ask questions in a comment on this PR, on the LLVM Discord or on the forums.

@llvmbot llvmbot added the lldb label Sep 10, 2024
@llvmbot
Copy link
Member

llvmbot commented Sep 10, 2024

@llvm/pr-subscribers-lldb

Author: Jennifer Phillips (jennphilqc)

Changes

Added debug adapter support for read memory and write memory. This in turn required adding Base64 variable type. We need this support to expand debugging capability.


Full diff: https://github.com/llvm/llvm-project/pull/108036.diff

1 Files Affected:

  • (modified) lldb/tools/lldb-dap/lldb-dap.cpp (+237)
diff --git a/lldb/tools/lldb-dap/lldb-dap.cpp b/lldb/tools/lldb-dap/lldb-dap.cpp
index c5c4b09f15622b..8aed6d1a432b84 100644
--- a/lldb/tools/lldb-dap/lldb-dap.cpp
+++ b/lldb/tools/lldb-dap/lldb-dap.cpp
@@ -13,6 +13,7 @@
 #include <cassert>
 #include <climits>
 #include <cstdarg>
+#include <cstdint>
 #include <cstdio>
 #include <cstdlib>
 #include <cstring>
@@ -39,6 +40,7 @@
 #include <sys/prctl.h>
 #endif
 
+#include "lldb/lldb-types.h"
 #include <algorithm>
 #include <array>
 #include <chrono>
@@ -61,6 +63,7 @@
 #include "llvm/Option/ArgList.h"
 #include "llvm/Option/OptTable.h"
 #include "llvm/Option/Option.h"
+#include "llvm/Support/Base64.h"
 #include "llvm/Support/Errno.h"
 #include "llvm/Support/FileSystem.h"
 #include "llvm/Support/InitLLVM.h"
@@ -1725,6 +1728,10 @@ void request_initialize(const llvm::json::Object &request) {
   body.try_emplace("supportsDataBreakpoints", true);
   // The debug adapter support for instruction breakpoint.
   body.try_emplace("supportsInstructionBreakpoints", true);
+  // The debug adapter support for read.
+  body.try_emplace("supportsReadMemoryRequest", true);
+  // The debug adapter support for write.
+  body.try_emplace("supportsWriteMemoryRequest", true);
 
   // Put in non-DAP specification lldb specific information.
   llvm::json::Object lldb_json;
@@ -4332,6 +4339,232 @@ void request_setInstructionBreakpoints(const llvm::json::Object &request) {
   g_dap.SendJSON(llvm::json::Value(std::move(response)));
 }
 
+// "ReadMemoryRequest": {
+//   "allOf": [ { "$ref": "#/definitions/Request" }, {
+//     "type": "object",
+//     "description": "Reads bytes from memory at the provided location.\n
+//     Clients should only call this request if the corresponding capability
+//     `supportsReadMemoryRequest` is true.",
+//     "properties": {
+//       "command": {
+//         "type": "string",
+//         "enum": [ "readMemory" ]
+//       },
+//       "arguments": {
+//         "$ref": "#/definitions/ReadMemoryArguments"
+//       }
+//     },
+//     "required": [ "command", "arguments" ]
+//   }]
+// },
+// "ReadMemoryArguments": {
+//   "type": "object",
+//   "description": "Arguments for `readMemory` request.",
+//   "properties": {
+//     "memoryReference": {
+//       "type": "string",
+//       "description": "Memory reference to the base location from which data
+//       should be read."
+//     },
+//     "offset": {
+//       "type": "integer",
+//       "description": "Offset (in bytes) to be applied to the reference
+//       location before reading data. Can be negative."
+//     },
+//     "count": {
+//       "type": "integer",
+//       "description": "Number of bytes to read at the specified location and
+//       offset."
+//     }
+//   },
+//   "required": [ "memoryReference", "count" ]
+// },
+// "ReadMemoryResponse": {
+//   "allOf": [ { "$ref": "#/definitions/Response" }, {
+//     "type": "object",
+//     "description": "Response to `readMemory` request.",
+//     "properties": {
+//       "body": {
+//         "type": "object",
+//         "properties": {
+//           "address": {
+//             "type": "string",
+//             "description": "The address of the first byte of data returned.
+//             \nTreated as a hex value if prefixed with `0x`, or as a decimal
+//             value otherwise."
+//           },
+//           "unreadableBytes": {
+//             "type": "integer",
+//             "description": "The number of unreadable bytes encountered after
+//             the last successfully read byte.\nThis can be used to determine
+//             the number of bytes that should be skipped before a subsequent
+//             `readMemory` request succeeds."
+//           },
+//           "data": {
+//             "type": "string",
+//             "description": "The bytes read from memory, encoded using
+//             base64. If the decoded length of `data` is less than the
+//             requested `count` in the original `readMemory` request, and
+//             `unreadableBytes` is zero or omitted, then the client should
+//             assume it's reached the end of readable memory."
+//           }
+//         },
+//         "required": [ "address" ]
+//       }
+//     }
+//   }]
+// },
+void request_readMemory(const llvm::json::Object &request) {
+  llvm::json::Object response;
+  llvm::json::Array response_readMemory;
+  llvm::json::Object body;
+  lldb::SBError error;
+  FillResponse(request, response);
+
+  auto arguments = request.getObject("arguments");
+  llvm::StringRef memoryReference = GetString(arguments, "memoryReference");
+  const size_t count = GetUnsigned(arguments, "count", 0);
+  const auto offset = GetSigned(arguments, "offset", 0);
+
+  lldb::addr_t address;
+  memoryReference.getAsInteger(0, address);
+  lldb::addr_t address_offset = address + offset;
+  // address as string
+  std::string address_offset_str(std::to_string(address_offset));
+
+  // make storage for the memory to read
+  //  need to use a vector because VC++ 2019 errors out on char buffer[count]
+  std::vector<char> buffer(count);
+
+  // read the memory add to storage
+  lldb::SBProcess process = g_vsc.target.GetProcess();
+  size_t countRead =
+      process.ReadMemory(address_offset, buffer.data(), count, error);
+  const llvm::StringRef inputBytes = {buffer.data(), countRead};
+  // return the read memory in base64
+  std::string data = llvm::encodeBase64(inputBytes);
+
+  // body: address, data? in spec formats
+  body.try_emplace("address", std::move(address_offset_str));
+  body.try_emplace("data", std::move(data));
+
+  response.try_emplace("body", std::move(body));
+  g_vsc.SendJSON(llvm::json::Value(std::move(response)));
+}
+
+// "WriteMemoryRequest": {
+//       "allOf": [ { "$ref": "#/definitions/Request" }, {
+//         "type": "object",
+//         "description": "Writes bytes to memory at the provided location.\n
+//         Clients should only call this request if the corresponding
+//         capability `supportsWriteMemoryRequest` is true.",
+//         "properties": {
+//           "command": {
+//             "type": "string",
+//             "enum": [ "writeMemory" ]
+//           },
+//           "arguments": {
+//             "$ref": "#/definitions/WriteMemoryArguments"
+//           }
+//         },
+//         "required": [ "command", "arguments" ]
+//       }]
+//     },
+//     "WriteMemoryArguments": {
+//       "type": "object",
+//       "description": "Arguments for `writeMemory` request.",
+//       "properties": {
+//         "memoryReference": {
+//           "type": "string",
+//           "description": "Memory reference to the base location to which
+//           data should be written."
+//         },
+//         "offset": {
+//           "type": "integer",
+//           "description": "Offset (in bytes) to be applied to the reference
+//           location before writing data. Can be negative."
+//         },
+//         "allowPartial": {
+//           "type": "boolean",
+//           "description": "Property to control partial writes. If true, the
+//           debug adapter should attempt to write memory even if the entire
+//           memory region is not writable. In such a case the debug adapter
+//           should stop after hitting the first byte of memory that cannot be
+//           written and return the number of bytes written in the response
+//           via the `offset` and `bytesWritten` properties.\nIf false or
+//           missing, a debug adapter should attempt to verify the region is
+//           writable before writing, and fail the response if it is not."
+//         },
+//         "data": {
+//           "type": "string",
+//           "description": "Bytes to write, encoded using base64."
+//         }
+//       },
+//       "required": [ "memoryReference", "data" ]
+//     },
+//     "WriteMemoryResponse": {
+//       "allOf": [ { "$ref": "#/definitions/Response" }, {
+//         "type": "object",
+//         "description": "Response to `writeMemory` request.",
+//         "properties": {
+//           "body": {
+//             "type": "object",
+//             "properties": {
+//               "offset": {
+//                 "type": "integer",
+//                 "description": "Property that should be returned when
+//                 `allowPartial` is true to indicate the offset of the first
+//                 byte of data successfully written. Can be negative."
+//               },
+//               "bytesWritten": {
+//                 "type": "integer",
+//                 "description": "Property that should be returned when
+//                 `allowPartial` is true to indicate the number of bytes
+//                 starting from address that were successfully written."
+//               }
+//             }
+//           }
+//         }
+//       }]
+//     },
+
+void request_writeMemory(const llvm::json::Object &request) {
+  llvm::json::Object response;
+  llvm::json::Array response_writeMemory;
+  llvm::json::Object body;
+  FillResponse(request, response);
+
+  auto arguments = request.getObject("arguments");
+  llvm::StringRef memoryReference = GetString(arguments, "memoryReference");
+  const auto offset = GetSigned(arguments, "offset", 0);
+  llvm::StringRef data64 = GetString(arguments, "data");
+
+  lldb::addr_t address;
+  memoryReference.getAsInteger(0, address);
+  lldb::addr_t address_offset = address + offset;
+
+  std::vector<char> output;
+  lldb::SBError writeError;
+  int64_t countWrite = 0;
+
+  llvm::Error decodeError = llvm::decodeBase64(data64, output);
+
+  int64_t dataLen = output.size();
+
+  // write the memory
+  if (decodeError.success()) {
+    lldb::SBProcess process = g_vsc.target.GetProcess();
+    countWrite =
+        process.WriteMemory(address_offset, static_cast<void *>(output.data()),
+                            dataLen, writeError);
+  }
+
+  body.try_emplace("bytesWritten", std::move(countWrite));
+
+  response.try_emplace("body", std::move(body));
+  g_vsc.SendJSON(llvm::json::Value(std::move(response)));
+}
+
 void RegisterRequestCallbacks() {
   g_dap.RegisterRequestCallback("attach", request_attach);
   g_dap.RegisterRequestCallback("completions", request_completions);
@@ -4367,6 +4600,10 @@ void RegisterRequestCallbacks() {
   // Instruction breakpoint request
   g_dap.RegisterRequestCallback("setInstructionBreakpoints",
                                 request_setInstructionBreakpoints);
+  // ReadMemory request
+  g_vsc.RegisterRequestCallback("readMemory", request_readMemory);
+  // WriteMemory request
+  g_vsc.RegisterRequestCallback("writeMemory", request_writeMemory);
   // Custom requests
   g_dap.RegisterRequestCallback("compileUnits", request_compileUnits);
   g_dap.RegisterRequestCallback("modules", request_modules);

@jennphilqc
Copy link
Author

Please add @clayborg as Reviewer.

Copy link
Collaborator

@labath labath left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks fairly straight forward, but it needs a test case.

Comment on lines +4435 to +4436
// make storage for the memory to read
// need to use a vector because VC++ 2019 errors out on char buffer[count]
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Variable-length arrays (a C++ extension) are also forbidden by the coding standards, so there's no need to justify this choice.

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

removed justification

Comment on lines +4445 to +4449
std::string data = llvm::encodeBase64(inputBytes);

// body: address, data? in spec formats
body.try_emplace("address", std::move(address_offset_str));
body.try_emplace("data", std::move(data));
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
std::string data = llvm::encodeBase64(inputBytes);
// body: address, data? in spec formats
body.try_emplace("address", std::move(address_offset_str));
body.try_emplace("data", std::move(data));
// body: address, data? in spec formats
body.try_emplace("address", std::move(address_offset_str));
body.try_emplace("data", llvm::encodeBase64(inputBytes));

If you don't make data a variable, then there's no need to move from it. You can do the same thing with address_offset_str

Copy link
Member

@walter-erquinigo walter-erquinigo left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for working on this.
Could you please write some tests?

lldb::SBError writeError;
int64_t countWrite = 0;

llvm::Error decodeError = llvm::decodeBase64(data64, output);
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

you need to return a failed response if there was a decoding error, otherwise the debug adapter will crash because you didn't handle the Error case.

Comment on lines +4557 to +4559
countWrite =
process.WriteMemory(address_offset, static_cast<void *>(output.data()),
dataLen, writeError);
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

how does this behave if the dataLen is much bigger than the writable memory?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Related to this, can you handle allowPartial correctly? If LLDB can't support easily a true case, at least write some documentation about it

@vogelsgesang
Copy link
Member

Hi @jennphilqc,

Great minds think alike - I actually also implemented the readMemory request in #104317.

Thankfully, we didn't literally do the same, duplicated work

  • your PR also implements writeMemory which my PR is still lacking. Here your PR is clearly further along
  • my PR already includes test cases which still seem to be lacking in your PR. When it comes to supporting readMemory, my PR seems to be further along

Given this, I think we should take the best parts of both PRs 🙂 What do you think about the following path forward?

  • rebase this PR on top of [lldb-dap] Support inspecting memory #104317
  • scope this PR down to writeMemory
  • modify this PR to also extend the test cases in TestDAP_memory.py (added by my PR) to introduce test coverage also for writeMemory

Also, I noticed that you did not introduce any memoryReferences as part of your pull request. With your PR, how can a user even open the hex-viewer in VS-Code? Is there some way which I am not aware of to open the hex-viewer without a memoryReference? Asking, because I was considering introducing a "View memory address" via TypeScript, similar to what VSCodeLLDB does - but if there already is some other way to pop up the memory-viewer, doing so would be unnecessary

@jennphilqc
Copy link
Author

Given this, I think we should take the best parts of both PRs 🙂 What do you think about the following path forward?

  • rebase this PR on top of [lldb-dap] Support inspecting memory #104317
  • scope this PR down to writeMemory
  • modify this PR to also extend the test cases in TestDAP_memory.py (added by my PR) to introduce test coverage also for writeMemory

It sounds like a great idea to work together on this, @vogelsgesang ! Let me set up the rebase and work on all the suggested edits above.

Also, I noticed that you did not introduce any memoryReferences as part of your pull request. With your PR, how can a user even open the hex-viewer in VS-Code? Is there some way which I am not aware of to open the hex-viewer without a memoryReference? Asking, because I was considering introducing a "View memory address" via TypeScript, similar to what VSCodeLLDB does - but if there already is some other way to pop up the memory-viewer, doing so would be unnecessary

I actually wrote this for a team that already has their own viewer. We've suggested that they upstream that as well.

@vogelsgesang
Copy link
Member

I actually wrote this for a team that already has their own viewer. We've suggested that they upstream that as well.

Is that viewer in VS-Code or in some other editor? (Trying to avoid duplicating effort, so I don't accidentally implement the same thing)

@jennphilqc
Copy link
Author

I actually wrote this for a team that already has their own viewer. We've suggested that they upstream that as well.

Is that viewer in VS-Code or in some other editor? (Trying to avoid duplicating effort, so I don't accidentally implement the same thing)

It is in VSCode, but I don't have control to upstream it (yet)

Copy link
Collaborator

@clayborg clayborg left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Reading memory is already in upstream with:

commit 3acb1eac5eb6ef4e60dd64b7845615e076cc6a3e
Author: Adrian Vogelsgesang <[email protected]>
Date:   Mon Sep 16 22:56:20 2024 +0200

    [lldb-dap] Support inspecting memory (#104317)
    
    Add support for the `readMemory` request which allows VS-Code to
    inspect memory. Also, add `memoryReference` to variables and `evaluate`
    responses, such that the binary view can be opened from the variables
    view and from the "watch" pane.

I would suggest following the same kind of code format and testing much like this patch did.

memoryReference.getAsInteger(0, address);
lldb::addr_t address_offset = address + offset;
// address as string
std::string address_offset_str(std::to_string(address_offset));
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If this isn't hex (decimal), lets encode this as hex for readability when looking at packet logs.

llvm::StringRef data64 = GetString(arguments, "data");

lldb::addr_t address;
memoryReference.getAsInteger(0, address);
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Add error checking in case this fails? What if "memoryReference" contains an invalid string? We should detect and return a value error and add a unit test for this.

This function returns true to indicate an error so this can be:

if (memoryReference.getAsInteger(0, address)) {
  // return error response
}

const auto offset = GetSigned(arguments, "offset", 0);

lldb::addr_t address;
memoryReference.getAsInteger(0, address);
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Add error checking in case this fails? What if "memoryReference" contains an invalid string? We should detect and return a value error and add a unit test for this.

This function returns true to indicate an error so this can be:

if (memoryReference.getAsInteger(0, address)) {
  // return error response
}


auto arguments = request.getObject("arguments");
llvm::StringRef memoryReference = GetString(arguments, "memoryReference");
const size_t count = GetUnsigned(arguments, "count", 0);
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Don't use size_t, it is 32 bits on a 32 bit CPUs. Use uint64_t. Also Count can't be zero, if it is return an error. We might also want to watch for a huge value being passed into this and return less bytes if the value is too large and set "unreadableBytes" to zero so it knows it can ask for more. If this value isn't present we should return an error. We might want to add a new GetUnsigned() overload like:

llvm::Expected<uint64_t> GetUnsigned(const llvm::json::Object &obj, llvm::StringRef key);

And have it return an error if:

  • the key/value pair doesn't exist
  • if the decoding fails for some reason (overflow)

We can just take the returned error and convert it to a string and return the error.

Comment on lines +4426 to +4427
const size_t count = GetUnsigned(arguments, "count", 0);
const auto offset = GetSigned(arguments, "offset", 0);
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The "offset" isn't optional, so if this value isn't in the arguments, we should return an error. Also if we fail to decode this key/value pair, we should not reset the offset to zero, but return an error. We might want to add a new GetSigned() overload like:

llvm::Expected<int64_t> GetSigned(const llvm::json::Object &obj, llvm::StringRef key);

And have it return an error if:

  • the key/value pair doesn't exist
  • if the decoding fails for some reason (overflow)

If it is optional, then this code is fine.


auto arguments = request.getObject("arguments");
llvm::StringRef memoryReference = GetString(arguments, "memoryReference");
const auto offset = GetSigned(arguments, "offset", 0);
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Use the new llvm::Expected<uint64_t> version of GetSigned() as mentioned above. This can't be missing from what I read from the spec if the arg isn't optional. If it is optional, then this code is fine.

santhoshe447 added a commit to santhoshe447/llvm-project that referenced this pull request Mar 18, 2025
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

6 participants