Skip to content

[lldb-dap] Fix disassembled ranges for disassemble request #105446

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Closed
Closed
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
Original file line number Diff line number Diff line change
Expand Up @@ -674,11 +674,11 @@ def request_disconnect(self, terminateDebuggee=None):
return self.send_recv(command_dict)

def request_disassemble(
self, memoryReference, offset=-50, instructionCount=200, resolveSymbols=True
self, memoryReference, instructionOffset=0, instructionCount=10, resolveSymbols=True
):
args_dict = {
"memoryReference": memoryReference,
"offset": offset,
"instructionOffset": instructionOffset,
"instructionCount": instructionCount,
"resolveSymbols": resolveSymbols,
}
Expand Down
Original file line number Diff line number Diff line change
Expand Up @@ -29,6 +29,14 @@ def test_disassemble(self):
)
self.continue_to_next_stop()

stackFrames = self.get_stackFrames(
threadId=threadId, startFrame=frameIndex, levels=1
)
self.assertIsNotNone(stackFrames)

# XXX finish updating test case
memoryReference = stackFrames[0]["instructionPointerReference"]

pc_assembly = self.disassemble(frameIndex=0)
self.assertIn("location", pc_assembly, "Source location missing.")
self.assertIn("instruction", pc_assembly, "Assembly instruction missing.")
Expand Down
68 changes: 61 additions & 7 deletions lldb/tools/lldb-dap/lldb-dap.cpp
Original file line number Diff line number Diff line change
Expand Up @@ -3910,18 +3910,36 @@ void request_disassemble(const llvm::json::Object &request) {
return;
}

addr_ptr += GetSigned(arguments, "instructionOffset", 0);
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We still need to use the "instructionOffset" here if it is specified. Seems like it isn't optional, so if it isn't here, then maybe return an error if it is required. We might want to add a overload of GetSigned like:

llvm::Expected<int64_t> GetSigned(arguments, "instructionOffset");

This would return an error if it isn't in the args, or if the decoding failed. We can just return a string version of the error as an error to this command.

lldb::SBAddress addr(addr_ptr, g_dap.target);
addr_ptr += GetSigned(arguments, "offset", 0);
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is "offset" required? If so we don't want to set it to zero if this fails, we would wan't to return an error. Use the above GetSigned function that returns an llvm::Expected

lldb::SBAddress addr = g_dap.target.ResolveLoadAddress(addr_ptr);
if (!addr.IsValid()) {
response["success"] = false;
response["message"] = "Memory reference not found in the current binary.";
g_dap.SendJSON(llvm::json::Value(std::move(response)));
return;
}

const auto inst_count = GetUnsigned(arguments, "instructionCount", 0);
lldb::SBInstructionList insts =
g_dap.target.ReadInstructions(addr, inst_count);
int64_t inst_offset = GetSigned(arguments, "instructionOffset", 0);
const int64_t inst_count = GetSigned(arguments, "instructionCount", 0);
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Use the llvm::Expected<int64_t> GetSigned(...)? If this is required. Same thing for "instructionOffset" above.

if (inst_count < 0) {
response["success"] = false;
response["message"] = "Negative instruction count.";
g_dap.SendJSON(llvm::json::Value(std::move(response)));
return;
}

lldb::SBInstructionList insts;
if (lldb::SBFunction func = addr.GetFunction()) {
// First try to identify the function boundaries through debug information
insts = func.GetInstructions(g_dap.target);
} else if (lldb::SBSymbol sym = addr.GetSymbol()) {
// Try to identify the function boundaries through the symbol table
insts = sym.GetInstructions(g_dap.target);
} else {
// We could not identify the function. Just disassemble starting from the
// provided address.
insts = g_dap.target.ReadInstructions(addr, inst_offset + inst_count);
}

if (!insts.IsValid()) {
response["success"] = false;
Expand All @@ -3930,10 +3948,46 @@ void request_disassemble(const llvm::json::Object &request) {
return;
}

// Find the start offset in the disassembled function
ssize_t offset = 0;
while (insts.GetInstructionAtIndex(offset).GetAddress().GetLoadAddress(
g_dap.target) < addr.GetLoadAddress(g_dap.target))
++offset;
offset += inst_offset;
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Need to watch for "offset" exceeding the size of the insts instruction list here. If we exceed, we need to get the next function and continue the search and update the "inst_offset" and "offset" correctly if we exceed the boundary


// Format the disassembled instructions
const bool resolveSymbols = GetBoolean(arguments, "resolveSymbols", false);
llvm::json::Array instructions;
const auto num_insts = insts.GetSize();
for (size_t i = 0; i < num_insts; ++i) {
for (ssize_t i = offset; i < inst_count + offset; ++i) {
// Before the range of disassembled instructions?
if (i < 0) {
auto fake_addr =
insts.GetInstructionAtIndex(0).GetAddress().GetLoadAddress(
g_dap.target) -
offset + i;
llvm::json::Object disassembled_inst{
{"address", "0x" + llvm::utohexstr(fake_addr)},
{"instruction", "Instruction before disassembled address range"},
{"presentationHint", "invalid"},
};
instructions.emplace_back(std::move(disassembled_inst));
continue;
}
// Past the range of disassembled instructions?
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We will only have 1 function's worth of instructions in "insts" and we will need to update it. I would keep trying to resolve a new Symbol or Function by adding the last instruction byte size the last load address of the last instruction in the list and keep searching for a function or symbol after each resolve call. We will bail if we reach the end of the bytes that were specified.

if (static_cast<uint64_t>(i) >= insts.GetSize()) {
auto fake_addr = insts.GetInstructionAtIndex(insts.GetSize() - 1)
.GetAddress()
.GetLoadAddress(g_dap.target) -
offset + i - insts.GetSize();
llvm::json::Object disassembled_inst{
{"address", "0x" + llvm::utohexstr(fake_addr)},
{"instruction", "Instruction after disassembled address range"},
{"presentationHint", "invalid"},
};
instructions.emplace_back(std::move(disassembled_inst));
continue;
}

lldb::SBInstruction inst = insts.GetInstructionAtIndex(i);
auto addr = inst.GetAddress();
const auto inst_addr = addr.GetLoadAddress(g_dap.target);
Expand Down
Loading