Skip to content

[lldb] Implement (SB)Function::GetInstructions for discontinuous functions #122933

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
merged 2 commits into from
Jan 15, 2025

Conversation

labath
Copy link
Collaborator

@labath labath commented Jan 14, 2025

The main change is to permit the disassembler class to process/store multiple (discontinuous) ranges of addresses. The result is not ambiguous because each instruction knows its size (in addition to its address), so we can check for discontinuity by looking at whether the next instruction begins where the previous ends.

This patch doesn't handle the "disassemble" CLI command, which uses a more elaborate mechanism for disassembling and printing instructions.

…tions

The main change is to permit the disassembler class to process/store
multiple (discontinuous) ranges of addresses. The result is not
ambiguous because each instruction knows its size (in addition to its
address), so we can check for discontinuity by looking at whether the
next instruction begins where the previous ends.

This patch doesn't handle the "disassemble" CLI command, which uses a
more elaborate mechanism for dissassembling and printing instructions.
@llvmbot
Copy link
Member

llvmbot commented Jan 14, 2025

@llvm/pr-subscribers-lldb

Author: Pavel Labath (labath)

Changes

The main change is to permit the disassembler class to process/store multiple (discontinuous) ranges of addresses. The result is not ambiguous because each instruction knows its size (in addition to its address), so we can check for discontinuity by looking at whether the next instruction begins where the previous ends.

This patch doesn't handle the "disassemble" CLI command, which uses a more elaborate mechanism for disassembling and printing instructions.


Full diff: https://github.com/llvm/llvm-project/pull/122933.diff

6 Files Affected:

  • (modified) lldb/include/lldb/Core/Disassembler.h (+9-2)
  • (modified) lldb/source/API/SBFunction.cpp (+1-1)
  • (modified) lldb/source/API/SBInstructionList.cpp (+8)
  • (modified) lldb/source/Core/Disassembler.cpp (+11-16)
  • (modified) lldb/source/Symbol/Function.cpp (+1-1)
  • (modified) lldb/test/Shell/ScriptInterpreter/Python/sb_function_ranges.s (+11)
diff --git a/lldb/include/lldb/Core/Disassembler.h b/lldb/include/lldb/Core/Disassembler.h
index e0ad4316e02497..21bacb14f9b25b 100644
--- a/lldb/include/lldb/Core/Disassembler.h
+++ b/lldb/include/lldb/Core/Disassembler.h
@@ -428,7 +428,7 @@ class Disassembler : public std::enable_shared_from_this<Disassembler>,
   static lldb::DisassemblerSP
   DisassembleRange(const ArchSpec &arch, const char *plugin_name,
                    const char *flavor, const char *cpu, const char *features,
-                   Target &target, const AddressRange &disasm_range,
+                   Target &target, llvm::ArrayRef<AddressRange> disasm_ranges,
                    bool force_live_memory = false);
 
   static lldb::DisassemblerSP
@@ -460,7 +460,11 @@ class Disassembler : public std::enable_shared_from_this<Disassembler>,
 
   size_t ParseInstructions(Target &target, Address address, Limit limit,
                            Stream *error_strm_ptr,
-                           bool force_live_memory = false);
+                           bool force_live_memory = false) {
+    m_instruction_list.Clear();
+    return AppendInstructions(target, address, limit, error_strm_ptr,
+                              force_live_memory);
+  }
 
   virtual size_t DecodeInstructions(const Address &base_addr,
                                     const DataExtractor &data,
@@ -480,6 +484,9 @@ class Disassembler : public std::enable_shared_from_this<Disassembler>,
                                       const char *flavor) = 0;
 
 protected:
+  size_t AppendInstructions(Target &target, Address address, Limit limit,
+                            Stream *error_strm_ptr, bool force_live_memory);
+
   // SourceLine and SourceLinesToDisplay structures are only used in the mixed
   // source and assembly display methods internal to this class.
 
diff --git a/lldb/source/API/SBFunction.cpp b/lldb/source/API/SBFunction.cpp
index 414eccc357c0e4..d07594c2e8c010 100644
--- a/lldb/source/API/SBFunction.cpp
+++ b/lldb/source/API/SBFunction.cpp
@@ -127,7 +127,7 @@ SBInstructionList SBFunction::GetInstructions(SBTarget target,
       sb_instructions.SetDisassembler(Disassembler::DisassembleRange(
           module_sp->GetArchitecture(), nullptr, flavor,
           target_sp->GetDisassemblyCPU(), target_sp->GetDisassemblyFeatures(),
-          *target_sp, m_opaque_ptr->GetAddressRange(), force_live_memory));
+          *target_sp, m_opaque_ptr->GetAddressRanges(), force_live_memory));
     }
   }
   return sb_instructions;
diff --git a/lldb/source/API/SBInstructionList.cpp b/lldb/source/API/SBInstructionList.cpp
index 3f37b984cb4627..5747ae0366ec30 100644
--- a/lldb/source/API/SBInstructionList.cpp
+++ b/lldb/source/API/SBInstructionList.cpp
@@ -151,6 +151,10 @@ bool SBInstructionList::GetDescription(Stream &sref) {
       FormatEntity::Parse("${addr}: ", format);
       SymbolContext sc;
       SymbolContext prev_sc;
+
+      // Expected address of the next instruction. Used to print an empty line
+      // for non-contiguous blocks of insns.
+      std::optional<Address> next_addr;
       for (size_t i = 0; i < num_instructions; ++i) {
         Instruction *inst =
             m_opaque_sp->GetInstructionList().GetInstructionAtIndex(i).get();
@@ -165,10 +169,14 @@ bool SBInstructionList::GetDescription(Stream &sref) {
               addr, eSymbolContextEverything, sc);
         }
 
+        if (next_addr && addr != next_addr)
+          sref.EOL();
         inst->Dump(&sref, max_opcode_byte_size, true, false,
                    /*show_control_flow_kind=*/false, nullptr, &sc, &prev_sc,
                    &format, 0);
         sref.EOL();
+        next_addr = addr;
+        next_addr->Slide(inst->GetOpcode().GetByteSize());
       }
       return true;
     }
diff --git a/lldb/source/Core/Disassembler.cpp b/lldb/source/Core/Disassembler.cpp
index 68e52144eb6ef8..ed6a008a29fef5 100644
--- a/lldb/source/Core/Disassembler.cpp
+++ b/lldb/source/Core/Disassembler.cpp
@@ -123,22 +123,19 @@ static Address ResolveAddress(Target &target, const Address &addr) {
 lldb::DisassemblerSP Disassembler::DisassembleRange(
     const ArchSpec &arch, const char *plugin_name, const char *flavor,
     const char *cpu, const char *features, Target &target,
-    const AddressRange &range, bool force_live_memory) {
-  if (range.GetByteSize() <= 0)
-    return {};
-
-  if (!range.GetBaseAddress().IsValid())
-    return {};
-
+    llvm::ArrayRef<AddressRange> disasm_ranges, bool force_live_memory) {
   lldb::DisassemblerSP disasm_sp = Disassembler::FindPluginForTarget(
       target, arch, flavor, cpu, features, plugin_name);
 
   if (!disasm_sp)
     return {};
 
-  const size_t bytes_disassembled = disasm_sp->ParseInstructions(
-      target, range.GetBaseAddress(), {Limit::Bytes, range.GetByteSize()},
-      nullptr, force_live_memory);
+  size_t bytes_disassembled = 0;
+  for (const AddressRange &range : disasm_ranges) {
+    bytes_disassembled += disasm_sp->AppendInstructions(
+        target, range.GetBaseAddress(), {Limit::Bytes, range.GetByteSize()},
+        nullptr, force_live_memory);
+  }
   if (bytes_disassembled == 0)
     return {};
 
@@ -1092,11 +1089,9 @@ InstructionList::GetIndexOfInstructionAtLoadAddress(lldb::addr_t load_addr,
   return GetIndexOfInstructionAtAddress(address);
 }
 
-size_t Disassembler::ParseInstructions(Target &target, Address start,
-                                       Limit limit, Stream *error_strm_ptr,
-                                       bool force_live_memory) {
-  m_instruction_list.Clear();
-
+size_t Disassembler::AppendInstructions(Target &target, Address start,
+                                        Limit limit, Stream *error_strm_ptr,
+                                        bool force_live_memory) {
   if (!start.IsValid())
     return 0;
 
@@ -1129,7 +1124,7 @@ size_t Disassembler::ParseInstructions(Target &target, Address start,
   return DecodeInstructions(start, data, 0,
                             limit.kind == Limit::Instructions ? limit.value
                                                               : UINT32_MAX,
-                            false, data_from_file);
+                            /*append=*/true, data_from_file);
 }
 
 // Disassembler copy constructor
diff --git a/lldb/source/Symbol/Function.cpp b/lldb/source/Symbol/Function.cpp
index c9523281dc5659..15879f05a0ff0c 100644
--- a/lldb/source/Symbol/Function.cpp
+++ b/lldb/source/Symbol/Function.cpp
@@ -488,7 +488,7 @@ lldb::DisassemblerSP Function::GetInstructions(const ExecutionContext &exe_ctx,
   if (module_sp && exe_ctx.HasTargetScope()) {
     return Disassembler::DisassembleRange(
         module_sp->GetArchitecture(), nullptr, nullptr, nullptr, flavor,
-        exe_ctx.GetTargetRef(), GetAddressRange(), !prefer_file_cache);
+        exe_ctx.GetTargetRef(), GetAddressRanges(), !prefer_file_cache);
   }
   return lldb::DisassemblerSP();
 }
diff --git a/lldb/test/Shell/ScriptInterpreter/Python/sb_function_ranges.s b/lldb/test/Shell/ScriptInterpreter/Python/sb_function_ranges.s
index a9e4104f2aaf76..2e2bc52cd3ff99 100644
--- a/lldb/test/Shell/ScriptInterpreter/Python/sb_function_ranges.s
+++ b/lldb/test/Shell/ScriptInterpreter/Python/sb_function_ranges.s
@@ -6,6 +6,16 @@
 
 # CHECK: Found 1 function(s).
 # CHECK: foo: [input.o[0x0-0xe), input.o[0x14-0x1c)]
+# CHECK-NEXT: input.o[0x0]: cmpl   $0x0, %edi
+# CHECK-NEXT: input.o[0x3]: je     0x14
+# CHECK-NEXT: input.o[0x5]: jmp    0x7
+# CHECK-NEXT: input.o[0x7]: callq  0xe
+# CHECK-NEXT: input.o[0xc]: jmp    0x1b
+# CHECK-EMPTY:
+# CHECK-NEXT: input.o[0x14]: callq  0x19
+# CHECK-NEXT: input.o[0x19]: jmp    0x1b
+# CHECK-NEXT: input.o[0x1b]: retq
+
 
 #--- script.py
 import lldb
@@ -17,6 +27,7 @@ def __lldb_init_module(debugger, internal_dict):
   for ctx in sym_ctxs:
     fn = ctx.function
     print(f"{fn.name}: {fn.GetRanges()}")
+    print(fn.GetInstructions(target))
 
 #--- input.s
 # An example of a function which has been split into two parts. Roughly

@@ -165,10 +169,14 @@ bool SBInstructionList::GetDescription(Stream &sref) {
addr, eSymbolContextEverything, sc);
}

if (next_addr && addr != next_addr)
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm not the most experienced with optionals, doesn't the second next_addr need to be *next_addr?

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It doesn't need to be because std::optional<T> defines an operator== for comparison with plain Ts (which treats an empty optional as different from all values of T), but yes, it would be better to compare the T directly since that's not the behavior I want (I basically want to treat the empty optional as equal to any address).

Copy link
Collaborator

@jasonmolenda jasonmolenda left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM.

Copy link
Collaborator

@DavidSpickett DavidSpickett left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM

@labath labath merged commit eb96c8c into llvm:main Jan 15, 2025
7 checks passed
@labath labath deleted the disassembler branch January 15, 2025 09:37
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants