Skip to content

[lldb][Minidump Parser] Implement a range data vector for minidump memory ranges #136040

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
merged 3 commits into from
Jun 18, 2025

Conversation

Jlalond
Copy link
Contributor

@Jlalond Jlalond commented Apr 16, 2025

Recently I was debugging a Minidump with a few thousand ranges, and came across the (now deleted) comment:

  // I don't have a sense of how frequently this is called or how many memory
  // ranges a Minidump typically has, so I'm not sure if searching for the
  // appropriate range linearly each time is stupid.  Perhaps we should build
  // an index for faster lookups.

blaming this comment, it's 9 years old! Much overdue for this simple fix with a range data vector.

I had to add a default constructor to Range in order to implement the RangeDataVector, but otherwise this just a replacement of look up logic.

@llvmbot
Copy link
Member

llvmbot commented Apr 16, 2025

@llvm/pr-subscribers-lldb

Author: Jacob Lalonde (Jlalond)

Changes

Recently I was debugging a Minidump with a few thousand ranges, and came across the (now deleted) comment:

  // I don't have a sense of how frequently this is called or how many memory
  // ranges a Minidump typically has, so I'm not sure if searching for the
  // appropriate range linearly each time is stupid.  Perhaps we should build
  // an index for faster lookups.

blaming this comment, it's 9 years old! Much overdue for this simple fix with a range data vector.

I had to add a default constructor to Range in order to implement the RangeDataVector, but otherwise this just a replacement of look up logic.


Full diff: https://github.com/llvm/llvm-project/pull/136040.diff

2 Files Affected:

  • (modified) lldb/source/Plugins/Process/minidump/MinidumpParser.cpp (+33-30)
  • (modified) lldb/source/Plugins/Process/minidump/MinidumpParser.h (+16-3)
diff --git a/lldb/source/Plugins/Process/minidump/MinidumpParser.cpp b/lldb/source/Plugins/Process/minidump/MinidumpParser.cpp
index 94c0a5f11e435..24c89a173944c 100644
--- a/lldb/source/Plugins/Process/minidump/MinidumpParser.cpp
+++ b/lldb/source/Plugins/Process/minidump/MinidumpParser.cpp
@@ -429,62 +429,64 @@ MinidumpParser::GetExceptionStreams() {
 
 std::optional<minidump::Range>
 MinidumpParser::FindMemoryRange(lldb::addr_t addr) {
-  Log *log = GetLog(LLDBLog::Modules);
+  if (m_memory_ranges.IsEmpty())
+    PopulateMemoryRanges();
+
+  MemoryRangeVector::Entry *entry = m_memory_ranges.FindEntryThatContains(addr);
+  if (!entry)
+    return std::nullopt;
+
+  return entry->data;
+}
 
+void MinidumpParser::PopulateMemoryRanges() {
+  Log *log = GetLog(LLDBLog::Modules);
   auto ExpectedMemory = GetMinidumpFile().getMemoryList();
-  if (!ExpectedMemory) {
-    LLDB_LOG_ERROR(log, ExpectedMemory.takeError(),
-                   "Failed to read memory list: {0}");
-  } else {
+  if (ExpectedMemory) {
     for (const auto &memory_desc : *ExpectedMemory) {
       const LocationDescriptor &loc_desc = memory_desc.Memory;
       const lldb::addr_t range_start = memory_desc.StartOfMemoryRange;
       const size_t range_size = loc_desc.DataSize;
-
-      if (loc_desc.RVA + loc_desc.DataSize > GetData().size())
-        return std::nullopt;
-
-      if (range_start <= addr && addr < range_start + range_size) {
-        auto ExpectedSlice = GetMinidumpFile().getRawData(loc_desc);
-        if (!ExpectedSlice) {
-          LLDB_LOG_ERROR(log, ExpectedSlice.takeError(),
-                         "Failed to get memory slice: {0}");
-          return std::nullopt;
-        }
-        return minidump::Range(range_start, *ExpectedSlice);
+      auto ExpectedSlice = GetMinidumpFile().getRawData(loc_desc);
+      if (!ExpectedSlice) {
+        LLDB_LOG_ERROR(log, ExpectedSlice.takeError(),
+                       "Failed to get memory slice: {0}");
+        continue;
       }
+      m_memory_ranges.Append(MemoryRangeVector::Entry(
+          range_start, range_size,
+          minidump::Range(range_start, *ExpectedSlice)));
     }
+  } else {
+    LLDB_LOG_ERROR(log, ExpectedMemory.takeError(),
+                   "Failed to read memory list: {0}");
   }
 
   if (!GetStream(StreamType::Memory64List).empty()) {
     llvm::Error err = llvm::Error::success();
-    for (const auto &memory_desc :  GetMinidumpFile().getMemory64List(err)) {
-      if (memory_desc.first.StartOfMemoryRange <= addr 
-          && addr < memory_desc.first.StartOfMemoryRange + memory_desc.first.DataSize) {
-        return minidump::Range(memory_desc.first.StartOfMemoryRange, memory_desc.second);
-      }
+    for (const auto &memory_desc : GetMinidumpFile().getMemory64List(err)) {
+      m_memory_ranges.Append(MemoryRangeVector::Entry(
+          memory_desc.first.StartOfMemoryRange, memory_desc.first.DataSize,
+          minidump::Range(memory_desc.first.StartOfMemoryRange,
+                          memory_desc.second)));
     }
 
     if (err)
       LLDB_LOG_ERROR(log, std::move(err), "Failed to read memory64 list: {0}");
   }
 
-  return std::nullopt;
+  m_memory_ranges.Sort();
 }
 
 llvm::ArrayRef<uint8_t> MinidumpParser::GetMemory(lldb::addr_t addr,
                                                   size_t size) {
-  // I don't have a sense of how frequently this is called or how many memory
-  // ranges a Minidump typically has, so I'm not sure if searching for the
-  // appropriate range linearly each time is stupid.  Perhaps we should build
-  // an index for faster lookups.
   std::optional<minidump::Range> range = FindMemoryRange(addr);
   if (!range)
     return {};
 
   // There's at least some overlap between the beginning of the desired range
-  // (addr) and the current range.  Figure out where the overlap begins and how
-  // much overlap there is.
+  // (addr) and the current range.  Figure out where the overlap begins and
+  // how much overlap there is.
 
   const size_t offset = addr - range->start;
 
@@ -495,7 +497,8 @@ llvm::ArrayRef<uint8_t> MinidumpParser::GetMemory(lldb::addr_t addr,
   return range->range_ref.slice(offset, overlap);
 }
 
-llvm::iterator_range<FallibleMemory64Iterator> MinidumpParser::GetMemory64Iterator(llvm::Error &err) {
+llvm::iterator_range<FallibleMemory64Iterator>
+MinidumpParser::GetMemory64Iterator(llvm::Error &err) {
   llvm::ErrorAsOutParameter ErrAsOutParam(&err);
   return m_file->getMemory64List(err);
 }
diff --git a/lldb/source/Plugins/Process/minidump/MinidumpParser.h b/lldb/source/Plugins/Process/minidump/MinidumpParser.h
index 2c5e6f19ff9a1..d9d537e7ab222 100644
--- a/lldb/source/Plugins/Process/minidump/MinidumpParser.h
+++ b/lldb/source/Plugins/Process/minidump/MinidumpParser.h
@@ -17,6 +17,7 @@
 #include "lldb/Utility/Status.h"
 #include "lldb/Utility/UUID.h"
 
+#include "lldb/Utility/RangeMap.h"
 #include "llvm/ADT/ArrayRef.h"
 #include "llvm/ADT/DenseMap.h"
 #include "llvm/ADT/StringRef.h"
@@ -35,6 +36,9 @@ namespace minidump {
 
 // Describes a range of memory captured in the Minidump
 struct Range {
+  // Default constructor required for range data vector
+  // but unusued.
+  Range() {}
   lldb::addr_t start; // virtual address of the beginning of the range
   // range_ref - absolute pointer to the first byte of the range and size
   llvm::ArrayRef<uint8_t> range_ref;
@@ -45,9 +49,16 @@ struct Range {
   friend bool operator==(const Range &lhs, const Range &rhs) {
     return lhs.start == rhs.start && lhs.range_ref == rhs.range_ref;
   }
+
+  friend bool operator<(const Range &lhs, const Range &rhs) {
+    return lhs.start < rhs.start;
+  }
 };
 
-using FallibleMemory64Iterator = llvm::object::MinidumpFile::FallibleMemory64Iterator;
+using MemoryRangeVector =
+    lldb_private::RangeDataVector<lldb::addr_t, lldb::addr_t, minidump::Range>;
+using FallibleMemory64Iterator =
+    llvm::object::MinidumpFile::FallibleMemory64Iterator;
 using ExceptionStreamsIterator =
     llvm::object::MinidumpFile::ExceptionStreamsIterator;
 
@@ -97,7 +108,8 @@ class MinidumpParser {
   /// complete (includes all regions mapped into the process memory).
   std::pair<MemoryRegionInfos, bool> BuildMemoryRegions();
 
-  llvm::iterator_range<FallibleMemory64Iterator> GetMemory64Iterator(llvm::Error &err);
+  llvm::iterator_range<FallibleMemory64Iterator>
+  GetMemory64Iterator(llvm::Error &err);
 
   static llvm::StringRef GetStreamTypeAsString(StreamType stream_type);
 
@@ -109,10 +121,11 @@ class MinidumpParser {
 private:
   MinidumpParser(lldb::DataBufferSP data_sp,
                  std::unique_ptr<llvm::object::MinidumpFile> file);
-
+  void PopulateMemoryRanges();
   lldb::DataBufferSP m_data_sp;
   std::unique_ptr<llvm::object::MinidumpFile> m_file;
   ArchSpec m_arch;
+  MemoryRangeVector m_memory_ranges;
 };
 
 } // end namespace minidump

@Jlalond Jlalond changed the title [lldb][Minidump parser] Implement a range data vector for minidump memory ranges [lldb][Minidump Parser] Implement a range data vector for minidump memory ranges Apr 16, 2025
@Jlalond Jlalond requested review from mbucko and removed request for JDevlieghere April 17, 2025 18:10
@Jlalond Jlalond force-pushed the minidump-parser-improved-memory-read branch from 28b6b46 to 42d5620 Compare April 22, 2025 16:32
@Jlalond Jlalond requested a review from dmpots April 22, 2025 16:49
@Jlalond
Copy link
Contributor Author

Jlalond commented Apr 22, 2025

@dmpots adding you after our conversation yesterday as an FYI


friend bool operator<(const Range &lhs, const Range &rhs) {
return lhs.start < rhs.start;
}
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

You don't need this class anymore. Just store a llvm::ArrayRef<uint8_t> in the MemoryRangeVector definition instead of a minidump::Range

Copy link
Collaborator

@clayborg clayborg left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Just one nit in the less than operator.

@@ -45,9 +49,16 @@ struct Range {
friend bool operator==(const Range &lhs, const Range &rhs) {
return lhs.start == rhs.start && lhs.range_ref == rhs.range_ref;
}

friend bool operator<(const Range &lhs, const Range &rhs) {
return lhs.start < rhs.start;
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Do you want to do something like:

friend bool operator<(const Range &lhs, const Range &rhs) {
  if (lhs.start == rhs.start)
    return lhs.range_ref.size() < rhs.range_ref.size();  
  return lhs.start < rhs.start;
}

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@clayborg is this possible?

A minidump with two ranges at the same address should be impossible correct?

Copy link

github-actions bot commented Jun 18, 2025

✅ With the latest revision this PR passed the C/C++ code formatter.

@Jlalond Jlalond force-pushed the minidump-parser-improved-memory-read branch from 43b8eae to 5f58ca4 Compare June 18, 2025 00:09
@Jlalond Jlalond merged commit a96a3f1 into llvm:main Jun 18, 2025
7 checks passed
fschlimb pushed a commit to fschlimb/llvm-project that referenced this pull request Jun 18, 2025
…mory ranges (llvm#136040)

Recently I was debugging a Minidump with a few thousand ranges, and came
across the (now deleted) comment:

```
  // I don't have a sense of how frequently this is called or how many memory
  // ranges a Minidump typically has, so I'm not sure if searching for the
  // appropriate range linearly each time is stupid.  Perhaps we should build
  // an index for faster lookups.
```

blaming this comment, it's 9 years old! Much overdue for this simple fix
with a range data vector.

I had to add a default constructor to Range in order to implement the
RangeDataVector, but otherwise this just a replacement of look up logic.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants