[LLDB][Minidump] Add 64b support to LLDB's minidump file builder. #95312

Jlalond · 2024-06-12T21:20:18Z

Currently, LLDB does not support taking a minidump over the 4.2gb limit imposed by uint32. In fact, currently it writes the RVA's and the headers to the end of the file, which can become corrupted due to the header offset only supporting a 32b offset.

This change reorganizes how the file structure is laid out. LLDB will precalculate the number of directories required and preallocate space at the top of the file to fill in later. Additionally, thread stacks require a 32b offset, and we provision empty descriptors and keep track of them to clean up once we write the 32b memory list.

For MemoryList64, the RVA to the start of the section itself will remain in a 32b addressable space. We achieve this by predetermining the space the memory regions will take, and only writing up to 4.2 gb of data with some buffer to allow all the MemoryDescriptor64s to also still be 32b addressable.

I did not add any explicit tests to this PR because allocating 4.2gb+ to test is very expensive. However, we have 32b automation tests and I validated with in several ways, including with 5gb+ array/object and would be willing to add this as a test case.

This is a change to reorder LLDB's minidumps to support 64b memory lists/minidumps while still keeping all directories under the 32b limit. Better O(N), not N^2 cleanup code

llvmbot · 2024-06-12T21:20:48Z

@llvm/pr-subscribers-lldb
@llvm/pr-subscribers-objectyaml

@llvm/pr-subscribers-llvm-binary-utilities

Author: Jacob Lalonde (Jlalond)

Changes

Currently, LLDB does not support taking a minidump over the 4.2gb limit imposed by uint32. In fact, currently it writes the RVA's and the headers to the end of the file, which can become corrupted due to the header offset only supporting a 32b offset.

This change reorganizes how the file structure is laid out. LLDB will precalculate the number of directories required and preallocate space at the top of the file to fill in later. Additionally, thread stacks require a 32b offset, and we provision empty descriptors and keep track of them to clean up once we write the 32b memory list.

For MemoryList64, the RVA to the start of the section itself will remain in a 32b addressable space. We achieve this by predetermining the space the memory regions will take, and only writing up to 4.2 gb of data with some buffer to allow all the MemoryDescriptor64s to also still be 32b addressable.

I did not add any explicit tests to this PR because allocating 4.2gb+ to test is very expensive. However, we have 32b automation tests and I validated with in several ways, including with 5gb+ array/object and would be willing to add this as a test case.

Patch is 36.01 KiB, truncated to 20.00 KiB below, full version: https://github.com/llvm/llvm-project/pull/95312.diff

5 Files Affected:

(modified) lldb/source/Plugins/ObjectFile/Minidump/MinidumpFileBuilder.cpp (+412-118)
(modified) lldb/source/Plugins/ObjectFile/Minidump/MinidumpFileBuilder.h (+52-15)
(modified) lldb/source/Plugins/ObjectFile/Minidump/ObjectFileMinidump.cpp (+23-25)
(modified) llvm/include/llvm/BinaryFormat/Minidump.h (+5)
(modified) llvm/lib/ObjectYAML/MinidumpEmitter.cpp (+2)

diff --git a/lldb/source/Plugins/ObjectFile/Minidump/MinidumpFileBuilder.cpp b/lldb/source/Plugins/ObjectFile/Minidump/MinidumpFileBuilder.cpp
index 7231433619ffb..3c1d88652cc92 100644
--- a/lldb/source/Plugins/ObjectFile/Minidump/MinidumpFileBuilder.cpp
+++ b/lldb/source/Plugins/ObjectFile/Minidump/MinidumpFileBuilder.cpp
@@ -20,25 +20,98 @@
 #include "lldb/Target/RegisterContext.h"
 #include "lldb/Target/StopInfo.h"
 #include "lldb/Target/ThreadList.h"
+#include "lldb/Utility/DataBufferHeap.h"
 #include "lldb/Utility/DataExtractor.h"
 #include "lldb/Utility/LLDBLog.h"
 #include "lldb/Utility/Log.h"
+#include "lldb/Utility/LLDBLog.h"
 #include "lldb/Utility/RegisterValue.h"
 
 #include "llvm/ADT/StringRef.h"
 #include "llvm/BinaryFormat/Minidump.h"
 #include "llvm/Support/ConvertUTF.h"
+#include "llvm/Support/Endian.h"
 #include "llvm/Support/Error.h"
 
 #include "Plugins/Process/minidump/MinidumpTypes.h"
+#include "lldb/lldb-enumerations.h"
+#include "lldb/lldb-forward.h"
+#include "lldb/lldb-types.h"
 
+#include <algorithm>
 #include <cinttypes>
+#include <climits>
+#include <cstddef>
+#include <cstdint>
+#include <iostream>
+#include <set>
+#include <utility>
+#include <vector>
 
 using namespace lldb;
 using namespace lldb_private;
 using namespace llvm::minidump;
 
-void MinidumpFileBuilder::AddDirectory(StreamType type, size_t stream_size) {
+Status MinidumpFileBuilder::AddHeaderAndCalculateDirectories(
+    const lldb_private::Target &target, const lldb::ProcessSP &process_sp) {
+  // First set the offset on the file, and on the bytes saved
+  m_saved_data_size += header_size;
+  // We know we will have at least Misc, SystemInfo, Modules, and ThreadList
+  // (corresponding memory list for stacks) And an additional memory list for
+  // non-stacks.
+  m_expected_directories = 6;
+  // Check if OS is linux
+  if (target.GetArchitecture().GetTriple().getOS() ==
+      llvm::Triple::OSType::Linux)
+    m_expected_directories += 9;
+
+  // Go through all of the threads and check for exceptions.
+  lldb_private::ThreadList thread_list = process_sp->GetThreadList();
+  const uint32_t num_threads = thread_list.GetSize();
+  for (uint32_t thread_idx = 0; thread_idx < num_threads; ++thread_idx) {
+    ThreadSP thread_sp(thread_list.GetThreadAtIndex(thread_idx));
+    StopInfoSP stop_info_sp = thread_sp->GetStopInfo();
+    if (stop_info_sp &&
+        stop_info_sp->GetStopReason() == StopReason::eStopReasonException) {
+      m_expected_directories++;
+    }
+  }
+
+  // Now offset the file by the directores so we can write them in later.
+  offset_t directory_offset = m_expected_directories * directory_size;
+  m_saved_data_size += directory_offset;
+  // Replace this when we make a better way to do this.
+  Status error;
+  Header empty_header;
+  size_t bytes_written;
+  bytes_written = header_size;
+  error = m_core_file->Write(&empty_header, bytes_written);
+  if (error.Fail() || bytes_written != header_size) {
+    if (bytes_written != header_size)
+      error.SetErrorStringWithFormat(
+          "unable to write the header (written %zd/%zd)", bytes_written,
+          header_size);
+    return error;
+  }
+
+  for (uint i = 0; i < m_expected_directories; i++) {
+    size_t bytes_written;
+    bytes_written = directory_size;
+    Directory empty_directory;
+    error = m_core_file->Write(&empty_directory, bytes_written);
+    if (error.Fail() || bytes_written != directory_size) {
+      if (bytes_written != directory_size)
+        error.SetErrorStringWithFormat(
+            "unable to write the directory (written %zd/%zd)", bytes_written,
+            directory_size);
+      return error;
+    }
+  }
+
+  return error;
+}
+
+void MinidumpFileBuilder::AddDirectory(StreamType type, uint64_t stream_size) {
   LocationDescriptor loc;
   loc.DataSize = static_cast<llvm::support::ulittle32_t>(stream_size);
   // Stream will begin at the current end of data section
@@ -49,6 +122,7 @@ void MinidumpFileBuilder::AddDirectory(StreamType type, size_t stream_size) {
   dir.Location = loc;
 
   m_directories.push_back(dir);
+  assert(m_expected_directories >= m_directories.size());
 }
 
 Status MinidumpFileBuilder::AddSystemInfo(const llvm::Triple &target_triple) {
@@ -165,7 +239,6 @@ llvm::Expected<uint64_t> getModuleFileSize(Target &target,
     }
     return SizeOfImage;
   }
-
   SectionSP sect_sp = mod->GetObjectFile()->GetBaseAddress().GetSection();
 
   if (!sect_sp) {
@@ -238,12 +311,10 @@ Status MinidumpFileBuilder::AddModuleList(Target &target) {
     auto maybe_mod_size = getModuleFileSize(target, mod);
     if (!maybe_mod_size) {
       llvm::Error mod_size_err = maybe_mod_size.takeError();
-      llvm::handleAllErrors(std::move(mod_size_err),
-                            [&](const llvm::ErrorInfoBase &E) {
-                              error.SetErrorStringWithFormat(
-                                  "Unable to get the size of module %s: %s.",
-                                  module_name.c_str(), E.message().c_str());
-                            });
+      llvm::handleAllErrors(std::move(mod_size_err), [&](const llvm::ErrorInfoBase &E) {
+        error.SetErrorStringWithFormat("Unable to get the size of module %s: %s.",
+                                     module_name.c_str(), E.message().c_str());
+      });
       return error;
     }
 
@@ -513,6 +584,29 @@ findStackHelper(const lldb::ProcessSP &process_sp, uint64_t rsp) {
   return std::make_pair(addr, size);
 }
 
+Status MinidumpFileBuilder::FixThreads() {
+  Status error;
+  // If we have anything in the heap flush it.
+  FlushToDisk();
+  m_core_file->SeekFromStart(m_thread_list_start);
+  for (auto &pair : m_thread_by_range_start) {
+    // The thread objects will get a new memory descriptor added
+    // When we are emitting the memory list and then we write it here
+    llvm::minidump::Thread thread = pair.second;
+    size_t bytes_to_write = sizeof(llvm::minidump::Thread);
+    size_t bytes_written = bytes_to_write;
+    error = m_core_file->Write(&thread, bytes_written);
+    if (error.Fail() || bytes_to_write != bytes_written) {
+      error.SetErrorStringWithFormat(
+          "Wrote incorrect number of bytes to minidump file. (written %zd/%zd)",
+          bytes_written, bytes_to_write);
+      return error;
+    }
+  }
+
+  return error;
+}
+
 Status MinidumpFileBuilder::AddThreadList(const lldb::ProcessSP &process_sp) {
   constexpr size_t minidump_thread_size = sizeof(llvm::minidump::Thread);
   lldb_private::ThreadList thread_list = process_sp->GetThreadList();
@@ -530,10 +624,12 @@ Status MinidumpFileBuilder::AddThreadList(const lldb::ProcessSP &process_sp) {
       static_cast<llvm::support::ulittle32_t>(thread_list.GetSize());
   m_data.AppendData(&thread_count, sizeof(llvm::support::ulittle32_t));
 
+  // Take the offset after the thread count.
+  m_thread_list_start = GetCurrentDataEndOffset();
   DataBufferHeap helper_data;
 
   const uint32_t num_threads = thread_list.GetSize();
-
+  Log *log = GetLog(LLDBLog::Object);
   for (uint32_t thread_idx = 0; thread_idx < num_threads; ++thread_idx) {
     ThreadSP thread_sp(thread_list.GetThreadAtIndex(thread_idx));
     RegisterContextSP reg_ctx_sp(thread_sp->GetRegisterContext());
@@ -553,38 +649,18 @@ Status MinidumpFileBuilder::AddThreadList(const lldb::ProcessSP &process_sp) {
           arch.GetTriple().getArchName().str().c_str());
       return error;
     }
-    uint64_t sp = reg_ctx->GetSP();
-    auto expected_address_range = findStackHelper(process_sp, sp);
-
-    if (!expected_address_range) {
-      consumeError(expected_address_range.takeError());
-      error.SetErrorString("Unable to get the stack address.");
-      return error;
-    }
-
-    std::pair<uint64_t, uint64_t> range = std::move(*expected_address_range);
-    uint64_t addr = range.first;
-    uint64_t size = range.second;
-
-    auto data_up = std::make_unique<DataBufferHeap>(size, 0);
-    const size_t stack_bytes_read =
-        process_sp->ReadMemory(addr, data_up->GetBytes(), size, error);
-
-    if (error.Fail())
-      return error;
 
-    LocationDescriptor stack_memory;
-    stack_memory.DataSize =
-        static_cast<llvm::support::ulittle32_t>(stack_bytes_read);
-    stack_memory.RVA = static_cast<llvm::support::ulittle32_t>(
-        size_before + thread_stream_size + helper_data.GetByteSize());
+    uint64_t sp = reg_ctx->GetSP();
+    MemoryRegionInfo sp_region;
+    process_sp->GetMemoryRegionInfo(sp, sp_region);
 
+    // Emit a blank descriptor
     MemoryDescriptor stack;
-    stack.StartOfMemoryRange = static_cast<llvm::support::ulittle64_t>(addr);
-    stack.Memory = stack_memory;
-
-    helper_data.AppendData(data_up->GetBytes(), stack_bytes_read);
-
+    LocationDescriptor empty_label;
+    empty_label.DataSize = 0;
+    empty_label.RVA = 0;
+    stack.Memory = empty_label;
+    stack.StartOfMemoryRange = 0;
     LocationDescriptor thread_context_memory_locator;
     thread_context_memory_locator.DataSize =
         static_cast<llvm::support::ulittle32_t>(thread_context.size());
@@ -593,6 +669,8 @@ Status MinidumpFileBuilder::AddThreadList(const lldb::ProcessSP &process_sp) {
     // Cache thie thread context memory so we can reuse for exceptions.
     m_tid_to_reg_ctx[thread_sp->GetID()] = thread_context_memory_locator;
 
+    LLDB_LOGF(log, "AddThreadList for thread %d: thread_context %zu bytes",
+              thread_idx, thread_context.size());
     helper_data.AppendData(thread_context.data(), thread_context.size());
 
     llvm::minidump::Thread t;
@@ -604,9 +682,13 @@ Status MinidumpFileBuilder::AddThreadList(const lldb::ProcessSP &process_sp) {
     t.EnvironmentBlock = static_cast<llvm::support::ulittle64_t>(0);
     t.Stack = stack, t.Context = thread_context_memory_locator;
 
+    // We save off the stack object so we can circle back and clean it up.
+    m_thread_by_range_start[sp_region.GetRange().GetRangeBase()] = t;
     m_data.AppendData(&t, sizeof(llvm::minidump::Thread));
   }
 
+  LLDB_LOGF(log, "AddThreadList(): total helper_data %zu bytes",
+            helper_data.GetByteSize());
   m_data.AppendData(helper_data.GetBytes(), helper_data.GetByteSize());
   return Status();
 }
@@ -662,64 +744,6 @@ void MinidumpFileBuilder::AddExceptions(const lldb::ProcessSP &process_sp) {
   }
 }
 
-lldb_private::Status
-MinidumpFileBuilder::AddMemoryList(const lldb::ProcessSP &process_sp,
-                                   lldb::SaveCoreStyle core_style) {
-  Status error;
-  Process::CoreFileMemoryRanges core_ranges;
-  error = process_sp->CalculateCoreFileSaveRanges(core_style, core_ranges);
-  if (error.Fail()) {
-    error.SetErrorString("Process doesn't support getting memory region info.");
-    return error;
-  }
-
-  DataBufferHeap helper_data;
-  std::vector<MemoryDescriptor> mem_descriptors;
-  for (const auto &core_range : core_ranges) {
-    // Skip empty memory regions.
-    if (core_range.range.empty())
-      continue;
-    const addr_t addr = core_range.range.start();
-    const addr_t size = core_range.range.size();
-    auto data_up = std::make_unique<DataBufferHeap>(size, 0);
-    const size_t bytes_read =
-        process_sp->ReadMemory(addr, data_up->GetBytes(), size, error);
-    if (error.Fail()) {
-      Log *log = GetLog(LLDBLog::Object);
-      LLDB_LOGF(log, "Failed to read memory region. Bytes read: %zu, error: %s",
-                bytes_read, error.AsCString());
-      error.Clear();
-    }
-    if (bytes_read == 0)
-      continue;
-    // We have a good memory region with valid bytes to store.
-    LocationDescriptor memory_dump;
-    memory_dump.DataSize = static_cast<llvm::support::ulittle32_t>(bytes_read);
-    memory_dump.RVA =
-        static_cast<llvm::support::ulittle32_t>(GetCurrentDataEndOffset());
-    MemoryDescriptor memory_desc;
-    memory_desc.StartOfMemoryRange =
-        static_cast<llvm::support::ulittle64_t>(addr);
-    memory_desc.Memory = memory_dump;
-    mem_descriptors.push_back(memory_desc);
-    m_data.AppendData(data_up->GetBytes(), bytes_read);
-  }
-
-  AddDirectory(StreamType::MemoryList,
-               sizeof(llvm::support::ulittle32_t) +
-                   mem_descriptors.size() *
-                       sizeof(llvm::minidump::MemoryDescriptor));
-  llvm::support::ulittle32_t memory_ranges_num(mem_descriptors.size());
-
-  m_data.AppendData(&memory_ranges_num, sizeof(llvm::support::ulittle32_t));
-  for (auto memory_descriptor : mem_descriptors) {
-    m_data.AppendData(&memory_descriptor,
-                      sizeof(llvm::minidump::MemoryDescriptor));
-  }
-
-  return error;
-}
-
 void MinidumpFileBuilder::AddMiscInfo(const lldb::ProcessSP &process_sp) {
   AddDirectory(StreamType::MiscInfo,
                sizeof(lldb_private::minidump::MinidumpMiscInfo));
@@ -782,6 +806,7 @@ void MinidumpFileBuilder::AddLinuxFileStreams(
         {StreamType::LinuxProcFD, "/proc/" + pid_str + "/fd"});
   }
 
+  Status error;
   for (const auto &entry : files_with_stream_types) {
     StreamType stream = entry.first;
     std::string path = entry.second;
@@ -797,10 +822,64 @@ void MinidumpFileBuilder::AddLinuxFileStreams(
   }
 }
 
-Status MinidumpFileBuilder::Dump(lldb::FileUP &core_file) const {
-  constexpr size_t header_size = sizeof(llvm::minidump::Header);
-  constexpr size_t directory_size = sizeof(llvm::minidump::Directory);
+Status MinidumpFileBuilder::AddMemory(const ProcessSP &process_sp,
+                                      SaveCoreStyle core_style) {
+  Status error;
+
+  Process::CoreFileMemoryRanges ranges_for_memory_list;
+  error = process_sp->CalculateCoreFileSaveRanges(
+      SaveCoreStyle::eSaveCoreStackOnly, ranges_for_memory_list);
+  if (error.Fail()) {
+    return error;
+  }
+
+  std::set<addr_t> stack_ranges;
+  for (const auto &core_range : ranges_for_memory_list) {
+    stack_ranges.insert(core_range.range.start());
+  }
+  // We leave a little padding for dictionary and any other metadata we would
+  // want. Also so that we can put the header of the memory list 64 in 32b land,
+  // because the directory requires a 32b RVA.
+  Process::CoreFileMemoryRanges ranges;
+  error = process_sp->CalculateCoreFileSaveRanges(core_style, ranges);
+  if (error.Fail()) {
+    return error;
+  }
+
+  uint64_t total_size =
+      256 + (ranges.size() * sizeof(llvm::minidump::MemoryDescriptor_64));
+  // Take all the memory that will fit in the 32b range.
+  for (int i = ranges.size() - 1; i >= 0; i--) {
+    addr_t size_to_add =
+        ranges[i].range.size() + sizeof(llvm::minidump::MemoryDescriptor);
+    // filter out the stacks and check if it's below 32b max.
+    if (stack_ranges.count(ranges[i].range.start()) > 0) {
+      ranges.erase(ranges.begin() + i);
+    } else if (total_size + size_to_add < UINT32_MAX) {
+      ranges_for_memory_list.push_back(ranges[i]);
+      total_size += ranges[i].range.size();
+      total_size += sizeof(llvm::minidump::MemoryDescriptor);
+      ranges.erase(ranges.begin() + i);
+    } else {
+      break;
+    }
+  }
 
+  error = AddMemoryList_32(process_sp, ranges_for_memory_list);
+  if (error.Fail())
+    return error;
+
+  // Add the remaining memory as a 64b range.
+  if (ranges.size() > 0) {
+    error = AddMemoryList_64(process_sp, ranges);
+    if (error.Fail())
+      return error;
+  }
+
+  return FixThreads();
+}
+
+Status MinidumpFileBuilder::DumpHeader() const {
   // write header
   llvm::minidump::Header header;
   header.Signature = static_cast<llvm::support::ulittle32_t>(
@@ -808,9 +887,10 @@ Status MinidumpFileBuilder::Dump(lldb::FileUP &core_file) const {
   header.Version = static_cast<llvm::support::ulittle32_t>(
       llvm::minidump::Header::MagicVersion);
   header.NumberOfStreams =
-      static_cast<llvm::support::ulittle32_t>(GetDirectoriesNum());
+      static_cast<llvm::support::ulittle32_t>(m_directories.size());
+  // We write the directories right after the header.
   header.StreamDirectoryRVA =
-      static_cast<llvm::support::ulittle32_t>(GetCurrentDataEndOffset());
+      static_cast<llvm::support::ulittle32_t>(header_size);
   header.Checksum = static_cast<llvm::support::ulittle32_t>(
       0u), // not used in most of the writers
       header.TimeDateStamp =
@@ -821,8 +901,9 @@ Status MinidumpFileBuilder::Dump(lldb::FileUP &core_file) const {
   Status error;
   size_t bytes_written;
 
+  m_core_file->SeekFromStart(0);
   bytes_written = header_size;
-  error = core_file->Write(&header, bytes_written);
+  error = m_core_file->Write(&header, bytes_written);
   if (error.Fail() || bytes_written != header_size) {
     if (bytes_written != header_size)
       error.SetErrorStringWithFormat(
@@ -830,22 +911,20 @@ Status MinidumpFileBuilder::Dump(lldb::FileUP &core_file) const {
           header_size);
     return error;
   }
+  return error;
+}
 
-  // write data
-  bytes_written = m_data.GetByteSize();
-  error = core_file->Write(m_data.GetBytes(), bytes_written);
-  if (error.Fail() || bytes_written != m_data.GetByteSize()) {
-    if (bytes_written != m_data.GetByteSize())
-      error.SetErrorStringWithFormat(
-          "unable to write the data (written %zd/%" PRIu64 ")", bytes_written,
-          m_data.GetByteSize());
-    return error;
-  }
+size_t MinidumpFileBuilder::GetCurrentDataEndOffset() const {
+  return m_data.GetByteSize() + m_saved_data_size;
+}
 
-  // write directories
+Status MinidumpFileBuilder::DumpDirectories() const {
+  Status error;
+  size_t bytes_written;
+  m_core_file->SeekFromStart(header_size);
   for (const Directory &dir : m_directories) {
     bytes_written = directory_size;
-    error = core_file->Write(&dir, bytes_written);
+    error = m_core_file->Write(&dir, bytes_written);
     if (error.Fail() || bytes_written != directory_size) {
       if (bytes_written != directory_size)
         error.SetErrorStringWithFormat(
@@ -858,10 +937,225 @@ Status MinidumpFileBuilder::Dump(lldb::FileUP &core_file) const {
   return error;
 }
 
-size_t MinidumpFileBuilder::GetDirectoriesNum() const {
-  return m_directories.size();
+Status MinidumpFileBuilder::AddMemoryList_32(
+    const ProcessSP &process_sp, const Process::CoreFileMemoryRanges &ranges) {
+  std::vector<MemoryDescriptor> descriptors;
+  Status error;
+  Log *log = GetLog(LLDBLog::Object);
+  size_t region_index = 0;
+  for (const auto &core_range : ranges) {
+    // Take the offset before we write.
+    const size_t offset_for_data = GetCurrentDataEndOffset();
+    const addr_t addr = core_range.range.start();
+    const addr_t size = core_range.range.size();
+    auto data_up = std::make_unique<DataBufferHeap>(size, 0);
+
+    LLDB_LOGF(
+        log,
+        "AddMemoryList %zu/%zu reading memory for region (%zu bytes) [%zx, %zx)",
+        region_index, ranges.size(), size, addr, addr + size);
+    ++region_index;
+
+    const size_t bytes_read =
+        process_sp->ReadMemory(addr, data_up->GetBytes(), size, error);
+    if (error.Fail() || bytes_read == 0) {
+      LLDB_LOGF(log, "Failed to read memory region. Bytes read: %zu, error: %s",
+                bytes_read, error.AsCString());
+      // Just skip sections with errors or zero bytes in 32b mode
+      continue;
+    } else if (bytes_read != size) {
+      LLDB_LOGF(log, "Memory region at: %zu failed to read %zu bytes", addr,
+                size);
+    }
+
+    MemoryDescriptor descriptor;
+    descriptor.StartOfMemoryRange =
+        static_cast<llvm::support::ulittle64_t>(addr);
+    descriptor.Memory.DataSize =
+        static_cast<llvm::support::ulittle32_t>(bytes_read);
+    descriptor.Memory.RVA =
+        static_cast<llvm::support::ulittle32_t>(offset_for_data);
+    descriptors.push_back(descriptor);
+    if (m_thread_by_range_start.count(addr) > 0) {
+      m_thread_by_range_start[addr].Stack = descriptor;
+    }
+
+    // Add the data to the buffer, flush as needed.
+    error = AddData(data_up->GetBytes(), bytes_read);
+    if (error.Fail())
+      return error;
+  }
+
+  // Add a directory that references this list
+  // With a size of the number of ranges as a 32 bit num
+  // And then the size of all the ranges
+  AddDirectory(StreamType::MemoryList,
+               sizeof(llvm::support::ulittle32_t) +
+                   descriptors.size() *
+                       sizeof(llvm::minidump::MemoryDescriptor));
+
+  llvm::support::ulittle32_t memory_ranges_num =
+      static_cast<llvm::support::ulittle32_t>(descriptors.size());
+  m_data.Ap...
[truncated]

github-actions · 2024-06-12T21:23:01Z

✅ With the latest revision this PR passed the C/C++ code formatter.

clayborg · 2024-06-13T03:39:06Z

lldb/source/Plugins/ObjectFile/Minidump/MinidumpFileBuilder.h

-  lldb_private::Status AddThreadList(const lldb::ProcessSP &process_sp);
-  // Add Exception streams for any threads that stopped with exceptions.
-  void AddExceptions(const lldb::ProcessSP &process_sp);


These haven't changed at all, so I would move them back to this line. No need to make changes to this header file if things haven't changed (AddThreadList and AddExceptions). So make as much of this the same as it was before by reverting the above 3 lines and removing them from below.

I tried to lay these out in an order of operations (but had to some more), still want me to put them back?

yeah, always good to minimize code changes to make reviewing easier

clayborg · 2024-06-13T03:39:46Z

lldb/source/Plugins/ObjectFile/Minidump/MinidumpFileBuilder.h

+  // Add Exception streams for any threads that stopped with exceptions.
+  void AddExceptions(const lldb::ProcessSP &process_sp);


Revert to be the same as the lines removed from above to minimize changes to this file

clayborg · 2024-06-13T03:40:48Z

lldb/source/Plugins/ObjectFile/Minidump/MinidumpFileBuilder.h

+  lldb_private::Status AddThreadList(const lldb::ProcessSP &process_sp);
+


Revert these two lines and use the old ones that were removed above.

clayborg · 2024-06-13T03:45:37Z

lldb/source/Plugins/ObjectFile/Minidump/MinidumpFileBuilder.h

@@ -40,7 +46,7 @@ lldb_private::Status WriteString(const std::string &to_write,
 /// the data on heap.
 class MinidumpFileBuilder {
 public:
-  MinidumpFileBuilder() = default;
+  MinidumpFileBuilder(lldb::FileUP&& core_file): m_core_file(std::move(core_file)) {};


We should a const lldb::ProcessSP &process_sp to the constructor and remove all const lldb::ProcessSP &process_sp arguemnts from the methods below. We never want to change processes while creating a minidump. We should store the process shared pointer as an instance variable and then the methods that used to take the const lldb::ProcessSP &process_sp as an argument now use the instance variable:

lldb::ProcessSP m_process_sp;

I agree with this. I also think it reduces the split responsibility with ObjectfileMinidump

clayborg · 2024-06-13T03:50:15Z

lldb/source/Plugins/ObjectFile/Minidump/MinidumpFileBuilder.h

-
-  // Stores directories to later put them at the end of minidump file
+  void AddDirectory(llvm::minidump::StreamType type, uint64_t stream_size);
+  lldb::addr_t GetCurrentDataEndOffset() const;


return lldb::offset_t instead of lldb::addr_t.

clayborg · 2024-06-13T04:41:42Z

lldb/source/Plugins/ObjectFile/Minidump/ObjectFileMinidump.cpp

    return false;
  }

+  builder.AddModuleList(target);
  builder.AddMiscInfo(process_sp);

  error = builder.AddThreadList(process_sp);


clayborg · 2024-06-13T04:42:08Z

lldb/source/Plugins/ObjectFile/Minidump/ObjectFileMinidump.cpp

    return false;
  }
+  if (target.GetArchitecture().GetTriple().getOS() ==
+      llvm::Triple::OSType::Linux) {
+    builder.AddLinuxFileStreams(process_sp);


remove process_sp. If we construct the builder with a process_sp, it can use the m_process_sp instance variable that the MinidumpFileBuilder has to grab the target or process when it needs one.

clayborg · 2024-06-13T04:42:13Z

lldb/source/Plugins/ObjectFile/Minidump/ObjectFileMinidump.cpp

+  if (target.GetArchitecture().GetTriple().getOS() ==
+      llvm::Triple::OSType::Linux) {
+    builder.AddLinuxFileStreams(process_sp);
+  }

  // Add any exceptions but only if there are any in any threads.
  builder.AddExceptions(process_sp);


remove process_sp. If we construct the builder with a process_sp, it can use the m_process_sp instance variable that the MinidumpFileBuilder has to grab the target or process when it needs one.

clayborg · 2024-06-13T04:42:25Z

lldb/source/Plugins/ObjectFile/Minidump/ObjectFileMinidump.cpp

-  error = builder.AddMemoryList(process_sp, core_style);
+  // Note: add memory HAS to be the last thing we do. It can overflow into 64b
+  // land and many RVA's only support 32b
+  error = builder.AddMemory(process_sp, core_style);


remove process_sp. If we construct the builder with a process_sp, it can use the m_process_sp instance variable that the MinidumpFileBuilder has to grab the target or process when it needs one.

lldb/source/Plugins/ObjectFile/Minidump/ObjectFileMinidump.cpp

lldb/source/Plugins/ObjectFile/Minidump/MinidumpFileBuilder.cpp

…ush to disk loop.

clayborg · 2024-06-13T20:56:50Z

lldb/source/Plugins/ObjectFile/Minidump/MinidumpFileBuilder.h

-  size_t GetDirectoriesNum() const;
+  void AddLinuxFileStreams();
+
+  lldb_private::Status AddMemory(lldb::SaveCoreStyle core_style);


I would leave this named the same as it was before as AddMemoryList and move it back up to where it used to be in this source file

clayborg · 2024-06-13T20:58:37Z

lldb/source/Plugins/ObjectFile/Minidump/MinidumpFileBuilder.h

+  lldb_private::Status AddMemory(const lldb::ProcessSP &process_sp,
+                                 lldb::SaveCoreStyle core_style);
+
+  lldb_private::Status DumpToFile();


It would be nice to just have a Finalize() that does what CompleteFile and FinalizeFile and WriteAndFinish would do?

clayborg · 2024-06-13T20:59:29Z

lldb/source/Plugins/ObjectFile/Minidump/MinidumpFileBuilder.h

  std::vector<llvm::minidump::Directory> m_directories;
+  // When we write off the threads for the first time, we need to clean them up
+  // and give them the correct RVA once we write the stack memory list.
+  std::map<lldb::addr_t, llvm::minidump::Thread> m_thread_by_range_start;


If it is an address then lldb::addr_t is the right choice. I must have confused this with another location that needed to use lldb::offset_t for an offset that was being stored as a lldb::addr_t

clayborg · 2024-06-13T21:02:17Z

lldb/source/Plugins/ObjectFile/Minidump/MinidumpFileBuilder.cpp

-void MinidumpFileBuilder::AddDirectory(StreamType type, size_t stream_size) {
+Status MinidumpFileBuilder::AddHeaderAndCalculateDirectories() {
+  // First set the offset on the file, and on the bytes saved
+  m_saved_data_size += header_size;


Can we have data already in the file here? If not we should assign m_saved_data_size here:

m_saved_data_size = header_size;

Is there a better name for m_saved_data_size?

clayborg · 2024-06-13T21:03:54Z

lldb/source/Plugins/ObjectFile/Minidump/MinidumpFileBuilder.cpp

+
+  lldb_private::Target &target = m_process_sp->GetTarget();
+  m_expected_directories = 6;
+  // Check if OS is linux


// Check if OS is linux and reserve directory space for all linux specific breakpad extension directories.

clayborg · 2024-06-13T21:10:05Z

lldb/source/Plugins/ObjectFile/Minidump/MinidumpFileBuilder.cpp

+    if (stack_start_addresses.count(core_range.range.start()) > 0) {
+      // Don't double save stacks.
+      continue;
+    } else if (total_size + size_to_add < UINT32_MAX) {


No need for else if if we continue:

if (stack_start_addresses.count(core_range.range.start()) > 0) continue; // Stacks are already in ranges_32 if (total_size + size_to_add < UINT32_MAX)

clayborg · 2024-06-13T21:12:25Z

lldb/source/Plugins/ObjectFile/Minidump/MinidumpFileBuilder.cpp

+  std::vector<MemoryDescriptor> descriptors;
+  Status error;
+  Log *log = GetLog(LLDBLog::Object);
+  size_t region_index = 0;


Make on DataBufferHeap outside of this loop and re-use it:

lldb::offset_t max_size = 0; for (const auto &core_range : ranges) { if (core_range.renge.size() > max_size) max_size = core_range.renge.size(); } auto data_up = std::make_unique<DataBufferHeap>(max_size, 0);

auto data_up = std::make_unique(size, 0);

Done. Albeit the m_data buffer that the class uses does get used and then reallocated when we call Clear() in flush to disk. May want to look into keeping a write index and a range...

clayborg · 2024-06-13T21:13:00Z

lldb/source/Plugins/ObjectFile/Minidump/MinidumpFileBuilder.cpp

+    const size_t offset_for_data = GetCurrentDataEndOffset();
+    const addr_t addr = core_range.range.start();
+    const addr_t size = core_range.range.size();
+    auto data_up = std::make_unique<DataBufferHeap>(size, 0);


re-use the one copy of data_up from above, just remove this line.

clayborg · 2024-06-13T21:13:29Z

lldb/source/Plugins/ObjectFile/Minidump/MinidumpFileBuilder.cpp

+    if (m_thread_by_range_start.count(addr) > 0) {
+      m_thread_by_range_start[addr].Stack = descriptor;
+    }


Still need to remove these braces from this if

clayborg · 2024-06-13T21:14:15Z

lldb/source/Plugins/ObjectFile/Minidump/MinidumpFileBuilder.cpp

+  for (const auto &core_range : ranges) {
+    const addr_t addr = core_range.range.start();
+    const addr_t size = core_range.range.size();
+    auto data_up = std::make_unique<DataBufferHeap>(size, 0);


move this out of the for loop and have it contain max_size bytes just like in the 32 bit variant of this function.

…ssible buffer ahead of time

jeffreytan81 · 2024-06-14T00:03:20Z

lldb/source/Plugins/ObjectFile/Minidump/MinidumpFileBuilder.h

+  lldb_private::Status AddData(const void *data, size_t size);
+  // Add MemoryList stream, containing dumps of important memory segments
+  lldb_private::Status
+  AddMemoryList_64(const lldb_private::Process::CoreFileMemoryRanges &ranges);


I would simply call AddMemoryList64, AddMemoryList32 and Is64Bit.
Let's be consistent about the member function naming convention.

jeffreytan81 · 2024-06-14T00:05:47Z

lldb/source/Plugins/ObjectFile/Minidump/MinidumpFileBuilder.h

+  static constexpr size_t m_write_chunk_max = (1024 * 1024 * 128);
+
+  static constexpr size_t header_size = sizeof(llvm::minidump::Header);
+  static constexpr size_t directory_size = sizeof(llvm::minidump::Directory);


The naming convention is not consistent. For constexpr, either MAX_WRITE_CHUNK or MaxChunkSize for readability.
Similar for HEADER_SIZE and DIRECTORY_SIZE.

jeffreytan81 · 2024-06-14T00:07:25Z

lldb/source/Plugins/ObjectFile/Minidump/ObjectFileMinidump.cpp

    return false;
  }

-  builder.AddMiscInfo(process_sp);
+  builder.AddModuleList();


Let's handle the failure from AddModuleList(). I am pretty sure I have seen a failure once.

jeffreytan81 · 2024-06-14T00:18:11Z

lldb/source/Plugins/ObjectFile/Minidump/MinidumpFileBuilder.cpp


+  std::set<addr_t> stack_start_addresses;


If you do not need ordering of the elements in the set, use std::unordered_set.

jeffreytan81 · 2024-06-14T00:30:28Z

lldb/source/Plugins/ObjectFile/Minidump/MinidumpFileBuilder.h

+  lldb_private::Status AddMemory(const lldb::ProcessSP &process_sp,
+                                 lldb::SaveCoreStyle core_style);
+
+  lldb_private::Status DumpToFile();


As you are describing it, you can see this method is doing two things. I do not see the benefit of grouping them into one confusing named function. Let's split it into two functions:

FixupHeaderDirectory()

FlushToFile()

jeffreytan81 · 2024-06-14T00:34:42Z

lldb/source/Plugins/ObjectFile/Minidump/MinidumpFileBuilder.cpp

+  for (auto &pair : m_thread_by_range_start) {
+    // The thread objects will get a new memory descriptor added
+    // When we are emitting the memory list and then we write it here
+    llvm::minidump::Thread thread = pair.second;


I do not think you need to copy:

const llvm::minidump::Thread& thread

jeffreytan81 · 2024-06-14T00:42:36Z

lldb/source/Plugins/ObjectFile/Minidump/MinidumpFileBuilder.h

+  AddMemoryList_64(const lldb_private::Process::CoreFileMemoryRanges &ranges);
+  lldb_private::Status
+  AddMemoryList_32(const lldb_private::Process::CoreFileMemoryRanges &ranges);
+  lldb_private::Status FixThreads();


You should add comment explaining what is "FixThreads()" doing and why we need it.

FixThreads is a very generic name, but you are really fixing up the stack field only. So rename to "FixupThreadStack()" to be more specific.

jeffreytan81 · 2024-06-14T00:46:05Z

lldb/source/Plugins/ObjectFile/Minidump/MinidumpFileBuilder.h

@@ -50,48 +60,75 @@ class MinidumpFileBuilder {

  ~MinidumpFileBuilder() = default;

+  lldb_private::Status AddHeaderAndCalculateDirectories();


I would add comment to clarify that this is partially done and which fields require later fixup.

lldb/source/Plugins/ObjectFile/Minidump/ObjectFileMinidump.cpp

jeffreytan81 · 2024-06-14T00:52:09Z

Can this be landed safely? Do we need to update consumer side to support memory list 64? For example, if this diff is landed, and a user tries to save a minidump with memory list 64, if there is no consumer side support, will lldb crash?

…ing off by 8 + descirptor count * sizeof(descriptor_64)

clayborg · 2024-06-15T00:22:54Z

lldb/source/Plugins/ObjectFile/Minidump/MinidumpFileBuilder.cpp

+  while (remaining_bytes > 0) {
+    // Keep filling in zero's until we preallocate enough space for the header
+    // and directory sections.
+    size_t bytes_written = std::min(remaining_bytes, sizeof(size_t));
+    error = m_core_file->Write(&zeroes, bytes_written);
+    if (error.Fail()) {
      error.SetErrorStringWithFormat(
-          "unable to write the header (written %zd/%zd)", bytes_written,
-          header_size);
-    return error;
-  }
-
-  for (uint i = 0; i < m_expected_directories; i++) {
-    size_t bytes_written;
-    bytes_written = directory_size;
-    Directory empty_directory;
-    error = m_core_file->Write(&empty_directory, bytes_written);
-    if (error.Fail() || bytes_written != directory_size) {
-      if (bytes_written != directory_size)
-        error.SetErrorStringWithFormat(
-            "unable to write the directory (written %zd/%zd)", bytes_written,
-            directory_size);
-      return error;
+        "Unable to write header and directory padding (written %zd/%zd)",
+        remaining_bytes - m_saved_data_size, m_saved_data_size);
+      break;
    }
+
+    remaining_bytes -= bytes_written;
  }


Remove this while look and complex functionality and just call:

m_core_file->SeekFromStart(m_saved_data_size);

No need to write zeroes.

clayborg · 2024-06-15T00:25:59Z

lldb/source/Plugins/ObjectFile/Minidump/MinidumpFileBuilder.cpp

-void MinidumpFileBuilder::AddDirectory(StreamType type, size_t stream_size) {
+Status MinidumpFileBuilder::AddHeaderAndCalculateDirectories() {
+  // First set the offset on the file, and on the bytes saved
+  m_saved_data_size += header_size;


Is there a better name for m_saved_data_size?

clayborg · 2024-06-15T00:27:37Z

lldb/source/Plugins/ObjectFile/Minidump/MinidumpFileBuilder.cpp

+  size_t zeroes = 0; // 8 0's
+  size_t remaining_bytes = m_saved_data_size;
+  while (remaining_bytes > 0) {
+    // Keep filling in zero's until we preallocate enough space for the header
+    // and directory sections.
+    size_t bytes_written = std::min(remaining_bytes, sizeof(size_t));
+    error = m_core_file->Write(&zeroes, bytes_written);
+    if (error.Fail()) {
+      error.SetErrorStringWithFormat(
+        "Unable to write header and directory padding (written %zd/%zd)",
+        remaining_bytes - m_saved_data_size, m_saved_data_size);
+      break;
+    }
+
+    remaining_bytes -= bytes_written;
+  }


Just set the file position in the m_core_file:

m_core_file->SeekFromStart(m_saved_data_size);

And remove this entire while loop.

clayborg · 2024-06-15T00:32:34Z

lldb/source/Plugins/ObjectFile/Minidump/MinidumpFileBuilder.cpp

+  uint64_t total_size =
+      ranges_32.size() * sizeof(llvm::minidump::MemoryDescriptor);
+  for (const auto &core_range : ranges_32)
+    total_size += core_range.range.size();


Merge this total_size calculation stuff into the loop above on line 836. No need to iterate over this twice.

clayborg · 2024-06-15T00:33:33Z

lldb/source/Plugins/ObjectFile/Minidump/MinidumpFileBuilder.cpp

+      total_size += core_range.range.size();
+      total_size += sizeof(llvm::minidump::MemoryDescriptor);


This can still be just:

total_size += size_to_add;

clayborg · 2024-06-15T00:38:14Z

lldb/source/Plugins/ObjectFile/Minidump/MinidumpFileBuilder.cpp

+  llvm::support::ulittle32_t memory_ranges_num =
+      static_cast<llvm::support::ulittle32_t>(descriptors.size());


llvm::support::ulittle32_t memory_ranges_num(descriptors.size());

No need for static_cast right. Lots of places already had these incorrect extra static_cast calls..

I'm torn on the static casts. I know the little endian support types can support big endian assignment, but I think the static cast makes it very explicit that these values have to be little endian

The assignment operator can and must work for these llvm::support::ulittleXX_t values, we should take advantage of them and keep the code cleaner. Removing all static_casts makes this code easier to read and avoids people making copy/paste of similar code in the future. I know there were some static casts before this patch, but we should remove them.

clayborg · 2024-06-15T00:43:30Z

lldb/source/Plugins/ObjectFile/Minidump/MinidumpFileBuilder.cpp

+               sizeof(llvm::support::ulittle32_t) +
+                   descriptors.size() *
+                       sizeof(llvm::minidump::MemoryDescriptor));


indent ok here? Run clang-format on this

clayborg · 2024-06-15T00:44:29Z

lldb/source/Plugins/ObjectFile/Minidump/MinidumpFileBuilder.cpp

+  llvm::support::ulittle64_t memory_ranges_num =
+      static_cast<llvm::support::ulittle64_t>(ranges.size());


No static cast:

llvm::support::ulittle64_t memory_ranges_num(ranges.size());

remove static casts

clayborg · 2024-06-15T00:44:54Z

lldb/source/Plugins/ObjectFile/Minidump/MinidumpFileBuilder.cpp

+  llvm::support::ulittle64_t memory_ranges_base_rva =
+      static_cast<llvm::support::ulittle64_t>(base_rva);


no static cast

clayborg · 2024-06-15T00:45:07Z

lldb/source/Plugins/ObjectFile/Minidump/MinidumpFileBuilder.cpp

+    memory_desc.StartOfMemoryRange =
+        static_cast<llvm::support::ulittle64_t>(core_range.range.start());
+    memory_desc.DataSize =
+        static_cast<llvm::support::ulittle64_t>(core_range.range.size());


remove static cast

… instead keep tracrack internally. Add comments explaining serialization formating and process as requested by Jeffrey. Run Git clang format.

lldb/source/Plugins/ObjectFile/Minidump/MinidumpFileBuilder.cpp

clayborg · 2024-06-17T20:50:35Z

lldb/source/Plugins/ObjectFile/Minidump/MinidumpFileBuilder.cpp

+
+  Process::CoreFileMemoryRanges ranges_32;
+  Process::CoreFileMemoryRanges ranges_64;
+  error = m_process_sp->CalculateCoreFileSaveRanges(


Add a comment here:

// We first save the thread stacks to ensure they fit in the first UINT32_MAX // bytes of the core file. Thread structures in minidump files can only use // 32 bit memory descriptiors, so we emit them first to ensure the memory is // in accessible with a 32 bit offset.

clayborg · 2024-06-17T20:56:42Z

lldb/source/Plugins/ObjectFile/Minidump/MinidumpFileBuilder.cpp

  // write header
  llvm::minidump::Header header;
  header.Signature = static_cast<llvm::support::ulittle32_t>(
      llvm::minidump::Header::MagicSignature);
  header.Version = static_cast<llvm::support::ulittle32_t>(
      llvm::minidump::Header::MagicVersion);
  header.NumberOfStreams =
-      static_cast<llvm::support::ulittle32_t>(GetDirectoriesNum());
+      static_cast<llvm::support::ulittle32_t>(m_directories.size());


remove static_cast, not needed.

clayborg · 2024-06-17T20:57:04Z

lldb/source/Plugins/ObjectFile/Minidump/MinidumpFileBuilder.cpp

  header.StreamDirectoryRVA =
-      static_cast<llvm::support::ulittle32_t>(GetCurrentDataEndOffset());
+      static_cast<llvm::support::ulittle32_t>(HEADER_SIZE);


remove static_cast, not needed.

clayborg · 2024-06-17T20:58:38Z

lldb/source/Plugins/ObjectFile/Minidump/MinidumpFileBuilder.cpp

+    descriptor.StartOfMemoryRange =
+        static_cast<llvm::support::ulittle64_t>(addr);
+    descriptor.Memory.DataSize =
+        static_cast<llvm::support::ulittle32_t>(bytes_read);
+    descriptor.Memory.RVA =
+        static_cast<llvm::support::ulittle32_t>(offset_for_data);


remove static_casts

clayborg · 2024-06-17T21:00:40Z

lldb/source/Plugins/ObjectFile/Minidump/MinidumpFileBuilder.cpp

+  llvm::support::ulittle32_t memory_ranges_num =
+      static_cast<llvm::support::ulittle32_t>(descriptors.size());


The assignment operator can and must work for these llvm::support::ulittleXX_t values, we should take advantage of them and keep the code cleaner. Removing all static_casts makes this code easier to read and avoids people making copy/paste of similar code in the future. I know there were some static casts before this patch, but we should remove them.

clayborg · 2024-06-17T21:01:20Z

lldb/source/Plugins/ObjectFile/Minidump/MinidumpFileBuilder.cpp

+  llvm::support::ulittle64_t memory_ranges_num =
+      static_cast<llvm::support::ulittle64_t>(ranges.size());


remove static casts

lldb/source/Plugins/ObjectFile/Minidump/MinidumpFileBuilder.cpp

clayborg · 2024-06-17T21:03:26Z

lldb/source/Plugins/ObjectFile/Minidump/MinidumpFileBuilder.cpp

-size_t MinidumpFileBuilder::GetDirectoriesNum() const {
-  return m_directories.size();
+static size_t GetLargestRange(const Process::CoreFileMemoryRanges &ranges) {
+  size_t max_size = 0;


use uint64_t

lldb/source/Plugins/ObjectFile/Minidump/MinidumpFileBuilder.cpp

…int32_t to avoid warnings. Run git clang format.

clayborg · 2024-06-18T05:31:52Z

lldb/source/Plugins/ObjectFile/Minidump/MinidumpFileBuilder.cpp

+  if (new_offset != m_saved_data_size)
+    error.SetErrorStringWithFormat("Failed to fill in header and directory "
+                                   "sections. Written / Expected (%" PRIx64
+                                   " / %zu)",


m_saved_data_size is uint64_t, so don't use " / %zu), use ` / %" PRIu64 ")"

clayborg · 2024-06-18T05:37:27Z

lldb/source/Plugins/ObjectFile/Minidump/MinidumpFileBuilder.cpp

+  total_size +=
+      256 + (all_core_memory_ranges.size() - stack_start_addresses.size()) *
+                sizeof(llvm::minidump::MemoryDescriptor_64);


this total_size will be wrong if all_core_memory_ranges is empty right? If we are saving stacks only, this will be:

256 + (0 - stack_start_addresses.size()) * sizeof(llvm::minidump::MemoryDescriptor_64);

And we will unsigned underflow but it won't matter because we won't use total_size. Best to put a:

if (!all_core_memory_ranges.empty()) {

around the total_size and for loop that iterates over all core ranges to avoid potential issues in the future.

This actually works, because we'll never enter the loop unless all_core_memory_ranges is non empty, where total_size will actually be used. Prior to code review feedback we always called to get memory ranges, even when it's just stacks, ensuring we would always have at least stacks in all_core_memory_ranges.

I have added the check for empty and an explanation comment as well.

clayborg · 2024-06-18T05:38:12Z

lldb/source/Plugins/ObjectFile/Minidump/MinidumpFileBuilder.cpp

+
+    if (total_size + range_size < UINT32_MAX) {
+      ranges_32.push_back(core_range);
+      total_size += core_range.range.size();


use range_size that you created above.

clayborg · 2024-06-18T05:38:19Z

lldb/source/Plugins/ObjectFile/Minidump/MinidumpFileBuilder.cpp

+                sizeof(llvm::minidump::MemoryDescriptor_64);
+
+  for (const auto &core_range : all_core_memory_ranges) {
+    addr_t range_size = core_range.range.size();


make const

…oop and make it const. Fix m_total_siz_size formatting. Run git-clang-format

petrhosek · 2024-06-24T21:26:49Z

This broke the Windows build with the following error:

C:\b\s\w\ir\x\w\cipd\bin\clang-cl.exe  /nologo -TP -DGTEST_HAS_RTTI=0 -DLIBXML_STATIC -DUNICODE -D_CRT_NONSTDC_NO_DEPRECATE -D_CRT_NONSTDC_NO_WARNINGS -D_CRT_SECURE_NO_DEPRECATE -D_CRT_SECURE_NO_WARNINGS -D_ENABLE_EXTENDED_ALIGNED_STORAGE -D_GLIBCXX_ASSERTIONS -D_HAS_EXCEPTIONS=0 -D_SCL_SECURE_NO_DEPRECATE -D_SCL_SECURE_NO_WARNINGS -D_UNICODE -D__STDC_CONSTANT_MACROS -D__STDC_FORMAT_MACROS -D__STDC_LIMIT_MACROS -IC:\b\s\w\ir\x\w\llvm_build\tools\lldb\source\Plugins\ObjectFile\Minidump -IC:\b\s\w\ir\x\w\llvm-llvm-project\lldb\source\Plugins\ObjectFile\Minidump -IC:\b\s\w\ir\x\w\llvm-llvm-project\lldb\include -IC:\b\s\w\ir\x\w\llvm_build\tools\lldb\include -IC:\b\s\w\ir\x\w\rc\tensorflow-venv\store\python_venv-51qj4upsi8nrslpsnfp48k5j14\contents\Lib\site-packages\tensorflow\include -IC:\b\s\w\ir\x\w\llvm_build\include -IC:\b\s\w\ir\x\w\llvm-llvm-project\llvm\include -IC:\b\s\w\ir\x\w\lldb_install\python3\include -IC:\b\s\w\ir\x\w\llvm-llvm-project\llvm\..\clang\include -IC:\b\s\w\ir\x\w\llvm_build\tools\lldb\..\clang\include -IC:\b\s\w\ir\x\w\llvm-llvm-project\lldb\source -IC:\b\s\w\ir\x\w\llvm_build\tools\lldb\source -imsvcC:\b\s\w\ir\x\w\libxml2_install_target\include\libxml2 -imsvcC:\b\s\w\ir\x\w\zlib_install_target\include -imsvcC:\b\s\w\ir\x\w\zstd_install\include /DWIN32 /D_WINDOWS   /Zc:inline /Zc:__cplusplus /Oi /Brepro /bigobj /permissive- -Werror=unguarded-availability-new /W4  -Wextra -Wno-unused-parameter -Wwrite-strings -Wcast-qual -Wmissing-field-initializers -Wimplicit-fallthrough -Wcovered-switch-default -Wno-noexcept-type -Wnon-virtual-dtor -Wdelete-non-virtual-dtor -Wsuggest-override -Wstring-conversion -Wmisleading-indentation -Wctad-maybe-unsupported /Gw -Wno-deprecated-register -Wno-vla-extension /O2 /Ob2  -std:c++17 -MT   -wd4018 -wd4068 -wd4150 -wd4201 -wd4251 -wd4521 -wd4530 -wd4589  /EHs-c- /GR- -UNDEBUG /showIncludes /Fotools\lldb\source\Plugins\ObjectFile\Minidump\CMakeFiles\lldbPluginObjectFileMinidump.dir\MinidumpFileBuilder.cpp.obj /Fdtools\lldb\source\Plugins\ObjectFile\Minidump\CMakeFiles\lldbPluginObjectFileMinidump.dir\lldbPluginObjectFileMinidump.pdb -c -- C:\b\s\w\ir\x\w\llvm-llvm-project\lldb\source\Plugins\ObjectFile\Minidump\MinidumpFileBuilder.cpp
In file included from C:\b\s\w\ir\x\w\llvm-llvm-project\lldb\source\Plugins\ObjectFile\Minidump\MinidumpFileBuilder.cpp:9:
C:\b\s\w\ir\x\w\llvm-llvm-project\lldb\source\Plugins\ObjectFile\Minidump\MinidumpFileBuilder.h(149,3): error: unknown type name 'uint'; did you mean 'int'?
  149 |   uint m_expected_directories = 0;
      |   ^~~~
      |   int
1 error generated.

Would it be possible to take a look and revert if necessary?

Jlalond · 2024-06-24T21:41:38Z

@petrhosek because it's just going to be a change to something like size_t, would you mind if I just did a quick commit to fix this and assign you as the reviewer?

Jlalond · 2024-06-24T22:30:45Z

Fixed uint issue in #96564

…e_t (#96564) In #95312 I incorrectly set `m_expected_directories` to uint, this broke the windows build and is the incorrect type. `size_t` is more accurate because this value only ever represents the expected upper bound of the directory vector.

vvereschaka · 2024-06-25T05:15:58Z

Hi @Jlalond ,
here is still a problem with the window builds:

FAILED: tools/lldb/source/Plugins/ObjectFile/Minidump/CMakeFiles/lldbPluginObjectFileMinidump.dir/ObjectFileMinidump.cpp.obj 
ccache C:\PROGRA~1\LLVM\bin\clang-cl.exe  /nologo -TP -DGTEST_HAS_RTTI=0 -DUNICODE -D_CRT_NONSTDC_NO_DEPRECATE -D_CRT_NONSTDC_NO_WARNINGS -D_CRT_SECURE_NO_DEPRECATE -D_CRT_SECURE_NO_WARNINGS -D_ENABLE_EXTENDED_ALIGNED_STORAGE -D_HAS_EXCEPTIONS=0 -D_SCL_SECURE_NO_DEPRECATE -D_SCL_SECURE_NO_WARNINGS -D_UNICODE -D__STDC_CONSTANT_MACROS -D__STDC_FORMAT_MACROS -D__STDC_LIMIT_MACROS -IC:\Users\tcwg\llvm-worker\lldb-aarch64-windows\build\tools\lldb\source\Plugins\ObjectFile\Minidump -IC:\Users\tcwg\llvm-worker\lldb-aarch64-windows\llvm-project\lldb\source\Plugins\ObjectFile\Minidump -IC:\Users\tcwg\llvm-worker\lldb-aarch64-windows\llvm-project\lldb\include -IC:\Users\tcwg\llvm-worker\lldb-aarch64-windows\build\tools\lldb\include -IC:\Users\tcwg\llvm-worker\lldb-aarch64-windows\build\include -IC:\Users\tcwg\llvm-worker\lldb-aarch64-windows\llvm-project\llvm\include -IC:\Users\tcwg\AppData\Local\Programs\Python\Python311-arm64\include -IC:\Users\tcwg\llvm-worker\lldb-aarch64-windows\llvm-project\llvm\..\clang\include -IC:\Users\tcwg\llvm-worker\lldb-aarch64-windows\build\tools\lldb\..\clang\include -IC:\Users\tcwg\llvm-worker\lldb-aarch64-windows\llvm-project\lldb\source -IC:\Users\tcwg\llvm-worker\lldb-aarch64-windows\build\tools\lldb\source /DWIN32 /D_WINDOWS   /Zc:inline /Zc:__cplusplus /Oi /Brepro /bigobj /permissive- -Werror=unguarded-availability-new /W4  -Wextra -Wno-unused-parameter -Wwrite-strings -Wcast-qual -Wmissing-field-initializers -Wimplicit-fallthrough -Wcovered-switch-default -Wno-noexcept-type -Wnon-virtual-dtor -Wdelete-non-virtual-dtor -Wsuggest-override -Wstring-conversion -Wmisleading-indentation -Wctad-maybe-unsupported /Gw -Wno-deprecated-register -Wno-vla-extension /O2 /Ob2 /DNDEBUG -MD   -wd4018 -wd4068 -wd4150 -wd4201 -wd4251 -wd4521 -wd4530 -wd4589  /EHs-c- /GR- -std:c++17 /showIncludes /Fotools\lldb\source\Plugins\ObjectFile\Minidump\CMakeFiles\lldbPluginObjectFileMinidump.dir\ObjectFileMinidump.cpp.obj /Fdtools\lldb\source\Plugins\ObjectFile\Minidump\CMakeFiles\lldbPluginObjectFileMinidump.dir\lldbPluginObjectFileMinidump.pdb -c -- C:\Users\tcwg\llvm-worker\lldb-aarch64-windows\llvm-project\lldb\source\Plugins\ObjectFile\Minidump\ObjectFileMinidump.cpp
C:\Users\tcwg\llvm-worker\lldb-aarch64-windows\llvm-project\lldb\source\Plugins\ObjectFile\Minidump\ObjectFileMinidump.cpp(21,10): fatal error: 'unistd.h' file not found
   21 | #include <unistd.h>
      |          ^~~~~~~~~~
1 error generated.

Jlalond · 2024-06-25T16:33:40Z

@vvereschaka I'll work on this today. Is it time pressing enough you would want everything reverted?

vvereschaka · 2024-06-25T16:38:23Z

@Jlalond , perfect, thank you. It is ok for me if you'll fix the problem during today.

petrhosek · 2024-06-25T18:36:00Z

We're seeing the same issue as well on our builders, will you be able to address this today?

Jlalond · 2024-06-25T22:06:41Z

@petrhosek I am working on this. I think it's only the header thankfully

![image](https://github.com/llvm/llvm-project/assets/25160653/2044cc8e-72d5-49ec-9439-256555f5fd2b) In #95312 uint and `#include <unistd.h>` were introduced. These broke the windows build. I addressed uint in #96564, but include went unfixed. So I acquired myself a windows machine to validate.

…vm#95312) Currently, LLDB does not support taking a minidump over the 4.2gb limit imposed by uint32. In fact, currently it writes the RVA's and the headers to the end of the file, which can become corrupted due to the header offset only supporting a 32b offset. This change reorganizes how the file structure is laid out. LLDB will precalculate the number of directories required and preallocate space at the top of the file to fill in later. Additionally, thread stacks require a 32b offset, and we provision empty descriptors and keep track of them to clean up once we write the 32b memory list. For [MemoryList64](https://learn.microsoft.com/en-us/windows/win32/api/minidumpapiset/ns-minidumpapiset-minidump_memory64_list), the RVA to the start of the section itself will remain in a 32b addressable space. We achieve this by predetermining the space the memory regions will take, and only writing up to 4.2 gb of data with some buffer to allow all the MemoryDescriptor64s to also still be 32b addressable. I did not add any explicit tests to this PR because allocating 4.2gb+ to test is very expensive. However, we have 32b automation tests and I validated with in several ways, including with 5gb+ array/object and would be willing to add this as a test case.

…e_t (llvm#96564) In llvm#95312 I incorrectly set `m_expected_directories` to uint, this broke the windows build and is the incorrect type. `size_t` is more accurate because this value only ever represents the expected upper bound of the directory vector.

![image](https://github.com/llvm/llvm-project/assets/25160653/2044cc8e-72d5-49ec-9439-256555f5fd2b) In llvm#95312 uint and `#include <unistd.h>` were introduced. These broke the windows build. I addressed uint in llvm#96564, but include went unfixed. So I acquired myself a windows machine to validate.

@mbucko

… regions when /proc/pid maps are missing. (#101086) This PR is in response to a bug my coworker @mbucko discovered where on MacOS Minidumps were being created where the 64b memory regions were readable, but were not being listed in `SBProcess.GetMemoryRegionList()`. This went unnoticed in #95312 due to all the linux testing including /proc/pid maps. On MacOS generated dumps (or any dump without access to /proc/pid) we would fail to properly map Memory Regions due to there being two independent methods for 32b and 64b mapping. In this PR I addressed this minor bug and merged the methods, but in order to add test coverage required additions to `obj2yaml` and `yaml2obj` which make up the bulk of this patch. Lastly, there are some non-required changes such as the addition of the `Memory64ListHeader` type, to make writing/reading the header section of the Memory64List easier.

@mbucko

… regions when /proc/pid maps are missing. (llvm#101086) This PR is in response to a bug my coworker @mbucko discovered where on MacOS Minidumps were being created where the 64b memory regions were readable, but were not being listed in `SBProcess.GetMemoryRegionList()`. This went unnoticed in llvm#95312 due to all the linux testing including /proc/pid maps. On MacOS generated dumps (or any dump without access to /proc/pid) we would fail to properly map Memory Regions due to there being two independent methods for 32b and 64b mapping. In this PR I addressed this minor bug and merged the methods, but in order to add test coverage required additions to `obj2yaml` and `yaml2obj` which make up the bulk of this patch. Lastly, there are some non-required changes such as the addition of the `Memory64ListHeader` type, to make writing/reading the header section of the Memory64List easier.

In #95312 Minidump file creation was moved from being created at the end, to the file being emitted in chunks. This causes some undesirable behavior where the file can still be present after an error has occurred. To resolve this we will now delete the file upon an error.

Jlalond added 5 commits June 12, 2024 14:05

Squash minidump-64-protoype.

8108655

This is a change to reorder LLDB's minidumps to support 64b memory lists/minidumps while still keeping all directories under the 32b limit. Better O(N), not N^2 cleanup code

Remove debug changes to testprocesssavecore

2885cb3

Add some comments and remove changes to ulittle64

762fb82

Run git clang format

5bf2dbb

Change log to zu and region index to size_t

c592936

Jlalond requested a review from JDevlieghere as a code owner June 12, 2024 21:20

llvmbot added lldb objectyaml llvm:binary-utilities labels Jun 12, 2024

kusmour requested review from clayborg and jeffreytan81 June 13, 2024 00:34

clayborg requested changes Jun 13, 2024

View reviewed changes

Jlalond commented Jun 13, 2024

View reviewed changes

lldb/source/Plugins/ObjectFile/Minidump/MinidumpFileBuilder.cpp Show resolved Hide resolved

Jlalond added 3 commits June 13, 2024 13:17

Code review feedback and an attempt to undo unintended changes

051ebab

Try to fix MinidumpFilerbuilder.h to minimize changes, and fix the fl…

b51cdbf

…ush to disk loop.

Run git clang format

53da9f5

clayborg requested changes Jun 13, 2024

View reviewed changes

Linting, and add some sorting so that we can determine the largest po…

e3841d8

…ssible buffer ahead of time

jeffreytan81 reviewed Jun 14, 2024

View reviewed changes

Implement fut further feedback and safety checks. Fix bug with 64b be…

5e375ea

…ing off by 8 + descirptor count * sizeof(descriptor_64)

clayborg reviewed Jun 15, 2024

View reviewed changes

Implement seek logic instead of explicitly writing bytes. Fix bug and…

f5de3f1

… instead keep tracrack internally. Add comments explaining serialization formating and process as requested by Jeffrey. Run Git clang format.

clayborg reviewed Jun 17, 2024

View reviewed changes

clayborg reviewed Jun 18, 2024

View reviewed changes

Code review feedback on size_t, fix printf for uint64. Cast type to u…

95a0a6c

…int32_t to avoid warnings. Run git clang format.

clayborg reviewed Jun 18, 2024

View reviewed changes

Extra cleanup that Greg pointed out, ensure full use of variable in l…

487c7fc

…oop and make it const. Fix m_total_siz_size formatting. Run git-clang-format

keith added a commit that referenced this pull request Jun 24, 2024

[bazel] Port #95312 (#96533)

0534953

Jlalond mentioned this pull request Jun 24, 2024

[LLDB][Minidump] Change expected directories to the correct type; size_t #96564

Merged

Jlalond mentioned this pull request Jun 26, 2024

[LLDB][Windows Build]Removed header and validated on new windows machine #96724

Merged

AlexisPerry pushed a commit to llvm-project-tlp/llvm-project that referenced this pull request Jul 9, 2024

[bazel] Port llvm#95312 (llvm#96533)

08a7ac3

Jlalond mentioned this pull request Jul 29, 2024

[LLDB][Minidump] Fix ProcessMinidump::GetMemoryRegions to include 64b regions when /proc/pid maps are missing. #101086

Merged

Jlalond deleted the minidump-64b-support branch August 6, 2024 18:46

Jlalond mentioned this pull request Sep 11, 2024

[LLDB][Minidump] Minidump erase file on failure #108259

Merged

Jlalond mentioned this pull request Jan 13, 2025

[LLDB][Minidump] Have Minidumps save off and properly read TLS data #109477

Merged

		// Add Exception streams for any threads that stopped with exceptions.
		void AddExceptions(const lldb::ProcessSP &process_sp);

		lldb_private::Status AddThreadList(const lldb::ProcessSP &process_sp);

		@@ -50,48 +60,75 @@ class MinidumpFileBuilder {

		~MinidumpFileBuilder() = default;

		lldb_private::Status AddHeaderAndCalculateDirectories();

		total_size += core_range.range.size();
		total_size += sizeof(llvm::minidump::MemoryDescriptor);

		llvm::support::ulittle32_t memory_ranges_num =
		static_cast<llvm::support::ulittle32_t>(descriptors.size());

		llvm::support::ulittle64_t memory_ranges_num =
		static_cast<llvm::support::ulittle64_t>(ranges.size());

[LLDB][Minidump] Add 64b support to LLDB's minidump file builder. #95312

[LLDB][Minidump] Add 64b support to LLDB's minidump file builder. #95312

Uh oh!

Conversation

Jlalond commented Jun 12, 2024

Uh oh!

llvmbot commented Jun 12, 2024 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

github-actions bot commented Jun 12, 2024 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Uh oh!

jeffreytan81 commented Jun 14, 2024

llvmbot commented Jun 12, 2024 •

edited

Loading

github-actions bot commented Jun 12, 2024 •

edited

Loading