Skip to content

[LLDB][Minidump] Add 64b support to LLDB's minidump file builder. #95312

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
merged 16 commits into from
Jun 24, 2024

Conversation

Jlalond
Copy link
Contributor

@Jlalond Jlalond commented Jun 12, 2024

Currently, LLDB does not support taking a minidump over the 4.2gb limit imposed by uint32. In fact, currently it writes the RVA's and the headers to the end of the file, which can become corrupted due to the header offset only supporting a 32b offset.

This change reorganizes how the file structure is laid out. LLDB will precalculate the number of directories required and preallocate space at the top of the file to fill in later. Additionally, thread stacks require a 32b offset, and we provision empty descriptors and keep track of them to clean up once we write the 32b memory list.

For MemoryList64, the RVA to the start of the section itself will remain in a 32b addressable space. We achieve this by predetermining the space the memory regions will take, and only writing up to 4.2 gb of data with some buffer to allow all the MemoryDescriptor64s to also still be 32b addressable.

I did not add any explicit tests to this PR because allocating 4.2gb+ to test is very expensive. However, we have 32b automation tests and I validated with in several ways, including with 5gb+ array/object and would be willing to add this as a test case.

Jlalond added 5 commits June 12, 2024 14:05
This is a change to reorder LLDB's minidumps to support 64b memory lists/minidumps while still keeping all directories under the 32b limit.

Better O(N), not N^2 cleanup code
@llvmbot
Copy link
Member

llvmbot commented Jun 12, 2024

@llvm/pr-subscribers-lldb
@llvm/pr-subscribers-objectyaml

@llvm/pr-subscribers-llvm-binary-utilities

Author: Jacob Lalonde (Jlalond)

Changes

Currently, LLDB does not support taking a minidump over the 4.2gb limit imposed by uint32. In fact, currently it writes the RVA's and the headers to the end of the file, which can become corrupted due to the header offset only supporting a 32b offset.

This change reorganizes how the file structure is laid out. LLDB will precalculate the number of directories required and preallocate space at the top of the file to fill in later. Additionally, thread stacks require a 32b offset, and we provision empty descriptors and keep track of them to clean up once we write the 32b memory list.

For MemoryList64, the RVA to the start of the section itself will remain in a 32b addressable space. We achieve this by predetermining the space the memory regions will take, and only writing up to 4.2 gb of data with some buffer to allow all the MemoryDescriptor64s to also still be 32b addressable.

I did not add any explicit tests to this PR because allocating 4.2gb+ to test is very expensive. However, we have 32b automation tests and I validated with in several ways, including with 5gb+ array/object and would be willing to add this as a test case.


Patch is 36.01 KiB, truncated to 20.00 KiB below, full version: https://github.com/llvm/llvm-project/pull/95312.diff

5 Files Affected:

  • (modified) lldb/source/Plugins/ObjectFile/Minidump/MinidumpFileBuilder.cpp (+412-118)
  • (modified) lldb/source/Plugins/ObjectFile/Minidump/MinidumpFileBuilder.h (+52-15)
  • (modified) lldb/source/Plugins/ObjectFile/Minidump/ObjectFileMinidump.cpp (+23-25)
  • (modified) llvm/include/llvm/BinaryFormat/Minidump.h (+5)
  • (modified) llvm/lib/ObjectYAML/MinidumpEmitter.cpp (+2)
diff --git a/lldb/source/Plugins/ObjectFile/Minidump/MinidumpFileBuilder.cpp b/lldb/source/Plugins/ObjectFile/Minidump/MinidumpFileBuilder.cpp
index 7231433619ffb..3c1d88652cc92 100644
--- a/lldb/source/Plugins/ObjectFile/Minidump/MinidumpFileBuilder.cpp
+++ b/lldb/source/Plugins/ObjectFile/Minidump/MinidumpFileBuilder.cpp
@@ -20,25 +20,98 @@
 #include "lldb/Target/RegisterContext.h"
 #include "lldb/Target/StopInfo.h"
 #include "lldb/Target/ThreadList.h"
+#include "lldb/Utility/DataBufferHeap.h"
 #include "lldb/Utility/DataExtractor.h"
 #include "lldb/Utility/LLDBLog.h"
 #include "lldb/Utility/Log.h"
+#include "lldb/Utility/LLDBLog.h"
 #include "lldb/Utility/RegisterValue.h"
 
 #include "llvm/ADT/StringRef.h"
 #include "llvm/BinaryFormat/Minidump.h"
 #include "llvm/Support/ConvertUTF.h"
+#include "llvm/Support/Endian.h"
 #include "llvm/Support/Error.h"
 
 #include "Plugins/Process/minidump/MinidumpTypes.h"
+#include "lldb/lldb-enumerations.h"
+#include "lldb/lldb-forward.h"
+#include "lldb/lldb-types.h"
 
+#include <algorithm>
 #include <cinttypes>
+#include <climits>
+#include <cstddef>
+#include <cstdint>
+#include <iostream>
+#include <set>
+#include <utility>
+#include <vector>
 
 using namespace lldb;
 using namespace lldb_private;
 using namespace llvm::minidump;
 
-void MinidumpFileBuilder::AddDirectory(StreamType type, size_t stream_size) {
+Status MinidumpFileBuilder::AddHeaderAndCalculateDirectories(
+    const lldb_private::Target &target, const lldb::ProcessSP &process_sp) {
+  // First set the offset on the file, and on the bytes saved
+  m_saved_data_size += header_size;
+  // We know we will have at least Misc, SystemInfo, Modules, and ThreadList
+  // (corresponding memory list for stacks) And an additional memory list for
+  // non-stacks.
+  m_expected_directories = 6;
+  // Check if OS is linux
+  if (target.GetArchitecture().GetTriple().getOS() ==
+      llvm::Triple::OSType::Linux)
+    m_expected_directories += 9;
+
+  // Go through all of the threads and check for exceptions.
+  lldb_private::ThreadList thread_list = process_sp->GetThreadList();
+  const uint32_t num_threads = thread_list.GetSize();
+  for (uint32_t thread_idx = 0; thread_idx < num_threads; ++thread_idx) {
+    ThreadSP thread_sp(thread_list.GetThreadAtIndex(thread_idx));
+    StopInfoSP stop_info_sp = thread_sp->GetStopInfo();
+    if (stop_info_sp &&
+        stop_info_sp->GetStopReason() == StopReason::eStopReasonException) {
+      m_expected_directories++;
+    }
+  }
+
+  // Now offset the file by the directores so we can write them in later.
+  offset_t directory_offset = m_expected_directories * directory_size;
+  m_saved_data_size += directory_offset;
+  // Replace this when we make a better way to do this.
+  Status error;
+  Header empty_header;
+  size_t bytes_written;
+  bytes_written = header_size;
+  error = m_core_file->Write(&empty_header, bytes_written);
+  if (error.Fail() || bytes_written != header_size) {
+    if (bytes_written != header_size)
+      error.SetErrorStringWithFormat(
+          "unable to write the header (written %zd/%zd)", bytes_written,
+          header_size);
+    return error;
+  }
+
+  for (uint i = 0; i < m_expected_directories; i++) {
+    size_t bytes_written;
+    bytes_written = directory_size;
+    Directory empty_directory;
+    error = m_core_file->Write(&empty_directory, bytes_written);
+    if (error.Fail() || bytes_written != directory_size) {
+      if (bytes_written != directory_size)
+        error.SetErrorStringWithFormat(
+            "unable to write the directory (written %zd/%zd)", bytes_written,
+            directory_size);
+      return error;
+    }
+  }
+
+  return error;
+}
+
+void MinidumpFileBuilder::AddDirectory(StreamType type, uint64_t stream_size) {
   LocationDescriptor loc;
   loc.DataSize = static_cast<llvm::support::ulittle32_t>(stream_size);
   // Stream will begin at the current end of data section
@@ -49,6 +122,7 @@ void MinidumpFileBuilder::AddDirectory(StreamType type, size_t stream_size) {
   dir.Location = loc;
 
   m_directories.push_back(dir);
+  assert(m_expected_directories >= m_directories.size());
 }
 
 Status MinidumpFileBuilder::AddSystemInfo(const llvm::Triple &target_triple) {
@@ -165,7 +239,6 @@ llvm::Expected<uint64_t> getModuleFileSize(Target &target,
     }
     return SizeOfImage;
   }
-
   SectionSP sect_sp = mod->GetObjectFile()->GetBaseAddress().GetSection();
 
   if (!sect_sp) {
@@ -238,12 +311,10 @@ Status MinidumpFileBuilder::AddModuleList(Target &target) {
     auto maybe_mod_size = getModuleFileSize(target, mod);
     if (!maybe_mod_size) {
       llvm::Error mod_size_err = maybe_mod_size.takeError();
-      llvm::handleAllErrors(std::move(mod_size_err),
-                            [&](const llvm::ErrorInfoBase &E) {
-                              error.SetErrorStringWithFormat(
-                                  "Unable to get the size of module %s: %s.",
-                                  module_name.c_str(), E.message().c_str());
-                            });
+      llvm::handleAllErrors(std::move(mod_size_err), [&](const llvm::ErrorInfoBase &E) {
+        error.SetErrorStringWithFormat("Unable to get the size of module %s: %s.",
+                                     module_name.c_str(), E.message().c_str());
+      });
       return error;
     }
 
@@ -513,6 +584,29 @@ findStackHelper(const lldb::ProcessSP &process_sp, uint64_t rsp) {
   return std::make_pair(addr, size);
 }
 
+Status MinidumpFileBuilder::FixThreads() {
+  Status error;
+  // If we have anything in the heap flush it.
+  FlushToDisk();
+  m_core_file->SeekFromStart(m_thread_list_start);
+  for (auto &pair : m_thread_by_range_start) {
+    // The thread objects will get a new memory descriptor added
+    // When we are emitting the memory list and then we write it here
+    llvm::minidump::Thread thread = pair.second;
+    size_t bytes_to_write = sizeof(llvm::minidump::Thread);
+    size_t bytes_written = bytes_to_write;
+    error = m_core_file->Write(&thread, bytes_written);
+    if (error.Fail() || bytes_to_write != bytes_written) {
+      error.SetErrorStringWithFormat(
+          "Wrote incorrect number of bytes to minidump file. (written %zd/%zd)",
+          bytes_written, bytes_to_write);
+      return error;
+    }
+  }
+
+  return error;
+}
+
 Status MinidumpFileBuilder::AddThreadList(const lldb::ProcessSP &process_sp) {
   constexpr size_t minidump_thread_size = sizeof(llvm::minidump::Thread);
   lldb_private::ThreadList thread_list = process_sp->GetThreadList();
@@ -530,10 +624,12 @@ Status MinidumpFileBuilder::AddThreadList(const lldb::ProcessSP &process_sp) {
       static_cast<llvm::support::ulittle32_t>(thread_list.GetSize());
   m_data.AppendData(&thread_count, sizeof(llvm::support::ulittle32_t));
 
+  // Take the offset after the thread count.
+  m_thread_list_start = GetCurrentDataEndOffset();
   DataBufferHeap helper_data;
 
   const uint32_t num_threads = thread_list.GetSize();
-
+  Log *log = GetLog(LLDBLog::Object);
   for (uint32_t thread_idx = 0; thread_idx < num_threads; ++thread_idx) {
     ThreadSP thread_sp(thread_list.GetThreadAtIndex(thread_idx));
     RegisterContextSP reg_ctx_sp(thread_sp->GetRegisterContext());
@@ -553,38 +649,18 @@ Status MinidumpFileBuilder::AddThreadList(const lldb::ProcessSP &process_sp) {
           arch.GetTriple().getArchName().str().c_str());
       return error;
     }
-    uint64_t sp = reg_ctx->GetSP();
-    auto expected_address_range = findStackHelper(process_sp, sp);
-
-    if (!expected_address_range) {
-      consumeError(expected_address_range.takeError());
-      error.SetErrorString("Unable to get the stack address.");
-      return error;
-    }
-
-    std::pair<uint64_t, uint64_t> range = std::move(*expected_address_range);
-    uint64_t addr = range.first;
-    uint64_t size = range.second;
-
-    auto data_up = std::make_unique<DataBufferHeap>(size, 0);
-    const size_t stack_bytes_read =
-        process_sp->ReadMemory(addr, data_up->GetBytes(), size, error);
-
-    if (error.Fail())
-      return error;
 
-    LocationDescriptor stack_memory;
-    stack_memory.DataSize =
-        static_cast<llvm::support::ulittle32_t>(stack_bytes_read);
-    stack_memory.RVA = static_cast<llvm::support::ulittle32_t>(
-        size_before + thread_stream_size + helper_data.GetByteSize());
+    uint64_t sp = reg_ctx->GetSP();
+    MemoryRegionInfo sp_region;
+    process_sp->GetMemoryRegionInfo(sp, sp_region);
 
+    // Emit a blank descriptor
     MemoryDescriptor stack;
-    stack.StartOfMemoryRange = static_cast<llvm::support::ulittle64_t>(addr);
-    stack.Memory = stack_memory;
-
-    helper_data.AppendData(data_up->GetBytes(), stack_bytes_read);
-
+    LocationDescriptor empty_label;
+    empty_label.DataSize = 0;
+    empty_label.RVA = 0;
+    stack.Memory = empty_label;
+    stack.StartOfMemoryRange = 0;
     LocationDescriptor thread_context_memory_locator;
     thread_context_memory_locator.DataSize =
         static_cast<llvm::support::ulittle32_t>(thread_context.size());
@@ -593,6 +669,8 @@ Status MinidumpFileBuilder::AddThreadList(const lldb::ProcessSP &process_sp) {
     // Cache thie thread context memory so we can reuse for exceptions.
     m_tid_to_reg_ctx[thread_sp->GetID()] = thread_context_memory_locator;
 
+    LLDB_LOGF(log, "AddThreadList for thread %d: thread_context %zu bytes",
+              thread_idx, thread_context.size());
     helper_data.AppendData(thread_context.data(), thread_context.size());
 
     llvm::minidump::Thread t;
@@ -604,9 +682,13 @@ Status MinidumpFileBuilder::AddThreadList(const lldb::ProcessSP &process_sp) {
     t.EnvironmentBlock = static_cast<llvm::support::ulittle64_t>(0);
     t.Stack = stack, t.Context = thread_context_memory_locator;
 
+    // We save off the stack object so we can circle back and clean it up.
+    m_thread_by_range_start[sp_region.GetRange().GetRangeBase()] = t;
     m_data.AppendData(&t, sizeof(llvm::minidump::Thread));
   }
 
+  LLDB_LOGF(log, "AddThreadList(): total helper_data %zu bytes",
+            helper_data.GetByteSize());
   m_data.AppendData(helper_data.GetBytes(), helper_data.GetByteSize());
   return Status();
 }
@@ -662,64 +744,6 @@ void MinidumpFileBuilder::AddExceptions(const lldb::ProcessSP &process_sp) {
   }
 }
 
-lldb_private::Status
-MinidumpFileBuilder::AddMemoryList(const lldb::ProcessSP &process_sp,
-                                   lldb::SaveCoreStyle core_style) {
-  Status error;
-  Process::CoreFileMemoryRanges core_ranges;
-  error = process_sp->CalculateCoreFileSaveRanges(core_style, core_ranges);
-  if (error.Fail()) {
-    error.SetErrorString("Process doesn't support getting memory region info.");
-    return error;
-  }
-
-  DataBufferHeap helper_data;
-  std::vector<MemoryDescriptor> mem_descriptors;
-  for (const auto &core_range : core_ranges) {
-    // Skip empty memory regions.
-    if (core_range.range.empty())
-      continue;
-    const addr_t addr = core_range.range.start();
-    const addr_t size = core_range.range.size();
-    auto data_up = std::make_unique<DataBufferHeap>(size, 0);
-    const size_t bytes_read =
-        process_sp->ReadMemory(addr, data_up->GetBytes(), size, error);
-    if (error.Fail()) {
-      Log *log = GetLog(LLDBLog::Object);
-      LLDB_LOGF(log, "Failed to read memory region. Bytes read: %zu, error: %s",
-                bytes_read, error.AsCString());
-      error.Clear();
-    }
-    if (bytes_read == 0)
-      continue;
-    // We have a good memory region with valid bytes to store.
-    LocationDescriptor memory_dump;
-    memory_dump.DataSize = static_cast<llvm::support::ulittle32_t>(bytes_read);
-    memory_dump.RVA =
-        static_cast<llvm::support::ulittle32_t>(GetCurrentDataEndOffset());
-    MemoryDescriptor memory_desc;
-    memory_desc.StartOfMemoryRange =
-        static_cast<llvm::support::ulittle64_t>(addr);
-    memory_desc.Memory = memory_dump;
-    mem_descriptors.push_back(memory_desc);
-    m_data.AppendData(data_up->GetBytes(), bytes_read);
-  }
-
-  AddDirectory(StreamType::MemoryList,
-               sizeof(llvm::support::ulittle32_t) +
-                   mem_descriptors.size() *
-                       sizeof(llvm::minidump::MemoryDescriptor));
-  llvm::support::ulittle32_t memory_ranges_num(mem_descriptors.size());
-
-  m_data.AppendData(&memory_ranges_num, sizeof(llvm::support::ulittle32_t));
-  for (auto memory_descriptor : mem_descriptors) {
-    m_data.AppendData(&memory_descriptor,
-                      sizeof(llvm::minidump::MemoryDescriptor));
-  }
-
-  return error;
-}
-
 void MinidumpFileBuilder::AddMiscInfo(const lldb::ProcessSP &process_sp) {
   AddDirectory(StreamType::MiscInfo,
                sizeof(lldb_private::minidump::MinidumpMiscInfo));
@@ -782,6 +806,7 @@ void MinidumpFileBuilder::AddLinuxFileStreams(
         {StreamType::LinuxProcFD, "/proc/" + pid_str + "/fd"});
   }
 
+  Status error;
   for (const auto &entry : files_with_stream_types) {
     StreamType stream = entry.first;
     std::string path = entry.second;
@@ -797,10 +822,64 @@ void MinidumpFileBuilder::AddLinuxFileStreams(
   }
 }
 
-Status MinidumpFileBuilder::Dump(lldb::FileUP &core_file) const {
-  constexpr size_t header_size = sizeof(llvm::minidump::Header);
-  constexpr size_t directory_size = sizeof(llvm::minidump::Directory);
+Status MinidumpFileBuilder::AddMemory(const ProcessSP &process_sp,
+                                      SaveCoreStyle core_style) {
+  Status error;
+
+  Process::CoreFileMemoryRanges ranges_for_memory_list;
+  error = process_sp->CalculateCoreFileSaveRanges(
+      SaveCoreStyle::eSaveCoreStackOnly, ranges_for_memory_list);
+  if (error.Fail()) {
+    return error;
+  }
+
+  std::set<addr_t> stack_ranges;
+  for (const auto &core_range : ranges_for_memory_list) {
+    stack_ranges.insert(core_range.range.start());
+  }
+  // We leave a little padding for dictionary and any other metadata we would
+  // want. Also so that we can put the header of the memory list 64 in 32b land,
+  // because the directory requires a 32b RVA.
+  Process::CoreFileMemoryRanges ranges;
+  error = process_sp->CalculateCoreFileSaveRanges(core_style, ranges);
+  if (error.Fail()) {
+    return error;
+  }
+
+  uint64_t total_size =
+      256 + (ranges.size() * sizeof(llvm::minidump::MemoryDescriptor_64));
+  // Take all the memory that will fit in the 32b range.
+  for (int i = ranges.size() - 1; i >= 0; i--) {
+    addr_t size_to_add =
+        ranges[i].range.size() + sizeof(llvm::minidump::MemoryDescriptor);
+    // filter out the stacks and check if it's below 32b max.
+    if (stack_ranges.count(ranges[i].range.start()) > 0) {
+      ranges.erase(ranges.begin() + i);
+    } else if (total_size + size_to_add < UINT32_MAX) {
+      ranges_for_memory_list.push_back(ranges[i]);
+      total_size += ranges[i].range.size();
+      total_size += sizeof(llvm::minidump::MemoryDescriptor);
+      ranges.erase(ranges.begin() + i);
+    } else {
+      break;
+    }
+  }
 
+  error = AddMemoryList_32(process_sp, ranges_for_memory_list);
+  if (error.Fail())
+    return error;
+
+  // Add the remaining memory as a 64b range.
+  if (ranges.size() > 0) {
+    error = AddMemoryList_64(process_sp, ranges);
+    if (error.Fail())
+      return error;
+  }
+
+  return FixThreads();
+}
+
+Status MinidumpFileBuilder::DumpHeader() const {
   // write header
   llvm::minidump::Header header;
   header.Signature = static_cast<llvm::support::ulittle32_t>(
@@ -808,9 +887,10 @@ Status MinidumpFileBuilder::Dump(lldb::FileUP &core_file) const {
   header.Version = static_cast<llvm::support::ulittle32_t>(
       llvm::minidump::Header::MagicVersion);
   header.NumberOfStreams =
-      static_cast<llvm::support::ulittle32_t>(GetDirectoriesNum());
+      static_cast<llvm::support::ulittle32_t>(m_directories.size());
+  // We write the directories right after the header.
   header.StreamDirectoryRVA =
-      static_cast<llvm::support::ulittle32_t>(GetCurrentDataEndOffset());
+      static_cast<llvm::support::ulittle32_t>(header_size);
   header.Checksum = static_cast<llvm::support::ulittle32_t>(
       0u), // not used in most of the writers
       header.TimeDateStamp =
@@ -821,8 +901,9 @@ Status MinidumpFileBuilder::Dump(lldb::FileUP &core_file) const {
   Status error;
   size_t bytes_written;
 
+  m_core_file->SeekFromStart(0);
   bytes_written = header_size;
-  error = core_file->Write(&header, bytes_written);
+  error = m_core_file->Write(&header, bytes_written);
   if (error.Fail() || bytes_written != header_size) {
     if (bytes_written != header_size)
       error.SetErrorStringWithFormat(
@@ -830,22 +911,20 @@ Status MinidumpFileBuilder::Dump(lldb::FileUP &core_file) const {
           header_size);
     return error;
   }
+  return error;
+}
 
-  // write data
-  bytes_written = m_data.GetByteSize();
-  error = core_file->Write(m_data.GetBytes(), bytes_written);
-  if (error.Fail() || bytes_written != m_data.GetByteSize()) {
-    if (bytes_written != m_data.GetByteSize())
-      error.SetErrorStringWithFormat(
-          "unable to write the data (written %zd/%" PRIu64 ")", bytes_written,
-          m_data.GetByteSize());
-    return error;
-  }
+size_t MinidumpFileBuilder::GetCurrentDataEndOffset() const {
+  return m_data.GetByteSize() + m_saved_data_size;
+}
 
-  // write directories
+Status MinidumpFileBuilder::DumpDirectories() const {
+  Status error;
+  size_t bytes_written;
+  m_core_file->SeekFromStart(header_size);
   for (const Directory &dir : m_directories) {
     bytes_written = directory_size;
-    error = core_file->Write(&dir, bytes_written);
+    error = m_core_file->Write(&dir, bytes_written);
     if (error.Fail() || bytes_written != directory_size) {
       if (bytes_written != directory_size)
         error.SetErrorStringWithFormat(
@@ -858,10 +937,225 @@ Status MinidumpFileBuilder::Dump(lldb::FileUP &core_file) const {
   return error;
 }
 
-size_t MinidumpFileBuilder::GetDirectoriesNum() const {
-  return m_directories.size();
+Status MinidumpFileBuilder::AddMemoryList_32(
+    const ProcessSP &process_sp, const Process::CoreFileMemoryRanges &ranges) {
+  std::vector<MemoryDescriptor> descriptors;
+  Status error;
+  Log *log = GetLog(LLDBLog::Object);
+  size_t region_index = 0;
+  for (const auto &core_range : ranges) {
+    // Take the offset before we write.
+    const size_t offset_for_data = GetCurrentDataEndOffset();
+    const addr_t addr = core_range.range.start();
+    const addr_t size = core_range.range.size();
+    auto data_up = std::make_unique<DataBufferHeap>(size, 0);
+
+    LLDB_LOGF(
+        log,
+        "AddMemoryList %zu/%zu reading memory for region (%zu bytes) [%zx, %zx)",
+        region_index, ranges.size(), size, addr, addr + size);
+    ++region_index;
+
+    const size_t bytes_read =
+        process_sp->ReadMemory(addr, data_up->GetBytes(), size, error);
+    if (error.Fail() || bytes_read == 0) {
+      LLDB_LOGF(log, "Failed to read memory region. Bytes read: %zu, error: %s",
+                bytes_read, error.AsCString());
+      // Just skip sections with errors or zero bytes in 32b mode
+      continue;
+    } else if (bytes_read != size) {
+      LLDB_LOGF(log, "Memory region at: %zu failed to read %zu bytes", addr,
+                size);
+    }
+
+    MemoryDescriptor descriptor;
+    descriptor.StartOfMemoryRange =
+        static_cast<llvm::support::ulittle64_t>(addr);
+    descriptor.Memory.DataSize =
+        static_cast<llvm::support::ulittle32_t>(bytes_read);
+    descriptor.Memory.RVA =
+        static_cast<llvm::support::ulittle32_t>(offset_for_data);
+    descriptors.push_back(descriptor);
+    if (m_thread_by_range_start.count(addr) > 0) {
+      m_thread_by_range_start[addr].Stack = descriptor;
+    }
+
+    // Add the data to the buffer, flush as needed.
+    error = AddData(data_up->GetBytes(), bytes_read);
+    if (error.Fail())
+      return error;
+  }
+
+  // Add a directory that references this list
+  // With a size of the number of ranges as a 32 bit num
+  // And then the size of all the ranges
+  AddDirectory(StreamType::MemoryList,
+               sizeof(llvm::support::ulittle32_t) +
+                   descriptors.size() *
+                       sizeof(llvm::minidump::MemoryDescriptor));
+
+  llvm::support::ulittle32_t memory_ranges_num =
+      static_cast<llvm::support::ulittle32_t>(descriptors.size());
+  m_data.Ap...
[truncated]

Copy link

github-actions bot commented Jun 12, 2024

✅ With the latest revision this PR passed the C/C++ code formatter.

@kusmour kusmour requested review from clayborg and jeffreytan81 June 13, 2024 00:34
Comment on lines 62 to 64
lldb_private::Status AddThreadList(const lldb::ProcessSP &process_sp);
// Add Exception streams for any threads that stopped with exceptions.
void AddExceptions(const lldb::ProcessSP &process_sp);
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

These haven't changed at all, so I would move them back to this line. No need to make changes to this header file if things haven't changed (AddThreadList and AddExceptions). So make as much of this the same as it was before by reverting the above 3 lines and removing them from below.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I tried to lay these out in an order of operations (but had to some more), still want me to put them back?

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

yeah, always good to minimize code changes to make reviewing easier

Comment on lines 75 to 76
// Add Exception streams for any threads that stopped with exceptions.
void AddExceptions(const lldb::ProcessSP &process_sp);
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Revert to be the same as the lines removed from above to minimize changes to this file

Comment on lines 79 to 80
lldb_private::Status AddThreadList(const lldb::ProcessSP &process_sp);

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Revert these two lines and use the old ones that were removed above.

@@ -40,7 +46,7 @@ lldb_private::Status WriteString(const std::string &to_write,
/// the data on heap.
class MinidumpFileBuilder {
public:
MinidumpFileBuilder() = default;
MinidumpFileBuilder(lldb::FileUP&& core_file): m_core_file(std::move(core_file)) {};
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We should a const lldb::ProcessSP &process_sp to the constructor and remove all const lldb::ProcessSP &process_sp arguemnts from the methods below. We never want to change processes while creating a minidump. We should store the process shared pointer as an instance variable and then the methods that used to take the const lldb::ProcessSP &process_sp as an argument now use the instance variable:

  lldb::ProcessSP m_process_sp;

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I agree with this. I also think it reduces the split responsibility with ObjectfileMinidump


// Stores directories to later put them at the end of minidump file
void AddDirectory(llvm::minidump::StreamType type, uint64_t stream_size);
lldb::addr_t GetCurrentDataEndOffset() const;
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

return lldb::offset_t instead of lldb::addr_t.

return false;
}

builder.AddModuleList(target);
builder.AddMiscInfo(process_sp);

error = builder.AddThreadList(process_sp);
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

ditto

return false;
}
if (target.GetArchitecture().GetTriple().getOS() ==
llvm::Triple::OSType::Linux) {
builder.AddLinuxFileStreams(process_sp);
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

remove process_sp. If we construct the builder with a process_sp, it can use the m_process_sp instance variable that the MinidumpFileBuilder has to grab the target or process when it needs one.

if (target.GetArchitecture().GetTriple().getOS() ==
llvm::Triple::OSType::Linux) {
builder.AddLinuxFileStreams(process_sp);
}

// Add any exceptions but only if there are any in any threads.
builder.AddExceptions(process_sp);
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

remove process_sp. If we construct the builder with a process_sp, it can use the m_process_sp instance variable that the MinidumpFileBuilder has to grab the target or process when it needs one.

error = builder.AddMemoryList(process_sp, core_style);
// Note: add memory HAS to be the last thing we do. It can overflow into 64b
// land and many RVA's only support 32b
error = builder.AddMemory(process_sp, core_style);
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

remove process_sp. If we construct the builder with a process_sp, it can use the m_process_sp instance variable that the MinidumpFileBuilder has to grab the target or process when it needs one.

size_t GetDirectoriesNum() const;
void AddLinuxFileStreams();

lldb_private::Status AddMemory(lldb::SaveCoreStyle core_style);
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I would leave this named the same as it was before as AddMemoryList and move it back up to where it used to be in this source file

lldb_private::Status AddMemory(const lldb::ProcessSP &process_sp,
lldb::SaveCoreStyle core_style);

lldb_private::Status DumpToFile();
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It would be nice to just have a Finalize() that does what CompleteFile and FinalizeFile and WriteAndFinish would do?

std::vector<llvm::minidump::Directory> m_directories;
// When we write off the threads for the first time, we need to clean them up
// and give them the correct RVA once we write the stack memory list.
std::map<lldb::addr_t, llvm::minidump::Thread> m_thread_by_range_start;
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If it is an address then lldb::addr_t is the right choice. I must have confused this with another location that needed to use lldb::offset_t for an offset that was being stored as a lldb::addr_t

void MinidumpFileBuilder::AddDirectory(StreamType type, size_t stream_size) {
Status MinidumpFileBuilder::AddHeaderAndCalculateDirectories() {
// First set the offset on the file, and on the bytes saved
m_saved_data_size += header_size;
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can we have data already in the file here? If not we should assign m_saved_data_size here:

m_saved_data_size = header_size;

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is there a better name for m_saved_data_size?


lldb_private::Target &target = m_process_sp->GetTarget();
m_expected_directories = 6;
// Check if OS is linux
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

// Check if OS is linux and reserve directory space for all linux specific breakpad extension directories.

Comment on lines 842 to 845
if (stack_start_addresses.count(core_range.range.start()) > 0) {
// Don't double save stacks.
continue;
} else if (total_size + size_to_add < UINT32_MAX) {
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

No need for else if if we continue:

if (stack_start_addresses.count(core_range.range.start()) > 0)
  continue;  // Stacks are already in ranges_32

if (total_size + size_to_add < UINT32_MAX)

std::vector<MemoryDescriptor> descriptors;
Status error;
Log *log = GetLog(LLDBLog::Object);
size_t region_index = 0;
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Make on DataBufferHeap outside of this loop and re-use it:

lldb::offset_t max_size = 0;
for (const auto &core_range : ranges) {
  if (core_range.renge.size() > max_size)
    max_size = core_range.renge.size();
}
auto data_up = std::make_unique<DataBufferHeap>(max_size, 0);

auto data_up = std::make_unique(size, 0);

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Done. Albeit the m_data buffer that the class uses does get used and then reallocated when we call Clear() in flush to disk. May want to look into keeping a write index and a range...

const size_t offset_for_data = GetCurrentDataEndOffset();
const addr_t addr = core_range.range.start();
const addr_t size = core_range.range.size();
auto data_up = std::make_unique<DataBufferHeap>(size, 0);
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

re-use the one copy of data_up from above, just remove this line.

Comment on lines 979 to 981
if (m_thread_by_range_start.count(addr) > 0) {
m_thread_by_range_start[addr].Stack = descriptor;
}
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Still need to remove these braces from this if

for (const auto &core_range : ranges) {
const addr_t addr = core_range.range.start();
const addr_t size = core_range.range.size();
auto data_up = std::make_unique<DataBufferHeap>(size, 0);
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

move this out of the for loop and have it contain max_size bytes just like in the 32 bit variant of this function.

lldb_private::Status AddData(const void *data, size_t size);
// Add MemoryList stream, containing dumps of important memory segments
lldb_private::Status
AddMemoryList_64(const lldb_private::Process::CoreFileMemoryRanges &ranges);
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I would simply call AddMemoryList64, AddMemoryList32 and Is64Bit.
Let's be consistent about the member function naming convention.

Comment on lines 122 to 125
static constexpr size_t m_write_chunk_max = (1024 * 1024 * 128);

static constexpr size_t header_size = sizeof(llvm::minidump::Header);
static constexpr size_t directory_size = sizeof(llvm::minidump::Directory);
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The naming convention is not consistent. For constexpr, either MAX_WRITE_CHUNK or MaxChunkSize for readability.
Similar for HEADER_SIZE and DIRECTORY_SIZE.

return false;
}

builder.AddMiscInfo(process_sp);
builder.AddModuleList();
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Let's handle the failure from AddModuleList(). I am pretty sure I have seen a failure once.


std::set<addr_t> stack_start_addresses;
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If you do not need ordering of the elements in the set, use std::unordered_set.

lldb_private::Status AddMemory(const lldb::ProcessSP &process_sp,
lldb::SaveCoreStyle core_style);

lldb_private::Status DumpToFile();
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

As you are describing it, you can see this method is doing two things. I do not see the benefit of grouping them into one confusing named function. Let's split it into two functions:

  1. FixupHeaderDirectory()
  2. FlushToFile()

for (auto &pair : m_thread_by_range_start) {
// The thread objects will get a new memory descriptor added
// When we are emitting the memory list and then we write it here
llvm::minidump::Thread thread = pair.second;
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I do not think you need to copy:

const llvm::minidump::Thread& thread

AddMemoryList_64(const lldb_private::Process::CoreFileMemoryRanges &ranges);
lldb_private::Status
AddMemoryList_32(const lldb_private::Process::CoreFileMemoryRanges &ranges);
lldb_private::Status FixThreads();
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

  1. You should add comment explaining what is "FixThreads()" doing and why we need it.
  2. FixThreads is a very generic name, but you are really fixing up the stack field only. So rename to "FixupThreadStack()" to be more specific.

@@ -50,48 +60,75 @@ class MinidumpFileBuilder {

~MinidumpFileBuilder() = default;

lldb_private::Status AddHeaderAndCalculateDirectories();
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I would add comment to clarify that this is partially done and which fields require later fixup.

@jeffreytan81
Copy link
Contributor

Can this be landed safely? Do we need to update consumer side to support memory list 64? For example, if this diff is landed, and a user tries to save a minidump with memory list 64, if there is no consumer side support, will lldb crash?

…ing off by 8 + descirptor count * sizeof(descriptor_64)
Comment on lines 88 to 101
while (remaining_bytes > 0) {
// Keep filling in zero's until we preallocate enough space for the header
// and directory sections.
size_t bytes_written = std::min(remaining_bytes, sizeof(size_t));
error = m_core_file->Write(&zeroes, bytes_written);
if (error.Fail()) {
error.SetErrorStringWithFormat(
"unable to write the header (written %zd/%zd)", bytes_written,
header_size);
return error;
}

for (uint i = 0; i < m_expected_directories; i++) {
size_t bytes_written;
bytes_written = directory_size;
Directory empty_directory;
error = m_core_file->Write(&empty_directory, bytes_written);
if (error.Fail() || bytes_written != directory_size) {
if (bytes_written != directory_size)
error.SetErrorStringWithFormat(
"unable to write the directory (written %zd/%zd)", bytes_written,
directory_size);
return error;
"Unable to write header and directory padding (written %zd/%zd)",
remaining_bytes - m_saved_data_size, m_saved_data_size);
break;
}

remaining_bytes -= bytes_written;
}
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Remove this while look and complex functionality and just call:

m_core_file->SeekFromStart(m_saved_data_size);

No need to write zeroes.

void MinidumpFileBuilder::AddDirectory(StreamType type, size_t stream_size) {
Status MinidumpFileBuilder::AddHeaderAndCalculateDirectories() {
// First set the offset on the file, and on the bytes saved
m_saved_data_size += header_size;
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is there a better name for m_saved_data_size?

Comment on lines 86 to 101
size_t zeroes = 0; // 8 0's
size_t remaining_bytes = m_saved_data_size;
while (remaining_bytes > 0) {
// Keep filling in zero's until we preallocate enough space for the header
// and directory sections.
size_t bytes_written = std::min(remaining_bytes, sizeof(size_t));
error = m_core_file->Write(&zeroes, bytes_written);
if (error.Fail()) {
error.SetErrorStringWithFormat(
"Unable to write header and directory padding (written %zd/%zd)",
remaining_bytes - m_saved_data_size, m_saved_data_size);
break;
}

remaining_bytes -= bytes_written;
}
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Just set the file position in the m_core_file:

m_core_file->SeekFromStart(m_saved_data_size);

And remove this entire while loop.

Comment on lines 839 to 842
uint64_t total_size =
ranges_32.size() * sizeof(llvm::minidump::MemoryDescriptor);
for (const auto &core_range : ranges_32)
total_size += core_range.range.size();
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Merge this total_size calculation stuff into the loop above on line 836. No need to iterate over this twice.

Comment on lines 847 to 848
total_size += core_range.range.size();
total_size += sizeof(llvm::minidump::MemoryDescriptor);
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This can still be just:

total_size += size_to_add;

Comment on lines +1026 to +1027
llvm::support::ulittle32_t memory_ranges_num =
static_cast<llvm::support::ulittle32_t>(descriptors.size());
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

llvm::support::ulittle32_t memory_ranges_num(descriptors.size());

No need for static_cast right. Lots of places already had these incorrect extra static_cast calls..

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm torn on the static casts. I know the little endian support types can support big endian assignment, but I think the static cast makes it very explicit that these values have to be little endian

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The assignment operator can and must work for these llvm::support::ulittleXX_t values, we should take advantage of them and keep the code cleaner. Removing all static_casts makes this code easier to read and avoids people making copy/paste of similar code in the future. I know there were some static casts before this patch, but we should remove them.

Comment on lines 1020 to 1022
sizeof(llvm::support::ulittle32_t) +
descriptors.size() *
sizeof(llvm::minidump::MemoryDescriptor));
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

indent ok here? Run clang-format on this

Comment on lines +1049 to +1050
llvm::support::ulittle64_t memory_ranges_num =
static_cast<llvm::support::ulittle64_t>(ranges.size());
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

No static cast:

llvm::support::ulittle64_t memory_ranges_num(ranges.size());

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

remove static casts

Comment on lines +1058 to +1059
llvm::support::ulittle64_t memory_ranges_base_rva =
static_cast<llvm::support::ulittle64_t>(base_rva);
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

no static cast

Comment on lines +1070 to +1073
memory_desc.StartOfMemoryRange =
static_cast<llvm::support::ulittle64_t>(core_range.range.start());
memory_desc.DataSize =
static_cast<llvm::support::ulittle64_t>(core_range.range.size());
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

remove static cast

… instead keep tracrack internally. Add comments explaining serialization formating and process as requested by Jeffrey.

Run Git clang format.

Process::CoreFileMemoryRanges ranges_32;
Process::CoreFileMemoryRanges ranges_64;
error = m_process_sp->CalculateCoreFileSaveRanges(
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Add a comment here:

// We first save the thread stacks to ensure they fit in the first UINT32_MAX 
// bytes of the core file. Thread structures in minidump files can only use
// 32 bit memory descriptiors, so we emit them first to ensure the memory is
// in accessible with a 32 bit offset.

// write header
llvm::minidump::Header header;
header.Signature = static_cast<llvm::support::ulittle32_t>(
llvm::minidump::Header::MagicSignature);
header.Version = static_cast<llvm::support::ulittle32_t>(
llvm::minidump::Header::MagicVersion);
header.NumberOfStreams =
static_cast<llvm::support::ulittle32_t>(GetDirectoriesNum());
static_cast<llvm::support::ulittle32_t>(m_directories.size());
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

remove static_cast, not needed.

header.StreamDirectoryRVA =
static_cast<llvm::support::ulittle32_t>(GetCurrentDataEndOffset());
static_cast<llvm::support::ulittle32_t>(HEADER_SIZE);
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

remove static_cast, not needed.

Comment on lines +1000 to +1005
descriptor.StartOfMemoryRange =
static_cast<llvm::support::ulittle64_t>(addr);
descriptor.Memory.DataSize =
static_cast<llvm::support::ulittle32_t>(bytes_read);
descriptor.Memory.RVA =
static_cast<llvm::support::ulittle32_t>(offset_for_data);
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

remove static_casts

Comment on lines +1026 to +1027
llvm::support::ulittle32_t memory_ranges_num =
static_cast<llvm::support::ulittle32_t>(descriptors.size());
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The assignment operator can and must work for these llvm::support::ulittleXX_t values, we should take advantage of them and keep the code cleaner. Removing all static_casts makes this code easier to read and avoids people making copy/paste of similar code in the future. I know there were some static casts before this patch, but we should remove them.

Comment on lines +1049 to +1050
llvm::support::ulittle64_t memory_ranges_num =
static_cast<llvm::support::ulittle64_t>(ranges.size());
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

remove static casts

size_t MinidumpFileBuilder::GetDirectoriesNum() const {
return m_directories.size();
static size_t GetLargestRange(const Process::CoreFileMemoryRanges &ranges) {
size_t max_size = 0;
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

use uint64_t

…int32_t to avoid warnings. Run git clang format.
if (new_offset != m_saved_data_size)
error.SetErrorStringWithFormat("Failed to fill in header and directory "
"sections. Written / Expected (%" PRIx64
" / %zu)",
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

m_saved_data_size is uint64_t, so don't use " / %zu), use ` / %" PRIu64 ")"

Comment on lines 861 to 863
total_size +=
256 + (all_core_memory_ranges.size() - stack_start_addresses.size()) *
sizeof(llvm::minidump::MemoryDescriptor_64);
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

this total_size will be wrong if all_core_memory_ranges is empty right? If we are saving stacks only, this will be:

256 + (0 - stack_start_addresses.size()) * sizeof(llvm::minidump::MemoryDescriptor_64);

And we will unsigned underflow but it won't matter because we won't use total_size. Best to put a:

if (!all_core_memory_ranges.empty()) {

around the total_size and for loop that iterates over all core ranges to avoid potential issues in the future.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This actually works, because we'll never enter the loop unless all_core_memory_ranges is non empty, where total_size will actually be used. Prior to code review feedback we always called to get memory ranges, even when it's just stacks, ensuring we would always have at least stacks in all_core_memory_ranges.

I have added the check for empty and an explanation comment as well.


if (total_size + range_size < UINT32_MAX) {
ranges_32.push_back(core_range);
total_size += core_range.range.size();
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

use range_size that you created above.

sizeof(llvm::minidump::MemoryDescriptor_64);

for (const auto &core_range : all_core_memory_ranges) {
addr_t range_size = core_range.range.size();
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

make const

…oop and make it const. Fix m_total_siz_size formatting. Run git-clang-format
keith added a commit that referenced this pull request Jun 24, 2024
@petrhosek
Copy link
Member

This broke the Windows build with the following error:

C:\b\s\w\ir\x\w\cipd\bin\clang-cl.exe  /nologo -TP -DGTEST_HAS_RTTI=0 -DLIBXML_STATIC -DUNICODE -D_CRT_NONSTDC_NO_DEPRECATE -D_CRT_NONSTDC_NO_WARNINGS -D_CRT_SECURE_NO_DEPRECATE -D_CRT_SECURE_NO_WARNINGS -D_ENABLE_EXTENDED_ALIGNED_STORAGE -D_GLIBCXX_ASSERTIONS -D_HAS_EXCEPTIONS=0 -D_SCL_SECURE_NO_DEPRECATE -D_SCL_SECURE_NO_WARNINGS -D_UNICODE -D__STDC_CONSTANT_MACROS -D__STDC_FORMAT_MACROS -D__STDC_LIMIT_MACROS -IC:\b\s\w\ir\x\w\llvm_build\tools\lldb\source\Plugins\ObjectFile\Minidump -IC:\b\s\w\ir\x\w\llvm-llvm-project\lldb\source\Plugins\ObjectFile\Minidump -IC:\b\s\w\ir\x\w\llvm-llvm-project\lldb\include -IC:\b\s\w\ir\x\w\llvm_build\tools\lldb\include -IC:\b\s\w\ir\x\w\rc\tensorflow-venv\store\python_venv-51qj4upsi8nrslpsnfp48k5j14\contents\Lib\site-packages\tensorflow\include -IC:\b\s\w\ir\x\w\llvm_build\include -IC:\b\s\w\ir\x\w\llvm-llvm-project\llvm\include -IC:\b\s\w\ir\x\w\lldb_install\python3\include -IC:\b\s\w\ir\x\w\llvm-llvm-project\llvm\..\clang\include -IC:\b\s\w\ir\x\w\llvm_build\tools\lldb\..\clang\include -IC:\b\s\w\ir\x\w\llvm-llvm-project\lldb\source -IC:\b\s\w\ir\x\w\llvm_build\tools\lldb\source -imsvcC:\b\s\w\ir\x\w\libxml2_install_target\include\libxml2 -imsvcC:\b\s\w\ir\x\w\zlib_install_target\include -imsvcC:\b\s\w\ir\x\w\zstd_install\include /DWIN32 /D_WINDOWS   /Zc:inline /Zc:__cplusplus /Oi /Brepro /bigobj /permissive- -Werror=unguarded-availability-new /W4  -Wextra -Wno-unused-parameter -Wwrite-strings -Wcast-qual -Wmissing-field-initializers -Wimplicit-fallthrough -Wcovered-switch-default -Wno-noexcept-type -Wnon-virtual-dtor -Wdelete-non-virtual-dtor -Wsuggest-override -Wstring-conversion -Wmisleading-indentation -Wctad-maybe-unsupported /Gw -Wno-deprecated-register -Wno-vla-extension /O2 /Ob2  -std:c++17 -MT   -wd4018 -wd4068 -wd4150 -wd4201 -wd4251 -wd4521 -wd4530 -wd4589  /EHs-c- /GR- -UNDEBUG /showIncludes /Fotools\lldb\source\Plugins\ObjectFile\Minidump\CMakeFiles\lldbPluginObjectFileMinidump.dir\MinidumpFileBuilder.cpp.obj /Fdtools\lldb\source\Plugins\ObjectFile\Minidump\CMakeFiles\lldbPluginObjectFileMinidump.dir\lldbPluginObjectFileMinidump.pdb -c -- C:\b\s\w\ir\x\w\llvm-llvm-project\lldb\source\Plugins\ObjectFile\Minidump\MinidumpFileBuilder.cpp
In file included from C:\b\s\w\ir\x\w\llvm-llvm-project\lldb\source\Plugins\ObjectFile\Minidump\MinidumpFileBuilder.cpp:9:
C:\b\s\w\ir\x\w\llvm-llvm-project\lldb\source\Plugins\ObjectFile\Minidump\MinidumpFileBuilder.h(149,3): error: unknown type name 'uint'; did you mean 'int'?
  149 |   uint m_expected_directories = 0;
      |   ^~~~
      |   int
1 error generated.

Would it be possible to take a look and revert if necessary?

@Jlalond
Copy link
Contributor Author

Jlalond commented Jun 24, 2024

@petrhosek because it's just going to be a change to something like size_t, would you mind if I just did a quick commit to fix this and assign you as the reviewer?

@Jlalond
Copy link
Contributor Author

Jlalond commented Jun 24, 2024

Fixed uint issue in #96564

Jlalond added a commit that referenced this pull request Jun 24, 2024
…e_t (#96564)

In #95312 I incorrectly set `m_expected_directories` to uint, this broke
the windows build and is the incorrect type.

`size_t` is more accurate because this value only ever represents the
expected upper bound of the directory vector.
@vvereschaka
Copy link
Contributor

Hi @Jlalond ,
here is still a problem with the window builds:

FAILED: tools/lldb/source/Plugins/ObjectFile/Minidump/CMakeFiles/lldbPluginObjectFileMinidump.dir/ObjectFileMinidump.cpp.obj 
ccache C:\PROGRA~1\LLVM\bin\clang-cl.exe  /nologo -TP -DGTEST_HAS_RTTI=0 -DUNICODE -D_CRT_NONSTDC_NO_DEPRECATE -D_CRT_NONSTDC_NO_WARNINGS -D_CRT_SECURE_NO_DEPRECATE -D_CRT_SECURE_NO_WARNINGS -D_ENABLE_EXTENDED_ALIGNED_STORAGE -D_HAS_EXCEPTIONS=0 -D_SCL_SECURE_NO_DEPRECATE -D_SCL_SECURE_NO_WARNINGS -D_UNICODE -D__STDC_CONSTANT_MACROS -D__STDC_FORMAT_MACROS -D__STDC_LIMIT_MACROS -IC:\Users\tcwg\llvm-worker\lldb-aarch64-windows\build\tools\lldb\source\Plugins\ObjectFile\Minidump -IC:\Users\tcwg\llvm-worker\lldb-aarch64-windows\llvm-project\lldb\source\Plugins\ObjectFile\Minidump -IC:\Users\tcwg\llvm-worker\lldb-aarch64-windows\llvm-project\lldb\include -IC:\Users\tcwg\llvm-worker\lldb-aarch64-windows\build\tools\lldb\include -IC:\Users\tcwg\llvm-worker\lldb-aarch64-windows\build\include -IC:\Users\tcwg\llvm-worker\lldb-aarch64-windows\llvm-project\llvm\include -IC:\Users\tcwg\AppData\Local\Programs\Python\Python311-arm64\include -IC:\Users\tcwg\llvm-worker\lldb-aarch64-windows\llvm-project\llvm\..\clang\include -IC:\Users\tcwg\llvm-worker\lldb-aarch64-windows\build\tools\lldb\..\clang\include -IC:\Users\tcwg\llvm-worker\lldb-aarch64-windows\llvm-project\lldb\source -IC:\Users\tcwg\llvm-worker\lldb-aarch64-windows\build\tools\lldb\source /DWIN32 /D_WINDOWS   /Zc:inline /Zc:__cplusplus /Oi /Brepro /bigobj /permissive- -Werror=unguarded-availability-new /W4  -Wextra -Wno-unused-parameter -Wwrite-strings -Wcast-qual -Wmissing-field-initializers -Wimplicit-fallthrough -Wcovered-switch-default -Wno-noexcept-type -Wnon-virtual-dtor -Wdelete-non-virtual-dtor -Wsuggest-override -Wstring-conversion -Wmisleading-indentation -Wctad-maybe-unsupported /Gw -Wno-deprecated-register -Wno-vla-extension /O2 /Ob2 /DNDEBUG -MD   -wd4018 -wd4068 -wd4150 -wd4201 -wd4251 -wd4521 -wd4530 -wd4589  /EHs-c- /GR- -std:c++17 /showIncludes /Fotools\lldb\source\Plugins\ObjectFile\Minidump\CMakeFiles\lldbPluginObjectFileMinidump.dir\ObjectFileMinidump.cpp.obj /Fdtools\lldb\source\Plugins\ObjectFile\Minidump\CMakeFiles\lldbPluginObjectFileMinidump.dir\lldbPluginObjectFileMinidump.pdb -c -- C:\Users\tcwg\llvm-worker\lldb-aarch64-windows\llvm-project\lldb\source\Plugins\ObjectFile\Minidump\ObjectFileMinidump.cpp
C:\Users\tcwg\llvm-worker\lldb-aarch64-windows\llvm-project\lldb\source\Plugins\ObjectFile\Minidump\ObjectFileMinidump.cpp(21,10): fatal error: 'unistd.h' file not found
   21 | #include <unistd.h>
      |          ^~~~~~~~~~
1 error generated.

@Jlalond
Copy link
Contributor Author

Jlalond commented Jun 25, 2024

@vvereschaka I'll work on this today. Is it time pressing enough you would want everything reverted?

@vvereschaka
Copy link
Contributor

@Jlalond , perfect, thank you. It is ok for me if you'll fix the problem during today.

@petrhosek
Copy link
Member

We're seeing the same issue as well on our builders, will you be able to address this today?

@Jlalond
Copy link
Contributor Author

Jlalond commented Jun 25, 2024

@petrhosek I am working on this. I think it's only the header thankfully

Jlalond added a commit that referenced this pull request Jun 26, 2024
![image](https://github.com/llvm/llvm-project/assets/25160653/2044cc8e-72d5-49ec-9439-256555f5fd2b)

In #95312 uint and `#include <unistd.h>` were introduced. These broke
the windows build. I addressed uint in #96564, but include went unfixed.
So I acquired myself a windows machine to validate.
AlexisPerry pushed a commit to llvm-project-tlp/llvm-project that referenced this pull request Jul 9, 2024
…vm#95312)

Currently, LLDB does not support taking a minidump over the 4.2gb limit imposed by uint32. In fact, currently it writes the RVA's and the headers to the end of the file, which can become corrupted due to the header offset only supporting a 32b offset.

This change reorganizes how the file structure is laid out. LLDB will precalculate the number of directories required and preallocate space at the top of the file to fill in later. Additionally, thread stacks require a 32b offset, and we provision empty descriptors and keep track of them to clean up once we write the 32b memory list.

For
[MemoryList64](https://learn.microsoft.com/en-us/windows/win32/api/minidumpapiset/ns-minidumpapiset-minidump_memory64_list),
the RVA to the start of the section itself will remain in a 32b addressable space. We achieve this by predetermining the space the memory regions will take, and only writing up to 4.2 gb of data with some buffer to allow all the MemoryDescriptor64s to also still be 32b addressable.

I did not add any explicit tests to this PR because allocating 4.2gb+ to test is very expensive. However, we have 32b automation tests and I validated with in several ways, including with 5gb+ array/object and would be willing to add this as a test case.
AlexisPerry pushed a commit to llvm-project-tlp/llvm-project that referenced this pull request Jul 9, 2024
AlexisPerry pushed a commit to llvm-project-tlp/llvm-project that referenced this pull request Jul 9, 2024
…e_t (llvm#96564)

In llvm#95312 I incorrectly set `m_expected_directories` to uint, this broke
the windows build and is the incorrect type.

`size_t` is more accurate because this value only ever represents the
expected upper bound of the directory vector.
AlexisPerry pushed a commit to llvm-project-tlp/llvm-project that referenced this pull request Jul 9, 2024
![image](https://github.com/llvm/llvm-project/assets/25160653/2044cc8e-72d5-49ec-9439-256555f5fd2b)

In llvm#95312 uint and `#include <unistd.h>` were introduced. These broke
the windows build. I addressed uint in llvm#96564, but include went unfixed.
So I acquired myself a windows machine to validate.
@Jlalond Jlalond deleted the minidump-64b-support branch August 6, 2024 18:46
Jlalond added a commit that referenced this pull request Aug 21, 2024
… regions when /proc/pid maps are missing. (#101086)

This PR is in response to a bug my coworker @mbucko discovered where on
MacOS Minidumps were being created where the 64b memory regions were
readable, but were not being listed in
`SBProcess.GetMemoryRegionList()`. This went unnoticed in #95312 due to
all the linux testing including /proc/pid maps. On MacOS generated dumps
(or any dump without access to /proc/pid) we would fail to properly map
Memory Regions due to there being two independent methods for 32b and
64b mapping.

In this PR I addressed this minor bug and merged the methods, but in
order to add test coverage required additions to `obj2yaml` and
`yaml2obj` which make up the bulk of this patch.

Lastly, there are some non-required changes such as the addition of the
`Memory64ListHeader` type, to make writing/reading the header section of
the Memory64List easier.
cjdb pushed a commit to cjdb/llvm-project that referenced this pull request Aug 23, 2024
… regions when /proc/pid maps are missing. (llvm#101086)

This PR is in response to a bug my coworker @mbucko discovered where on
MacOS Minidumps were being created where the 64b memory regions were
readable, but were not being listed in
`SBProcess.GetMemoryRegionList()`. This went unnoticed in llvm#95312 due to
all the linux testing including /proc/pid maps. On MacOS generated dumps
(or any dump without access to /proc/pid) we would fail to properly map
Memory Regions due to there being two independent methods for 32b and
64b mapping.

In this PR I addressed this minor bug and merged the methods, but in
order to add test coverage required additions to `obj2yaml` and
`yaml2obj` which make up the bulk of this patch.

Lastly, there are some non-required changes such as the addition of the
`Memory64ListHeader` type, to make writing/reading the header section of
the Memory64List easier.
Jlalond added a commit that referenced this pull request Sep 13, 2024
In #95312 Minidump file creation was moved from being created at the
end, to the file being emitted in chunks. This causes some undesirable
behavior where the file can still be present after an error has
occurred. To resolve this we will now delete the file upon an error.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

6 participants