Skip to content

[pdb] Provide a better error message when overflowing the public/global symbol record stream #140884

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
merged 4 commits into from
May 22, 2025

Conversation

zmodem
Copy link
Collaborator

@zmodem zmodem commented May 21, 2025

Before:

lld-link: error: Stream Error: The stream is too short to perform the requested operation.
lld-link: error: failed to write PDB file ./unit_tests.exe.pdb

After:

lld-link: error: the public (2127832912 bytes) and global (2200532960 bytes) symbols
are too large to fit in a PDB file; the maximum total is 4294967295 bytes.
lld-link: error: failed to write PDB file ./unit_tests.exe.pdb

…al symbol record stream

Before:

  lld-link: error: Stream Error: The stream is too short to perform the requested operation.
  lld-link: error: failed to write PDB file ./unit_tests.exe.pdb

After:

  lld-link: error: the public (2127832912 bytes) and global (2200532960 bytes) symbols are too large to fit in a PDB file.
  lld-link: error: failed to write PDB file ./unit_tests.exe.pdb
@llvmbot
Copy link
Member

llvmbot commented May 21, 2025

@llvm/pr-subscribers-platform-windows

Author: Hans Wennborg (zmodem)

Changes

Before:

lld-link: error: Stream Error: The stream is too short to perform the requested operation.
lld-link: error: failed to write PDB file ./unit_tests.exe.pdb

After:

lld-link: error: the public (2127832912 bytes) and global (2200532960 bytes) symbols are too large to fit in a PDB file.
lld-link: error: failed to write PDB file ./unit_tests.exe.pdb


Full diff: https://github.com/llvm/llvm-project/pull/140884.diff

1 Files Affected:

  • (modified) llvm/lib/DebugInfo/PDB/Native/GSIStreamBuilder.cpp (+7-2)
diff --git a/llvm/lib/DebugInfo/PDB/Native/GSIStreamBuilder.cpp b/llvm/lib/DebugInfo/PDB/Native/GSIStreamBuilder.cpp
index c195754c0c679..0289a7543205a 100644
--- a/llvm/lib/DebugInfo/PDB/Native/GSIStreamBuilder.cpp
+++ b/llvm/lib/DebugInfo/PDB/Native/GSIStreamBuilder.cpp
@@ -25,6 +25,7 @@
 #include "llvm/DebugInfo/PDB/Native/RawTypes.h"
 #include "llvm/Support/BinaryItemStream.h"
 #include "llvm/Support/BinaryStreamWriter.h"
+#include "llvm/Support/FormatVariadic.h"
 #include "llvm/Support/Parallel.h"
 #include "llvm/Support/TimeProfiler.h"
 #include "llvm/Support/xxhash.h"
@@ -39,7 +40,7 @@ using namespace llvm::codeview;
 // Helper class for building the public and global PDB hash table buckets.
 struct llvm::pdb::GSIHashStreamBuilder {
   // Sum of the size of all public or global records.
-  uint32_t RecordByteSize = 0;
+  uint64_t RecordByteSize = 0;
 
   std::vector<PSHashRecord> HashRecords;
 
@@ -319,7 +320,11 @@ Error GSIStreamBuilder::finalizeMsfLayout() {
     return Idx.takeError();
   PublicsStreamIndex = *Idx;
 
-  uint32_t RecordBytes = PSH->RecordByteSize + GSH->RecordByteSize;
+  uint64_t RecordBytes = PSH->RecordByteSize + GSH->RecordByteSize;
+  if (RecordBytes > UINT32_MAX)
+    return make_error<StringError>(formatv("the public ({0} bytes) and global "
+          "({1} bytes) symbols are too large to fit in a PDB file.",
+          PSH->RecordByteSize, GSH->RecordByteSize), inconvertibleErrorCode());
 
   Idx = Msf.addStream(RecordBytes);
   if (!Idx)

@llvmbot
Copy link
Member

llvmbot commented May 21, 2025

@llvm/pr-subscribers-debuginfo

Author: Hans Wennborg (zmodem)

Changes

Before:

lld-link: error: Stream Error: The stream is too short to perform the requested operation.
lld-link: error: failed to write PDB file ./unit_tests.exe.pdb

After:

lld-link: error: the public (2127832912 bytes) and global (2200532960 bytes) symbols are too large to fit in a PDB file.
lld-link: error: failed to write PDB file ./unit_tests.exe.pdb


Full diff: https://github.com/llvm/llvm-project/pull/140884.diff

1 Files Affected:

  • (modified) llvm/lib/DebugInfo/PDB/Native/GSIStreamBuilder.cpp (+7-2)
diff --git a/llvm/lib/DebugInfo/PDB/Native/GSIStreamBuilder.cpp b/llvm/lib/DebugInfo/PDB/Native/GSIStreamBuilder.cpp
index c195754c0c679..0289a7543205a 100644
--- a/llvm/lib/DebugInfo/PDB/Native/GSIStreamBuilder.cpp
+++ b/llvm/lib/DebugInfo/PDB/Native/GSIStreamBuilder.cpp
@@ -25,6 +25,7 @@
 #include "llvm/DebugInfo/PDB/Native/RawTypes.h"
 #include "llvm/Support/BinaryItemStream.h"
 #include "llvm/Support/BinaryStreamWriter.h"
+#include "llvm/Support/FormatVariadic.h"
 #include "llvm/Support/Parallel.h"
 #include "llvm/Support/TimeProfiler.h"
 #include "llvm/Support/xxhash.h"
@@ -39,7 +40,7 @@ using namespace llvm::codeview;
 // Helper class for building the public and global PDB hash table buckets.
 struct llvm::pdb::GSIHashStreamBuilder {
   // Sum of the size of all public or global records.
-  uint32_t RecordByteSize = 0;
+  uint64_t RecordByteSize = 0;
 
   std::vector<PSHashRecord> HashRecords;
 
@@ -319,7 +320,11 @@ Error GSIStreamBuilder::finalizeMsfLayout() {
     return Idx.takeError();
   PublicsStreamIndex = *Idx;
 
-  uint32_t RecordBytes = PSH->RecordByteSize + GSH->RecordByteSize;
+  uint64_t RecordBytes = PSH->RecordByteSize + GSH->RecordByteSize;
+  if (RecordBytes > UINT32_MAX)
+    return make_error<StringError>(formatv("the public ({0} bytes) and global "
+          "({1} bytes) symbols are too large to fit in a PDB file.",
+          PSH->RecordByteSize, GSH->RecordByteSize), inconvertibleErrorCode());
 
   Idx = Msf.addStream(RecordBytes);
   if (!Idx)

Copy link

github-actions bot commented May 21, 2025

✅ With the latest revision this PR passed the C/C++ code formatter.

@tru
Copy link
Collaborator

tru commented May 21, 2025

Looks good, but I wonder if we should hint at any way to resolve the problem? I am not sure I would know what to do when this happen and end up on this Pull Request after some research :)

uint64_t RecordBytes = PSH->RecordByteSize + GSH->RecordByteSize;
if (RecordBytes > UINT32_MAX)
return make_error<StringError>(
formatv("the public ({0} bytes) and global ({1} bytes) "
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can we also display what the maximum size is? It’s obvious for us, but not for regular users.

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Done.

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Github shows some kind of syntax error, maybe you have local fixup changes that didn't get pushed.

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Oops. Done.

@zmodem
Copy link
Collaborator Author

zmodem commented May 21, 2025

I wonder if we should hint at any way to resolve the problem? I am not sure I would know what to do when this happen and end up on this Pull Request after some research :)

I don't think there's room for much info into the error message, but we can certainly put some notes here.

  • This kind of overflow is likely due to a huge amount of function template instantiations which would go away if inlined. So if compiling with -O1 or higher is an option, that would be the best solution. If that's not an option, but the user controls the source code, applying force_inline attributes to common templates may be an alternative.
  • There are flags to reduce the amount of debug info, such as -gline-tables-only, but that will give a much more limited debugging experience.
  • If the above doesn't help, it may be necessary to split the binary into multiple .exe/.dll files.

Copy link
Collaborator

@rnk rnk left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks good, thanks!

uint64_t RecordBytes = PSH->RecordByteSize + GSH->RecordByteSize;
if (RecordBytes > UINT32_MAX)
return make_error<StringError>(
formatv("the public ({0} bytes) and global ({1} bytes) "
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Github shows some kind of syntax error, maybe you have local fixup changes that didn't get pushed.

return make_error<StringError>(
formatv("the public ({0} bytes) and global ({1} bytes) "
"symbols are too large to fit in a PDB file; ",
"the maximum total is 4294967295 bytes.
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

nit: maybe pass UINT32_MAX as a parameter to the formatter so we don't have to hardcode the decimal value.

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Done.

@zmodem zmodem merged commit 213d0d2 into llvm:main May 22, 2025
9 of 11 checks passed
return make_error<StringError>(
formatv("the public ({0} bytes) and global ({1} bytes) "
"symbols are too large to fit in a PDB file; "
"the maximum total is {2} bytes.",
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

nit: "The public and global symbols are too large" can be read as if this is about two symbols, the public symbol and the global symbol. Maybe just "The public symbols and global symbols"? "The public and global symbol tables?" (I think that suggestion, while not being ambiguous, is incorrect, but I'm not sure. If it is, we obviously shouldn't use it :P)

Sorry for the bikeshed; feel free to ignore if you're happy with this as-is.

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks, going with "the public symbols and global symbols" in 6bb05ea

zmodem added a commit that referenced this pull request May 23, 2025
llvm-sync bot pushed a commit to arm/arm-toolchain that referenced this pull request May 23, 2025
sivan-shani pushed a commit to sivan-shani/llvm-project that referenced this pull request Jun 3, 2025
…al symbol record stream (llvm#140884)

Before:

lld-link: error: Stream Error: The stream is too short to perform the requested operation.
lld-link: error: failed to write PDB file ./unit_tests.exe.pdb

After:

lld-link: error: the public (2127832912 bytes) and global (2200532960 bytes) symbols
are too large to fit in a PDB file; the maximum total is 4294967295 bytes.
lld-link: error: failed to write PDB file ./unit_tests.exe.pdb
sivan-shani pushed a commit to sivan-shani/llvm-project that referenced this pull request Jun 3, 2025
ajaden-codes pushed a commit to Jaddyen/llvm-project that referenced this pull request Jun 6, 2025
…al symbol record stream (llvm#140884)

Before:

lld-link: error: Stream Error: The stream is too short to perform the requested operation.
lld-link: error: failed to write PDB file ./unit_tests.exe.pdb

After:

lld-link: error: the public (2127832912 bytes) and global (2200532960 bytes) symbols
are too large to fit in a PDB file; the maximum total is 4294967295 bytes.
lld-link: error: failed to write PDB file ./unit_tests.exe.pdb
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

7 participants