Skip to content

[clang] Make source locations space usage diagnostics numbers easier to read #114999

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
merged 4 commits into from
Nov 6, 2024

Conversation

bricknerb
Copy link
Contributor

Instead of writing "12345678B", write "12345678B (12.34MB)".

…to read

Instead of write "12345678B", write "12345678B (12.34MB)".
@llvmbot llvmbot added clang Clang issues not falling into any other category clang:frontend Language frontend issues, e.g. anything involving "Sema" labels Nov 5, 2024
@llvmbot
Copy link
Member

llvmbot commented Nov 5, 2024

@llvm/pr-subscribers-clang

Author: Boaz Brickner (bricknerb)

Changes

Instead of writing "12345678B", write "12345678B (12.34MB)".


Full diff: https://github.com/llvm/llvm-project/pull/114999.diff

4 Files Affected:

  • (modified) clang/include/clang/Basic/DiagnosticCommonKinds.td (+6-5)
  • (modified) clang/lib/Basic/SourceManager.cpp (+30-3)
  • (modified) clang/test/Lexer/SourceLocationsOverflow.c (+5-5)
  • (modified) clang/test/Misc/sloc-usage.cpp (+2-2)
diff --git a/clang/include/clang/Basic/DiagnosticCommonKinds.td b/clang/include/clang/Basic/DiagnosticCommonKinds.td
index ae709e45a700a1..457abea0b81471 100644
--- a/clang/include/clang/Basic/DiagnosticCommonKinds.td
+++ b/clang/include/clang/Basic/DiagnosticCommonKinds.td
@@ -389,13 +389,14 @@ def remark_sloc_usage : Remark<
   "source manager location address space usage:">,
   InGroup<DiagGroup<"sloc-usage">>, DefaultRemark, ShowInSystemHeader;
 def note_total_sloc_usage : Note<
-  "%0B in local locations, %1B in locations loaded from AST files, for a total "
-  "of %2B (%3%% of available space)">;
+  "%0B (%1B) in local locations, %2B (%3B) "
+  "in locations loaded from AST files, for a total of %4B (%5B) "
+  "(%6%% of available space)">;
 def note_file_sloc_usage : Note<
-  "file entered %0 time%s0 using %1B of space"
-  "%plural{0:|: plus %2B for macro expansions}2">;
+  "file entered %0 time%s0 using %1B (%2B) of space"
+  "%plural{0:|: plus %3B (%4B) for macro expansions}3">;
 def note_file_misc_sloc_usage : Note<
-  "%0 additional files entered using a total of %1B of space">;
+  "%0 additional files entered using a total of %1B (%2B) of space">;
 
 // Modules
 def err_module_format_unhandled : Error<
diff --git a/clang/lib/Basic/SourceManager.cpp b/clang/lib/Basic/SourceManager.cpp
index 65a8a7253e054f..cbc2b840150321 100644
--- a/clang/lib/Basic/SourceManager.cpp
+++ b/clang/lib/Basic/SourceManager.cpp
@@ -2227,6 +2227,28 @@ LLVM_DUMP_METHOD void SourceManager::dump() const {
   }
 }
 
+static std::string NumberToHumanString(uint64_t number) {
+  static constexpr std::array<std::pair<uint64_t, char>, 4> Units = {
+      {{1'000'000'000'000UL, 'T'},
+       {1'000'000'000UL, 'G'},
+       {1'000'000UL, 'M'},
+       {1'000UL, 'k'}}};
+
+  std::string human_string;
+  llvm::raw_string_ostream human_string_stream(human_string);
+  for (const auto &[UnitSize, UnitSign] : Units) {
+    if (number >= UnitSize) {
+      human_string_stream << llvm::format(
+          "%.2f%c", number / static_cast<double>(UnitSize), UnitSign);
+      break;
+    }
+  }
+  if (human_string.empty()) {
+    human_string_stream << number;
+  }
+  return human_string;
+}
+
 void SourceManager::noteSLocAddressSpaceUsage(
     DiagnosticsEngine &Diag, std::optional<unsigned> MaxNotes) const {
   struct Info {
@@ -2296,7 +2318,9 @@ void SourceManager::noteSLocAddressSpaceUsage(
   int UsagePercent = static_cast<int>(100.0 * double(LocalUsage + LoadedUsage) /
                                       MaxLoadedOffset);
   Diag.Report(SourceLocation(), diag::note_total_sloc_usage)
-    << LocalUsage << LoadedUsage << (LocalUsage + LoadedUsage) << UsagePercent;
+      << LocalUsage << NumberToHumanString(LocalUsage) << LoadedUsage
+      << NumberToHumanString(LoadedUsage) << (LocalUsage + LoadedUsage)
+      << NumberToHumanString(LocalUsage + LoadedUsage) << UsagePercent;
 
   // Produce notes on sloc address space usage for each file with a high usage.
   uint64_t ReportedSize = 0;
@@ -2304,14 +2328,17 @@ void SourceManager::noteSLocAddressSpaceUsage(
        llvm::make_range(SortedUsage.begin(), SortedEnd)) {
     Diag.Report(FileInfo.Loc, diag::note_file_sloc_usage)
         << FileInfo.Inclusions << FileInfo.DirectSize
-        << (FileInfo.TotalSize - FileInfo.DirectSize);
+        << NumberToHumanString(FileInfo.DirectSize)
+        << (FileInfo.TotalSize - FileInfo.DirectSize)
+        << NumberToHumanString(FileInfo.TotalSize - FileInfo.DirectSize);
     ReportedSize += FileInfo.TotalSize;
   }
 
   // Describe any remaining usage not reported in the per-file usage.
   if (ReportedSize != CountedSize) {
     Diag.Report(SourceLocation(), diag::note_file_misc_sloc_usage)
-        << (SortedUsage.end() - SortedEnd) << CountedSize - ReportedSize;
+        << (SortedUsage.end() - SortedEnd) << CountedSize - ReportedSize
+        << NumberToHumanString(CountedSize - ReportedSize);
   }
 }
 
diff --git a/clang/test/Lexer/SourceLocationsOverflow.c b/clang/test/Lexer/SourceLocationsOverflow.c
index f058c09428e6e7..26b0d204c49ff5 100644
--- a/clang/test/Lexer/SourceLocationsOverflow.c
+++ b/clang/test/Lexer/SourceLocationsOverflow.c
@@ -3,17 +3,17 @@
 // CHECK-NEXT: inc1.h{{.*}}: fatal error: translation unit is too large for Clang to process: ran out of source locations
 // CHECK-NEXT: #include "inc2.h"
 // CHECK-NEXT:          ^
-// CHECK-NEXT: note: 214{{.......}}B in local locations, 0B in locations loaded from AST files, for a total of 214{{.......}}B (99% of available space)
-// CHECK-NEXT: {{.*}}inc2.h:1:1: note: file entered 214{{..}} times using 214{{.......}}B of space
+// CHECK-NEXT: note: 214{{.......}}B (2.15GB) in local locations, 0B (0B) in locations loaded from AST files, for a total of 214{{.......}}B (2.15GB) (99% of available space)
+// CHECK-NEXT: {{.*}}inc2.h:1:1: note: file entered 214{{..}} times using 214{{.......}}B (2.15GB) of space
 // CHECK-NEXT: /*.................................................................................................
 // CHECK-NEXT: ^
-// CHECK-NEXT: {{.*}}inc1.h:1:1: note: file entered 15 times using 39{{....}}B of space
+// CHECK-NEXT: {{.*}}inc1.h:1:1: note: file entered 15 times using 39{{....}}B (396.92kB) of space
 // CHECK-NEXT: #include "inc2.h"
 // CHECK-NEXT: ^
-// CHECK-NEXT: <built-in>:1:1: note: file entered {{.*}} times using {{.*}}B of space
+// CHECK-NEXT: <built-in>:1:1: note: file entered {{.*}} times using {{.*}}B ({{.*}}B) of space
 // CHECK-NEXT: # {{.*}}
 // CHECK-NEXT: ^
-// CHECK-NEXT: {{.*}}SourceLocationsOverflow.c:1:1: note: file entered 1 time using {{.*}}B of space
+// CHECK-NEXT: {{.*}}SourceLocationsOverflow.c:1:1: note: file entered 1 time using {{.*}}B ({{.*}}B) of space
 // CHECK-NEXT: // RUN: not %clang %s -S -o - 2>&1 | FileCheck %s
 // CHECK-NEXT: ^
 // CHECK-NEXT: 1 error generated.
diff --git a/clang/test/Misc/sloc-usage.cpp b/clang/test/Misc/sloc-usage.cpp
index 18bd94f8b9dc30..f2c152a268ac03 100644
--- a/clang/test/Misc/sloc-usage.cpp
+++ b/clang/test/Misc/sloc-usage.cpp
@@ -9,6 +9,6 @@ bool b = EQUALS(k, k);
 
 #pragma clang __debug sloc_usage // expected-remark {{address space usage}}
 // expected-note@* {{(0% of available space)}}
-// (this file)     expected-note-re@1 {{file entered 1 time using {{.*}}B of space plus 51B for macro expansions}}
-// (included file) expected-note-re@Inputs/include.h:1 {{file entered 3 times using {{.*}}B of space{{$}}}}
+// (this file)     expected-note-re@1 {{file entered 1 time using {{.*}}B ({{.*}}B) of space plus 51B (51B) for macro expansions}}
+// (included file) expected-note-re@Inputs/include.h:1 {{file entered 3 times using {{.*}}B ({{.*}}B) of space{{$}}}}
 // (builtins file) expected-note@* {{file entered}}

@ilya-biryukov
Copy link
Contributor

Do we really care about the exact byte numbers? Maybe we should only show the human-friendly version?
It's appealing to have less noise if we can.

@bricknerb
Copy link
Contributor Author

Do we really care about the exact byte numbers? Maybe we should only show the human-friendly version? It's appealing to have less noise if we can.

Yes, I was considering both options.
It might be useful to see the full number in case you want to diff between logs, and the diff would be relatively small (people might care about zero diff vs. tiny diff), so I decided to make this change not lose any information.

@ilya-biryukov
Copy link
Contributor

Yes, I was considering both options. It might be useful to see the full number in case you want to diff between logs, and the diff would be relatively small (people might care about zero diff vs. tiny diff), so I decided to make this change not lose any information.

Makes sense, and given that it's a relatively infrequently surfaced feature, I think it's fine to change our mind on that later.

1. llvm::formatv() instead of llvm::format().
2. Early return.
3. Use std::to_string() when no special formatting is necessary.
Copy link
Contributor

@ilya-biryukov ilya-biryukov left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM, thanks!

@bricknerb bricknerb merged commit 8431494 into llvm:main Nov 6, 2024
5 of 7 checks passed
@bricknerb bricknerb deleted the number branch November 6, 2024 08:58
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
clang:frontend Language frontend issues, e.g. anything involving "Sema" clang Clang issues not falling into any other category
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants