[SSAUpdater] Don't use large SmallSets for IDFcalc #97823

jmorse · 2024-07-05T13:00:15Z

As per the LLVM programmers manual, SmallSets do linear scans on insertion and then turn into a hash-table if the set gets big. Here in the IDFCalculator, the SmallSets have been configured to have 32 elements in each static allocation... which means that we linearly scan for all problems with up to 32 elements, which I feel is quite a large N.

Replace these SmallSets with a plain DenseSet (and use reserve to avoid any repeated allocations). Doing this yields a 0.13% compile-time improvement for debug-info builds, as we hit IDFCalculator pretty hard in InstrRefBasedLDV. Adding @nikic as this could affect more than debug-info things, although apparently not in a visible way on CTMark.

http://llvm-compile-time-tracker.com/compare.php?from=4ce6cc488083a931e795dafe30e69a0e4a96a3e4&to=a9f1d377013b1d208ac58f6f75548f1f24d174d2&stat=instructions%3Au

(tracker link is across a few commits, with this patch being the summary).

As per the LLVM programmers manual, SmallSets do linear scans on insertion and then turn into a hash-table if the set gets big. Here in the IDFCalculator, the SmallSets have been configured to have 32 elements in each static allocation... which means that we linearly scan for all problems with up to 32 elements. Replace these SmallSets with a plain DenseSet (and use reserve to avoid any repeated allocations). Doing this yields a 0.13% compile-time improvement for debug-info builds, as we hit IDFCalculator pretty hard in InstrRefBasedLDV.

llvmbot · 2024-07-05T13:00:49Z

@llvm/pr-subscribers-llvm-adt

@llvm/pr-subscribers-llvm-support

Author: Jeremy Morse (jmorse)

Changes

As per the LLVM programmers manual, SmallSets do linear scans on insertion and then turn into a hash-table if the set gets big. Here in the IDFCalculator, the SmallSets have been configured to have 32 elements in each static allocation... which means that we linearly scan for all problems with up to 32 elements, which I feel is quite a large N.

Replace these SmallSets with a plain DenseSet (and use reserve to avoid any repeated allocations). Doing this yields a 0.13% compile-time improvement for debug-info builds, as we hit IDFCalculator pretty hard in InstrRefBasedLDV. Adding @nikic as this could affect more than debug-info things, although apparently not in a visible way on CTMark.

http://llvm-compile-time-tracker.com/compare.php?from=4ce6cc488083a931e795dafe30e69a0e4a96a3e4&to=a9f1d377013b1d208ac58f6f75548f1f24d174d2&stat=instructions%3Au

(tracker link is across a few commits, with this patch being the summary).

Full diff: https://github.com/llvm/llvm-project/pull/97823.diff

1 Files Affected:

(modified) llvm/include/llvm/Support/GenericIteratedDominanceFrontier.h (+6-2)

diff --git a/llvm/include/llvm/Support/GenericIteratedDominanceFrontier.h b/llvm/include/llvm/Support/GenericIteratedDominanceFrontier.h
index 0dc58e37c821df..c218d33c3eb303 100644
--- a/llvm/include/llvm/Support/GenericIteratedDominanceFrontier.h
+++ b/llvm/include/llvm/Support/GenericIteratedDominanceFrontier.h
@@ -144,8 +144,12 @@ void IDFCalculatorBase<NodeTy, IsPostDom>::calculate(
   DT.updateDFSNumbers();
 
   SmallVector<DomTreeNodeBase<NodeTy> *, 32> Worklist;
-  SmallPtrSet<DomTreeNodeBase<NodeTy> *, 32> VisitedPQ;
-  SmallPtrSet<DomTreeNodeBase<NodeTy> *, 32> VisitedWorklist;
+  SmallDenseSet<DomTreeNodeBase<NodeTy> *, 16> VisitedPQ;
+  SmallDenseSet<DomTreeNodeBase<NodeTy> *, 16> VisitedWorklist;
+  if (useLiveIn) {
+    VisitedPQ.reserve(LiveInBlocks->size());
+    VisitedWorklist.reserve(LiveInBlocks->size());
+  }
 
   for (NodeTy *BB : *DefBlocks)
     if (DomTreeNodeBase<NodeTy> *Node = DT.getNode(BB)) {

nikic · 2024-07-05T13:11:25Z

Does the benefit here really come from switching from SmallPtrSet to SmallDenseSet, or from the added reserve calls? I'd generally expect SmallPtrSet to still be beneficial, even if used in large mode.

jmorse · 2024-07-05T13:19:07Z

Looking back at my experiments, I originally hit a small regression (0.09% in -g builds) by using DenseSet, with the overall 0.13% improvement coming when the reserve call was added. So I guess it's the reserves.

SmallPtrSet doesn't have a reserve method right now, but if you reckon it's a better fit here then we can add one.

nikic · 2024-07-05T13:31:23Z

SmallPtrSet doesn't have a reserve method right now, but if you reckon it's a better fit here then we can add one.

That's surprising! We should definitely add one...

jmorse · 2024-07-16T14:26:36Z

Added a SmallPtrSet reserve method and switched the relevant containers back to SmallPtrSet. I stuck it on the compile-time-tracker and got the same speedup: http://llvm-compile-time-tracker.com/compare.php?from=00c622e596f918d9d83674b58097c8982ae1af95&to=ab267cc31f01dcc088c4839a4fadd3eca92b902f&stat=instructions%3Au

jmorse · 2024-07-24T10:07:17Z

le ping

llvm/include/llvm/ADT/SmallPtrSet.h

Also adjust two off-by-ones in the reserve implementation: * We should be willing to fill the small array up to 100% * The test for reallocating a big array happens before the "last" element that we reserve for is inserted, so apply the filter off-by-one.

jmorse · 2024-07-30T11:39:21Z

I've added some unit tests -- they're pretty basic as SmallPtrSet doesn't expose a means of working out if it's reallocated, and it doesn't look like existing LLVM unit tests go to great effort examining internal state. At the least they'll ensure reserve exists, and doesn't do anything silly.

I've also twiddled some off-by-ones: we fill the Small array up to 100% full, so shouldn't reallocate if we reserve the same number of elements as the small array size. I've also tried to ensure the big-array reallocation will only occur iff insert_imp_big would reallocate: it performs its size test before it completes an insertion, thus I've made the reserve logic off-by-one from the true number of elements to be inserted. Hopefully this isn't making it too complicated?

github-actions · 2024-07-30T11:43:53Z

⚠️ C/C++ code formatter, clang-format found issues in your code. ⚠️

You can test this locally with the following command:

git-clang-format --diff 00c622e596f918d9d83674b58097c8982ae1af95 eb71b58690afe6f8b150214e5ccf570169732df4 --extensions h,cpp -- llvm/include/llvm/ADT/SmallPtrSet.h llvm/include/llvm/Support/GenericIteratedDominanceFrontier.h llvm/unittests/ADT/SmallPtrSetTest.cpp

View the diff from clang-format here.

diff --git a/llvm/unittests/ADT/SmallPtrSetTest.cpp b/llvm/unittests/ADT/SmallPtrSetTest.cpp
index 1d9a0d1725..da872601a6 100644
--- a/llvm/unittests/ADT/SmallPtrSetTest.cpp
+++ b/llvm/unittests/ADT/SmallPtrSetTest.cpp
@@ -455,5 +455,6 @@ TEST(SmallPtrSetTest, Reserve) {
   Set.reserve(0);
   EXPECT_EQ(Set.capacity(), 128u);
   EXPECT_EQ(Set.size(), 6u);
-  EXPECT_THAT(Set, UnorderedElementsAre(&Vals[0], &Vals[1], &Vals[2], &Vals[3], &Vals[4], &Vals[5]));
+  EXPECT_THAT(Set, UnorderedElementsAre(&Vals[0], &Vals[1], &Vals[2], &Vals[3],
+                                        &Vals[4], &Vals[5]));
 }

llvm/unittests/ADT/SmallPtrSetTest.cpp

kuhar · 2024-07-30T11:55:02Z

I've added some unit tests -- they're pretty basic as SmallPtrSet doesn't expose a means of working out if it's reallocated, and it doesn't look like existing LLVM unit tests go to great effort examining internal state. At the least they'll ensure reserve exists, and doesn't do anything silly.

Oh, I'd expect it to expose something similar to .capacity()

This involves adding a filter for zero-sized reserves, which break the off-by-one size test I'd added earlier. To maintain consistency with insert_imp_big, the first allocation always jumps to 128 elements.

jmorse · 2024-07-30T14:34:56Z

Added a capacity method as it's extremely simple and added some more test coverage. As a result, I've tweaked reserve to always allocate 128 elements whenever it grows from a small size, to match the behaviour of insert_imp_big. Also a filter for zero-sized reserves that was going to mess with some of the logic.

llvm/include/llvm/ADT/SmallPtrSet.h

Co-authored-by: Jakub Kuderski <[email protected]>

jmorse · 2024-08-08T10:35:52Z

Thanks for the reviews!

llvm-ci · 2024-08-08T10:38:57Z

LLVM Buildbot has detected a new failure on builder flang-aarch64-libcxx running on linaro-flang-aarch64-libcxx while building llvm at step 5 "build-unified-tree".

Full details are available at: https://lab.llvm.org/buildbot/#/builders/89/builds/3664

Here is the relevant piece of the build log for the reference:

Step 5 (build-unified-tree) failure: build (failure)
...
5.762 [6751/67/295] Building CXX object third-party/benchmark/src/CMakeFiles/benchmark.dir/benchmark_api_internal.cc.o
5.808 [6751/66/296] Building CXX object third-party/benchmark/src/CMakeFiles/benchmark.dir/benchmark_runner.cc.o
5.826 [6748/68/297] Building CXX object third-party/benchmark/src/CMakeFiles/benchmark.dir/benchmark.cc.o
5.828 [6748/67/298] Linking CXX shared library lib/libFortranDecimal.so.20.0git
5.832 [6748/66/299] Linking CXX executable bin/llvm-PerfectShuffle
5.841 [6748/65/300] Building CXX object third-party/benchmark/src/CMakeFiles/benchmark.dir/benchmark_register.cc.o
5.891 [6748/64/301] Linking CXX static library lib/libDynamicLibraryLib.a
5.986 [6748/63/302] Linking CXX shared module unittests/Support/DynamicLibrary/SecondLib.so
5.993 [6748/62/303] Linking CXX shared module unittests/Support/DynamicLibrary/PipSqueak.so
6.183 [6748/61/304] Building CXX object lib/Support/CMakeFiles/LLVMSupport.dir/SmallPtrSet.cpp.o
FAILED: lib/Support/CMakeFiles/LLVMSupport.dir/SmallPtrSet.cpp.o 
/usr/local/bin/c++ -DGTEST_HAS_RTTI=0 -D_DEBUG -D_GLIBCXX_ASSERTIONS -D_GNU_SOURCE -D__STDC_CONSTANT_MACROS -D__STDC_FORMAT_MACROS -D__STDC_LIMIT_MACROS -I/home/tcwg-buildbot/worker/flang-aarch64-libcxx/build/lib/Support -I/home/tcwg-buildbot/worker/flang-aarch64-libcxx/llvm-project/llvm/lib/Support -I/home/tcwg-buildbot/worker/flang-aarch64-libcxx/build/include -I/home/tcwg-buildbot/worker/flang-aarch64-libcxx/llvm-project/llvm/include -stdlib=libc++ -fPIC -fno-semantic-interposition -fvisibility-inlines-hidden -Werror=date-time -Werror=unguarded-availability-new -Wall -Wextra -Wno-unused-parameter -Wwrite-strings -Wcast-qual -Wmissing-field-initializers -pedantic -Wno-long-long -Wc++98-compat-extra-semi -Wimplicit-fallthrough -Wcovered-switch-default -Wno-noexcept-type -Wnon-virtual-dtor -Wdelete-non-virtual-dtor -Wsuggest-override -Wstring-conversion -Wmisleading-indentation -Wctad-maybe-unsupported -fdiagnostics-color -ffunction-sections -fdata-sections  -O3 -DNDEBUG -fPIC -UNDEBUG -std=c++17  -fno-exceptions -funwind-tables -fno-rtti -MD -MT lib/Support/CMakeFiles/LLVMSupport.dir/SmallPtrSet.cpp.o -MF lib/Support/CMakeFiles/LLVMSupport.dir/SmallPtrSet.cpp.o.d -o lib/Support/CMakeFiles/LLVMSupport.dir/SmallPtrSet.cpp.o -c /home/tcwg-buildbot/worker/flang-aarch64-libcxx/llvm-project/llvm/lib/Support/SmallPtrSet.cpp
In file included from ../llvm-project/llvm/lib/Support/SmallPtrSet.cpp:14:
../llvm-project/llvm/include/llvm/ADT/SmallPtrSet.h:130:20: error: no member named 'max' in namespace 'std'; did you mean 'fmax'?
  130 |     NewSize = std::max(128u, NewSize);
      |               ~~~~~^~~
      |                    fmax
/usr/local/clang+llvm-17.0.6-aarch64-linux-gnu/bin/../include/c++/v1/cmath:435:9: note: 'fmax' declared here
  435 | using ::fmax _LIBCPP_USING_IF_EXISTS;
      |         ^
1 error generated.
6.365 [6748/60/305] Building CXX object third-party/benchmark/src/CMakeFiles/benchmark.dir/counter.cc.o
6.376 [6748/59/306] Building CXX object third-party/benchmark/src/CMakeFiles/benchmark.dir/perf_counters.cc.o
6.379 [6748/58/307] Building CXX object third-party/benchmark/src/CMakeFiles/benchmark.dir/json_reporter.cc.o
6.381 [6748/57/308] Building CXX object third-party/benchmark/src/CMakeFiles/benchmark_main.dir/benchmark_main.cc.o
6.396 [6748/56/309] Building CXX object third-party/benchmark/src/CMakeFiles/benchmark.dir/statistics.cc.o
6.397 [6748/55/310] Building CXX object third-party/benchmark/src/CMakeFiles/benchmark.dir/reporter.cc.o
6.399 [6748/54/311] Building CXX object third-party/benchmark/src/CMakeFiles/benchmark.dir/string_util.cc.o
6.429 [6748/53/312] Building CXX object third-party/benchmark/src/CMakeFiles/benchmark.dir/csv_reporter.cc.o
6.431 [6748/52/313] Building CXX object third-party/benchmark/src/CMakeFiles/benchmark.dir/timers.cc.o
6.433 [6748/51/314] Building CXX object third-party/benchmark/src/CMakeFiles/benchmark.dir/sysinfo.cc.o
7.085 [6748/50/315] Building CXX object lib/Support/CMakeFiles/LLVMSupport.dir/Debug.cpp.o
8.396 [6748/49/316] Building CXX object tools/mlir/lib/TableGen/CMakeFiles/MLIRTableGen.dir/GenInfo.cpp.o
8.478 [6748/48/317] Building CXX object lib/TableGen/CMakeFiles/LLVMTableGen.dir/TableGenBackendSkeleton.cpp.o
8.587 [6748/47/318] Building CXX object lib/Support/CMakeFiles/LLVMSupport.dir/Statistic.cpp.o
8.641 [6748/46/319] Building CXX object lib/TableGen/CMakeFiles/LLVMTableGen.dir/TableGenBackend.cpp.o
8.716 [6748/45/320] Building CXX object tools/clang/utils/TableGen/CMakeFiles/clang-tblgen.dir/ClangDataCollectorsEmitter.cpp.o
8.815 [6748/44/321] Building CXX object utils/TableGen/CMakeFiles/llvm-min-tblgen.dir/Attributes.cpp.o
8.879 [6748/43/322] Building CXX object tools/clang/utils/TableGen/CMakeFiles/clang-tblgen.dir/ClangCommentHTMLTagsEmitter.cpp.o
8.958 [6748/42/323] Building CXX object utils/TableGen/CMakeFiles/llvm-min-tblgen.dir/VTEmitter.cpp.o
8.997 [6748/41/324] Building CXX object utils/TableGen/CMakeFiles/llvm-min-tblgen.dir/TableGen.cpp.o
9.087 [6748/40/325] Building CXX object tools/clang/utils/TableGen/CMakeFiles/clang-tblgen.dir/ClangTypeNodesEmitter.cpp.o
9.205 [6748/39/326] Building CXX object tools/clang/utils/TableGen/CMakeFiles/clang-tblgen.dir/ClangCommentHTMLNamedCharacterReferenceEmitter.cpp.o
9.291 [6748/38/327] Building CXX object tools/reduce-chunk-list/CMakeFiles/reduce-chunk-list.dir/reduce-chunk-list.cpp.o
9.293 [6748/37/328] Building CXX object tools/mlir/lib/TableGen/CMakeFiles/MLIRTableGen.dir/AttrOrTypeDef.cpp.o
9.438 [6748/36/329] Building CXX object lib/Support/CMakeFiles/LLVMSupport.dir/TypeSize.cpp.o
9.502 [6748/35/330] Building CXX object lib/Support/CMakeFiles/LLVMSupport.dir/Timer.cpp.o
9.513 [6748/34/331] Building CXX object tools/mlir/lib/TableGen/CMakeFiles/MLIRTableGen.dir/Predicate.cpp.o
9.633 [6748/33/332] Building CXX object utils/TableGen/CMakeFiles/llvm-min-tblgen.dir/ARMTargetDefEmitter.cpp.o

jmorse requested review from nikic, SLTozer and OCHyams July 5, 2024 13:00

llvmbot added the llvm:support label Jul 5, 2024

jmorse added 2 commits July 16, 2024 14:23

Implement a reserve call for SmallPtrSet

3ab1e7d

Switch back to using SmallPtrSets in IDF calculator

ab267cc

llvmbot added the llvm:adt label Jul 16, 2024

kuhar reviewed Jul 24, 2024

View reviewed changes

llvm/include/llvm/ADT/SmallPtrSet.h Show resolved Hide resolved

kuhar reviewed Jul 30, 2024

View reviewed changes

llvm/unittests/ADT/SmallPtrSetTest.cpp Outdated Show resolved Hide resolved

Add a capacity method to SmallPtrSet, re-wrangle tests

e28791a

This involves adding a filter for zero-sized reserves, which break the off-by-one size test I'd added earlier. To maintain consistency with insert_imp_big, the first allocation always jumps to 128 elements.

clang-format

f47a9b9

kuhar approved these changes Jul 30, 2024

View reviewed changes

llvm/include/llvm/ADT/SmallPtrSet.h Outdated Show resolved Hide resolved

More explicit condition-check

eb71b58

Co-authored-by: Jakub Kuderski <[email protected]>

jmorse merged commit 7195572 into llvm:main Aug 8, 2024
6 of 7 checks passed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

[SSAUpdater] Don't use large SmallSets for IDFcalc #97823

[SSAUpdater] Don't use large SmallSets for IDFcalc #97823

Uh oh!

jmorse commented Jul 5, 2024

Uh oh!

llvmbot commented Jul 5, 2024 •

edited

Loading

Uh oh!

nikic commented Jul 5, 2024

Uh oh!

jmorse commented Jul 5, 2024

Uh oh!

nikic commented Jul 5, 2024

Uh oh!

jmorse commented Jul 16, 2024

Uh oh!

jmorse commented Jul 24, 2024

Uh oh!

Uh oh!

jmorse commented Jul 30, 2024

Uh oh!

github-actions bot commented Jul 30, 2024 •

edited

Loading

Uh oh!

Uh oh!

kuhar commented Jul 30, 2024

Uh oh!

jmorse commented Jul 30, 2024

Uh oh!

Uh oh!

Uh oh!

jmorse commented Aug 8, 2024

Uh oh!

llvm-ci commented Aug 8, 2024

Uh oh!

Uh oh!

[SSAUpdater] Don't use large SmallSets for IDFcalc #97823

[SSAUpdater] Don't use large SmallSets for IDFcalc #97823

Uh oh!

Conversation

jmorse commented Jul 5, 2024

Uh oh!

llvmbot commented Jul 5, 2024 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

nikic commented Jul 5, 2024

Uh oh!

jmorse commented Jul 5, 2024

Uh oh!

nikic commented Jul 5, 2024

Uh oh!

jmorse commented Jul 16, 2024

Uh oh!

jmorse commented Jul 24, 2024

Uh oh!

Uh oh!

jmorse commented Jul 30, 2024

Uh oh!

github-actions bot commented Jul 30, 2024 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

Uh oh!

kuhar commented Jul 30, 2024

Uh oh!

jmorse commented Jul 30, 2024

Uh oh!

Uh oh!

Uh oh!

jmorse commented Aug 8, 2024

Uh oh!

llvm-ci commented Aug 8, 2024

Uh oh!

Uh oh!

llvmbot commented Jul 5, 2024 •

edited

Loading

github-actions bot commented Jul 30, 2024 •

edited

Loading