-
Notifications
You must be signed in to change notification settings - Fork 14.3k
Add a pass to convert jump tables to switches. #77709
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Add a pass to convert jump tables to switches. #77709
Conversation
@llvm/pr-subscribers-llvm-transforms Author: Alexander Shaposhnikov (alexander-shaposhnikov) ChangesAdd a pass to convert jump tables to switches.
For Chrome, this results in a binary size improvement of 1.6 MB (measured on Linux) for some build configurations. The improvement may be smaller for other configurations. For several internal binaries, the size decrease was approximately 0.9%, depending on how frequently the optimized patterns occur in the codebase. The impact on the compilation time is small (see https://llvm-compile-time-tracker.com/?config=Overview&stat=instructions%3Au&remote=alexander-shaposhnikov](https://www.google.com/url?sa=D&q=https%3A%2F%2Fllvm-compile-time-tracker.com%2F%3Fconfig%3DOverview%26stat%3Dinstructions%253Au%26remote%3Dalexander-shaposhnikov Rationale for a separate pass: the existing passes (e.g. CallSiteSplitting) implement different transformations with minimal overlap with the proposed one, and the new pass can be further extended in the future. Patch is 24.50 KiB, truncated to 20.00 KiB below, full version: https://github.com/llvm/llvm-project/pull/77709.diff 16 Files Affected:
diff --git a/llvm/include/llvm/Transforms/Scalar/JumpTableToSwitch.h b/llvm/include/llvm/Transforms/Scalar/JumpTableToSwitch.h
new file mode 100644
index 00000000000000..61786227d7a335
--- /dev/null
+++ b/llvm/include/llvm/Transforms/Scalar/JumpTableToSwitch.h
@@ -0,0 +1,24 @@
+//===- JumpTableToSwitch.h - ------------------------------------*- C++ -*-===//
+//
+// Part of the LLVM Project, under the Apache License v2.0 with LLVM Exceptions.
+// See https://llvm.org/LICENSE.txt for license information.
+// SPDX-License-Identifier: Apache-2.0 WITH LLVM-exception
+//
+//===----------------------------------------------------------------------===//
+
+#ifndef LLVM_TRANSFORMS_SCALAR_JUMP_TABLE_TO_SWITCH_H
+#define LLVM_TRANSFORMS_SCALAR_JUMP_TABLE_TO_SWITCH_H
+
+#include "llvm/IR/PassManager.h"
+
+namespace llvm {
+
+class Function;
+
+struct JumpTableToSwitchPass : PassInfoMixin<JumpTableToSwitchPass> {
+ /// Run the pass over the function.
+ PreservedAnalyses run(Function &F, FunctionAnalysisManager &AM);
+};
+} // end namespace llvm
+
+#endif // LLVM_TRANSFORMS_SCALAR_JUMP_TABLE_TO_SWITCH_H
diff --git a/llvm/lib/Passes/PassBuilder.cpp b/llvm/lib/Passes/PassBuilder.cpp
index 649451edc0e2c6..e1da44316c8666 100644
--- a/llvm/lib/Passes/PassBuilder.cpp
+++ b/llvm/lib/Passes/PassBuilder.cpp
@@ -193,6 +193,7 @@
#include "llvm/Transforms/Scalar/InferAlignment.h"
#include "llvm/Transforms/Scalar/InstSimplifyPass.h"
#include "llvm/Transforms/Scalar/JumpThreading.h"
+#include "llvm/Transforms/Scalar/JumpTableToSwitch.h"
#include "llvm/Transforms/Scalar/LICM.h"
#include "llvm/Transforms/Scalar/LoopAccessAnalysisPrinter.h"
#include "llvm/Transforms/Scalar/LoopBoundSplit.h"
diff --git a/llvm/lib/Passes/PassBuilderPipelines.cpp b/llvm/lib/Passes/PassBuilderPipelines.cpp
index 5c6c391049a7b2..40fdef3e0f0724 100644
--- a/llvm/lib/Passes/PassBuilderPipelines.cpp
+++ b/llvm/lib/Passes/PassBuilderPipelines.cpp
@@ -92,6 +92,7 @@
#include "llvm/Transforms/Scalar/InferAlignment.h"
#include "llvm/Transforms/Scalar/InstSimplifyPass.h"
#include "llvm/Transforms/Scalar/JumpThreading.h"
+#include "llvm/Transforms/Scalar/JumpTableToSwitch.h"
#include "llvm/Transforms/Scalar/LICM.h"
#include "llvm/Transforms/Scalar/LoopDeletion.h"
#include "llvm/Transforms/Scalar/LoopDistribute.h"
@@ -557,6 +558,7 @@ PassBuilder::buildFunctionSimplificationPipeline(OptimizationLevel Level,
// Optimize based on known information about branches, and cleanup afterward.
FPM.addPass(JumpThreadingPass());
+ FPM.addPass(JumpTableToSwitchPass());
FPM.addPass(CorrelatedValuePropagationPass());
FPM.addPass(
@@ -695,6 +697,7 @@ PassBuilder::buildFunctionSimplificationPipeline(OptimizationLevel Level,
FPM.addPass(DFAJumpThreadingPass());
FPM.addPass(JumpThreadingPass());
+
FPM.addPass(CorrelatedValuePropagationPass());
// Finally, do an expensive DCE pass to catch all the dead code exposed by
@@ -1926,6 +1929,7 @@ PassBuilder::buildLTODefaultPipeline(OptimizationLevel Level,
invokePeepholeEPCallbacks(MainFPM, Level);
MainFPM.addPass(JumpThreadingPass());
+
MPM.addPass(createModuleToFunctionPassAdaptor(std::move(MainFPM),
PTO.EagerlyInvalidateAnalyses));
diff --git a/llvm/lib/Passes/PassRegistry.def b/llvm/lib/Passes/PassRegistry.def
index eaa7c3fc89241b..7417d639a9ecd7 100644
--- a/llvm/lib/Passes/PassRegistry.def
+++ b/llvm/lib/Passes/PassRegistry.def
@@ -330,6 +330,7 @@ FUNCTION_PASS("interleaved-load-combine", InterleavedLoadCombinePass(TM))
FUNCTION_PASS("invalidate<all>", InvalidateAllAnalysesPass())
FUNCTION_PASS("irce", IRCEPass())
FUNCTION_PASS("jump-threading", JumpThreadingPass())
+FUNCTION_PASS("jump-table-to-switch", JumpTableToSwitchPass());
FUNCTION_PASS("kcfi", KCFIPass())
FUNCTION_PASS("lcssa", LCSSAPass())
FUNCTION_PASS("libcalls-shrinkwrap", LibCallsShrinkWrapPass())
diff --git a/llvm/lib/Transforms/Scalar/CMakeLists.txt b/llvm/lib/Transforms/Scalar/CMakeLists.txt
index 2dd27037a17de7..7b49c8010c874a 100644
--- a/llvm/lib/Transforms/Scalar/CMakeLists.txt
+++ b/llvm/lib/Transforms/Scalar/CMakeLists.txt
@@ -25,6 +25,7 @@ add_llvm_component_library(LLVMScalarOpts
InferAlignment.cpp
InstSimplifyPass.cpp
JumpThreading.cpp
+ JumpTableToSwitch.cpp
LICM.cpp
LoopAccessAnalysisPrinter.cpp
LoopBoundSplit.cpp
diff --git a/llvm/lib/Transforms/Scalar/JumpTableToSwitch.cpp b/llvm/lib/Transforms/Scalar/JumpTableToSwitch.cpp
new file mode 100644
index 00000000000000..60b451ff9d3db2
--- /dev/null
+++ b/llvm/lib/Transforms/Scalar/JumpTableToSwitch.cpp
@@ -0,0 +1,206 @@
+//===- JumpTableToSwitch.cpp ----------------------------------------------===//
+//
+// Part of the LLVM Project, under the Apache License v2.0 with LLVM Exceptions.
+// See https://llvm.org/LICENSE.txt for license information.
+// SPDX-License-Identifier: Apache-2.0 WITH LLVM-exception
+//
+//===----------------------------------------------------------------------===//
+
+#include "llvm/Transforms/Scalar/JumpTableToSwitch.h"
+#include "llvm/ADT/DenseMap.h"
+#include "llvm/ADT/SmallSet.h"
+#include "llvm/Analysis/DomTreeUpdater.h"
+#include "llvm/Analysis/PostDominators.h"
+#include "llvm/Analysis/TargetLibraryInfo.h"
+#include "llvm/Analysis/TargetTransformInfo.h"
+#include "llvm/Analysis/ValueTracking.h"
+#include "llvm/IR/IRBuilder.h"
+#include "llvm/IR/IntrinsicInst.h"
+#include "llvm/IR/PatternMatch.h"
+#include "llvm/Support/CommandLine.h"
+#include "llvm/Support/Debug.h"
+#include "llvm/Transforms/Utils/BasicBlockUtils.h"
+#include "llvm/Transforms/Utils/Cloning.h"
+#include "llvm/Transforms/Utils/Local.h"
+
+using namespace llvm;
+using namespace PatternMatch;
+
+static cl::opt<unsigned>
+ JumpTableSizeThreshold("jump-table-to-switch-size-threshold", cl::Hidden,
+ cl::desc("Only split jump tables with size less or "
+ "equal than JumpTableSizeThreshold."),
+ cl::init(10));
+
+#define DEBUG_TYPE "jump-table-to-switch"
+
+static Constant *getElementWithGivenTypeAtOffset(Constant *C, const Type *Ty,
+ uint64_t Offset, Module &M) {
+ if (Offset == 0 && C->getType() == Ty)
+ return C;
+ if (auto *CS = dyn_cast<ConstantStruct>(C)) {
+ const DataLayout &DL = M.getDataLayout();
+ const StructLayout *SL = DL.getStructLayout(CS->getType());
+ if (Offset >= SL->getSizeInBytes())
+ return nullptr;
+ const unsigned Op = SL->getElementContainingOffset(Offset);
+ const uint64_t AdjustedOffset = Offset - SL->getElementOffset(Op);
+ Constant *Element = cast<Constant>(CS->getOperand(Op));
+ return getElementWithGivenTypeAtOffset(Element, Ty, AdjustedOffset, M);
+ }
+ // TODO: add support for arrays.
+ return nullptr;
+}
+
+namespace {
+struct JumpTableTy {
+ Value *Index;
+ SmallVector<Function *, 5> Funcs;
+};
+} // anonymous namespace
+
+static std::optional<JumpTableTy> parseJumpTable(GetElementPtrInst *GEP) {
+ if (!GEP || !GEP->isInBounds())
+ return std::nullopt;
+ ArrayType *ArrayTy = dyn_cast<ArrayType>(GEP->getSourceElementType());
+ if (!ArrayTy || ArrayTy->getArrayNumElements() > JumpTableSizeThreshold)
+ return std::nullopt;
+ Constant *Ptr = dyn_cast<Constant>(GEP->getPointerOperand());
+ if (!Ptr)
+ return std::nullopt;
+
+ Function &F = *GEP->getParent()->getParent();
+ const DataLayout &DL = F.getParent()->getDataLayout();
+ const unsigned BitWidth =
+ DL.getIndexSizeInBits(GEP->getPointerAddressSpace());
+ MapVector<Value *, APInt> VariableOffsets;
+ APInt ConstantOffset(BitWidth, 0);
+ if (!GEP->collectOffset(DL, BitWidth, VariableOffsets, ConstantOffset))
+ return std::nullopt;
+ if (VariableOffsets.empty() || VariableOffsets.size() > 1)
+ return std::nullopt;
+
+ Module &M = *F.getParent();
+ unsigned Offset = ConstantOffset.getZExtValue();
+ // TODO: support more general patterns
+ // (see also TODO in getElementWithGivenTypeAtOffset).
+ if (Offset != 0)
+ return std::nullopt;
+ if (!Ptr->getNumOperands())
+ return std::nullopt;
+ Constant *ConstArray = getElementWithGivenTypeAtOffset(
+ cast<Constant>(Ptr->getOperand(0)), ArrayTy, Offset, M);
+ if (!ConstArray)
+ return std::nullopt;
+
+ JumpTableTy JumpTable;
+ JumpTable.Index = VariableOffsets.front().first;
+
+ const uint64_t N = ArrayTy->getArrayNumElements();
+ JumpTable.Funcs.assign(N, nullptr);
+ for (uint64_t Index = 0; Index < N; ++Index) {
+ auto *Func =
+ dyn_cast_or_null<Function>(ConstArray->getAggregateElement(Index));
+ if (!Func || Func->isDeclaration())
+ return std::nullopt;
+ JumpTable.Funcs[Index] = Func;
+ }
+ return JumpTable;
+}
+
+static BasicBlock *split(CallBase *CB, const JumpTableTy &JT,
+ DomTreeUpdater *DTU) {
+ const bool IsVoid = CB->getType() == Type::getVoidTy(CB->getContext());
+
+ SmallVector<DominatorTree::UpdateType, 8> DTUpdates;
+ BasicBlock *BB = CB->getParent();
+ BasicBlock *Tail =
+ SplitBlock(BB, CB, DTU, nullptr, nullptr, BB->getName() + Twine(".tail"));
+ DTUpdates.push_back({DominatorTree::Delete, BB, Tail});
+ BB->getTerminator()->eraseFromParent();
+
+ Function &F = *BB->getParent();
+ BasicBlock *BBUnreachable = BasicBlock::Create(
+ F.getContext(), "default.switch.case.unreachable", &F, Tail);
+ IRBuilder<> BuilderUnreachable(BBUnreachable);
+ BuilderUnreachable.CreateUnreachable();
+
+ IRBuilder<> Builder(BB);
+ SwitchInst *Switch = Builder.CreateSwitch(JT.Index, BBUnreachable);
+ DTUpdates.push_back({DominatorTree::Insert, BB, BBUnreachable});
+
+ IRBuilder<> BuilderTail(CB);
+ PHINode *PHI =
+ IsVoid ? nullptr : BuilderTail.CreatePHI(CB->getType(), JT.Funcs.size());
+
+ SmallVector<BasicBlock *, 10> Blocks;
+ for (auto [Index, Func] : llvm::enumerate(JT.Funcs)) {
+ BasicBlock *B = BasicBlock::Create(Func->getContext(),
+ "call." + Twine(Index), &F, Tail);
+ DTUpdates.push_back({DominatorTree::Insert, BB, B});
+ DTUpdates.push_back({DominatorTree::Insert, B, Tail});
+
+ CallBase *Call = cast<CallBase>(CB->clone());
+ Call->setCalledFunction(Func);
+ Call->insertInto(B, B->end());
+ Switch->addCase(
+ cast<ConstantInt>(ConstantInt::get(JT.Index->getType(), Index)), B);
+ BranchInst::Create(Tail, B);
+ if (PHI)
+ PHI->addIncoming(Call, B);
+ }
+ if (DTU)
+ DTU->applyUpdates(DTUpdates);
+ if (PHI)
+ CB->replaceAllUsesWith(PHI);
+ CB->eraseFromParent();
+ return Tail;
+}
+
+PreservedAnalyses JumpTableToSwitchPass::run(Function &F,
+ FunctionAnalysisManager &AM) {
+ DominatorTree *DT = AM.getCachedResult<DominatorTreeAnalysis>(F);
+ PostDominatorTree *PDT = AM.getCachedResult<PostDominatorTreeAnalysis>(F);
+ std::unique_ptr<DomTreeUpdater> DTU;
+ bool Changed = false;
+ for (BasicBlock &BB : make_early_inc_range(F)) {
+ BasicBlock *CurrentBB = &BB;
+ while (CurrentBB) {
+ BasicBlock *SplittedOutTail = nullptr;
+ for (Instruction &I : make_early_inc_range(*CurrentBB)) {
+ CallBase *CB = dyn_cast<CallBase>(&I);
+ if (!CB || isa<IntrinsicInst>(CB) || CB->getCalledFunction() ||
+ isa<InvokeInst>(CB) || CB->isMustTailCall())
+ continue;
+
+ Value *V;
+ if (!match(CB->getCalledOperand(), m_Load(m_Value(V))))
+ continue;
+ auto *GEP = dyn_cast<GetElementPtrInst>(V);
+ if (!GEP)
+ continue;
+
+ std::optional<JumpTableTy> JumpTable = parseJumpTable(GEP);
+ if (!JumpTable)
+ continue;
+ if ((DT || PDT) && !DTU)
+ DTU = std::make_unique<DomTreeUpdater>(
+ DT, PDT, DomTreeUpdater::UpdateStrategy::Lazy);
+ SplittedOutTail = split(CB, *JumpTable, DTU.get());
+ Changed = true;
+ break;
+ }
+ CurrentBB = SplittedOutTail ? SplittedOutTail : nullptr;
+ }
+ }
+
+ if (!Changed)
+ return PreservedAnalyses::all();
+
+ PreservedAnalyses PA;
+ if (DT)
+ PA.preserve<DominatorTreeAnalysis>();
+ if (PDT)
+ PA.preserve<PostDominatorTreeAnalysis>();
+ return PA;
+}
diff --git a/llvm/test/Other/new-pm-defaults.ll b/llvm/test/Other/new-pm-defaults.ll
index ecdb5a5e010d92..fb4254363c3eaa 100644
--- a/llvm/test/Other/new-pm-defaults.ll
+++ b/llvm/test/Other/new-pm-defaults.ll
@@ -149,6 +149,7 @@
; CHECK-O23SZ-NEXT: Running pass: SpeculativeExecutionPass
; CHECK-O23SZ-NEXT: Running pass: JumpThreadingPass
; CHECK-O23SZ-NEXT: Running analysis: LazyValueAnalysis
+; CHECK-O23SZ-NEXT: Running pass: JumpTableToSwitchPass
; CHECK-O23SZ-NEXT: Running pass: CorrelatedValuePropagationPass
; CHECK-O23SZ-NEXT: Invalidating analysis: LazyValueAnalysis
; CHECK-O-NEXT: Running pass: SimplifyCFGPass
diff --git a/llvm/test/Other/new-pm-thinlto-postlink-defaults.ll b/llvm/test/Other/new-pm-thinlto-postlink-defaults.ll
index 064362eabbf839..4f54becf5871f5 100644
--- a/llvm/test/Other/new-pm-thinlto-postlink-defaults.ll
+++ b/llvm/test/Other/new-pm-thinlto-postlink-defaults.ll
@@ -88,6 +88,7 @@
; CHECK-O23SZ-NEXT: Running pass: SpeculativeExecutionPass
; CHECK-O23SZ-NEXT: Running pass: JumpThreadingPass
; CHECK-O23SZ-NEXT: Running analysis: LazyValueAnalysis
+; CHECK-O23SZ-NEXT: Running pass: JumpTableToSwitchPass
; CHECK-O23SZ-NEXT: Running pass: CorrelatedValuePropagationPass
; CHECK-O23SZ-NEXT: Invalidating analysis: LazyValueAnalysis
; CHECK-O-NEXT: Running pass: SimplifyCFGPass
diff --git a/llvm/test/Other/new-pm-thinlto-postlink-pgo-defaults.ll b/llvm/test/Other/new-pm-thinlto-postlink-pgo-defaults.ll
index 19a44867e434ac..8b869b1871a815 100644
--- a/llvm/test/Other/new-pm-thinlto-postlink-pgo-defaults.ll
+++ b/llvm/test/Other/new-pm-thinlto-postlink-pgo-defaults.ll
@@ -76,6 +76,7 @@
; CHECK-O23SZ-NEXT: Running pass: SpeculativeExecutionPass
; CHECK-O23SZ-NEXT: Running pass: JumpThreadingPass
; CHECK-O23SZ-NEXT: Running analysis: LazyValueAnalysis
+; CHECK-O23SZ-NEXT: Running pass: JumpTableToSwitchPass
; CHECK-O23SZ-NEXT: Running pass: CorrelatedValuePropagationPass
; CHECK-O23SZ-NEXT: Invalidating analysis: LazyValueAnalysis
; CHECK-O-NEXT: Running pass: SimplifyCFGPass
diff --git a/llvm/test/Other/new-pm-thinlto-postlink-samplepgo-defaults.ll b/llvm/test/Other/new-pm-thinlto-postlink-samplepgo-defaults.ll
index ac80a31d8fd4bc..da5d8466711017 100644
--- a/llvm/test/Other/new-pm-thinlto-postlink-samplepgo-defaults.ll
+++ b/llvm/test/Other/new-pm-thinlto-postlink-samplepgo-defaults.ll
@@ -84,6 +84,7 @@
; CHECK-O23SZ-NEXT: Running pass: SpeculativeExecutionPass
; CHECK-O23SZ-NEXT: Running pass: JumpThreadingPass
; CHECK-O23SZ-NEXT: Running analysis: LazyValueAnalysis
+; CHECK-O23SZ-NEXT: Running pass: JumpTableToSwitchPass
; CHECK-O23SZ-NEXT: Running pass: CorrelatedValuePropagationPass
; CHECK-O23SZ-NEXT: Invalidating analysis: LazyValueAnalysis
; CHECK-O-NEXT: Running pass: SimplifyCFGPass
diff --git a/llvm/test/Other/new-pm-thinlto-prelink-defaults.ll b/llvm/test/Other/new-pm-thinlto-prelink-defaults.ll
index 6486639e07b49c..809b8abb206040 100644
--- a/llvm/test/Other/new-pm-thinlto-prelink-defaults.ll
+++ b/llvm/test/Other/new-pm-thinlto-prelink-defaults.ll
@@ -119,6 +119,7 @@
; CHECK-O23SZ-NEXT: Running pass: SpeculativeExecutionPass
; CHECK-O23SZ-NEXT: Running pass: JumpThreadingPass
; CHECK-O23SZ-NEXT: Running analysis: LazyValueAnalysis
+; CHECK-O23SZ-NEXT: Running pass: JumpTableToSwitchPass
; CHECK-O23SZ-NEXT: Running pass: CorrelatedValuePropagationPass
; CHECK-O23SZ-NEXT: Invalidating analysis: LazyValueAnalysis
; CHECK-O-NEXT: Running pass: SimplifyCFGPass
diff --git a/llvm/test/Other/new-pm-thinlto-prelink-pgo-defaults.ll b/llvm/test/Other/new-pm-thinlto-prelink-pgo-defaults.ll
index 09f9f0f48baddb..438c4ba364be3a 100644
--- a/llvm/test/Other/new-pm-thinlto-prelink-pgo-defaults.ll
+++ b/llvm/test/Other/new-pm-thinlto-prelink-pgo-defaults.ll
@@ -116,6 +116,7 @@
; CHECK-O23SZ-NEXT: Running pass: SpeculativeExecutionPass
; CHECK-O23SZ-NEXT: Running pass: JumpThreadingPass
; CHECK-O23SZ-NEXT: Running analysis: LazyValueAnalysis
+; CHECK-O23SZ-NEXT: Running pass: JumpTableToSwitchPass
; CHECK-O23SZ-NEXT: Running pass: CorrelatedValuePropagationPass
; CHECK-O23SZ-NEXT: Invalidating analysis: LazyValueAnalysis
; CHECK-O-NEXT: Running pass: SimplifyCFGPass
diff --git a/llvm/test/Other/new-pm-thinlto-prelink-samplepgo-defaults.ll b/llvm/test/Other/new-pm-thinlto-prelink-samplepgo-defaults.ll
index 47bdbfd2d357d4..1b7ce7bbf9680f 100644
--- a/llvm/test/Other/new-pm-thinlto-prelink-samplepgo-defaults.ll
+++ b/llvm/test/Other/new-pm-thinlto-prelink-samplepgo-defaults.ll
@@ -88,6 +88,7 @@
; CHECK-O23SZ-NEXT: Running pass: SpeculativeExecutionPass
; CHECK-O23SZ-NEXT: Running pass: JumpThreadingPass
; CHECK-O23SZ-NEXT: Running analysis: LazyValueAnalysis
+; CHECK-O23SZ-NEXT: Running pass: JumpTableToSwitchPass
; CHECK-O23SZ-NEXT: Running pass: CorrelatedValuePropagationPass
; CHECK-O23SZ-NEXT: Invalidating analysis: LazyValueAnalysis
; CHECK-O-NEXT: Running pass: SimplifyCFGPass
diff --git a/llvm/test/Transforms/JumpTableToSwitch/basic.ll b/llvm/test/Transforms/JumpTableToSwitch/basic.ll
new file mode 100644
index 00000000000000..3feb760909d08a
--- /dev/null
+++ b/llvm/test/Transforms/JumpTableToSwitch/basic.ll
@@ -0,0 +1,94 @@
+; RUN: opt < %s -passes=jump-table-to-switch -verify-dom-info -S | FileCheck %s
+; RUN: opt < %s -passes=jump-table-to-switch -jump-table-to-switch-size-threshold=0 -verify-dom-info -S | FileCheck %s --check-prefix=THRESHOLD-0
+
+@func_array = global [2 x ptr] [ptr @func0, ptr @func1]
+
+define i32 @func0() {
+ ret i32 1
+}
+
+define i32 @func1() {
+ ret i32 2
+}
+
+define i32 @function_with_jump_table(i32 %index) {
+ %gep = getelementptr inbounds [2 x ptr], ptr @func_array, i32 0, i32 %index
+ %func_ptr = load ptr, ptr %gep
+ %result = call i32 %func_ptr()
+ ret i32 %result
+}
+
+; CHECK-LABEL: define i32 @function_with_jump_table
+; CHECK: [[GEP:%.*]] = getelementptr inbounds [2 x ptr], ptr @func_array, i32 0, i32 %index
+; CHECK: switch i32 %index, label %[[DEFAULT_LABEL:.*]] [
+; CHECK-NEXT: i32 0, label %[[CALL_0_LABEL:.*]]
+; CHECK-NEXT: i32 1, label %[[CALL_1_LABEL:.*]]
+; CHECK-NEXT: ]
+
+; CHECK: [[DEFAULT_LABEL]]
+; CHECK-NEXT: unreachable
+
+; CHECK: [[CALL_0_LABEL]]:
+; CHECK-NEXT: %1 = call i32 @func0()
+; CHECK-NEXT: br label %[[TAIL_LABEL:.*]]
+
+; CHECK: [[CALL_1_LABEL]]:
+; CHECK-NEXT: %2 = call i32 @func1()
+; CHECK-NEXT: br label %[[TAIL_LABEL]]
+
+; CHECK: [[TAIL_LABEL]]:
+; CHECK-NEXT: %3 = phi i32 [ %1, %[[CALL_0_LABEL]] ], [ %2, %[[CALL_1_LABEL]] ]
+; CHECK-NEXT: ret i32 %3
+
+; THRESHOLD-0-LABEL: define i32 @function_with_jump_table(i32 %index) {
+; THRESHOLD-0: %gep = getelementptr inbounds [2 x ptr], ptr @func_array, i32 0, i32 %index
+; THRESHOLD-0-NEXT: %func_ptr = load ptr, ptr %gep
+; THRESHOLD-0-NEXT: %result = call i32 %func_ptr()
+; THRESHOLD-0-NEXT: ret i32 %result
+; THRESHOLD-0-NEXT: }
+
+define void @void_func0() {
+ ret void
+}
+
+define void @void_func1() {
+ ret void
+}
+
+@void_func_array = global [2 x ptr] [ptr @void_func0, ptr @void_func1]
+
+define void @void_function_with_jump_table(i32 %index) {
+ %gep = getelementptr inbounds [2 x ptr], ptr @void_func_array, i32 0, i32 %index
+ %func_ptr = load ptr, ptr %gep
+ call void %func_ptr()
+ ret void
+}
+
+; CHECK-LABEL: define void @void_function_with_jump_table
+; CHECK: [[GEP:%.*]] = getelementptr inbounds [2 x ptr], ptr @void_func_array, i32 0, i32 %index
+; CHECK: switch i32 %index, label %[[VOID_DEFAULT_LABEL:.*]] [
+; CHECK-NEXT: i32 0, label %[[VOID_CALL_0_LABEL:.*]]
+; CHECK-NEXT: i32 1, label %[[VOID_CALL_1_LABEL:.*]]
+; CHECK-NEXT: ]
+
+; CHECK: [[VOID_DEFAULT_LABEL]]
+; CHECK-NEXT: unreachable
+
+; CHECK: [[VOID_CALL_0_LABEL]]:
+; CHECK-NEXT: call void @void_func0()
+; CHECK-NEXT: br label %[[VOID_TAIL_LABEL:.*]]
+
+; CHECK: [[VOID_CALL_1_LABEL]]:
+; CHECK-NEXT: call void @void_func1()
+; CHECK-NEXT: br label %[[VOID_TAIL_LABEL]]
+
+; CHECK: [[VOID_TAIL_LABEL]]:
+; CHECK-NEXT: ret void
+
+; THRESHOLD-0-LABEL: define void @void_function_with_jump_table(i32 %index) {
+; THRESHOLD-0: %gep = getelementptr inbounds [2 ...
[truncated]
|
✅ With the latest revision this PR passed the C/C++ code formatter. |
a936548
to
cfb6758
Compare
For the small example you have, I noticed that using libstdc++ produces clean code. Is the root cause of the size increase with std::variant in libc++'s implementation? |
I believe there are other patterns which will benefit from this pass.
|
If the patch is intended as a size reduction optimization, it should be guarded under Os/Oz. Otherwise more performance data is needed. In particular performance with FDO (FDO does similar transformation but more selectively with value profile data. |
@david-xl - correct, to the best of my knowledge libc++ tried to change the implementation in the past but then moved back. |
12896fd
to
d684cc4
Compare
Isn't this an example where we do the inverse transform? We start from a switch and SimplifyCFG converts it into a vtable? |
Regarding FDO - we've done more performance measurements (for builds with FDO), in several cases, the top-level metrics (e.g., throughput) have slightly improved (stat-sig). The concrete ranges depend on the particular service and workload; the observed ones were around ~0.2% to ~1%. |
Regarding libc++'s std::variant as compared to libstdc++, the context for that is in #62648. |
d684cc4
to
e031383
Compare
Addressed comments, expanded test coverage. |
Those are all valid concerns. It seems it needs some cost model for profitability analysis (also depend on it is optimize for size or speed). Things to consider when optimize for speed:
For now, if looks fine to me to go in if a) have a flag that is off by default nikic@, WDTY? |
ae5fd1b
to
6530281
Compare
Fwiw, I've logged more data (from quite a large build - the optimization kicked in ~10K times), in 80% of the cases the max size of the functions in the jump table is less than 25 instructions, and in 90% - less than 37 instructions. UPD. |
19b7d6a
to
7474a6b
Compare
Could you add a cl::opt to disable the pass entirely and use that in the pipeline? First iteration should be to commit it with flag disabled and enable it in a follow up PR/merge. Before turning on by default, consider adding checks using PSI/BPI and also check how/if it affects your testing. |
7474a6b
to
3a455b3
Compare
Added -enable-jump-table-to-switch, now the new pass is disabled by default. |
@dtcxzyw Could you please rerun tests? |
Done. |
3a455b3
to
ff11d0e
Compare
@nikic - many thanks for the review |
ff11d0e
to
fbeec88
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM
c2b7996
to
741f3cf
Compare
741f3cf
to
ede9263
Compare
Reminder that the flag still needs flipping. |
#83229 flips the flag, but it's held up because of build time issues. |
Add a pass to convert jump tables to switches.
This replaces an indirect call with a switch + direct calls and enables the compiler to apply other optimizations.
Such IR patterns come e.g. from C/C++ codebases that use arrays of function pointers.
E.g. this optimization kicks in for
For Chrome, this results in a binary size improvement of ~0.25MB - 1.6MB (measured on Linux) depending on the build configuration / implementation of "variant" (but the optimization kicks in for the code unrelated to std::visit / std::variant too).
For several internal binaries, the size decrease was approximately 0.9%, depending on how frequently the optimized patterns occur in the codebase.
The impact on the compilation time is relatively small (see https://llvm-compile-time-tracker.com/?config=Overview&stat=instructions%3Au&remote=alexander-shaposhnikov](https://www.google.com/url?sa=D&q=https%3A%2F%2Fllvm-compile-time-tracker.com%2F%3Fconfig%3DOverview%26stat%3Dinstructions%253Au%26remote%3Dalexander-shaposhnikov
perf/convert_jt_do_not_submit_4). The pass always finishes after a single scan of instructions.
Rationale for a separate pass: the existing passes (e.g. CallSiteSplitting) implement different transformations with minimal overlap with the proposed one, and the new pass can be further extended in the future.