[LLVM] Add 'ExpandVariadicsPass' to LTO default pipeline #100479

jhuber6 · 2024-07-24T22:18:32Z

Summary:
This pass expands variadic functions into non-variadic function calls
according to the target ABI. Currently, this is used as the lowering for
the NVPTX and AMDGPU targets.

This pass is currently only run late in the target's backend. However,
during LTO we want to run it before the inliner pass so that the
expanded functions can be inlined using standard heuristics. This pass
is a no-op for unsupported targets, so this won't apply to any code that
isn't already using it.

arsenm

Can we get a phase ordering test to show the inlining effect

jhuber6 · 2024-07-25T11:56:17Z

Can we get a phase ordering test to show the inlining effect

Good point, done.

llvmbot · 2024-07-25T11:56:26Z

@llvm/pr-subscribers-llvm-transforms

Author: Joseph Huber (jhuber6)

Changes

Summary:
This pass expands variadic functions into non-variadic function calls
according to the target ABI. Currently, this is used as the lowering for
the NVPTX, AMDGPU, and WASM targets.

This pass is currently only run late in the target's backend. However,
during LTO we want to run it before the inliner pass so that the
expanded functions can be inlined using standard heuristics. This pass
is a no-op for unsupported targets, so this won't apply to any code that
isn't already using it.

Full diff: https://github.com/llvm/llvm-project/pull/100479.diff

3 Files Affected:

(modified) llvm/lib/Passes/PassBuilderPipelines.cpp (+4)
(modified) llvm/test/Other/new-pm-lto-defaults.ll (+1)
(added) llvm/test/Transforms/PhaseOrdering/varargs.ll (+48)

diff --git a/llvm/lib/Passes/PassBuilderPipelines.cpp b/llvm/lib/Passes/PassBuilderPipelines.cpp
index 6f36bdad780ae3..757b20dcd6693a 100644
--- a/llvm/lib/Passes/PassBuilderPipelines.cpp
+++ b/llvm/lib/Passes/PassBuilderPipelines.cpp
@@ -48,6 +48,7 @@
 #include "llvm/Transforms/IPO/DeadArgumentElimination.h"
 #include "llvm/Transforms/IPO/ElimAvailExtern.h"
 #include "llvm/Transforms/IPO/EmbedBitcodePass.h"
+#include "llvm/Transforms/IPO/ExpandVariadics.h"
 #include "llvm/Transforms/IPO/ForceFunctionAttrs.h"
 #include "llvm/Transforms/IPO/FunctionAttrs.h"
 #include "llvm/Transforms/IPO/GlobalDCE.h"
@@ -1874,6 +1875,9 @@ PassBuilder::buildLTODefaultPipeline(OptimizationLevel Level,
   MPM.addPass(createModuleToFunctionPassAdaptor(std::move(PeepholeFPM),
                                                 PTO.EagerlyInvalidateAnalyses));
 
+  // Lower variadic functions for supported targets prior to inlining.
+  MPM.addPass(ExpandVariadicsPass(ExpandVariadicsMode::Optimize));
+
   // Note: historically, the PruneEH pass was run first to deduce nounwind and
   // generally clean up exception handling overhead. It isn't clear this is
   // valuable as the inliner doesn't currently care whether it is inlining an
diff --git a/llvm/test/Other/new-pm-lto-defaults.ll b/llvm/test/Other/new-pm-lto-defaults.ll
index d451d2897f673c..1b345780476b71 100644
--- a/llvm/test/Other/new-pm-lto-defaults.ll
+++ b/llvm/test/Other/new-pm-lto-defaults.ll
@@ -29,6 +29,7 @@
 
 ; CHECK-EP: Running pass: NoOpModulePass
 ; CHECK-O: Running pass: CrossDSOCFIPass
+; CHECK-O-NEXT: Running pass: ExpandVariadicsPass
 ; CHECK-O-NEXT: Running pass: OpenMPOptPass
 ; CHECK-O-NEXT: Running pass: GlobalDCEPass
 ; CHECK-O-NEXT: Running pass: InferFunctionAttrsPass
diff --git a/llvm/test/Transforms/PhaseOrdering/varargs.ll b/llvm/test/Transforms/PhaseOrdering/varargs.ll
new file mode 100644
index 00000000000000..dd6a41fcc96012
--- /dev/null
+++ b/llvm/test/Transforms/PhaseOrdering/varargs.ll
@@ -0,0 +1,48 @@
+; NOTE: Assertions have been autogenerated by utils/update_test_checks.py UTC_ARGS: --version 5
+; RUN: opt -mtriple=amdgcn-- -S -passes='lto<O2>' < %s | FileCheck %s
+target triple = "amdgcn-amd-amdhsa"
+
+; We use the ExpandVariadics pass to lower variadic functions so they can be
+; inlined.
+
+define i32 @foo() {
+; CHECK-LABEL: define i32 @foo(
+; CHECK-SAME: ) local_unnamed_addr #[[ATTR0:[0-9]+]] {
+; CHECK-NEXT:  [[ENTRY:.*:]]
+; CHECK-NEXT:    ret i32 6
+;
+entry:
+  %call = tail call i32 (i32, ...) @vararg(i32 poison, i32 noundef 1, i32 noundef 2, i32 noundef 3)
+  ret i32 %call
+}
+
+define internal i32 @vararg(i32 %first, ...) {
+entry:
+  %vlist = alloca ptr, align 8, addrspace(5)
+  %vlist.ascast = addrspacecast ptr addrspace(5) %vlist to ptr
+  call void @llvm.lifetime.start.p5(i64 8, ptr addrspace(5) %vlist)
+  call void @llvm.va_start.p0(ptr %vlist.ascast)
+  %vlist.promoted = load ptr, ptr addrspace(5) %vlist, align 8
+  %argp.next = getelementptr inbounds i8, ptr %vlist.promoted, i64 4
+  store ptr %argp.next, ptr addrspace(5) %vlist, align 8
+  %0 = load i32, ptr %vlist.promoted, align 4
+  %argp.next.1 = getelementptr inbounds i8, ptr %vlist.promoted, i64 8
+  store ptr %argp.next.1, ptr addrspace(5) %vlist, align 8
+  %1 = load i32, ptr %argp.next, align 4
+  %add.1 = add nsw i32 %1, %0
+  %argp.next.2 = getelementptr inbounds i8, ptr %vlist.promoted, i64 12
+  store ptr %argp.next.2, ptr addrspace(5) %vlist, align 8
+  %2 = load i32, ptr %argp.next.1, align 4
+  %add.2 = add nsw i32 %2, %add.1
+  call void @llvm.va_end.p0(ptr %vlist.ascast)
+  call void @llvm.lifetime.end.p5(i64 8, ptr addrspace(5) %vlist)
+  ret i32 %add.2
+}
+
+declare void @llvm.lifetime.start.p5(i64 immarg, ptr addrspace(5) nocapture)
+
+declare void @llvm.va_start.p0(ptr)
+
+declare void @llvm.lifetime.end.p5(i64 immarg, ptr addrspace(5) nocapture)
+
+declare void @llvm.va_end.p0(ptr)

llvm/test/Transforms/PhaseOrdering/varargs.ll

Summary: This pass expands variadic functions into non-variadic function calls according to the target ABI. Currently, this is used as the lowering for the NVPTX, AMDGPU, and WASM targets. This pass is currently only run late in the target's backend. However, during LTO we want to run it before the inliner pass so that the expanded functions can be inlined using standard heuristics. This pass is a no-op for unsupported targets, so this won't apply to any code that isn't already using it.

JonChesterfield · 2024-07-25T13:55:27Z

llvm/lib/Passes/PassBuilderPipelines.cpp

@@ -1874,6 +1875,9 @@ PassBuilder::buildLTODefaultPipeline(OptimizationLevel Level,
  MPM.addPass(createModuleToFunctionPassAdaptor(std::move(PeepholeFPM),
                                                PTO.EagerlyInvalidateAnalyses));

+  // Lower variadic functions for supported targets prior to inlining.
+  MPM.addPass(ExpandVariadicsPass(ExpandVariadicsMode::Optimize));


The optimise version can run however often we like, on any subsets of the IR module. All good.

JonChesterfield · 2024-07-25T13:57:02Z

The original idea was to run the optimise per-tu instead of on the whole module at lto time, but until there's a non-gpu target using it in production the distinction doesn't matter too much. It's presumably at lto time here in order to combine application code with libc which seems legitimate

Summary: This pass expands variadic functions into non-variadic function calls according to the target ABI. Currently, this is used as the lowering for the NVPTX and AMDGPU targets. This pass is currently only run late in the target's backend. However, during LTO we want to run it before the inliner pass so that the expanded functions can be inlined using standard heuristics. This pass is a no-op for unsupported targets, so this won't apply to any code that isn't already using it. Test Plan: Reviewers: Subscribers: Tasks: Tags: Differential Revision: https://phabricator.intern.facebook.com/D60250576

jhuber6 requested review from arsenm, efriedma-quic, jdoerfert, JonChesterfield, Pierre-vh and shiltian July 24, 2024 22:18

arsenm reviewed Jul 25, 2024

View reviewed changes

jhuber6 force-pushed the ExpandVariadicsLTO branch from 182181e to 2a6590e Compare July 25, 2024 11:55

llvmbot added the llvm:transforms label Jul 25, 2024

jhuber6 force-pushed the ExpandVariadicsLTO branch from 2a6590e to 56af884 Compare July 25, 2024 11:58

arsenm reviewed Jul 25, 2024

View reviewed changes

llvm/test/Transforms/PhaseOrdering/varargs.ll Outdated Show resolved Hide resolved

llvm/test/Transforms/PhaseOrdering/varargs.ll Outdated Show resolved Hide resolved

jhuber6 force-pushed the ExpandVariadicsLTO branch from 56af884 to 802badf Compare July 25, 2024 12:08

JonChesterfield approved these changes Jul 25, 2024

View reviewed changes

jhuber6 merged commit 8758091 into llvm:main Jul 25, 2024
4 of 7 checks passed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

[LLVM] Add 'ExpandVariadicsPass' to LTO default pipeline #100479

[LLVM] Add 'ExpandVariadicsPass' to LTO default pipeline #100479

Uh oh!

jhuber6 commented Jul 24, 2024 •

edited by JonChesterfield

Loading

Uh oh!

arsenm left a comment

Uh oh!

jhuber6 commented Jul 25, 2024

Uh oh!

llvmbot commented Jul 25, 2024

Uh oh!

Uh oh!

Uh oh!

JonChesterfield Jul 25, 2024

Uh oh!

JonChesterfield commented Jul 25, 2024

Uh oh!

Uh oh!

Uh oh!

[LLVM] Add 'ExpandVariadicsPass' to LTO default pipeline #100479

[LLVM] Add 'ExpandVariadicsPass' to LTO default pipeline #100479

Uh oh!

Conversation

jhuber6 commented Jul 24, 2024 • edited by JonChesterfield Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

arsenm left a comment

Choose a reason for hiding this comment

Uh oh!

jhuber6 commented Jul 25, 2024

Uh oh!

llvmbot commented Jul 25, 2024

Uh oh!

Uh oh!

Uh oh!

JonChesterfield Jul 25, 2024

Choose a reason for hiding this comment

Uh oh!

JonChesterfield commented Jul 25, 2024

Uh oh!

Uh oh!

Uh oh!

jhuber6 commented Jul 24, 2024 •

edited by JonChesterfield

Loading