Skip to content

[Aarch64] [ISel] Don't save vaargs registers if vaargs are unused #126780

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
merged 2 commits into from
Mar 5, 2025
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
2 changes: 1 addition & 1 deletion llvm/lib/Target/AArch64/AArch64ISelLowering.cpp
Original file line number Diff line number Diff line change
Expand Up @@ -8189,7 +8189,7 @@ SDValue AArch64TargetLowering::LowerFormalArguments(
}

// varargs
if (isVarArg) {
if (isVarArg && DAG.getMachineFunction().getFrameInfo().hasVAStart()) {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If this path is already handled in globalisel that should also be updated to match

Copy link
Contributor Author

@pskrgag pskrgag Feb 22, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hm, I don't think I get what you mean, sorry.

IIUC, GlobalIsel is quite new approach for generating machine code, which is supposed to be a successor for SelectionDAG. It can be turned on using --global-isel=1.

So, I didn't find any hasVAStart() references in GISel code and tried simple examples to see if such optimization exists.

After some tests I've observed that llc actually crashes on vaargs, when GISel is turned on. Here is godbolt link for x86 https://godbolt.org/z/q66feEb65. And locally (after addressing @ostannard review) I got similar failure on CodeGen/AArch64/win64_vararg2.ll test.

+ /home/paskripkin/Documents/git/llvm-project/build/bin/llc -global-isel -mtriple=aarch64-pc-win32
+ /home/paskripkin/Documents/git/llvm-project/build/bin/FileCheck /home/paskripkin/Documents/git/llvm-project/llvm/test/CodeGen/AArch64/win64_vararg2.ll --check-prefix=GISEL
LLVM ERROR: cannot select: G_VASTART %17:gpr(p0) :: (store (s64) into %ir.valist) (in function: va_func)
PLEASE submit a bug report to https://github.com/llvm/llvm-project/issues/ and include the crash backtrace.
Stack dump:
0.	Program arguments: /home/paskripkin/Documents/git/llvm-project/build/bin/llc -global-isel -mtriple=aarch64-pc-win32
1.	Running pass 'Function Pass Manager' on module '<stdin>'.
2.	Running pass 'InstructionSelect' on function '@va_func'

Is it known? If not, I will try to look deeper

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

As long as it works with -global-isel-abort=0, you're fine; you don't need to implement G_VASTART on any target where it isn't already implemented.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks! Added -global-isel-abort=0 to failing test.

if (!Subtarget->isTargetDarwin() || IsWin64) {
// The AAPCS variadic function ABI is identical to the non-variadic
// one. As a result there may be more arguments in registers and we should
Expand Down
64 changes: 54 additions & 10 deletions llvm/test/CodeGen/AArch64/GlobalISel/aapcs_vararg_frame.ll
Original file line number Diff line number Diff line change
Expand Up @@ -2,19 +2,11 @@
; RUN: llc < %s --global-isel=0 -mtriple=aarch64-linux-gnu -mattr=+fp-armv8 | FileCheck %s
; RUN: llc < %s --global-isel=1 -mtriple=aarch64-linux-gnu -mattr=+fp-armv8 | FileCheck %s --check-prefix=GISEL

%struct.__va_list = type { ptr, ptr, ptr, i32, i32 }

define void @va(i32 %count, half %f, ...) nounwind {
; CHECK-LABEL: va:
; CHECK: // %bb.0: // %entry
; CHECK-NEXT: sub sp, sp, #176
; CHECK-NEXT: stp x2, x3, [sp, #128]
; CHECK-NEXT: str x1, [sp, #120]
; CHECK-NEXT: stp x4, x5, [sp, #144]
; CHECK-NEXT: stp x6, x7, [sp, #160]
; CHECK-NEXT: stp q1, q2, [sp]
; CHECK-NEXT: stp q3, q4, [sp, #32]
; CHECK-NEXT: stp q5, q6, [sp, #64]
; CHECK-NEXT: str q7, [sp, #96]
; CHECK-NEXT: add sp, sp, #176
; CHECK-NEXT: ret
;
; GISEL-LABEL: va:
Expand All @@ -33,3 +25,55 @@ define void @va(i32 %count, half %f, ...) nounwind {
entry:
ret void
}

define void @va_used(i32 %count, half %f, ...) nounwind {
; CHECK-LABEL: va_used:
; CHECK: // %bb.0:
; CHECK-NEXT: sub sp, sp, #208
; CHECK-NEXT: mov x8, #-56 // =0xffffffffffffffc8
; CHECK-NEXT: mov x9, sp
; CHECK-NEXT: add x10, sp, #120
; CHECK-NEXT: movk x8, #65424, lsl #32
; CHECK-NEXT: add x9, x9, #112
; CHECK-NEXT: stp x2, x3, [sp, #128]
; CHECK-NEXT: stp x9, x8, [sp, #192]
; CHECK-NEXT: add x8, x10, #56
; CHECK-NEXT: add x9, sp, #208
; CHECK-NEXT: str x1, [sp, #120]
; CHECK-NEXT: stp x4, x5, [sp, #144]
; CHECK-NEXT: stp x6, x7, [sp, #160]
; CHECK-NEXT: stp q1, q2, [sp]
; CHECK-NEXT: stp q3, q4, [sp, #32]
; CHECK-NEXT: stp q5, q6, [sp, #64]
; CHECK-NEXT: str q7, [sp, #96]
; CHECK-NEXT: stp x9, x8, [sp, #176]
; CHECK-NEXT: add sp, sp, #208
; CHECK-NEXT: ret
;
; GISEL-LABEL: va_used:
; GISEL: // %bb.0:
; GISEL-NEXT: sub sp, sp, #208
; GISEL-NEXT: mov x8, sp
; GISEL-NEXT: add x9, sp, #208
; GISEL-NEXT: add x10, sp, #208
; GISEL-NEXT: stp x9, x10, [x8]
; GISEL-NEXT: add x9, sp, #144
; GISEL-NEXT: mov w10, #-112 // =0xffffff90
; GISEL-NEXT: str x9, [x8, #16]
; GISEL-NEXT: mov w9, #-56 // =0xffffffc8
; GISEL-NEXT: stp x1, x2, [sp, #152]
; GISEL-NEXT: stp x3, x4, [sp, #168]
; GISEL-NEXT: stp x5, x6, [sp, #184]
; GISEL-NEXT: str x7, [sp, #200]
; GISEL-NEXT: stp q1, q2, [sp, #32]
; GISEL-NEXT: stp q3, q4, [sp, #64]
; GISEL-NEXT: stp q5, q6, [sp, #96]
; GISEL-NEXT: str q7, [sp, #128]
; GISEL-NEXT: stp w9, w10, [x8, #24]
; GISEL-NEXT: add sp, sp, #208
; GISEL-NEXT: ret
%valist = alloca %struct.__va_list
call void @llvm.va_start(ptr %valist)
call void @llvm.va_end(ptr %valist)
ret void
}
24 changes: 14 additions & 10 deletions llvm/test/CodeGen/AArch64/alloca.ll
Original file line number Diff line number Diff line change
Expand Up @@ -62,6 +62,8 @@ define i64 @test_alloca_with_local(i64 %n) {
; CHECK: ret
}

%struct.__va_list = type { ptr, ptr, ptr, i32, i32 }

define void @test_variadic_alloca(i64 %n, ...) {
; CHECK-LABEL: test_variadic_alloca:

Expand All @@ -77,31 +79,33 @@ define void @test_variadic_alloca(i64 %n, ...) {

; CHECK: stp x29, x30, [sp, #-16]!
; CHECK: mov x29, sp
; CHECK: sub sp, sp, #192
; CHECK-DAG: stp q6, q7, [x29, #-96]
; CHECK: sub sp, sp, #224
; CHECK-DAG: stp q6, q7, [x29, #-128]
; [...]
; CHECK-DAG: stp q0, q1, [x29, #-192]
; CHECK-DAG: stp q2, q3, [x29, #-192]

; CHECK-DAG: stp x5, x6, [x29, #-24]
; CHECK-DAG: stp x5, x6, [x29, #-56]
; [...]
; CHECK-DAG: stp x1, x2, [x29, #-56]
; CHECK-DAG: stp x1, x2, [x29, #-88]

; CHECK-NOFP-ARM64: stp x29, x30, [sp, #-16]!
; CHECK-NOFP-ARM64: mov x29, sp
; CHECK-NOFP-ARM64: sub sp, sp, #64
; CHECK-NOFP-ARM64-DAG: stp x5, x6, [x29, #-24]
; CHECK-NOFP-ARM64: sub sp, sp, #16
; CHECK-NOFP-ARM64-DAG: stp x5, x6, [x29, #-56]
; [...]
; CHECK-NOFP-ARM64-DAG: stp x3, x4, [x29, #-40]
; CHECK-NOFP-ARM64-DAG: stp x3, x4, [x29, #-72]
; [...]
; CHECK-NOFP-ARM64-DAG: stp x1, x2, [x29, #-56]
; CHECK-NOFP-ARM64-DAG: stp x1, x2, [x29, #-88]
; [...]
; CHECK-NOFP-ARM64: mov x8, sp

%valist = alloca %struct.__va_list
call void @llvm.va_start(ptr %valist)
%addr = alloca i8, i64 %n

call void @use_addr(ptr %addr)
; CHECK: bl use_addr

call void @llvm.va_end(ptr %valist)
ret void

; CHECK-NOFP-AARCH64: sub sp, x29, #64
Expand Down
13 changes: 9 additions & 4 deletions llvm/test/CodeGen/AArch64/arm64ec-hybrid-patchable.ll
Original file line number Diff line number Diff line change
Expand Up @@ -16,6 +16,8 @@ define dso_local ptr @func() hybrid_patchable nounwind {
ret ptr @func
}

%struct.__va_list = type { ptr, ptr, ptr, i32, i32 }

define void @has_varargs(...) hybrid_patchable nounwind {
; SYM: [11](sec 5)(fl 0x00)(ty 20)(scl 2) (nx 0) 0x00000000 #has_varargs$hp_target
; CHECK-LABEL: .def "#has_varargs$hp_target";
Expand All @@ -24,11 +26,14 @@ define void @has_varargs(...) hybrid_patchable nounwind {
; CHECK-NEXT: .p2align 2
; CHECK-NEXT: "#has_varargs$hp_target": // @"#has_varargs$hp_target"
; CHECK-NEXT: // %bb.0:
; CHECK-NEXT: sub sp, sp, #32
; CHECK-NEXT: stp x0, x1, [x4, #-32]
; CHECK-NEXT: stp x2, x3, [x4, #-16]
; CHECK-NEXT: add sp, sp, #32
; CHECK-NEXT: sub sp, sp, #64
; CHECK-NEXT: stp x0, x1, [x4, #-32]!
; CHECK-NEXT: stp x2, x3, [x4, #16]
; CHECK-NEXT: str x4, [sp], #64
; CHECK-NEXT: ret
%valist = alloca %struct.__va_list
call void @llvm.va_start(ptr %valist)
call void @llvm.va_end(ptr %valist)
ret void
}

Expand Down
4 changes: 2 additions & 2 deletions llvm/test/CodeGen/AArch64/darwinpcs-tail.ll
Original file line number Diff line number Diff line change
Expand Up @@ -8,8 +8,8 @@
; CHECK-LABEL: _tailTest:
; CHECK: b __ZN1C3addEPKcz
; CHECK-LABEL: __ZThn8_N1C1fEiiiiiiiiiz:
; CHECK: ldr w9, [sp, #4]
; CHECK: str w9, [sp, #4]
; CHECK: ldr w8, [sp, #4]
; CHECK: str w8, [sp, #4]
; CHECK: b __ZN1C1fEiiiiiiiiiz

%class.C = type { %class.A.base, [4 x i8], %class.B.base, [4 x i8] }
Expand Down
10 changes: 8 additions & 2 deletions llvm/test/CodeGen/AArch64/vararg-tallcall.ll
Original file line number Diff line number Diff line change
Expand Up @@ -8,18 +8,22 @@ target datalayout = "e-m:w-p:64:64-i32:32-i64:64-i128:128-n32:64-S128"

%class.X = type { i8 }
%struct.B = type { ptr }
%struct.__va_list = type { ptr, ptr, ptr, i32, i32 }

$"??_9B@@$BA@AA" = comdat any

; Function Attrs: noinline optnone
define linkonce_odr void @"??_9B@@$BA@AA"(ptr %this, ...) #1 comdat align 2 {
entry:
%valist = alloca %struct.__va_list
call void @llvm.va_start.p0(ptr %valist)
%this.addr = alloca ptr, align 8
store ptr %this, ptr %this.addr, align 8
%this1 = load ptr, ptr %this.addr, align 8
call void asm sideeffect "", "~{d0}"()
%vtable = load ptr, ptr %this1, align 8
%0 = load ptr, ptr %vtable, align 8
call void @llvm.va_end.p0(ptr %valist)
musttail call void (ptr, ...) %0(ptr %this1, ...)
ret void
; No predecessors!
Expand All @@ -36,7 +40,9 @@ attributes #1 = { noinline optnone "thunk" }
; CHECK-EC: mov v7.16b, v0.16b
; CHECK-EC: ldr x9, [x0]
; CHECK-EC: ldr x11, [x9]
; CHECH-EC: add x4, sp, #96
; CHECK-EC: mov v0.16b, v7.16b
; CHECK-EC: add x4, sp, #64
; CHECK-EC: add sp, sp, #64
; CHECK-EC: add x4, sp, #96
; CHECK-EC: ldr x30, [sp, #48]
; CHECK-EC: add sp, sp, #96
; CHECK-EC: br x11
71 changes: 40 additions & 31 deletions llvm/test/CodeGen/AArch64/win64_vararg2.ll
Original file line number Diff line number Diff line change
@@ -1,36 +1,40 @@
; NOTE: Assertions have been autogenerated by utils/update_llc_test_checks.py
; RUN: llc < %s -mtriple=aarch64-pc-win32 | FileCheck %s
; RUN: llc < %s -global-isel -mtriple=aarch64-pc-win32 | FileCheck %s --check-prefix=GISEL
; RUN: llc < %s -global-isel -mtriple=aarch64-pc-win32 -global-isel-abort=0 | FileCheck %s --check-prefix=GISEL

%struct.__va_list = type { ptr, ptr, ptr, i32, i32 }

; Function Attrs: mustprogress noinline nounwind optnone uwtable
define i1 @va_func(i32 %a, i8 %b, i8 %c, ...) {
; CHECK-LABEL: va_func:
; CHECK: .seh_proc va_func
; CHECK-NEXT: // %bb.0:
; CHECK-NEXT: sub sp, sp, #80
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is another test which is trying to check vararg behaviour, so should have a va_start call.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

So looks like GISel does not support vastart for win calling convention

bool AArch64InstructionSelector::selectVaStartAAPCS(
    MachineInstr &I, MachineFunction &MF, MachineRegisterInfo &MRI) const {

  if (STI.isCallingConvWin64(MF.getFunction().getCallingConv(),
                             MF.getFunction().isVarArg()))
    return false;

; CHECK-NEXT: .seh_stackalloc 80
; CHECK-NEXT: str x19, [sp, #16] // 8-byte Folded Spill
; CHECK-NEXT: .seh_save_reg x19, 16
; CHECK-NEXT: str x30, [sp, #24] // 8-byte Folded Spill
; CHECK-NEXT: .seh_save_reg x30, 24
; CHECK-NEXT: sub sp, sp, #112
; CHECK-NEXT: .seh_stackalloc 112
; CHECK-NEXT: str x19, [sp, #48] // 8-byte Folded Spill
; CHECK-NEXT: .seh_save_reg x19, 48
; CHECK-NEXT: str x30, [sp, #56] // 8-byte Folded Spill
; CHECK-NEXT: .seh_save_reg x30, 56
; CHECK-NEXT: .seh_endprologue
; CHECK-NEXT: add x8, sp, #72
; CHECK-NEXT: mov w19, w0
; CHECK-NEXT: stp x3, x4, [sp, #40]
; CHECK-NEXT: stp x5, x6, [sp, #56]
; CHECK-NEXT: str x7, [sp, #72]
; CHECK-NEXT: stp x3, x4, [sp, #72]
; CHECK-NEXT: stp x5, x6, [sp, #88]
; CHECK-NEXT: str x7, [sp, #104]
; CHECK-NEXT: str x8, [sp, #16]
; CHECK-NEXT: str w0, [sp, #12]
; CHECK-NEXT: strb w1, [sp, #11]
; CHECK-NEXT: strb w2, [sp, #10]
; CHECK-NEXT: bl other
; CHECK-NEXT: cmp w19, w0
; CHECK-NEXT: cset w0, ls
; CHECK-NEXT: .seh_startepilogue
; CHECK-NEXT: ldr x30, [sp, #24] // 8-byte Folded Reload
; CHECK-NEXT: .seh_save_reg x30, 24
; CHECK-NEXT: ldr x19, [sp, #16] // 8-byte Folded Reload
; CHECK-NEXT: .seh_save_reg x19, 16
; CHECK-NEXT: add sp, sp, #80
; CHECK-NEXT: .seh_stackalloc 80
; CHECK-NEXT: ldr x30, [sp, #56] // 8-byte Folded Reload
; CHECK-NEXT: .seh_save_reg x30, 56
; CHECK-NEXT: ldr x19, [sp, #48] // 8-byte Folded Reload
; CHECK-NEXT: .seh_save_reg x19, 48
; CHECK-NEXT: add sp, sp, #112
; CHECK-NEXT: .seh_stackalloc 112
; CHECK-NEXT: .seh_endepilogue
; CHECK-NEXT: ret
; CHECK-NEXT: .seh_endfunclet
Expand All @@ -39,34 +43,38 @@ define i1 @va_func(i32 %a, i8 %b, i8 %c, ...) {
; GISEL-LABEL: va_func:
; GISEL: .seh_proc va_func
; GISEL-NEXT: // %bb.0:
; GISEL-NEXT: sub sp, sp, #80
; GISEL-NEXT: .seh_stackalloc 80
; GISEL-NEXT: str x19, [sp, #16] // 8-byte Folded Spill
; GISEL-NEXT: .seh_save_reg x19, 16
; GISEL-NEXT: str x30, [sp, #24] // 8-byte Folded Spill
; GISEL-NEXT: .seh_save_reg x30, 24
; GISEL-NEXT: sub sp, sp, #112
; GISEL-NEXT: .seh_stackalloc 112
; GISEL-NEXT: str x19, [sp, #48] // 8-byte Folded Spill
; GISEL-NEXT: .seh_save_reg x19, 48
; GISEL-NEXT: str x30, [sp, #56] // 8-byte Folded Spill
; GISEL-NEXT: .seh_save_reg x30, 56
; GISEL-NEXT: .seh_endprologue
; GISEL-NEXT: stp x3, x4, [sp, #40]
; GISEL-NEXT: add x8, sp, #72
; GISEL-NEXT: mov w19, w0
; GISEL-NEXT: stp x5, x6, [sp, #56]
; GISEL-NEXT: str x7, [sp, #72]
; GISEL-NEXT: stp x3, x4, [sp, #72]
; GISEL-NEXT: stp x5, x6, [sp, #88]
; GISEL-NEXT: str x7, [sp, #104]
; GISEL-NEXT: str x8, [sp, #16]
; GISEL-NEXT: str w0, [sp, #12]
; GISEL-NEXT: strb w1, [sp, #11]
; GISEL-NEXT: strb w2, [sp, #10]
; GISEL-NEXT: bl other
; GISEL-NEXT: cmp w19, w0
; GISEL-NEXT: cset w0, ls
; GISEL-NEXT: .seh_startepilogue
; GISEL-NEXT: ldr x30, [sp, #24] // 8-byte Folded Reload
; GISEL-NEXT: .seh_save_reg x30, 24
; GISEL-NEXT: ldr x19, [sp, #16] // 8-byte Folded Reload
; GISEL-NEXT: .seh_save_reg x19, 16
; GISEL-NEXT: add sp, sp, #80
; GISEL-NEXT: .seh_stackalloc 80
; GISEL-NEXT: ldr x30, [sp, #56] // 8-byte Folded Reload
; GISEL-NEXT: .seh_save_reg x30, 56
; GISEL-NEXT: ldr x19, [sp, #48] // 8-byte Folded Reload
; GISEL-NEXT: .seh_save_reg x19, 48
; GISEL-NEXT: add sp, sp, #112
; GISEL-NEXT: .seh_stackalloc 112
; GISEL-NEXT: .seh_endepilogue
; GISEL-NEXT: ret
; GISEL-NEXT: .seh_endfunclet
; GISEL-NEXT: .seh_endproc
%valist = alloca %struct.__va_list
call void @llvm.va_start(ptr %valist)
%a_alloc = alloca i32, align 4
%b_alloc = alloca i8, align 1
%c_alloc = alloca i8, align 1
Expand All @@ -76,6 +84,7 @@ define i1 @va_func(i32 %a, i8 %b, i8 %c, ...) {
%a_load = load i32, ptr %a_alloc, align 4
%ret = call noundef i32 @other()
%cmp = icmp ule i32 %a_load, %ret
call void @llvm.va_end(ptr %valist)
ret i1 %cmp
}

Expand Down