Skip to content

[X86][GlobalISel] Support StructRet arguments #96629

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
merged 3 commits into from
Jul 2, 2024
Merged

Conversation

e-kud
Copy link
Contributor

@e-kud e-kud commented Jun 25, 2024

We follow SelectionDAG and FastISel manner: set a register during formal arguments lowering and use this register to insert a copy of StructRet argument to RAX register during return lowering.

Also add RAX register to RET instruction to fix a difference between GlobalISel and SelectionDAG,
when the copy instruction could be deleted.

We follow SelectionDAG and FastISel manner: set a register during formal
arguments lowering. And use this register to insert a copy of StructRet
argument to RAX register during return lowering.

Fix a minor difference betwen GlobalISel and SelectionDAG when RAX
register wasn't used and copy instruction may be deleted.
@llvmbot
Copy link
Member

llvmbot commented Jun 25, 2024

@llvm/pr-subscribers-backend-x86

@llvm/pr-subscribers-llvm-globalisel

Author: Evgenii Kudriashov (e-kud)

Changes

We follow SelectionDAG and FastISel manner: set a register during formal arguments lowering and use this register to insert a copy of StructRet argument to RAX register during return lowering.

Also add RAX register to RET instruction to fix a difference between GlobalISel and SelectionDAG,
when the copy instruction could be deleted.


Full diff: https://github.com/llvm/llvm-project/pull/96629.diff

4 Files Affected:

  • (modified) llvm/lib/Target/X86/GISel/X86CallLowering.cpp (+16-4)
  • (modified) llvm/test/CodeGen/X86/GlobalISel/irtranslator-callingconv.ll (+30-2)
  • (modified) llvm/test/CodeGen/X86/isel-buildvector-sse.ll (+14-12)
  • (modified) llvm/test/CodeGen/X86/isel-buildvector-sse2.ll (+12-11)
diff --git a/llvm/lib/Target/X86/GISel/X86CallLowering.cpp b/llvm/lib/Target/X86/GISel/X86CallLowering.cpp
index 48830769fdf6c..4975582e89e31 100644
--- a/llvm/lib/Target/X86/GISel/X86CallLowering.cpp
+++ b/llvm/lib/Target/X86/GISel/X86CallLowering.cpp
@@ -16,6 +16,7 @@
 #include "X86CallingConv.h"
 #include "X86ISelLowering.h"
 #include "X86InstrInfo.h"
+#include "X86MachineFunctionInfo.h"
 #include "X86RegisterInfo.h"
 #include "X86Subtarget.h"
 #include "llvm/ADT/ArrayRef.h"
@@ -147,12 +148,17 @@ bool X86CallLowering::lowerReturn(MachineIRBuilder &MIRBuilder,
          "Return value without a vreg");
   MachineFunction &MF = MIRBuilder.getMF();
   auto MIB = MIRBuilder.buildInstrNoInsert(X86::RET).addImm(0);
-  const X86Subtarget &STI = MF.getSubtarget<X86Subtarget>();
-  bool Is64Bit = STI.is64Bit();
+  auto FuncInfo = MF.getInfo<X86MachineFunctionInfo>();
+  const auto &STI = MF.getSubtarget<X86Subtarget>();
+  Register RetReg = STI.is64Bit() ? X86::RAX : X86::EAX;
 
   if (!FLI.CanLowerReturn) {
     insertSRetStores(MIRBuilder, Val->getType(), VRegs, FLI.DemoteRegister);
-    MIRBuilder.buildCopy(Is64Bit ? X86::RAX : X86::EAX, FLI.DemoteRegister);
+    MIRBuilder.buildCopy(RetReg, FLI.DemoteRegister);
+    MIB.addReg(RetReg);
+  } else if (Register Reg = FuncInfo->getSRetReturnReg()) {
+    MIRBuilder.buildCopy(RetReg, Reg);
+    MIB.addReg(RetReg);
   } else if (!VRegs.empty()) {
     const Function &F = MF.getFunction();
     MachineRegisterInfo &MRI = MF.getRegInfo();
@@ -258,6 +264,7 @@ bool X86CallLowering::lowerFormalArguments(MachineIRBuilder &MIRBuilder,
   MachineFunction &MF = MIRBuilder.getMF();
   MachineRegisterInfo &MRI = MF.getRegInfo();
   auto DL = MF.getDataLayout();
+  auto FuncInfo = MF.getInfo<X86MachineFunctionInfo>();
 
   SmallVector<ArgInfo, 8> SplitArgs;
 
@@ -273,12 +280,17 @@ bool X86CallLowering::lowerFormalArguments(MachineIRBuilder &MIRBuilder,
     // TODO: handle not simple cases.
     if (Arg.hasAttribute(Attribute::ByVal) ||
         Arg.hasAttribute(Attribute::InReg) ||
-        Arg.hasAttribute(Attribute::StructRet) ||
         Arg.hasAttribute(Attribute::SwiftSelf) ||
         Arg.hasAttribute(Attribute::SwiftError) ||
         Arg.hasAttribute(Attribute::Nest) || VRegs[Idx].size() > 1)
       return false;
 
+    if (Arg.hasAttribute(Attribute::StructRet)) {
+      assert(VRegs[Idx].size() == 1 &&
+             "Unexpected amount of registers for sret argument.");
+      FuncInfo->setSRetReturnReg(VRegs[Idx][0]);
+    }
+
     ArgInfo OrigArg(VRegs[Idx], Arg.getType(), Idx);
     setArgFlags(OrigArg, Idx + AttributeList::FirstArgIndex, DL, F);
     splitToValueTypes(OrigArg, SplitArgs, DL, F.getCallingConv());
diff --git a/llvm/test/CodeGen/X86/GlobalISel/irtranslator-callingconv.ll b/llvm/test/CodeGen/X86/GlobalISel/irtranslator-callingconv.ll
index 55e73dc5d29ec..a797c235c46f4 100644
--- a/llvm/test/CodeGen/X86/GlobalISel/irtranslator-callingconv.ll
+++ b/llvm/test/CodeGen/X86/GlobalISel/irtranslator-callingconv.ll
@@ -5,6 +5,7 @@
 @a1_8bit = external global i8
 @a7_8bit = external global i8
 @a8_8bit = external global i8
+%struct.all = type { i8, i16, i32, i8, i16, i32, i64, float, double }
 
 define i8 @test_i8_args_8(i8 %arg1, i8 %arg2, i8 %arg3, i8 %arg4, i8 %arg5, i8 %arg6, i8 %arg7, i8 %arg8) {
   ; X86-LABEL: name: test_i8_args_8
@@ -745,7 +746,7 @@ define <32 x float> @test_return_v32f32() {
   ; X86-NEXT:   [[BUILD_VECTOR:%[0-9]+]]:_(<32 x s32>) = G_BUILD_VECTOR [[C]](s32), [[C]](s32), [[C]](s32), [[C]](s32), [[C]](s32), [[C]](s32), [[C]](s32), [[C]](s32), [[C]](s32), [[C]](s32), [[C]](s32), [[C]](s32), [[C]](s32), [[C]](s32), [[C]](s32), [[C]](s32), [[C]](s32), [[C]](s32), [[C]](s32), [[C]](s32), [[C]](s32), [[C]](s32), [[C]](s32), [[C]](s32), [[C]](s32), [[C]](s32), [[C]](s32), [[C]](s32), [[C]](s32), [[C]](s32), [[C]](s32), [[C]](s32)
   ; X86-NEXT:   G_STORE [[BUILD_VECTOR]](<32 x s32>), [[LOAD]](p0) :: (store (<32 x s32>))
   ; X86-NEXT:   $eax = COPY [[LOAD]](p0)
-  ; X86-NEXT:   RET 0
+  ; X86-NEXT:   RET 0, $eax
   ;
   ; X64-LABEL: name: test_return_v32f32
   ; X64: bb.1 (%ir-block.0):
@@ -756,7 +757,7 @@ define <32 x float> @test_return_v32f32() {
   ; X64-NEXT:   [[BUILD_VECTOR:%[0-9]+]]:_(<32 x s32>) = G_BUILD_VECTOR [[C]](s32), [[C]](s32), [[C]](s32), [[C]](s32), [[C]](s32), [[C]](s32), [[C]](s32), [[C]](s32), [[C]](s32), [[C]](s32), [[C]](s32), [[C]](s32), [[C]](s32), [[C]](s32), [[C]](s32), [[C]](s32), [[C]](s32), [[C]](s32), [[C]](s32), [[C]](s32), [[C]](s32), [[C]](s32), [[C]](s32), [[C]](s32), [[C]](s32), [[C]](s32), [[C]](s32), [[C]](s32), [[C]](s32), [[C]](s32), [[C]](s32), [[C]](s32)
   ; X64-NEXT:   G_STORE [[BUILD_VECTOR]](<32 x s32>), [[COPY]](p0) :: (store (<32 x s32>))
   ; X64-NEXT:   $rax = COPY [[COPY]](p0)
-  ; X64-NEXT:   RET 0
+  ; X64-NEXT:   RET 0, $rax
   ret <32 x float> zeroinitializer
 }
 
@@ -793,3 +794,30 @@ define float @test_call_v32f32() {
   %elt = extractelement <32 x float> %vect, i32 7
   ret float %elt
 }
+
+define void @test_sret(ptr sret(%struct.all) align 8 %result) #0 {
+  ; X86-LABEL: name: test_sret
+  ; X86: bb.1.entry:
+  ; X86-NEXT:   [[FRAME_INDEX:%[0-9]+]]:_(p0) = G_FRAME_INDEX %fixed-stack.0
+  ; X86-NEXT:   [[LOAD:%[0-9]+]]:_(p0) = G_LOAD [[FRAME_INDEX]](p0) :: (invariant load (p0) from %fixed-stack.0, align 16)
+  ; X86-NEXT:   [[C:%[0-9]+]]:_(s8) = G_CONSTANT i8 104
+  ; X86-NEXT:   [[COPY:%[0-9]+]]:_(p0) = COPY [[LOAD]](p0)
+  ; X86-NEXT:   G_STORE [[C]](s8), [[COPY]](p0) :: (store (s8) into %ir.c, align 8)
+  ; X86-NEXT:   $eax = COPY [[LOAD]](p0)
+  ; X86-NEXT:   RET 0, $eax
+  ;
+  ; X64-LABEL: name: test_sret
+  ; X64: bb.1.entry:
+  ; X64-NEXT:   liveins: $rdi
+  ; X64-NEXT: {{  $}}
+  ; X64-NEXT:   [[COPY:%[0-9]+]]:_(p0) = COPY $rdi
+  ; X64-NEXT:   [[C:%[0-9]+]]:_(s8) = G_CONSTANT i8 104
+  ; X64-NEXT:   [[COPY1:%[0-9]+]]:_(p0) = COPY [[COPY]](p0)
+  ; X64-NEXT:   G_STORE [[C]](s8), [[COPY1]](p0) :: (store (s8) into %ir.c, align 8)
+  ; X64-NEXT:   $rax = COPY [[COPY]](p0)
+  ; X64-NEXT:   RET 0, $rax
+entry:
+  %c = getelementptr inbounds %struct.all, ptr %result, i32 0, i32 0
+  store i8 104, ptr %c, align 8
+  ret void
+}
diff --git a/llvm/test/CodeGen/X86/isel-buildvector-sse.ll b/llvm/test/CodeGen/X86/isel-buildvector-sse.ll
index 5b96d57cf019b..7f580aad78764 100644
--- a/llvm/test/CodeGen/X86/isel-buildvector-sse.ll
+++ b/llvm/test/CodeGen/X86/isel-buildvector-sse.ll
@@ -22,22 +22,23 @@ define <8 x i32> @test_vector_v8i32() {
 ;
 ; SSE-X64-GISEL-LABEL: test_vector_v8i32:
 ; SSE-X64-GISEL:       # %bb.0:
-; SSE-X64-GISEL-NEXT:    movl $128100944, %eax # imm = 0x7A2AA50
-; SSE-X64-GISEL-NEXT:    movl $-632258670, %ecx # imm = 0xDA507F92
-; SSE-X64-GISEL-NEXT:    movl $-408980432, %edx # imm = 0xE79F7430
-; SSE-X64-GISEL-NEXT:    movl $708630551, %esi # imm = 0x2A3CD817
+; SSE-X64-GISEL-NEXT:    movq %rdi, %rax
+; SSE-X64-GISEL-NEXT:    movl $128100944, %ecx # imm = 0x7A2AA50
+; SSE-X64-GISEL-NEXT:    movl $-632258670, %edx # imm = 0xDA507F92
+; SSE-X64-GISEL-NEXT:    movl $-408980432, %esi # imm = 0xE79F7430
+; SSE-X64-GISEL-NEXT:    movl $708630551, %edi # imm = 0x2A3CD817
 ; SSE-X64-GISEL-NEXT:    movl $-871899055, %r8d # imm = 0xCC07E051
 ; SSE-X64-GISEL-NEXT:    movl $-633489957, %r9d # imm = 0xDA3DB5DB
 ; SSE-X64-GISEL-NEXT:    movl $591019567, %r10d # imm = 0x233A3E2F
 ; SSE-X64-GISEL-NEXT:    movl $708632899, %r11d # imm = 0x2A3CE143
-; SSE-X64-GISEL-NEXT:    movl %eax, (%rdi)
-; SSE-X64-GISEL-NEXT:    movl %ecx, 4(%rdi)
-; SSE-X64-GISEL-NEXT:    movl %edx, 8(%rdi)
-; SSE-X64-GISEL-NEXT:    movl %esi, 12(%rdi)
-; SSE-X64-GISEL-NEXT:    movl %r8d, 16(%rdi)
-; SSE-X64-GISEL-NEXT:    movl %r9d, 20(%rdi)
-; SSE-X64-GISEL-NEXT:    movl %r10d, 24(%rdi)
-; SSE-X64-GISEL-NEXT:    movl %r11d, 28(%rdi)
+; SSE-X64-GISEL-NEXT:    movl %ecx, (%rax)
+; SSE-X64-GISEL-NEXT:    movl %edx, 4(%rax)
+; SSE-X64-GISEL-NEXT:    movl %esi, 8(%rax)
+; SSE-X64-GISEL-NEXT:    movl %edi, 12(%rax)
+; SSE-X64-GISEL-NEXT:    movl %r8d, 16(%rax)
+; SSE-X64-GISEL-NEXT:    movl %r9d, 20(%rax)
+; SSE-X64-GISEL-NEXT:    movl %r10d, 24(%rax)
+; SSE-X64-GISEL-NEXT:    movl %r11d, 28(%rax)
 ; SSE-X64-GISEL-NEXT:    retq
 ;
 ; SSE-X86-LABEL: test_vector_v8i32:
@@ -88,6 +89,7 @@ define <4 x i32> @test_vector_v4i32() {
 ;
 ; SSE-X64-GISEL-LABEL: test_vector_v4i32:
 ; SSE-X64-GISEL:       # %bb.0:
+; SSE-X64-GISEL-NEXT:    movq %rdi, %rax
 ; SSE-X64-GISEL-NEXT:    movaps {{.*#+}} xmm0 = [128100944,3662708626,3885986864,708630551]
 ; SSE-X64-GISEL-NEXT:    movaps %xmm0, (%rdi)
 ; SSE-X64-GISEL-NEXT:    retq
diff --git a/llvm/test/CodeGen/X86/isel-buildvector-sse2.ll b/llvm/test/CodeGen/X86/isel-buildvector-sse2.ll
index 88e0ede0d4b6f..da089dda6d03d 100644
--- a/llvm/test/CodeGen/X86/isel-buildvector-sse2.ll
+++ b/llvm/test/CodeGen/X86/isel-buildvector-sse2.ll
@@ -19,20 +19,21 @@ define <7 x i8> @test_vector_v7i8() {
 ;
 ; SSE2-GISEL-LABEL: test_vector_v7i8:
 ; SSE2-GISEL:       # %bb.0:
-; SSE2-GISEL-NEXT:    movb $4, %al
-; SSE2-GISEL-NEXT:    movb $8, %cl
-; SSE2-GISEL-NEXT:    movb $15, %dl
-; SSE2-GISEL-NEXT:    movb $16, %sil
+; SSE2-GISEL-NEXT:    movq %rdi, %rax
+; SSE2-GISEL-NEXT:    movb $4, %cl
+; SSE2-GISEL-NEXT:    movb $8, %dl
+; SSE2-GISEL-NEXT:    movb $15, %sil
+; SSE2-GISEL-NEXT:    movb $16, %dil
 ; SSE2-GISEL-NEXT:    movb $23, %r8b
 ; SSE2-GISEL-NEXT:    movb $42, %r9b
 ; SSE2-GISEL-NEXT:    movb $63, %r10b
-; SSE2-GISEL-NEXT:    movb %al, (%rdi)
-; SSE2-GISEL-NEXT:    movb %cl, 1(%rdi)
-; SSE2-GISEL-NEXT:    movb %dl, 2(%rdi)
-; SSE2-GISEL-NEXT:    movb %sil, 3(%rdi)
-; SSE2-GISEL-NEXT:    movb %r8b, 4(%rdi)
-; SSE2-GISEL-NEXT:    movb %r9b, 5(%rdi)
-; SSE2-GISEL-NEXT:    movb %r10b, 6(%rdi)
+; SSE2-GISEL-NEXT:    movb %cl, (%rax)
+; SSE2-GISEL-NEXT:    movb %dl, 1(%rax)
+; SSE2-GISEL-NEXT:    movb %sil, 2(%rax)
+; SSE2-GISEL-NEXT:    movb %dil, 3(%rax)
+; SSE2-GISEL-NEXT:    movb %r8b, 4(%rax)
+; SSE2-GISEL-NEXT:    movb %r9b, 5(%rax)
+; SSE2-GISEL-NEXT:    movb %r10b, 6(%rax)
 ; SSE2-GISEL-NEXT:    retq
   ret <7 x i8> <i8 4, i8 8, i8 15, i8 16, i8 23, i8 42, i8 63>
 }

Copy link
Contributor

@arsenm arsenm left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Does this cover the implicit sret case? Can you add a test that hits that condition?

@tschuett
Copy link

Is this really X86 specific or does it belong into CallLowering.cpp?

@e-kud
Copy link
Contributor Author

e-kud commented Jun 25, 2024

Does this cover the implicit sret case? Can you add a test that hits that condition?

@arsenm, the implicit case was supported earlier. There is already a test for it with ret <32 x float> zeroinitializer. Do we want more tests with structures?

Is this really X86 specific or does it belong into CallLowering.cpp?

@qcolombet hmm, interesting. This is ABI dependent and according to SelectionDAG comments

Target ABI comment
X86 All x86 ABIs require that for returning structs by value we copy the sret argument into %rax/%eax (depending on ABI) for the return
Sparc If the function returns a struct, copy the SRetReturnReg to I0
MSP430 No comment. Looks like we must return a reference
Mips The mips ABIs for returning structs by value requires that we copy the sret argument into $v0 for the return
M68k ABI require that for returning structs by value we copy the sret argument into %D0 for the return
Lanai The Lanai ABI for returning structs by value requires that we copy the sret argument into rv for the return
AArch64 Windows AArch64 ABIs require that for returning structs by value we copy the sret argument into X0 for the return

Also each of these targets implements getSRetReturnReg that is not in the base class of MachineFunctionInfo.

So to generalize this approach:

  1. We need to extract {get,set}SRetReturnReg into MachineFunctionInfo
  2. We need a target hook something like getSRetReturnReg(CC) so IRTranslator is able to generate a copy instruction.
  3. We need to be sure that adding a register to RET instruction is common among these targets. Or we need another hook to update RET instruction according to calling convention.
  4. Some code aside lowerFormalArguments to call setSRetReturnReg when StructRet argument is present.

Looks like keeping it in the target code is simpler.

@e-kud e-kud merged commit b5b0a22 into llvm:main Jul 2, 2024
7 checks passed
@e-kud e-kud deleted the global-sret branch July 2, 2024 22:56
@llvm-ci
Copy link
Collaborator

llvm-ci commented Jul 2, 2024

LLVM Buildbot has detected a new failure on builder clang-hip-vega20 running on hip-vega20-0 while building llvm at step 3 "annotate".

Full details are available at: https://lab.llvm.org/buildbot/#/builders/123/builds/1090

Here is the relevant piece of the build log for the reference:

Step 3 (annotate) failure: '../llvm-zorg/zorg/buildbot/builders/annotated/hip-build.sh --jobs=' (failure)
...
[36/38] : && /buildbot/hip-vega20-0/clang-hip-vega20/llvm/bin/clang++ -O3 -DNDEBUG  External/HIP/CMakeFiles/InOneWeekend-hip-6.0.2.dir/workload/ray-tracing/InOneWeekend/main.cc.o -o External/HIP/InOneWeekend-hip-6.0.2  --rocm-path=/buildbot/Externals/hip/rocm-6.0.2 --hip-link -rtlib=compiler-rt -unwindlib=libgcc -frtlib-add-rpath && cd /buildbot/hip-vega20-0/clang-hip-vega20/test-suite-build/External/HIP && /usr/local/bin/cmake -E create_symlink /buildbot/llvm-test-suite/External/HIP/InOneWeekend.reference_output /buildbot/hip-vega20-0/clang-hip-vega20/test-suite-build/External/HIP/InOneWeekend.reference_output-hip-6.0.2
[37/38] /buildbot/hip-vega20-0/clang-hip-vega20/llvm/bin/clang++ -DNDEBUG  -O3 -DNDEBUG   -w -Werror=date-time --rocm-path=/buildbot/Externals/hip/rocm-6.0.2 --offload-arch=gfx908 --offload-arch=gfx90a --offload-arch=gfx1030 --offload-arch=gfx1100 -xhip -mfma -MD -MT External/HIP/CMakeFiles/TheNextWeek-hip-6.0.2.dir/workload/ray-tracing/TheNextWeek/main.cc.o -MF External/HIP/CMakeFiles/TheNextWeek-hip-6.0.2.dir/workload/ray-tracing/TheNextWeek/main.cc.o.d -o External/HIP/CMakeFiles/TheNextWeek-hip-6.0.2.dir/workload/ray-tracing/TheNextWeek/main.cc.o -c /buildbot/llvm-test-suite/External/HIP/workload/ray-tracing/TheNextWeek/main.cc
[38/38] : && /buildbot/hip-vega20-0/clang-hip-vega20/llvm/bin/clang++ -O3 -DNDEBUG  External/HIP/CMakeFiles/TheNextWeek-hip-6.0.2.dir/workload/ray-tracing/TheNextWeek/main.cc.o -o External/HIP/TheNextWeek-hip-6.0.2  --rocm-path=/buildbot/Externals/hip/rocm-6.0.2 --hip-link -rtlib=compiler-rt -unwindlib=libgcc -frtlib-add-rpath && cd /buildbot/hip-vega20-0/clang-hip-vega20/test-suite-build/External/HIP && /usr/local/bin/cmake -E create_symlink /buildbot/llvm-test-suite/External/HIP/TheNextWeek.reference_output /buildbot/hip-vega20-0/clang-hip-vega20/test-suite-build/External/HIP/TheNextWeek.reference_output-hip-6.0.2
+ build_step 'Testing HIP test-suite'
+ echo '@@@BUILD_STEP Testing HIP test-suite@@@'
@@@BUILD_STEP Testing HIP test-suite@@@
+ ninja -v check-hip-simple
[0/1] cd /buildbot/hip-vega20-0/clang-hip-vega20/test-suite-build/External/HIP && /buildbot/hip-vega20-0/clang-hip-vega20/llvm/bin/llvm-lit -sv empty-hip-6.0.2.test with-fopenmp-hip-6.0.2.test saxpy-hip-6.0.2.test InOneWeekend-hip-6.0.2.test TheNextWeek-hip-6.0.2.test blender.test
-- Testing: 6 tests, 6 workers --
Testing:  0.. 10.. 20.. 30.. 40.. 
FAIL: test-suite :: External/HIP/InOneWeekend-hip-6.0.2.test (4 of 6)
******************** TEST 'test-suite :: External/HIP/InOneWeekend-hip-6.0.2.test' FAILED ********************

/buildbot/hip-vega20-0/clang-hip-vega20/test-suite-build/tools/timeit-target --timeout 7200 --limit-core 0 --limit-cpu 7200 --limit-file-size 209715200 --limit-rss-size 838860800 --append-exitstatus --redirect-output /buildbot/hip-vega20-0/clang-hip-vega20/test-suite-build/External/HIP/Output/InOneWeekend-hip-6.0.2.test.out --redirect-input /dev/null --summary /buildbot/hip-vega20-0/clang-hip-vega20/test-suite-build/External/HIP/Output/InOneWeekend-hip-6.0.2.test.time /buildbot/hip-vega20-0/clang-hip-vega20/test-suite-build/External/HIP/InOneWeekend-hip-6.0.2
cd /buildbot/hip-vega20-0/clang-hip-vega20/test-suite-build/External/HIP ; /buildbot/hip-vega20-0/clang-hip-vega20/test-suite-build/tools/fpcmp-target /buildbot/hip-vega20-0/clang-hip-vega20/test-suite-build/External/HIP/Output/InOneWeekend-hip-6.0.2.test.out InOneWeekend.reference_output-hip-6.0.2

+ cd /buildbot/hip-vega20-0/clang-hip-vega20/test-suite-build/External/HIP
+ /buildbot/hip-vega20-0/clang-hip-vega20/test-suite-build/tools/fpcmp-target /buildbot/hip-vega20-0/clang-hip-vega20/test-suite-build/External/HIP/Output/InOneWeekend-hip-6.0.2.test.out InOneWeekend.reference_output-hip-6.0.2
/buildbot/hip-vega20-0/clang-hip-vega20/test-suite-build/tools/fpcmp-target: Comparison failed, textual difference between 'M' and 'i'

********************
/usr/bin/strip: /bin/bash.stripped: Bad file descriptor
Testing:  0.. 10.. 20.. 30.. 40.. 50.. 60.. 70.. 80.. 90.. 
********************
Failed Tests (1):
  test-suite :: External/HIP/InOneWeekend-hip-6.0.2.test


Testing Time: 379.08s

Total Discovered Tests: 6
  Passed: 5 (83.33%)
  Failed: 1 (16.67%)
FAILED: External/HIP/CMakeFiles/check-hip-simple-hip-6.0.2 
cd /buildbot/hip-vega20-0/clang-hip-vega20/test-suite-build/External/HIP && /buildbot/hip-vega20-0/clang-hip-vega20/llvm/bin/llvm-lit -sv empty-hip-6.0.2.test with-fopenmp-hip-6.0.2.test saxpy-hip-6.0.2.test InOneWeekend-hip-6.0.2.test TheNextWeek-hip-6.0.2.test blender.test
ninja: build stopped: subcommand failed.
Step 12 (Testing HIP test-suite) failure: Testing HIP test-suite (failure)
@@@BUILD_STEP Testing HIP test-suite@@@
+ ninja -v check-hip-simple
[0/1] cd /buildbot/hip-vega20-0/clang-hip-vega20/test-suite-build/External/HIP && /buildbot/hip-vega20-0/clang-hip-vega20/llvm/bin/llvm-lit -sv empty-hip-6.0.2.test with-fopenmp-hip-6.0.2.test saxpy-hip-6.0.2.test InOneWeekend-hip-6.0.2.test TheNextWeek-hip-6.0.2.test blender.test
-- Testing: 6 tests, 6 workers --
Testing:  0.. 10.. 20.. 30.. 40.. 
FAIL: test-suite :: External/HIP/InOneWeekend-hip-6.0.2.test (4 of 6)
******************** TEST 'test-suite :: External/HIP/InOneWeekend-hip-6.0.2.test' FAILED ********************

/buildbot/hip-vega20-0/clang-hip-vega20/test-suite-build/tools/timeit-target --timeout 7200 --limit-core 0 --limit-cpu 7200 --limit-file-size 209715200 --limit-rss-size 838860800 --append-exitstatus --redirect-output /buildbot/hip-vega20-0/clang-hip-vega20/test-suite-build/External/HIP/Output/InOneWeekend-hip-6.0.2.test.out --redirect-input /dev/null --summary /buildbot/hip-vega20-0/clang-hip-vega20/test-suite-build/External/HIP/Output/InOneWeekend-hip-6.0.2.test.time /buildbot/hip-vega20-0/clang-hip-vega20/test-suite-build/External/HIP/InOneWeekend-hip-6.0.2
cd /buildbot/hip-vega20-0/clang-hip-vega20/test-suite-build/External/HIP ; /buildbot/hip-vega20-0/clang-hip-vega20/test-suite-build/tools/fpcmp-target /buildbot/hip-vega20-0/clang-hip-vega20/test-suite-build/External/HIP/Output/InOneWeekend-hip-6.0.2.test.out InOneWeekend.reference_output-hip-6.0.2

+ cd /buildbot/hip-vega20-0/clang-hip-vega20/test-suite-build/External/HIP
+ /buildbot/hip-vega20-0/clang-hip-vega20/test-suite-build/tools/fpcmp-target /buildbot/hip-vega20-0/clang-hip-vega20/test-suite-build/External/HIP/Output/InOneWeekend-hip-6.0.2.test.out InOneWeekend.reference_output-hip-6.0.2
/buildbot/hip-vega20-0/clang-hip-vega20/test-suite-build/tools/fpcmp-target: Comparison failed, textual difference between 'M' and 'i'

********************
/usr/bin/strip: /bin/bash.stripped: Bad file descriptor
Testing:  0.. 10.. 20.. 30.. 40.. 50.. 60.. 70.. 80.. 90.. 
********************
Failed Tests (1):
  test-suite :: External/HIP/InOneWeekend-hip-6.0.2.test


Testing Time: 379.08s

Total Discovered Tests: 6
  Passed: 5 (83.33%)
  Failed: 1 (16.67%)
FAILED: External/HIP/CMakeFiles/check-hip-simple-hip-6.0.2 
cd /buildbot/hip-vega20-0/clang-hip-vega20/test-suite-build/External/HIP && /buildbot/hip-vega20-0/clang-hip-vega20/llvm/bin/llvm-lit -sv empty-hip-6.0.2.test with-fopenmp-hip-6.0.2.test saxpy-hip-6.0.2.test InOneWeekend-hip-6.0.2.test TheNextWeek-hip-6.0.2.test blender.test
ninja: build stopped: subcommand failed.
program finished with exit code 1
elapsedTime=486.677162

lravenclaw pushed a commit to lravenclaw/llvm-project that referenced this pull request Jul 3, 2024
We follow SelectionDAG and FastISel manner: set a register during formal
arguments lowering and use this register to insert a copy of StructRet
argument to RAX register during return lowering.

Also add RAX register to RET instruction to fix a difference between
GlobalISel and SelectionDAG, when the copy instruction could be
deleted.
kbluck pushed a commit to kbluck/llvm-project that referenced this pull request Jul 6, 2024
We follow SelectionDAG and FastISel manner: set a register during formal
arguments lowering and use this register to insert a copy of StructRet
argument to RAX register during return lowering.

Also add RAX register to RET instruction to fix a difference between
GlobalISel and SelectionDAG, when the copy instruction could be
deleted.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

5 participants