Skip to content

[GlobalISel] Allow more illegal vector types in params/returns. #95514

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
merged 2 commits into from
Jun 18, 2024
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
21 changes: 17 additions & 4 deletions llvm/lib/CodeGen/GlobalISel/CallLowering.cpp
Original file line number Diff line number Diff line change
Expand Up @@ -471,13 +471,15 @@ static void buildCopyFromRegs(MachineIRBuilder &B, ArrayRef<Register> OrigRegs,
// Deal with vector with 64-bit elements decomposed to 32-bit
// registers. Need to create intermediate 64-bit elements.
SmallVector<Register, 8> EltMerges;
int PartsPerElt = DstEltTy.getSizeInBits() / PartLLT.getSizeInBits();

assert(DstEltTy.getSizeInBits() % PartLLT.getSizeInBits() == 0);
int PartsPerElt =
divideCeil(DstEltTy.getSizeInBits(), PartLLT.getSizeInBits());
LLT ExtendedPartTy = LLT::scalar(PartLLT.getSizeInBits() * PartsPerElt);

for (int I = 0, NumElts = LLTy.getNumElements(); I != NumElts; ++I) {
auto Merge =
B.buildMergeLikeInstr(RealDstEltTy, Regs.take_front(PartsPerElt));
B.buildMergeLikeInstr(ExtendedPartTy, Regs.take_front(PartsPerElt));
if (ExtendedPartTy.getSizeInBits() > RealDstEltTy.getSizeInBits())
Merge = B.buildTrunc(RealDstEltTy, Merge);
// Fix the type in case this is really a vector of pointers.
MRI.setType(Merge.getReg(0), RealDstEltTy);
EltMerges.push_back(Merge.getReg(0));
Expand Down Expand Up @@ -574,6 +576,17 @@ static void buildCopyToRegs(MachineIRBuilder &B, ArrayRef<Register> DstRegs,
return;
}

if (SrcTy.isVector() && !PartTy.isVector() &&
SrcTy.getScalarSizeInBits() > PartTy.getSizeInBits()) {
LLT ExtTy =
LLT::vector(SrcTy.getElementCount(),
LLT::scalar(PartTy.getScalarSizeInBits() * DstRegs.size() /
SrcTy.getNumElements()));
auto Ext = B.buildAnyExt(ExtTy, SrcReg);
B.buildUnmerge(DstRegs, Ext);
return;
}

MachineRegisterInfo &MRI = *B.getMRI();
LLT DstTy = MRI.getType(DstRegs[0]);
LLT LCMTy = getCoverTy(SrcTy, PartTy);
Expand Down
3 changes: 1 addition & 2 deletions llvm/lib/Target/AArch64/GISel/AArch64CallLowering.cpp
Original file line number Diff line number Diff line change
Expand Up @@ -404,14 +404,13 @@ bool AArch64CallLowering::lowerReturn(MachineIRBuilder &MIRBuilder,
ExtendOp = TargetOpcode::G_ZEXT;

LLT NewLLT(NewVT);
LLT OldLLT(MVT::getVT(CurArgInfo.Ty));
LLT OldLLT = getLLTForType(*CurArgInfo.Ty, DL);
CurArgInfo.Ty = EVT(NewVT).getTypeForEVT(Ctx);
// Instead of an extend, we might have a vector type which needs
// padding with more elements, e.g. <2 x half> -> <4 x half>.
if (NewVT.isVector()) {
if (OldLLT.isVector()) {
if (NewLLT.getNumElements() > OldLLT.getNumElements()) {

CurVReg =
MIRBuilder.buildPadVectorWithUndefElements(NewLLT, CurVReg)
.getReg(0);
Expand Down
464 changes: 458 additions & 6 deletions llvm/test/CodeGen/AArch64/GlobalISel/ret-vec-promote.ll

Large diffs are not rendered by default.

344 changes: 344 additions & 0 deletions llvm/test/CodeGen/AArch64/GlobalISel/vec-param.ll

Large diffs are not rendered by default.

1 change: 0 additions & 1 deletion llvm/test/CodeGen/AArch64/sadd_sat_vec.ll
Original file line number Diff line number Diff line change
Expand Up @@ -3,7 +3,6 @@
; RUN: llc < %s -mtriple=aarch64-- -global-isel -global-isel-abort=2 2>&1 | FileCheck %s --check-prefixes=CHECK,CHECK-GI

; CHECK-GI: warning: Instruction selection used fallback path for v2i8
; CHECK-GI-NEXT: warning: Instruction selection used fallback path for v12i8
; CHECK-GI-NEXT: warning: Instruction selection used fallback path for v16i4
; CHECK-GI-NEXT: warning: Instruction selection used fallback path for v16i1
; CHECK-GI-NEXT: warning: Instruction selection used fallback path for v2i128
Expand Down
1 change: 0 additions & 1 deletion llvm/test/CodeGen/AArch64/ssub_sat_vec.ll
Original file line number Diff line number Diff line change
Expand Up @@ -3,7 +3,6 @@
; RUN: llc < %s -mtriple=aarch64-- -global-isel -global-isel-abort=2 2>&1 | FileCheck %s --check-prefixes=CHECK,CHECK-GI

; CHECK-GI: warning: Instruction selection used fallback path for v2i8
; CHECK-GI-NEXT: warning: Instruction selection used fallback path for v12i8
; CHECK-GI-NEXT: warning: Instruction selection used fallback path for v16i4
; CHECK-GI-NEXT: warning: Instruction selection used fallback path for v16i1
; CHECK-GI-NEXT: warning: Instruction selection used fallback path for v2i128
Expand Down
1 change: 0 additions & 1 deletion llvm/test/CodeGen/AArch64/uadd_sat_vec.ll
Original file line number Diff line number Diff line change
Expand Up @@ -3,7 +3,6 @@
; RUN: llc < %s -mtriple=aarch64-- -global-isel -global-isel-abort=2 2>&1 | FileCheck %s --check-prefixes=CHECK,CHECK-GI

; CHECK-GI: warning: Instruction selection used fallback path for v2i8
; CHECK-GI-NEXT: warning: Instruction selection used fallback path for v12i8
; CHECK-GI-NEXT: warning: Instruction selection used fallback path for v16i4
; CHECK-GI-NEXT: warning: Instruction selection used fallback path for v16i1
; CHECK-GI-NEXT: warning: Instruction selection used fallback path for v2i128
Expand Down
1 change: 0 additions & 1 deletion llvm/test/CodeGen/AArch64/usub_sat_vec.ll
Original file line number Diff line number Diff line change
Expand Up @@ -3,7 +3,6 @@
; RUN: llc < %s -mtriple=aarch64-- -global-isel -global-isel-abort=2 2>&1 | FileCheck %s --check-prefixes=CHECK,CHECK-GI

; CHECK-GI: warning: Instruction selection used fallback path for v2i8
; CHECK-GI-NEXT: warning: Instruction selection used fallback path for v12i8
; CHECK-GI-NEXT: warning: Instruction selection used fallback path for v16i4
; CHECK-GI-NEXT: warning: Instruction selection used fallback path for v16i1
; CHECK-GI-NEXT: warning: Instruction selection used fallback path for v2i128
Expand Down
17 changes: 15 additions & 2 deletions llvm/test/CodeGen/AMDGPU/GlobalISel/function-returns.v2i65.ll
Original file line number Diff line number Diff line change
@@ -1,7 +1,20 @@
; XFAIL: *
; RUN: llc -global-isel -mtriple=amdgcn-mesa-mesa3d -mcpu=fiji -stop-after=irtranslator -verify-machineinstrs -o - %s
; NOTE: Assertions have been autogenerated by utils/update_mir_test_checks.py UTC_ARGS: --version 5
; RUN: llc -global-isel -mtriple=amdgcn-mesa-mesa3d -mcpu=fiji -stop-after=irtranslator -o - %s | FileCheck %s

define <2 x i65> @v2i65_func_void() #0 {
; CHECK-LABEL: name: v2i65_func_void
; CHECK: bb.1 (%ir-block.0):
; CHECK-NEXT: [[DEF:%[0-9]+]]:_(p1) = G_IMPLICIT_DEF
; CHECK-NEXT: [[LOAD:%[0-9]+]]:_(<2 x s65>) = G_LOAD [[DEF]](p1) :: (load (<2 x s65>) from `ptr addrspace(1) undef`, align 32, addrspace 1)
; CHECK-NEXT: [[ANYEXT:%[0-9]+]]:_(<2 x s96>) = G_ANYEXT [[LOAD]](<2 x s65>)
; CHECK-NEXT: [[UV:%[0-9]+]]:_(s32), [[UV1:%[0-9]+]]:_(s32), [[UV2:%[0-9]+]]:_(s32), [[UV3:%[0-9]+]]:_(s32), [[UV4:%[0-9]+]]:_(s32), [[UV5:%[0-9]+]]:_(s32) = G_UNMERGE_VALUES [[ANYEXT]](<2 x s96>)
; CHECK-NEXT: $vgpr0 = COPY [[UV]](s32)
; CHECK-NEXT: $vgpr1 = COPY [[UV1]](s32)
; CHECK-NEXT: $vgpr2 = COPY [[UV2]](s32)
; CHECK-NEXT: $vgpr3 = COPY [[UV3]](s32)
; CHECK-NEXT: $vgpr4 = COPY [[UV4]](s32)
; CHECK-NEXT: $vgpr5 = COPY [[UV5]](s32)
; CHECK-NEXT: SI_RETURN implicit $vgpr0, implicit $vgpr1, implicit $vgpr2, implicit $vgpr3, implicit $vgpr4, implicit $vgpr5
%val = load <2 x i65>, ptr addrspace(1) undef
ret <2 x i65> %val
}
Original file line number Diff line number Diff line change
@@ -1,7 +1,25 @@
; XFAIL: *
; RUN: llc -global-isel -mtriple=amdgcn -mcpu=fiji -stop-after=irtranslator -verify-machineinstrs -o - %s
; NOTE: Assertions have been autogenerated by utils/update_mir_test_checks.py UTC_ARGS: --version 5
; RUN: llc -global-isel -mtriple=amdgcn -mcpu=fiji -stop-after=irtranslator -o - %s | FileCheck %s

define void @void_func_v2i65(<2 x i65> %arg0) #0 {
; CHECK-LABEL: name: void_func_v2i65
; CHECK: bb.1 (%ir-block.0):
; CHECK-NEXT: liveins: $vgpr0, $vgpr1, $vgpr2, $vgpr3, $vgpr4, $vgpr5
; CHECK-NEXT: {{ $}}
; CHECK-NEXT: [[COPY:%[0-9]+]]:_(s32) = COPY $vgpr0
; CHECK-NEXT: [[COPY1:%[0-9]+]]:_(s32) = COPY $vgpr1
; CHECK-NEXT: [[COPY2:%[0-9]+]]:_(s32) = COPY $vgpr2
; CHECK-NEXT: [[COPY3:%[0-9]+]]:_(s32) = COPY $vgpr3
; CHECK-NEXT: [[COPY4:%[0-9]+]]:_(s32) = COPY $vgpr4
; CHECK-NEXT: [[COPY5:%[0-9]+]]:_(s32) = COPY $vgpr5
; CHECK-NEXT: [[MV:%[0-9]+]]:_(s96) = G_MERGE_VALUES [[COPY]](s32), [[COPY1]](s32), [[COPY2]](s32)
; CHECK-NEXT: [[TRUNC:%[0-9]+]]:_(s65) = G_TRUNC [[MV]](s96)
; CHECK-NEXT: [[MV1:%[0-9]+]]:_(s96) = G_MERGE_VALUES [[COPY3]](s32), [[COPY4]](s32), [[COPY5]](s32)
; CHECK-NEXT: [[TRUNC1:%[0-9]+]]:_(s65) = G_TRUNC [[MV1]](s96)
; CHECK-NEXT: [[BUILD_VECTOR:%[0-9]+]]:_(<2 x s65>) = G_BUILD_VECTOR [[TRUNC]](s65), [[TRUNC1]](s65)
; CHECK-NEXT: [[DEF:%[0-9]+]]:_(p1) = G_IMPLICIT_DEF
; CHECK-NEXT: G_STORE [[BUILD_VECTOR]](<2 x s65>), [[DEF]](p1) :: (store (<2 x s65>) into `ptr addrspace(1) undef`, align 32, addrspace 1)
; CHECK-NEXT: SI_RETURN
Comment on lines +15 to +21
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think this looks right. I checked the DAG for reference, but I actually think the DAG is wrong.

store <2 x i65> %arg0, ptr addrspace(1) undef
ret void
}
Loading