Skip to content

[GlobalISel] Import llvm.stepvector #115721

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
merged 1 commit into from
Nov 11, 2024
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
4 changes: 4 additions & 0 deletions llvm/lib/CodeGen/GlobalISel/IRTranslator.cpp
Original file line number Diff line number Diff line change
Expand Up @@ -2597,6 +2597,10 @@ bool IRTranslator::translateKnownIntrinsic(const CallInst &CI, Intrinsic::ID ID,
return translateExtractVector(CI, MIRBuilder);
case Intrinsic::vector_insert:
return translateInsertVector(CI, MIRBuilder);
case Intrinsic::stepvector: {
MIRBuilder.buildStepVector(getOrCreateVReg(CI), 1);
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Not sure why this has an operand the intrinsic doesn't but it's already that way. I'm confused by the DAG API, it's one level removed but doing the same thing

Copy link
Author

@tschuett tschuett Nov 11, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The LLVM-IR intrinsic has an implicit factor of one:
https://llvm.org/docs/LangRef.html#llvm-stepvector-intrinsic
For GlobalISel, we have a real factor:
#115598

G_STEP_VECTOR is a cheap name for the AArch64
INDEX (immediates)
instruction.

We and the DAG did the same for the vscale intrinsic:
https://llvm.org/docs/GlobalISel/GenericOpcode.html#g-vscale
The standard vscale pattern takes a scale, while the LLVM-IR intrinsic doesn't:
def : Pat<(vscale GPR64:$scale), (MADDXrrr (UBFMXri (RDVLI_XI 1), 4, 63), $scale, XZR)>;

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The difference is probably between LLVM-IR and hardware instructions.

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Slightly above:

  case Intrinsic::vscale: {
    MIRBuilder.buildVScale(getOrCreateVReg(CI), 1);
    return true;
  }

return true;
}
case Intrinsic::prefetch: {
Value *Addr = CI.getOperand(0);
unsigned RW = cast<ConstantInt>(CI.getOperand(1))->getZExtValue();
Expand Down
5 changes: 3 additions & 2 deletions llvm/lib/CodeGen/GlobalISel/MachineIRBuilder.cpp
Original file line number Diff line number Diff line change
Expand Up @@ -811,8 +811,9 @@ MachineInstrBuilder MachineIRBuilder::buildInsert(const DstOp &Res,

MachineInstrBuilder MachineIRBuilder::buildStepVector(const DstOp &Res,
unsigned Step) {
ConstantInt *CI =
ConstantInt::get(getMF().getFunction().getContext(), APInt(64, Step));
unsigned Bitwidth = Res.getLLTTy(*getMRI()).getElementType().getSizeInBits();
ConstantInt *CI = ConstantInt::get(getMF().getFunction().getContext(),
APInt(Bitwidth, Step));
auto StepVector = buildInstr(TargetOpcode::G_STEP_VECTOR);
StepVector->setDebugLoc(DebugLoc());
Res.addDefToMIB(*getMRI(), StepVector);
Expand Down
46 changes: 46 additions & 0 deletions llvm/test/CodeGen/AArch64/GlobalISel/irtranslator-stepvector.ll
Original file line number Diff line number Diff line change
@@ -0,0 +1,46 @@
; NOTE: Assertions have been autogenerated by utils/update_mir_test_checks.py
; RUN: llc -O0 -mtriple=aarch64-linux-gnu -mattr=+sve -global-isel -aarch64-enable-gisel-sve=1 -stop-after=irtranslator %s -o - | FileCheck %s

define <vscale x 2 x i64> @call_step_vector_i64() {
; CHECK-LABEL: name: call_step_vector_i64
; CHECK: bb.1.entry:
; CHECK-NEXT: [[STEP_VECTOR:%[0-9]+]]:_(<vscale x 2 x s64>) = G_STEP_VECTOR i64 1
; CHECK-NEXT: $z0 = COPY [[STEP_VECTOR]](<vscale x 2 x s64>)
; CHECK-NEXT: RET_ReallyLR implicit $z0
entry:
%steps = call <vscale x 2 x i64> @llvm.stepvector.nxv2i64()
ret <vscale x 2 x i64> %steps
}

define <vscale x 4 x i32> @call_step_vector_i32() {
; CHECK-LABEL: name: call_step_vector_i32
; CHECK: bb.1.entry:
; CHECK-NEXT: [[STEP_VECTOR:%[0-9]+]]:_(<vscale x 4 x s32>) = G_STEP_VECTOR i32 1
; CHECK-NEXT: $z0 = COPY [[STEP_VECTOR]](<vscale x 4 x s32>)
; CHECK-NEXT: RET_ReallyLR implicit $z0
entry:
%steps = call <vscale x 4 x i32> @llvm.stepvector.nxv4i32()
ret <vscale x 4 x i32> %steps
}

define <vscale x 8 x i16> @call_step_vector_i16() {
; CHECK-LABEL: name: call_step_vector_i16
; CHECK: bb.1.entry:
; CHECK-NEXT: [[STEP_VECTOR:%[0-9]+]]:_(<vscale x 8 x s16>) = G_STEP_VECTOR i16 1
; CHECK-NEXT: $z0 = COPY [[STEP_VECTOR]](<vscale x 8 x s16>)
; CHECK-NEXT: RET_ReallyLR implicit $z0
entry:
%steps = call <vscale x 8 x i16> @llvm.stepvector.nxv8i16()
ret <vscale x 8 x i16> %steps
}

define <vscale x 16 x i8> @call_step_vector_i8() {
; CHECK-LABEL: name: call_step_vector_i8
; CHECK: bb.1.entry:
; CHECK-NEXT: [[STEP_VECTOR:%[0-9]+]]:_(<vscale x 16 x s8>) = G_STEP_VECTOR i8 1
; CHECK-NEXT: $z0 = COPY [[STEP_VECTOR]](<vscale x 16 x s8>)
; CHECK-NEXT: RET_ReallyLR implicit $z0
entry:
%steps = call <vscale x 16 x i8> @llvm.stepvector.nxv16i8()
ret <vscale x 16 x i8> %steps
}
Loading