Skip to content

[GISel][RISCV] Legalize G_{U|S}DIVREM #93067

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
merged 1 commit into from
May 22, 2024
Merged

Conversation

dtcxzyw
Copy link
Member

@dtcxzyw dtcxzyw commented May 22, 2024

This patch expands G_{U|S}DIVREM into G_{U|S}DIV + G_{U|S}REM. G_{U|S}DIVREM is generated by the following fold:

bool CombinerHelper::matchCombineDivRem(MachineInstr &MI,
MachineInstr *&OtherMI) {
unsigned Opcode = MI.getOpcode();
bool IsDiv, IsSigned;
switch (Opcode) {
default:
llvm_unreachable("Unexpected opcode!");
case TargetOpcode::G_SDIV:
case TargetOpcode::G_UDIV: {
IsDiv = true;
IsSigned = Opcode == TargetOpcode::G_SDIV;
break;
}
case TargetOpcode::G_SREM:
case TargetOpcode::G_UREM: {
IsDiv = false;
IsSigned = Opcode == TargetOpcode::G_SREM;
break;
}
}
Register Src1 = MI.getOperand(1).getReg();
unsigned DivOpcode, RemOpcode, DivremOpcode;
if (IsSigned) {
DivOpcode = TargetOpcode::G_SDIV;
RemOpcode = TargetOpcode::G_SREM;
DivremOpcode = TargetOpcode::G_SDIVREM;
} else {
DivOpcode = TargetOpcode::G_UDIV;
RemOpcode = TargetOpcode::G_UREM;
DivremOpcode = TargetOpcode::G_UDIVREM;
}
if (!isLegalOrBeforeLegalizer({DivremOpcode, {MRI.getType(Src1)}}))
return false;
// Combine:
// %div:_ = G_[SU]DIV %src1:_, %src2:_
// %rem:_ = G_[SU]REM %src1:_, %src2:_
// into:
// %div:_, %rem:_ = G_[SU]DIVREM %src1:_, %src2:_
// Combine:
// %rem:_ = G_[SU]REM %src1:_, %src2:_
// %div:_ = G_[SU]DIV %src1:_, %src2:_
// into:
// %div:_, %rem:_ = G_[SU]DIVREM %src1:_, %src2:_
for (auto &UseMI : MRI.use_nodbg_instructions(Src1)) {
if (MI.getParent() == UseMI.getParent() &&
((IsDiv && UseMI.getOpcode() == RemOpcode) ||
(!IsDiv && UseMI.getOpcode() == DivOpcode)) &&
matchEqualDefs(MI.getOperand(2), UseMI.getOperand(2)) &&
matchEqualDefs(MI.getOperand(1), UseMI.getOperand(1))) {
OtherMI = &UseMI;
return true;
}
}
return false;
}

It always folds div + rem pairs into divrem during pre-legalization. I tried to change isLegalOrBeforeLegalizer to isLegal but it produced worse codegen on AArch64.

I am not sure whether this patch is useful since DivRemPairsPass always converts div + rem pairs into div + mul on RISCV.

@llvmbot
Copy link
Member

llvmbot commented May 22, 2024

@llvm/pr-subscribers-backend-risc-v

Author: Yingwei Zheng (dtcxzyw)

Changes

This patch expands G_{U|S}DIVREM into G_{U|S}DIV + G_{U|S}REM. G_{U|S}DIVREM is generated by the following fold:

bool CombinerHelper::matchCombineDivRem(MachineInstr &MI,
MachineInstr *&OtherMI) {
unsigned Opcode = MI.getOpcode();
bool IsDiv, IsSigned;
switch (Opcode) {
default:
llvm_unreachable("Unexpected opcode!");
case TargetOpcode::G_SDIV:
case TargetOpcode::G_UDIV: {
IsDiv = true;
IsSigned = Opcode == TargetOpcode::G_SDIV;
break;
}
case TargetOpcode::G_SREM:
case TargetOpcode::G_UREM: {
IsDiv = false;
IsSigned = Opcode == TargetOpcode::G_SREM;
break;
}
}
Register Src1 = MI.getOperand(1).getReg();
unsigned DivOpcode, RemOpcode, DivremOpcode;
if (IsSigned) {
DivOpcode = TargetOpcode::G_SDIV;
RemOpcode = TargetOpcode::G_SREM;
DivremOpcode = TargetOpcode::G_SDIVREM;
} else {
DivOpcode = TargetOpcode::G_UDIV;
RemOpcode = TargetOpcode::G_UREM;
DivremOpcode = TargetOpcode::G_UDIVREM;
}
if (!isLegalOrBeforeLegalizer({DivremOpcode, {MRI.getType(Src1)}}))
return false;
// Combine:
// %div:_ = G_[SU]DIV %src1:_, %src2:_
// %rem:_ = G_[SU]REM %src1:_, %src2:_
// into:
// %div:_, %rem:_ = G_[SU]DIVREM %src1:_, %src2:_
// Combine:
// %rem:_ = G_[SU]REM %src1:_, %src2:_
// %div:_ = G_[SU]DIV %src1:_, %src2:_
// into:
// %div:_, %rem:_ = G_[SU]DIVREM %src1:_, %src2:_
for (auto &UseMI : MRI.use_nodbg_instructions(Src1)) {
if (MI.getParent() == UseMI.getParent() &&
((IsDiv && UseMI.getOpcode() == RemOpcode) ||
(!IsDiv && UseMI.getOpcode() == DivOpcode)) &&
matchEqualDefs(MI.getOperand(2), UseMI.getOperand(2)) &&
matchEqualDefs(MI.getOperand(1), UseMI.getOperand(1))) {
OtherMI = &UseMI;
return true;
}
}
return false;
}

It always folds div + rem pairs into divrem during pre-legalization. I tried to change isLegalOrBeforeLegalizer to isLegal but it produced worse codegen on AArch64.

I am not sure whether this patch is useful since DivRemPairsPass always converts div + rem pairs into div + mul on RISCV.


Full diff: https://github.com/llvm/llvm-project/pull/93067.diff

3 Files Affected:

  • (modified) llvm/lib/Target/RISCV/GISel/RISCVLegalizerInfo.cpp (+3)
  • (modified) llvm/test/CodeGen/RISCV/GlobalISel/legalizer/legalize-div-rv32.mir (+90)
  • (modified) llvm/test/CodeGen/RISCV/GlobalISel/legalizer/legalize-div-rv64.mir (+90)
diff --git a/llvm/lib/Target/RISCV/GISel/RISCVLegalizerInfo.cpp b/llvm/lib/Target/RISCV/GISel/RISCVLegalizerInfo.cpp
index a1d3aadb816ab..c6d11b8a8bd7e 100644
--- a/llvm/lib/Target/RISCV/GISel/RISCVLegalizerInfo.cpp
+++ b/llvm/lib/Target/RISCV/GISel/RISCVLegalizerInfo.cpp
@@ -349,6 +349,9 @@ RISCVLegalizerInfo::RISCVLegalizerInfo(const RISCVSubtarget &ST)
         .widenScalarToNextPow2(0);
   }
 
+  // TODO: Use libcall for sDoubleXLen.
+  getActionDefinitionsBuilder({G_UDIVREM, G_SDIVREM}).lower();
+
   auto &AbsActions = getActionDefinitionsBuilder(G_ABS);
   if (ST.hasStdExtZbb())
     AbsActions.customFor({s32, sXLen}).minScalar(0, sXLen);
diff --git a/llvm/test/CodeGen/RISCV/GlobalISel/legalizer/legalize-div-rv32.mir b/llvm/test/CodeGen/RISCV/GlobalISel/legalizer/legalize-div-rv32.mir
index 4177a40e3826c..26d8785afb470 100644
--- a/llvm/test/CodeGen/RISCV/GlobalISel/legalizer/legalize-div-rv32.mir
+++ b/llvm/test/CodeGen/RISCV/GlobalISel/legalizer/legalize-div-rv32.mir
@@ -555,3 +555,93 @@ body:             |
     PseudoRET implicit $x10, implicit $x11
 
 ...
+---
+name:            udivrem_i32
+body:             |
+  bb.1.entry:
+    liveins: $x10, $x11
+
+    ; CHECK-I-LABEL: name: udivrem_i32
+    ; CHECK-I: liveins: $x10, $x11
+    ; CHECK-I-NEXT: {{  $}}
+    ; CHECK-I-NEXT: [[COPY:%[0-9]+]]:_(s32) = COPY $x10
+    ; CHECK-I-NEXT: [[COPY1:%[0-9]+]]:_(s32) = COPY $x11
+    ; CHECK-I-NEXT: ADJCALLSTACKDOWN 0, 0, implicit-def $x2, implicit $x2
+    ; CHECK-I-NEXT: $x10 = COPY [[COPY]](s32)
+    ; CHECK-I-NEXT: $x11 = COPY [[COPY1]](s32)
+    ; CHECK-I-NEXT: PseudoCALL target-flags(riscv-call) &__udivsi3, csr_ilp32_lp64, implicit-def $x1, implicit $x10, implicit $x11, implicit-def $x10
+    ; CHECK-I-NEXT: ADJCALLSTACKUP 0, 0, implicit-def $x2, implicit $x2
+    ; CHECK-I-NEXT: [[COPY2:%[0-9]+]]:_(s32) = COPY $x10
+    ; CHECK-I-NEXT: ADJCALLSTACKDOWN 0, 0, implicit-def $x2, implicit $x2
+    ; CHECK-I-NEXT: $x10 = COPY [[COPY]](s32)
+    ; CHECK-I-NEXT: $x11 = COPY [[COPY1]](s32)
+    ; CHECK-I-NEXT: PseudoCALL target-flags(riscv-call) &__umodsi3, csr_ilp32_lp64, implicit-def $x1, implicit $x10, implicit $x11, implicit-def $x10
+    ; CHECK-I-NEXT: ADJCALLSTACKUP 0, 0, implicit-def $x2, implicit $x2
+    ; CHECK-I-NEXT: [[COPY3:%[0-9]+]]:_(s32) = COPY $x10
+    ; CHECK-I-NEXT: [[ADD:%[0-9]+]]:_(s32) = G_ADD [[COPY2]], [[COPY3]]
+    ; CHECK-I-NEXT: $x10 = COPY [[ADD]](s32)
+    ; CHECK-I-NEXT: PseudoRET implicit $x10
+    ;
+    ; CHECK-M-LABEL: name: udivrem_i32
+    ; CHECK-M: liveins: $x10, $x11
+    ; CHECK-M-NEXT: {{  $}}
+    ; CHECK-M-NEXT: [[COPY:%[0-9]+]]:_(s32) = COPY $x10
+    ; CHECK-M-NEXT: [[COPY1:%[0-9]+]]:_(s32) = COPY $x11
+    ; CHECK-M-NEXT: [[UDIV:%[0-9]+]]:_(s32) = G_UDIV [[COPY]], [[COPY1]]
+    ; CHECK-M-NEXT: [[UREM:%[0-9]+]]:_(s32) = G_UREM [[COPY]], [[COPY1]]
+    ; CHECK-M-NEXT: [[ADD:%[0-9]+]]:_(s32) = G_ADD [[UDIV]], [[UREM]]
+    ; CHECK-M-NEXT: $x10 = COPY [[ADD]](s32)
+    ; CHECK-M-NEXT: PseudoRET implicit $x10
+    %0:_(s32) = COPY $x10
+    %1:_(s32) = COPY $x11
+    %2:_(s32), %3:_(s32) = G_UDIVREM %0, %1
+    %4:_(s32) = G_ADD %2, %3
+    $x10 = COPY %4(s32)
+    PseudoRET implicit $x10
+
+...
+---
+name:            sdivrem_i32
+body:             |
+  bb.1.entry:
+    liveins: $x10, $x11
+
+    ; CHECK-I-LABEL: name: sdivrem_i32
+    ; CHECK-I: liveins: $x10, $x11
+    ; CHECK-I-NEXT: {{  $}}
+    ; CHECK-I-NEXT: [[COPY:%[0-9]+]]:_(s32) = COPY $x10
+    ; CHECK-I-NEXT: [[COPY1:%[0-9]+]]:_(s32) = COPY $x11
+    ; CHECK-I-NEXT: ADJCALLSTACKDOWN 0, 0, implicit-def $x2, implicit $x2
+    ; CHECK-I-NEXT: $x10 = COPY [[COPY]](s32)
+    ; CHECK-I-NEXT: $x11 = COPY [[COPY1]](s32)
+    ; CHECK-I-NEXT: PseudoCALL target-flags(riscv-call) &__divsi3, csr_ilp32_lp64, implicit-def $x1, implicit $x10, implicit $x11, implicit-def $x10
+    ; CHECK-I-NEXT: ADJCALLSTACKUP 0, 0, implicit-def $x2, implicit $x2
+    ; CHECK-I-NEXT: [[COPY2:%[0-9]+]]:_(s32) = COPY $x10
+    ; CHECK-I-NEXT: ADJCALLSTACKDOWN 0, 0, implicit-def $x2, implicit $x2
+    ; CHECK-I-NEXT: $x10 = COPY [[COPY]](s32)
+    ; CHECK-I-NEXT: $x11 = COPY [[COPY1]](s32)
+    ; CHECK-I-NEXT: PseudoCALL target-flags(riscv-call) &__modsi3, csr_ilp32_lp64, implicit-def $x1, implicit $x10, implicit $x11, implicit-def $x10
+    ; CHECK-I-NEXT: ADJCALLSTACKUP 0, 0, implicit-def $x2, implicit $x2
+    ; CHECK-I-NEXT: [[COPY3:%[0-9]+]]:_(s32) = COPY $x10
+    ; CHECK-I-NEXT: [[ADD:%[0-9]+]]:_(s32) = G_ADD [[COPY2]], [[COPY3]]
+    ; CHECK-I-NEXT: $x10 = COPY [[ADD]](s32)
+    ; CHECK-I-NEXT: PseudoRET implicit $x10
+    ;
+    ; CHECK-M-LABEL: name: sdivrem_i32
+    ; CHECK-M: liveins: $x10, $x11
+    ; CHECK-M-NEXT: {{  $}}
+    ; CHECK-M-NEXT: [[COPY:%[0-9]+]]:_(s32) = COPY $x10
+    ; CHECK-M-NEXT: [[COPY1:%[0-9]+]]:_(s32) = COPY $x11
+    ; CHECK-M-NEXT: [[SDIV:%[0-9]+]]:_(s32) = G_SDIV [[COPY]], [[COPY1]]
+    ; CHECK-M-NEXT: [[SREM:%[0-9]+]]:_(s32) = G_SREM [[COPY]], [[COPY1]]
+    ; CHECK-M-NEXT: [[ADD:%[0-9]+]]:_(s32) = G_ADD [[SDIV]], [[SREM]]
+    ; CHECK-M-NEXT: $x10 = COPY [[ADD]](s32)
+    ; CHECK-M-NEXT: PseudoRET implicit $x10
+    %0:_(s32) = COPY $x10
+    %1:_(s32) = COPY $x11
+    %2:_(s32), %3:_(s32) = G_SDIVREM %0, %1
+    %4:_(s32) = G_ADD %2, %3
+    $x10 = COPY %4(s32)
+    PseudoRET implicit $x10
+
+...
diff --git a/llvm/test/CodeGen/RISCV/GlobalISel/legalizer/legalize-div-rv64.mir b/llvm/test/CodeGen/RISCV/GlobalISel/legalizer/legalize-div-rv64.mir
index 492f9530997c9..bbbe38f695d2e 100644
--- a/llvm/test/CodeGen/RISCV/GlobalISel/legalizer/legalize-div-rv64.mir
+++ b/llvm/test/CodeGen/RISCV/GlobalISel/legalizer/legalize-div-rv64.mir
@@ -655,3 +655,93 @@ body:             |
     PseudoRET implicit $x10, implicit $x11
 
 ...
+---
+name:            udivrem_i64
+body:             |
+  bb.1.entry:
+    liveins: $x10, $x11
+
+    ; CHECK-I-LABEL: name: udivrem_i64
+    ; CHECK-I: liveins: $x10, $x11
+    ; CHECK-I-NEXT: {{  $}}
+    ; CHECK-I-NEXT: [[COPY:%[0-9]+]]:_(s64) = COPY $x10
+    ; CHECK-I-NEXT: [[COPY1:%[0-9]+]]:_(s64) = COPY $x11
+    ; CHECK-I-NEXT: ADJCALLSTACKDOWN 0, 0, implicit-def $x2, implicit $x2
+    ; CHECK-I-NEXT: $x10 = COPY [[COPY]](s64)
+    ; CHECK-I-NEXT: $x11 = COPY [[COPY1]](s64)
+    ; CHECK-I-NEXT: PseudoCALL target-flags(riscv-call) &__udivdi3, csr_ilp32_lp64, implicit-def $x1, implicit $x10, implicit $x11, implicit-def $x10
+    ; CHECK-I-NEXT: ADJCALLSTACKUP 0, 0, implicit-def $x2, implicit $x2
+    ; CHECK-I-NEXT: [[COPY2:%[0-9]+]]:_(s64) = COPY $x10
+    ; CHECK-I-NEXT: ADJCALLSTACKDOWN 0, 0, implicit-def $x2, implicit $x2
+    ; CHECK-I-NEXT: $x10 = COPY [[COPY]](s64)
+    ; CHECK-I-NEXT: $x11 = COPY [[COPY1]](s64)
+    ; CHECK-I-NEXT: PseudoCALL target-flags(riscv-call) &__umoddi3, csr_ilp32_lp64, implicit-def $x1, implicit $x10, implicit $x11, implicit-def $x10
+    ; CHECK-I-NEXT: ADJCALLSTACKUP 0, 0, implicit-def $x2, implicit $x2
+    ; CHECK-I-NEXT: [[COPY3:%[0-9]+]]:_(s64) = COPY $x10
+    ; CHECK-I-NEXT: [[ADD:%[0-9]+]]:_(s64) = G_ADD [[COPY2]], [[COPY3]]
+    ; CHECK-I-NEXT: $x10 = COPY [[ADD]](s64)
+    ; CHECK-I-NEXT: PseudoRET implicit $x10
+    ;
+    ; CHECK-M-LABEL: name: udivrem_i64
+    ; CHECK-M: liveins: $x10, $x11
+    ; CHECK-M-NEXT: {{  $}}
+    ; CHECK-M-NEXT: [[COPY:%[0-9]+]]:_(s64) = COPY $x10
+    ; CHECK-M-NEXT: [[COPY1:%[0-9]+]]:_(s64) = COPY $x11
+    ; CHECK-M-NEXT: [[UDIV:%[0-9]+]]:_(s64) = G_UDIV [[COPY]], [[COPY1]]
+    ; CHECK-M-NEXT: [[UREM:%[0-9]+]]:_(s64) = G_UREM [[COPY]], [[COPY1]]
+    ; CHECK-M-NEXT: [[ADD:%[0-9]+]]:_(s64) = G_ADD [[UDIV]], [[UREM]]
+    ; CHECK-M-NEXT: $x10 = COPY [[ADD]](s64)
+    ; CHECK-M-NEXT: PseudoRET implicit $x10
+    %0:_(s64) = COPY $x10
+    %1:_(s64) = COPY $x11
+    %2:_(s64), %3:_(s64) = G_UDIVREM %0, %1
+    %4:_(s64) = G_ADD %2, %3
+    $x10 = COPY %4(s64)
+    PseudoRET implicit $x10
+
+...
+---
+name:            sdivrem_i64
+body:             |
+  bb.1.entry:
+    liveins: $x10, $x11
+
+    ; CHECK-I-LABEL: name: sdivrem_i64
+    ; CHECK-I: liveins: $x10, $x11
+    ; CHECK-I-NEXT: {{  $}}
+    ; CHECK-I-NEXT: [[COPY:%[0-9]+]]:_(s64) = COPY $x10
+    ; CHECK-I-NEXT: [[COPY1:%[0-9]+]]:_(s64) = COPY $x11
+    ; CHECK-I-NEXT: ADJCALLSTACKDOWN 0, 0, implicit-def $x2, implicit $x2
+    ; CHECK-I-NEXT: $x10 = COPY [[COPY]](s64)
+    ; CHECK-I-NEXT: $x11 = COPY [[COPY1]](s64)
+    ; CHECK-I-NEXT: PseudoCALL target-flags(riscv-call) &__divdi3, csr_ilp32_lp64, implicit-def $x1, implicit $x10, implicit $x11, implicit-def $x10
+    ; CHECK-I-NEXT: ADJCALLSTACKUP 0, 0, implicit-def $x2, implicit $x2
+    ; CHECK-I-NEXT: [[COPY2:%[0-9]+]]:_(s64) = COPY $x10
+    ; CHECK-I-NEXT: ADJCALLSTACKDOWN 0, 0, implicit-def $x2, implicit $x2
+    ; CHECK-I-NEXT: $x10 = COPY [[COPY]](s64)
+    ; CHECK-I-NEXT: $x11 = COPY [[COPY1]](s64)
+    ; CHECK-I-NEXT: PseudoCALL target-flags(riscv-call) &__moddi3, csr_ilp32_lp64, implicit-def $x1, implicit $x10, implicit $x11, implicit-def $x10
+    ; CHECK-I-NEXT: ADJCALLSTACKUP 0, 0, implicit-def $x2, implicit $x2
+    ; CHECK-I-NEXT: [[COPY3:%[0-9]+]]:_(s64) = COPY $x10
+    ; CHECK-I-NEXT: [[ADD:%[0-9]+]]:_(s64) = G_ADD [[COPY2]], [[COPY3]]
+    ; CHECK-I-NEXT: $x10 = COPY [[ADD]](s64)
+    ; CHECK-I-NEXT: PseudoRET implicit $x10
+    ;
+    ; CHECK-M-LABEL: name: sdivrem_i64
+    ; CHECK-M: liveins: $x10, $x11
+    ; CHECK-M-NEXT: {{  $}}
+    ; CHECK-M-NEXT: [[COPY:%[0-9]+]]:_(s64) = COPY $x10
+    ; CHECK-M-NEXT: [[COPY1:%[0-9]+]]:_(s64) = COPY $x11
+    ; CHECK-M-NEXT: [[SDIV:%[0-9]+]]:_(s64) = G_SDIV [[COPY]], [[COPY1]]
+    ; CHECK-M-NEXT: [[SREM:%[0-9]+]]:_(s64) = G_SREM [[COPY]], [[COPY1]]
+    ; CHECK-M-NEXT: [[ADD:%[0-9]+]]:_(s64) = G_ADD [[SDIV]], [[SREM]]
+    ; CHECK-M-NEXT: $x10 = COPY [[ADD]](s64)
+    ; CHECK-M-NEXT: PseudoRET implicit $x10
+    %0:_(s64) = COPY $x10
+    %1:_(s64) = COPY $x11
+    %2:_(s64), %3:_(s64) = G_SDIVREM %0, %1
+    %4:_(s64) = G_ADD %2, %3
+    $x10 = COPY %4(s64)
+    PseudoRET implicit $x10
+
+...

@llvmbot
Copy link
Member

llvmbot commented May 22, 2024

@llvm/pr-subscribers-llvm-globalisel

Author: Yingwei Zheng (dtcxzyw)

Changes

This patch expands G_{U|S}DIVREM into G_{U|S}DIV + G_{U|S}REM. G_{U|S}DIVREM is generated by the following fold:

bool CombinerHelper::matchCombineDivRem(MachineInstr &MI,
MachineInstr *&OtherMI) {
unsigned Opcode = MI.getOpcode();
bool IsDiv, IsSigned;
switch (Opcode) {
default:
llvm_unreachable("Unexpected opcode!");
case TargetOpcode::G_SDIV:
case TargetOpcode::G_UDIV: {
IsDiv = true;
IsSigned = Opcode == TargetOpcode::G_SDIV;
break;
}
case TargetOpcode::G_SREM:
case TargetOpcode::G_UREM: {
IsDiv = false;
IsSigned = Opcode == TargetOpcode::G_SREM;
break;
}
}
Register Src1 = MI.getOperand(1).getReg();
unsigned DivOpcode, RemOpcode, DivremOpcode;
if (IsSigned) {
DivOpcode = TargetOpcode::G_SDIV;
RemOpcode = TargetOpcode::G_SREM;
DivremOpcode = TargetOpcode::G_SDIVREM;
} else {
DivOpcode = TargetOpcode::G_UDIV;
RemOpcode = TargetOpcode::G_UREM;
DivremOpcode = TargetOpcode::G_UDIVREM;
}
if (!isLegalOrBeforeLegalizer({DivremOpcode, {MRI.getType(Src1)}}))
return false;
// Combine:
// %div:_ = G_[SU]DIV %src1:_, %src2:_
// %rem:_ = G_[SU]REM %src1:_, %src2:_
// into:
// %div:_, %rem:_ = G_[SU]DIVREM %src1:_, %src2:_
// Combine:
// %rem:_ = G_[SU]REM %src1:_, %src2:_
// %div:_ = G_[SU]DIV %src1:_, %src2:_
// into:
// %div:_, %rem:_ = G_[SU]DIVREM %src1:_, %src2:_
for (auto &UseMI : MRI.use_nodbg_instructions(Src1)) {
if (MI.getParent() == UseMI.getParent() &&
((IsDiv && UseMI.getOpcode() == RemOpcode) ||
(!IsDiv && UseMI.getOpcode() == DivOpcode)) &&
matchEqualDefs(MI.getOperand(2), UseMI.getOperand(2)) &&
matchEqualDefs(MI.getOperand(1), UseMI.getOperand(1))) {
OtherMI = &UseMI;
return true;
}
}
return false;
}

It always folds div + rem pairs into divrem during pre-legalization. I tried to change isLegalOrBeforeLegalizer to isLegal but it produced worse codegen on AArch64.

I am not sure whether this patch is useful since DivRemPairsPass always converts div + rem pairs into div + mul on RISCV.


Full diff: https://github.com/llvm/llvm-project/pull/93067.diff

3 Files Affected:

  • (modified) llvm/lib/Target/RISCV/GISel/RISCVLegalizerInfo.cpp (+3)
  • (modified) llvm/test/CodeGen/RISCV/GlobalISel/legalizer/legalize-div-rv32.mir (+90)
  • (modified) llvm/test/CodeGen/RISCV/GlobalISel/legalizer/legalize-div-rv64.mir (+90)
diff --git a/llvm/lib/Target/RISCV/GISel/RISCVLegalizerInfo.cpp b/llvm/lib/Target/RISCV/GISel/RISCVLegalizerInfo.cpp
index a1d3aadb816ab..c6d11b8a8bd7e 100644
--- a/llvm/lib/Target/RISCV/GISel/RISCVLegalizerInfo.cpp
+++ b/llvm/lib/Target/RISCV/GISel/RISCVLegalizerInfo.cpp
@@ -349,6 +349,9 @@ RISCVLegalizerInfo::RISCVLegalizerInfo(const RISCVSubtarget &ST)
         .widenScalarToNextPow2(0);
   }
 
+  // TODO: Use libcall for sDoubleXLen.
+  getActionDefinitionsBuilder({G_UDIVREM, G_SDIVREM}).lower();
+
   auto &AbsActions = getActionDefinitionsBuilder(G_ABS);
   if (ST.hasStdExtZbb())
     AbsActions.customFor({s32, sXLen}).minScalar(0, sXLen);
diff --git a/llvm/test/CodeGen/RISCV/GlobalISel/legalizer/legalize-div-rv32.mir b/llvm/test/CodeGen/RISCV/GlobalISel/legalizer/legalize-div-rv32.mir
index 4177a40e3826c..26d8785afb470 100644
--- a/llvm/test/CodeGen/RISCV/GlobalISel/legalizer/legalize-div-rv32.mir
+++ b/llvm/test/CodeGen/RISCV/GlobalISel/legalizer/legalize-div-rv32.mir
@@ -555,3 +555,93 @@ body:             |
     PseudoRET implicit $x10, implicit $x11
 
 ...
+---
+name:            udivrem_i32
+body:             |
+  bb.1.entry:
+    liveins: $x10, $x11
+
+    ; CHECK-I-LABEL: name: udivrem_i32
+    ; CHECK-I: liveins: $x10, $x11
+    ; CHECK-I-NEXT: {{  $}}
+    ; CHECK-I-NEXT: [[COPY:%[0-9]+]]:_(s32) = COPY $x10
+    ; CHECK-I-NEXT: [[COPY1:%[0-9]+]]:_(s32) = COPY $x11
+    ; CHECK-I-NEXT: ADJCALLSTACKDOWN 0, 0, implicit-def $x2, implicit $x2
+    ; CHECK-I-NEXT: $x10 = COPY [[COPY]](s32)
+    ; CHECK-I-NEXT: $x11 = COPY [[COPY1]](s32)
+    ; CHECK-I-NEXT: PseudoCALL target-flags(riscv-call) &__udivsi3, csr_ilp32_lp64, implicit-def $x1, implicit $x10, implicit $x11, implicit-def $x10
+    ; CHECK-I-NEXT: ADJCALLSTACKUP 0, 0, implicit-def $x2, implicit $x2
+    ; CHECK-I-NEXT: [[COPY2:%[0-9]+]]:_(s32) = COPY $x10
+    ; CHECK-I-NEXT: ADJCALLSTACKDOWN 0, 0, implicit-def $x2, implicit $x2
+    ; CHECK-I-NEXT: $x10 = COPY [[COPY]](s32)
+    ; CHECK-I-NEXT: $x11 = COPY [[COPY1]](s32)
+    ; CHECK-I-NEXT: PseudoCALL target-flags(riscv-call) &__umodsi3, csr_ilp32_lp64, implicit-def $x1, implicit $x10, implicit $x11, implicit-def $x10
+    ; CHECK-I-NEXT: ADJCALLSTACKUP 0, 0, implicit-def $x2, implicit $x2
+    ; CHECK-I-NEXT: [[COPY3:%[0-9]+]]:_(s32) = COPY $x10
+    ; CHECK-I-NEXT: [[ADD:%[0-9]+]]:_(s32) = G_ADD [[COPY2]], [[COPY3]]
+    ; CHECK-I-NEXT: $x10 = COPY [[ADD]](s32)
+    ; CHECK-I-NEXT: PseudoRET implicit $x10
+    ;
+    ; CHECK-M-LABEL: name: udivrem_i32
+    ; CHECK-M: liveins: $x10, $x11
+    ; CHECK-M-NEXT: {{  $}}
+    ; CHECK-M-NEXT: [[COPY:%[0-9]+]]:_(s32) = COPY $x10
+    ; CHECK-M-NEXT: [[COPY1:%[0-9]+]]:_(s32) = COPY $x11
+    ; CHECK-M-NEXT: [[UDIV:%[0-9]+]]:_(s32) = G_UDIV [[COPY]], [[COPY1]]
+    ; CHECK-M-NEXT: [[UREM:%[0-9]+]]:_(s32) = G_UREM [[COPY]], [[COPY1]]
+    ; CHECK-M-NEXT: [[ADD:%[0-9]+]]:_(s32) = G_ADD [[UDIV]], [[UREM]]
+    ; CHECK-M-NEXT: $x10 = COPY [[ADD]](s32)
+    ; CHECK-M-NEXT: PseudoRET implicit $x10
+    %0:_(s32) = COPY $x10
+    %1:_(s32) = COPY $x11
+    %2:_(s32), %3:_(s32) = G_UDIVREM %0, %1
+    %4:_(s32) = G_ADD %2, %3
+    $x10 = COPY %4(s32)
+    PseudoRET implicit $x10
+
+...
+---
+name:            sdivrem_i32
+body:             |
+  bb.1.entry:
+    liveins: $x10, $x11
+
+    ; CHECK-I-LABEL: name: sdivrem_i32
+    ; CHECK-I: liveins: $x10, $x11
+    ; CHECK-I-NEXT: {{  $}}
+    ; CHECK-I-NEXT: [[COPY:%[0-9]+]]:_(s32) = COPY $x10
+    ; CHECK-I-NEXT: [[COPY1:%[0-9]+]]:_(s32) = COPY $x11
+    ; CHECK-I-NEXT: ADJCALLSTACKDOWN 0, 0, implicit-def $x2, implicit $x2
+    ; CHECK-I-NEXT: $x10 = COPY [[COPY]](s32)
+    ; CHECK-I-NEXT: $x11 = COPY [[COPY1]](s32)
+    ; CHECK-I-NEXT: PseudoCALL target-flags(riscv-call) &__divsi3, csr_ilp32_lp64, implicit-def $x1, implicit $x10, implicit $x11, implicit-def $x10
+    ; CHECK-I-NEXT: ADJCALLSTACKUP 0, 0, implicit-def $x2, implicit $x2
+    ; CHECK-I-NEXT: [[COPY2:%[0-9]+]]:_(s32) = COPY $x10
+    ; CHECK-I-NEXT: ADJCALLSTACKDOWN 0, 0, implicit-def $x2, implicit $x2
+    ; CHECK-I-NEXT: $x10 = COPY [[COPY]](s32)
+    ; CHECK-I-NEXT: $x11 = COPY [[COPY1]](s32)
+    ; CHECK-I-NEXT: PseudoCALL target-flags(riscv-call) &__modsi3, csr_ilp32_lp64, implicit-def $x1, implicit $x10, implicit $x11, implicit-def $x10
+    ; CHECK-I-NEXT: ADJCALLSTACKUP 0, 0, implicit-def $x2, implicit $x2
+    ; CHECK-I-NEXT: [[COPY3:%[0-9]+]]:_(s32) = COPY $x10
+    ; CHECK-I-NEXT: [[ADD:%[0-9]+]]:_(s32) = G_ADD [[COPY2]], [[COPY3]]
+    ; CHECK-I-NEXT: $x10 = COPY [[ADD]](s32)
+    ; CHECK-I-NEXT: PseudoRET implicit $x10
+    ;
+    ; CHECK-M-LABEL: name: sdivrem_i32
+    ; CHECK-M: liveins: $x10, $x11
+    ; CHECK-M-NEXT: {{  $}}
+    ; CHECK-M-NEXT: [[COPY:%[0-9]+]]:_(s32) = COPY $x10
+    ; CHECK-M-NEXT: [[COPY1:%[0-9]+]]:_(s32) = COPY $x11
+    ; CHECK-M-NEXT: [[SDIV:%[0-9]+]]:_(s32) = G_SDIV [[COPY]], [[COPY1]]
+    ; CHECK-M-NEXT: [[SREM:%[0-9]+]]:_(s32) = G_SREM [[COPY]], [[COPY1]]
+    ; CHECK-M-NEXT: [[ADD:%[0-9]+]]:_(s32) = G_ADD [[SDIV]], [[SREM]]
+    ; CHECK-M-NEXT: $x10 = COPY [[ADD]](s32)
+    ; CHECK-M-NEXT: PseudoRET implicit $x10
+    %0:_(s32) = COPY $x10
+    %1:_(s32) = COPY $x11
+    %2:_(s32), %3:_(s32) = G_SDIVREM %0, %1
+    %4:_(s32) = G_ADD %2, %3
+    $x10 = COPY %4(s32)
+    PseudoRET implicit $x10
+
+...
diff --git a/llvm/test/CodeGen/RISCV/GlobalISel/legalizer/legalize-div-rv64.mir b/llvm/test/CodeGen/RISCV/GlobalISel/legalizer/legalize-div-rv64.mir
index 492f9530997c9..bbbe38f695d2e 100644
--- a/llvm/test/CodeGen/RISCV/GlobalISel/legalizer/legalize-div-rv64.mir
+++ b/llvm/test/CodeGen/RISCV/GlobalISel/legalizer/legalize-div-rv64.mir
@@ -655,3 +655,93 @@ body:             |
     PseudoRET implicit $x10, implicit $x11
 
 ...
+---
+name:            udivrem_i64
+body:             |
+  bb.1.entry:
+    liveins: $x10, $x11
+
+    ; CHECK-I-LABEL: name: udivrem_i64
+    ; CHECK-I: liveins: $x10, $x11
+    ; CHECK-I-NEXT: {{  $}}
+    ; CHECK-I-NEXT: [[COPY:%[0-9]+]]:_(s64) = COPY $x10
+    ; CHECK-I-NEXT: [[COPY1:%[0-9]+]]:_(s64) = COPY $x11
+    ; CHECK-I-NEXT: ADJCALLSTACKDOWN 0, 0, implicit-def $x2, implicit $x2
+    ; CHECK-I-NEXT: $x10 = COPY [[COPY]](s64)
+    ; CHECK-I-NEXT: $x11 = COPY [[COPY1]](s64)
+    ; CHECK-I-NEXT: PseudoCALL target-flags(riscv-call) &__udivdi3, csr_ilp32_lp64, implicit-def $x1, implicit $x10, implicit $x11, implicit-def $x10
+    ; CHECK-I-NEXT: ADJCALLSTACKUP 0, 0, implicit-def $x2, implicit $x2
+    ; CHECK-I-NEXT: [[COPY2:%[0-9]+]]:_(s64) = COPY $x10
+    ; CHECK-I-NEXT: ADJCALLSTACKDOWN 0, 0, implicit-def $x2, implicit $x2
+    ; CHECK-I-NEXT: $x10 = COPY [[COPY]](s64)
+    ; CHECK-I-NEXT: $x11 = COPY [[COPY1]](s64)
+    ; CHECK-I-NEXT: PseudoCALL target-flags(riscv-call) &__umoddi3, csr_ilp32_lp64, implicit-def $x1, implicit $x10, implicit $x11, implicit-def $x10
+    ; CHECK-I-NEXT: ADJCALLSTACKUP 0, 0, implicit-def $x2, implicit $x2
+    ; CHECK-I-NEXT: [[COPY3:%[0-9]+]]:_(s64) = COPY $x10
+    ; CHECK-I-NEXT: [[ADD:%[0-9]+]]:_(s64) = G_ADD [[COPY2]], [[COPY3]]
+    ; CHECK-I-NEXT: $x10 = COPY [[ADD]](s64)
+    ; CHECK-I-NEXT: PseudoRET implicit $x10
+    ;
+    ; CHECK-M-LABEL: name: udivrem_i64
+    ; CHECK-M: liveins: $x10, $x11
+    ; CHECK-M-NEXT: {{  $}}
+    ; CHECK-M-NEXT: [[COPY:%[0-9]+]]:_(s64) = COPY $x10
+    ; CHECK-M-NEXT: [[COPY1:%[0-9]+]]:_(s64) = COPY $x11
+    ; CHECK-M-NEXT: [[UDIV:%[0-9]+]]:_(s64) = G_UDIV [[COPY]], [[COPY1]]
+    ; CHECK-M-NEXT: [[UREM:%[0-9]+]]:_(s64) = G_UREM [[COPY]], [[COPY1]]
+    ; CHECK-M-NEXT: [[ADD:%[0-9]+]]:_(s64) = G_ADD [[UDIV]], [[UREM]]
+    ; CHECK-M-NEXT: $x10 = COPY [[ADD]](s64)
+    ; CHECK-M-NEXT: PseudoRET implicit $x10
+    %0:_(s64) = COPY $x10
+    %1:_(s64) = COPY $x11
+    %2:_(s64), %3:_(s64) = G_UDIVREM %0, %1
+    %4:_(s64) = G_ADD %2, %3
+    $x10 = COPY %4(s64)
+    PseudoRET implicit $x10
+
+...
+---
+name:            sdivrem_i64
+body:             |
+  bb.1.entry:
+    liveins: $x10, $x11
+
+    ; CHECK-I-LABEL: name: sdivrem_i64
+    ; CHECK-I: liveins: $x10, $x11
+    ; CHECK-I-NEXT: {{  $}}
+    ; CHECK-I-NEXT: [[COPY:%[0-9]+]]:_(s64) = COPY $x10
+    ; CHECK-I-NEXT: [[COPY1:%[0-9]+]]:_(s64) = COPY $x11
+    ; CHECK-I-NEXT: ADJCALLSTACKDOWN 0, 0, implicit-def $x2, implicit $x2
+    ; CHECK-I-NEXT: $x10 = COPY [[COPY]](s64)
+    ; CHECK-I-NEXT: $x11 = COPY [[COPY1]](s64)
+    ; CHECK-I-NEXT: PseudoCALL target-flags(riscv-call) &__divdi3, csr_ilp32_lp64, implicit-def $x1, implicit $x10, implicit $x11, implicit-def $x10
+    ; CHECK-I-NEXT: ADJCALLSTACKUP 0, 0, implicit-def $x2, implicit $x2
+    ; CHECK-I-NEXT: [[COPY2:%[0-9]+]]:_(s64) = COPY $x10
+    ; CHECK-I-NEXT: ADJCALLSTACKDOWN 0, 0, implicit-def $x2, implicit $x2
+    ; CHECK-I-NEXT: $x10 = COPY [[COPY]](s64)
+    ; CHECK-I-NEXT: $x11 = COPY [[COPY1]](s64)
+    ; CHECK-I-NEXT: PseudoCALL target-flags(riscv-call) &__moddi3, csr_ilp32_lp64, implicit-def $x1, implicit $x10, implicit $x11, implicit-def $x10
+    ; CHECK-I-NEXT: ADJCALLSTACKUP 0, 0, implicit-def $x2, implicit $x2
+    ; CHECK-I-NEXT: [[COPY3:%[0-9]+]]:_(s64) = COPY $x10
+    ; CHECK-I-NEXT: [[ADD:%[0-9]+]]:_(s64) = G_ADD [[COPY2]], [[COPY3]]
+    ; CHECK-I-NEXT: $x10 = COPY [[ADD]](s64)
+    ; CHECK-I-NEXT: PseudoRET implicit $x10
+    ;
+    ; CHECK-M-LABEL: name: sdivrem_i64
+    ; CHECK-M: liveins: $x10, $x11
+    ; CHECK-M-NEXT: {{  $}}
+    ; CHECK-M-NEXT: [[COPY:%[0-9]+]]:_(s64) = COPY $x10
+    ; CHECK-M-NEXT: [[COPY1:%[0-9]+]]:_(s64) = COPY $x11
+    ; CHECK-M-NEXT: [[SDIV:%[0-9]+]]:_(s64) = G_SDIV [[COPY]], [[COPY1]]
+    ; CHECK-M-NEXT: [[SREM:%[0-9]+]]:_(s64) = G_SREM [[COPY]], [[COPY1]]
+    ; CHECK-M-NEXT: [[ADD:%[0-9]+]]:_(s64) = G_ADD [[SDIV]], [[SREM]]
+    ; CHECK-M-NEXT: $x10 = COPY [[ADD]](s64)
+    ; CHECK-M-NEXT: PseudoRET implicit $x10
+    %0:_(s64) = COPY $x10
+    %1:_(s64) = COPY $x11
+    %2:_(s64), %3:_(s64) = G_SDIVREM %0, %1
+    %4:_(s64) = G_ADD %2, %3
+    $x10 = COPY %4(s64)
+    PseudoRET implicit $x10
+
+...

Copy link
Collaborator

@topperc topperc left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM

@dtcxzyw dtcxzyw merged commit f2bbb4c into llvm:main May 22, 2024
6 of 7 checks passed
@dtcxzyw dtcxzyw deleted the gisel-divrem branch May 22, 2024 17:26
@boomshroom
Copy link

Is there any good reason why it always generates separate lib calls for div and rem rather than a single line call for divrem?

@dtcxzyw
Copy link
Member Author

dtcxzyw commented May 22, 2024

Is there any good reason why it always generates separate lib calls for div and rem rather than a single line call for divrem?

I just leave it as a to-do. It is unclear whether DivRemPairsPass always converts div + rem into div + mul.

@topperc
Copy link
Collaborator

topperc commented May 22, 2024

Is there any good reason why it always generates separate lib calls for div and rem rather than a single line call for divrem?

The divrem expansion in GISel probably splits its into div and rem and then each get legalized separate.

It's very weird to me that GISel combiner creates DIVREM when it needs to be expanded. That's not what SelectionDAG does.

@boomshroom
Copy link

From what I can tell, the primary reason is just because LLVM doesn't know divrem library calls exist because they've been explicitly set to null. This is in spite of the fact that they're defined in compiler-rt and should be available. Not only that, but the divrem functions are directly called by div and rem.

The only reasons I can tell for them not to be enabled are to avoid auto-generated 2-output libcalls (which would explain why sincos is also set to null), and because the 128-bit versions aren't built on all targets (which doesn't make as much sense since the 128-bit libcalls for div and rem are unconditionally declared anyways).

If the libcalls were declared in RuntimeLibcalls.def, then LLVM does seem fully capable of generating calls to them in the right circumstances, and even seems to pass the pointer parameter without issue.

@dtcxzyw
Copy link
Member Author

dtcxzyw commented May 23, 2024

Is there any good reason why it always generates separate lib calls for div and rem rather than a single line call for divrem?

The divrem expansion in GISel probably splits its into div and rem and then each get legalized separate.

It's very weird to me that GISel combiner creates DIVREM when it needs to be expanded. That's not what SelectionDAG does.

It always folds div + rem pairs into divrem during pre-legalization. I tried to change isLegalOrBeforeLegalizer to isLegal but it produced worse codegen on AArch64.

@topperc
Copy link
Collaborator

topperc commented May 23, 2024

From what I can tell, the primary reason is just because LLVM doesn't know divrem library calls exist because they've been explicitly set to null. This is in spite of the fact that they're defined in compiler-rt and should be available. Not only that, but the divrem functions are directly called by div and rem.

The only reasons I can tell for them not to be enabled are to avoid auto-generated 2-output libcalls (which would explain why sincos is also set to null), and because the 128-bit versions aren't built on all targets (which doesn't make as much sense since the 128-bit libcalls for div and rem are unconditionally declared anyways).

If the libcalls were declared in RuntimeLibcalls.def, then LLVM does seem fully capable of generating calls to them in the right circumstances, and even seems to pass the pointer parameter without issue.

There's some other messy issues here. Just a couple example
-libgcc hasn't always exported them and we can't detect in the backend whether we're using compiler-rt or libgcc or the version. Enough time may have passed by now that we can ignore the really old versions of libgcc. And it probably isn't an issue for a newer target like RISC-V.
-32-bit ARM has custom versions in compiler-rt that return 2 registers I think.

I don't think setting them to non-null globally is a good idea. We could set it to non-null in RISCVISelLowering.cpp specifically.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

5 participants