Skip to content

[GISel][RISCV] Fix several boundary cases in narrow G_SEXT_INREG. #72719

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
merged 1 commit into from
Nov 24, 2023

Conversation

topperc
Copy link
Collaborator

@topperc topperc commented Nov 17, 2023

This fixes cases when SizeInBits is a multiple of the narrow size.

If SizeBits is equal to NarrowTy size, the first block would create an illegal G_SEXT_INREG where the the extension size is equal to the type. I tried to turn it into G_TRUNC+G_SEXT, but that just turned back into G_SEXT_INREG causing an infinite loop. So punt to the splitting case.

In the for loop we should copy when the part ends on SizeInBits. In that case there is no G_SEXT_INREG needed for partial. But we should note that register in PartialExtensionReg for the first full part to use.

If the part starts on SizeInBits then we should do an AShr of PartialExtensionReg.

We should only get to the G_SEXT_INREG case if the SizeInBits is in the middle of the part.

@llvmbot
Copy link
Member

llvmbot commented Nov 17, 2023

@llvm/pr-subscribers-llvm-globalisel

Author: Craig Topper (topperc)

Changes

This fixes cases when SizeInBits is a multiple of the narrow size.

If SizeBits is equal to NarrowTy size, the first block would create an illegal G_SEXT_INREG where the the extension size is equal to the type. I tried to turn it into G_TRUNC+G_SEXT, but that just turned back into G_SEXT_INREG causing an infinite loop. So punt to the splitting case.

In the for loop we should copy when the part ends on SizeInBits. In that case there is no G_SEXT_INREG needed for partial. But we should note that register in PartialExtensionReg for the first full part to use.

If the part starts on SizeInBits then we should do an AShr of PartialExtensionReg.

We should only get to the G_SEXT_INREG case if the SizeInBits is in the middle of the part.


Full diff: https://github.com/llvm/llvm-project/pull/72719.diff

3 Files Affected:

  • (modified) llvm/lib/CodeGen/GlobalISel/LegalizerHelper.cpp (+6-5)
  • (modified) llvm/test/CodeGen/RISCV/GlobalISel/legalizer/rv32/legalize-mulo.mir (+6-9)
  • (modified) llvm/test/CodeGen/RISCV/GlobalISel/legalizer/rv64/legalize-mulo.mir (+6-9)
diff --git a/llvm/lib/CodeGen/GlobalISel/LegalizerHelper.cpp b/llvm/lib/CodeGen/GlobalISel/LegalizerHelper.cpp
index dd5577d47f97764..195f0d60ca10ba2 100644
--- a/llvm/lib/CodeGen/GlobalISel/LegalizerHelper.cpp
+++ b/llvm/lib/CodeGen/GlobalISel/LegalizerHelper.cpp
@@ -1445,7 +1445,7 @@ LegalizerHelper::LegalizeResult LegalizerHelper::narrowScalar(MachineInstr &MI,
 
     // So long as the new type has more bits than the bits we're extending we
     // don't need to break it apart.
-    if (NarrowTy.getScalarSizeInBits() >= SizeInBits) {
+    if (NarrowTy.getScalarSizeInBits() > SizeInBits) {
       Observer.changingInstr(MI);
       // We don't lose any non-extension bits by truncating the src and
       // sign-extending the dst.
@@ -1488,14 +1488,15 @@ LegalizerHelper::LegalizeResult LegalizerHelper::narrowScalar(MachineInstr &MI,
     Register AshrCstReg =
         MIRBuilder.buildConstant(NarrowTy, NarrowTy.getScalarSizeInBits() - 1)
             .getReg(0);
-    Register FullExtensionReg = 0;
-    Register PartialExtensionReg = 0;
+    Register FullExtensionReg;
+    Register PartialExtensionReg;
 
     // Do the operation on each small part.
     for (int i = 0; i < NumParts; ++i) {
-      if ((i + 1) * NarrowTy.getScalarSizeInBits() < SizeInBits)
+      if ((i + 1) * NarrowTy.getScalarSizeInBits() <= SizeInBits) {
         DstRegs.push_back(SrcRegs[i]);
-      else if (i * NarrowTy.getScalarSizeInBits() > SizeInBits) {
+        PartialExtensionReg = DstRegs.back();
+      } else if (i * NarrowTy.getScalarSizeInBits() >= SizeInBits) {
         assert(PartialExtensionReg &&
                "Expected to visit partial extension before full");
         if (FullExtensionReg) {
diff --git a/llvm/test/CodeGen/RISCV/GlobalISel/legalizer/rv32/legalize-mulo.mir b/llvm/test/CodeGen/RISCV/GlobalISel/legalizer/rv32/legalize-mulo.mir
index 1b584e0561c77f8..d0929fd19eb92ab 100644
--- a/llvm/test/CodeGen/RISCV/GlobalISel/legalizer/rv32/legalize-mulo.mir
+++ b/llvm/test/CodeGen/RISCV/GlobalISel/legalizer/rv32/legalize-mulo.mir
@@ -168,16 +168,13 @@ body:             |
     ; LIBCALL-NEXT: ADJCALLSTACKUP 0, 0, implicit-def $x2, implicit $x2
     ; LIBCALL-NEXT: [[COPY2:%[0-9]+]]:_(s32) = COPY $x10
     ; LIBCALL-NEXT: [[COPY3:%[0-9]+]]:_(s32) = COPY $x11
-    ; LIBCALL-NEXT: [[C2:%[0-9]+]]:_(s32) = G_CONSTANT i32 0
-    ; LIBCALL-NEXT: [[SHL:%[0-9]+]]:_(s32) = G_SHL [[COPY2]], [[C2]](s32)
-    ; LIBCALL-NEXT: [[ASHR2:%[0-9]+]]:_(s32) = G_ASHR [[SHL]], [[C2]](s32)
-    ; LIBCALL-NEXT: [[C3:%[0-9]+]]:_(s32) = G_CONSTANT i32 31
-    ; LIBCALL-NEXT: [[ASHR3:%[0-9]+]]:_(s32) = G_ASHR [[ASHR2]], [[C3]](s32)
-    ; LIBCALL-NEXT: [[C4:%[0-9]+]]:_(s32) = G_CONSTANT i32 0
-    ; LIBCALL-NEXT: [[XOR:%[0-9]+]]:_(s32) = G_XOR [[COPY2]], [[ASHR2]]
-    ; LIBCALL-NEXT: [[XOR1:%[0-9]+]]:_(s32) = G_XOR [[COPY3]], [[ASHR3]]
+    ; LIBCALL-NEXT: [[C2:%[0-9]+]]:_(s32) = G_CONSTANT i32 31
+    ; LIBCALL-NEXT: [[ASHR2:%[0-9]+]]:_(s32) = G_ASHR [[COPY2]], [[C2]](s32)
+    ; LIBCALL-NEXT: [[C3:%[0-9]+]]:_(s32) = G_CONSTANT i32 0
+    ; LIBCALL-NEXT: [[XOR:%[0-9]+]]:_(s32) = G_XOR [[COPY2]], [[COPY2]]
+    ; LIBCALL-NEXT: [[XOR1:%[0-9]+]]:_(s32) = G_XOR [[COPY3]], [[ASHR2]]
     ; LIBCALL-NEXT: [[OR:%[0-9]+]]:_(s32) = G_OR [[XOR]], [[XOR1]]
-    ; LIBCALL-NEXT: [[ICMP:%[0-9]+]]:_(s32) = G_ICMP intpred(ne), [[OR]](s32), [[C4]]
+    ; LIBCALL-NEXT: [[ICMP:%[0-9]+]]:_(s32) = G_ICMP intpred(ne), [[OR]](s32), [[C3]]
     ; LIBCALL-NEXT: $x10 = COPY [[COPY2]](s32)
     ; LIBCALL-NEXT: $x11 = COPY [[ICMP]](s32)
     ; LIBCALL-NEXT: PseudoRET implicit $x10, implicit $x11
diff --git a/llvm/test/CodeGen/RISCV/GlobalISel/legalizer/rv64/legalize-mulo.mir b/llvm/test/CodeGen/RISCV/GlobalISel/legalizer/rv64/legalize-mulo.mir
index 433e1b8e6d577e3..c2bf9ff2a61e9a7 100644
--- a/llvm/test/CodeGen/RISCV/GlobalISel/legalizer/rv64/legalize-mulo.mir
+++ b/llvm/test/CodeGen/RISCV/GlobalISel/legalizer/rv64/legalize-mulo.mir
@@ -218,16 +218,13 @@ body:             |
     ; LIBCALL-NEXT: ADJCALLSTACKUP 0, 0, implicit-def $x2, implicit $x2
     ; LIBCALL-NEXT: [[COPY2:%[0-9]+]]:_(s64) = COPY $x10
     ; LIBCALL-NEXT: [[COPY3:%[0-9]+]]:_(s64) = COPY $x11
-    ; LIBCALL-NEXT: [[C2:%[0-9]+]]:_(s64) = G_CONSTANT i64 0
-    ; LIBCALL-NEXT: [[SHL:%[0-9]+]]:_(s64) = G_SHL [[COPY2]], [[C2]](s64)
-    ; LIBCALL-NEXT: [[ASHR2:%[0-9]+]]:_(s64) = G_ASHR [[SHL]], [[C2]](s64)
-    ; LIBCALL-NEXT: [[C3:%[0-9]+]]:_(s64) = G_CONSTANT i64 63
-    ; LIBCALL-NEXT: [[ASHR3:%[0-9]+]]:_(s64) = G_ASHR [[ASHR2]], [[C3]](s64)
-    ; LIBCALL-NEXT: [[C4:%[0-9]+]]:_(s64) = G_CONSTANT i64 0
-    ; LIBCALL-NEXT: [[XOR:%[0-9]+]]:_(s64) = G_XOR [[COPY2]], [[ASHR2]]
-    ; LIBCALL-NEXT: [[XOR1:%[0-9]+]]:_(s64) = G_XOR [[COPY3]], [[ASHR3]]
+    ; LIBCALL-NEXT: [[C2:%[0-9]+]]:_(s64) = G_CONSTANT i64 63
+    ; LIBCALL-NEXT: [[ASHR2:%[0-9]+]]:_(s64) = G_ASHR [[COPY2]], [[C2]](s64)
+    ; LIBCALL-NEXT: [[C3:%[0-9]+]]:_(s64) = G_CONSTANT i64 0
+    ; LIBCALL-NEXT: [[XOR:%[0-9]+]]:_(s64) = G_XOR [[COPY2]], [[COPY2]]
+    ; LIBCALL-NEXT: [[XOR1:%[0-9]+]]:_(s64) = G_XOR [[COPY3]], [[ASHR2]]
     ; LIBCALL-NEXT: [[OR:%[0-9]+]]:_(s64) = G_OR [[XOR]], [[XOR1]]
-    ; LIBCALL-NEXT: [[ICMP:%[0-9]+]]:_(s64) = G_ICMP intpred(ne), [[OR]](s64), [[C4]]
+    ; LIBCALL-NEXT: [[ICMP:%[0-9]+]]:_(s64) = G_ICMP intpred(ne), [[OR]](s64), [[C3]]
     ; LIBCALL-NEXT: $x10 = COPY [[COPY2]](s64)
     ; LIBCALL-NEXT: $x11 = COPY [[ICMP]](s64)
     ; LIBCALL-NEXT: PseudoRET implicit $x10, implicit $x11

Copy link
Contributor

@aemerson aemerson left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think this makes sense?

This fixes cases when SizeInBits is a multiple of the narrow size.

If SizeBits is equal to NarrowTy size, the first block would create
an illegal G_SEXT_INREG where the the extension size is equal to the type.
I tried to turn it into G_TRUNC+G_SEXT, but that just turned back
into G_SEXT_INREG causing an infinite loop. So punt to the splitting case.

In the for loop we should copy when the part ends on SizeInBits. In
that case there is no G_SEXT_INREG needed for partial. But we should
note that register in PartialExtensionReg for the first full part to
use.

If the part starts on SizeInBits then we should do an AShr of
PartialExtensionReg.

We should only get to the G_SEXT_INREG case if the SizeInBits is in
the middle of the part.
@topperc topperc force-pushed the pr/gisel-sext-inreg branch from 4490e22 to 1b867fc Compare November 24, 2023 07:18
@topperc topperc merged commit 5d501b1 into llvm:main Nov 24, 2023
@topperc topperc deleted the pr/gisel-sext-inreg branch November 24, 2023 16:39
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants