Skip to content

[RISCV][ISel] Add codegen support for the experimental zabha extension #80192

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
merged 5 commits into from
Feb 16, 2024

Conversation

dtcxzyw
Copy link
Member

@dtcxzyw dtcxzyw commented Jan 31, 2024

This patch implements the codegen support of zabha (Byte and Halfword Atomic Memory Operations) v1.0-rc1 extension.
See also https://github.com/riscv/riscv-zabha/blob/v1.0-rc1/zabha.adoc.

@llvmbot
Copy link
Member

llvmbot commented Jan 31, 2024

@llvm/pr-subscribers-backend-risc-v

Author: Yingwei Zheng (dtcxzyw)

Changes

This patch implements the codegen support of zabha (Byte and Halfword Atomic Memory Operations) v1.0-rc1 extension.
See also https://github.com/riscv/riscv-zabha/blob/v1.0-rc1/zabha.adoc.


Patch is 437.87 KiB, truncated to 20.00 KiB below, full version: https://github.com/llvm/llvm-project/pull/80192.diff

7 Files Affected:

  • (modified) llvm/docs/RISCVUsage.rst (+1-1)
  • (modified) llvm/docs/ReleaseNotes.rst (+1-1)
  • (modified) llvm/lib/Target/RISCV/RISCVISelLowering.cpp (+19-4)
  • (modified) llvm/lib/Target/RISCV/RISCVInstrInfoZa.td (+27)
  • (modified) llvm/test/CodeGen/RISCV/atomic-cmpxchg-branch-on-result.ll (+22)
  • (modified) llvm/test/CodeGen/RISCV/atomic-cmpxchg.ll (+942-180)
  • (modified) llvm/test/CodeGen/RISCV/atomic-rmw.ll (+6451-850)
diff --git a/llvm/docs/RISCVUsage.rst b/llvm/docs/RISCVUsage.rst
index 5caf2fee197f2..156db36638984 100644
--- a/llvm/docs/RISCVUsage.rst
+++ b/llvm/docs/RISCVUsage.rst
@@ -227,7 +227,7 @@ LLVM supports (to various degrees) a number of experimental extensions.  All exp
 The primary goal of experimental support is to assist in the process of ratification by providing an existence proof of an implementation, and simplifying efforts to validate the value of a proposed extension against large code bases.  Experimental extensions are expected to either transition to ratified status, or be eventually removed.  The decision on whether to accept an experimental extension is currently done on an entirely case by case basis; if you want to propose one, attending the bi-weekly RISC-V sync-up call is strongly advised.
 
 ``experimental-zabha``
-  LLVM implements assembler support for the `v1.0-rc1 draft specification <https://github.com/riscv/riscv-zabha/tree/v1.0-rc1>`_.
+  LLVM implements the `v1.0-rc1 draft specification <https://github.com/riscv/riscv-zabha/tree/v1.0-rc1>`_.
 
 ``experimental-zacas``
   LLVM implements the `1.0-rc1 draft specification <https://github.com/riscv/riscv-zacas/releases/tag/v1.0-rc1>`_.
diff --git a/llvm/docs/ReleaseNotes.rst b/llvm/docs/ReleaseNotes.rst
index ad5b2ec1219e0..9810c02e78fd3 100644
--- a/llvm/docs/ReleaseNotes.rst
+++ b/llvm/docs/ReleaseNotes.rst
@@ -93,7 +93,7 @@ Changes to the RISC-V Backend
 -----------------------------
 
 * Support for the Zicond extension is no longer experimental.
-* Added assembler/disassembler support for the experimental Zabha (Byte and Halfword Atomic Memory Operations) extension.
+* Added full support for the experimental Zabha (Byte and Halfword Atomic Memory Operations) extension.
 
 Changes to the WebAssembly Backend
 ----------------------------------
diff --git a/llvm/lib/Target/RISCV/RISCVISelLowering.cpp b/llvm/lib/Target/RISCV/RISCVISelLowering.cpp
index b8994e7b7bdb2..789acfce56810 100644
--- a/llvm/lib/Target/RISCV/RISCVISelLowering.cpp
+++ b/llvm/lib/Target/RISCV/RISCVISelLowering.cpp
@@ -629,7 +629,10 @@ RISCVTargetLowering::RISCVTargetLowering(const TargetMachine &TM,
 
   if (Subtarget.hasStdExtA()) {
     setMaxAtomicSizeInBitsSupported(Subtarget.getXLen());
-    setMinCmpXchgSizeInBits(32);
+    if (Subtarget.hasStdExtZabha())
+      setMinCmpXchgSizeInBits(8);
+    else
+      setMinCmpXchgSizeInBits(32);
   } else if (Subtarget.hasForcedAtomics()) {
     setMaxAtomicSizeInBitsSupported(Subtarget.getXLen());
   } else {
@@ -19519,8 +19522,14 @@ RISCVTargetLowering::shouldExpandAtomicRMWInIR(AtomicRMWInst *AI) const {
     return AtomicExpansionKind::None;
 
   unsigned Size = AI->getType()->getPrimitiveSizeInBits();
-  if (Size == 8 || Size == 16)
-    return AtomicExpansionKind::MaskedIntrinsic;
+  if (Size == 8 || Size == 16) {
+    if (!Subtarget.hasStdExtZabha())
+      return AtomicExpansionKind::MaskedIntrinsic;
+    else if (AI->getOperation() == AtomicRMWInst::Nand)
+      return Subtarget.hasStdExtZacas() ? AtomicExpansionKind::CmpXChg
+                                        : AtomicExpansionKind::MaskedIntrinsic;
+  }
+
   return AtomicExpansionKind::None;
 }
 
@@ -19629,6 +19638,8 @@ Value *RISCVTargetLowering::emitMaskedAtomicRMWIntrinsic(
         Builder.CreateCall(LrwOpScwLoop, {AlignedAddr, Incr, Mask, Ordering});
   }
 
+  if (Subtarget.hasStdExtZabha())
+    return Builder.CreateTrunc(Result, AI->getValOperand()->getType());
   if (XLen == 64)
     Result = Builder.CreateTrunc(Result, Builder.getInt32Ty());
   return Result;
@@ -19642,7 +19653,8 @@ RISCVTargetLowering::shouldExpandAtomicCmpXchgInIR(
     return AtomicExpansionKind::None;
 
   unsigned Size = CI->getCompareOperand()->getType()->getPrimitiveSizeInBits();
-  if (Size == 8 || Size == 16)
+  if (!(Subtarget.hasStdExtZabha() && Subtarget.hasStdExtZacas()) &&
+      (Size == 8 || Size == 16))
     return AtomicExpansionKind::MaskedIntrinsic;
   return AtomicExpansionKind::None;
 }
@@ -19664,6 +19676,9 @@ Value *RISCVTargetLowering::emitMaskedAtomicCmpXchgIntrinsic(
       Intrinsic::getDeclaration(CI->getModule(), CmpXchgIntrID, Tys);
   Value *Result = Builder.CreateCall(
       MaskedCmpXchg, {AlignedAddr, CmpVal, NewVal, Mask, Ordering});
+
+  if (Subtarget.hasStdExtZabha())
+    return Builder.CreateTrunc(Result, CI->getCompareOperand()->getType());
   if (XLen == 64)
     Result = Builder.CreateTrunc(Result, Builder.getInt32Ty());
   return Result;
diff --git a/llvm/lib/Target/RISCV/RISCVInstrInfoZa.td b/llvm/lib/Target/RISCV/RISCVInstrInfoZa.td
index fa918d90ad160..0cd41cac218f9 100644
--- a/llvm/lib/Target/RISCV/RISCVInstrInfoZa.td
+++ b/llvm/lib/Target/RISCV/RISCVInstrInfoZa.td
@@ -185,3 +185,30 @@ let Predicates = [HasStdExtZabha, HasStdExtZacas] in {
 defm AMOCAS_B : AMO_cas_aq_rl<0b00101, 0b000, "amocas.b", GPR>;
 defm AMOCAS_H : AMO_cas_aq_rl<0b00101, 0b001, "amocas.h", GPR>;
 }
+
+/// AMOs
+
+defm : AMOPat<"atomic_swap_8", "AMOSWAP_B", XLenVT, [HasStdExtZabha]>;
+defm : AMOPat<"atomic_load_add_8", "AMOADD_B", XLenVT, [HasStdExtZabha]>;
+defm : AMOPat<"atomic_load_and_8", "AMOAND_B", XLenVT, [HasStdExtZabha]>;
+defm : AMOPat<"atomic_load_or_8", "AMOOR_B", XLenVT, [HasStdExtZabha]>;
+defm : AMOPat<"atomic_load_xor_8", "AMOXOR_B", XLenVT, [HasStdExtZabha]>;
+defm : AMOPat<"atomic_load_max_8", "AMOMAX_B", XLenVT, [HasStdExtZabha]>;
+defm : AMOPat<"atomic_load_min_8", "AMOMIN_B", XLenVT, [HasStdExtZabha]>;
+defm : AMOPat<"atomic_load_umax_8", "AMOMAXU_B", XLenVT, [HasStdExtZabha]>;
+defm : AMOPat<"atomic_load_umin_8", "AMOMINU_B", XLenVT, [HasStdExtZabha]>;
+
+defm : AMOPat<"atomic_swap_16", "AMOSWAP_H", XLenVT, [HasStdExtZabha]>;
+defm : AMOPat<"atomic_load_add_16", "AMOADD_H", XLenVT, [HasStdExtZabha]>;
+defm : AMOPat<"atomic_load_and_16", "AMOAND_H", XLenVT, [HasStdExtZabha]>;
+defm : AMOPat<"atomic_load_or_16", "AMOOR_H", XLenVT, [HasStdExtZabha]>;
+defm : AMOPat<"atomic_load_xor_16", "AMOXOR_H", XLenVT, [HasStdExtZabha]>;
+defm : AMOPat<"atomic_load_max_16", "AMOMAX_H", XLenVT, [HasStdExtZabha]>;
+defm : AMOPat<"atomic_load_min_16", "AMOMIN_H", XLenVT, [HasStdExtZabha]>;
+defm : AMOPat<"atomic_load_umax_16", "AMOMAXU_H", XLenVT, [HasStdExtZabha]>;
+defm : AMOPat<"atomic_load_umin_16", "AMOMINU_H", XLenVT, [HasStdExtZabha]>;
+
+/// AMOCAS
+
+defm : AMOCASPat<"atomic_cmp_swap_8", "AMOCAS_B", XLenVT, [HasStdExtZabha]>;
+defm : AMOCASPat<"atomic_cmp_swap_16", "AMOCAS_H", XLenVT, [HasStdExtZabha]>;
diff --git a/llvm/test/CodeGen/RISCV/atomic-cmpxchg-branch-on-result.ll b/llvm/test/CodeGen/RISCV/atomic-cmpxchg-branch-on-result.ll
index a8477cc550fe6..90d78779b764d 100644
--- a/llvm/test/CodeGen/RISCV/atomic-cmpxchg-branch-on-result.ll
+++ b/llvm/test/CodeGen/RISCV/atomic-cmpxchg-branch-on-result.ll
@@ -7,6 +7,8 @@
 ; RUN:   | FileCheck -check-prefixes=NOZACAS,RV64IA %s
 ; RUN: llc -mtriple=riscv64 -mattr=+a,+experimental-zacas -verify-machineinstrs < %s \
 ; RUN:   | FileCheck -check-prefixes=ZACAS,RV64IA-ZACAS %s
+; RUN: llc -mtriple=riscv64 -mattr=+a,+experimental-zacas,+experimental-zabha -verify-machineinstrs < %s \
+; RUN:   | FileCheck -check-prefixes=ZACAS,RV64IA-ZABHA %s
 
 ; Test cmpxchg followed by a branch on the cmpxchg success value to see if the
 ; branch is folded into the cmpxchg expansion.
@@ -209,6 +211,16 @@ define void @cmpxchg_masked_and_branch1(ptr %ptr, i8 signext %cmp, i8 signext %v
 ; RV64IA-ZACAS-NEXT:  # %bb.5: # %do_cmpxchg
 ; RV64IA-ZACAS-NEXT:  # %bb.2: # %exit
 ; RV64IA-ZACAS-NEXT:    ret
+;
+; RV64IA-ZABHA-LABEL: cmpxchg_masked_and_branch1:
+; RV64IA-ZABHA:       # %bb.0: # %entry
+; RV64IA-ZABHA-NEXT:  .LBB2_1: # %do_cmpxchg
+; RV64IA-ZABHA-NEXT:    # =>This Inner Loop Header: Depth=1
+; RV64IA-ZABHA-NEXT:    mv a3, a1
+; RV64IA-ZABHA-NEXT:    amocas.b.aqrl a3, a2, (a0)
+; RV64IA-ZABHA-NEXT:    bne a3, a1, .LBB2_1
+; RV64IA-ZABHA-NEXT:  # %bb.2: # %exit
+; RV64IA-ZABHA-NEXT:    ret
 entry:
   br label %do_cmpxchg
 do_cmpxchg:
@@ -351,6 +363,16 @@ define void @cmpxchg_masked_and_branch2(ptr %ptr, i8 signext %cmp, i8 signext %v
 ; RV64IA-ZACAS-NEXT:    beq a1, a4, .LBB3_1
 ; RV64IA-ZACAS-NEXT:  # %bb.2: # %exit
 ; RV64IA-ZACAS-NEXT:    ret
+;
+; RV64IA-ZABHA-LABEL: cmpxchg_masked_and_branch2:
+; RV64IA-ZABHA:       # %bb.0: # %entry
+; RV64IA-ZABHA-NEXT:  .LBB3_1: # %do_cmpxchg
+; RV64IA-ZABHA-NEXT:    # =>This Inner Loop Header: Depth=1
+; RV64IA-ZABHA-NEXT:    mv a3, a1
+; RV64IA-ZABHA-NEXT:    amocas.b.aqrl a3, a2, (a0)
+; RV64IA-ZABHA-NEXT:    beq a3, a1, .LBB3_1
+; RV64IA-ZABHA-NEXT:  # %bb.2: # %exit
+; RV64IA-ZABHA-NEXT:    ret
 entry:
   br label %do_cmpxchg
 do_cmpxchg:
diff --git a/llvm/test/CodeGen/RISCV/atomic-cmpxchg.ll b/llvm/test/CodeGen/RISCV/atomic-cmpxchg.ll
index 5b3e5789e8d91..8df37bf40975c 100644
--- a/llvm/test/CodeGen/RISCV/atomic-cmpxchg.ll
+++ b/llvm/test/CodeGen/RISCV/atomic-cmpxchg.ll
@@ -15,10 +15,14 @@
 ; RUN:   | FileCheck -check-prefixes=RV64IA,RV64IA-WMO %s
 ; RUN: llc -mtriple=riscv64 -mattr=+a,+experimental-zacas -verify-machineinstrs < %s \
 ; RUN:   | FileCheck -check-prefixes=RV64IA,RV64IA-ZACAS,RV64IA-WMO-ZACAS %s
+; RUN: llc -mtriple=riscv64 -mattr=+a,+experimental-zacas,+experimental-zabha -verify-machineinstrs < %s \
+; RUN:   | FileCheck -check-prefixes=RV64IA,RV64IA-ZABHA,RV64IA-WMO-ZABHA %s
 ; RUN: llc -mtriple=riscv64 -mattr=+a,+experimental-ztso -verify-machineinstrs < %s \
 ; RUN:   | FileCheck -check-prefixes=RV64IA,RV64IA-TSO %s
 ; RUN: llc -mtriple=riscv64 -mattr=+a,+experimental-ztso,+experimental-zacas -verify-machineinstrs < %s \
 ; RUN:   | FileCheck -check-prefixes=RV64IA,RV64IA-ZACAS,RV64IA-TSO-ZACAS %s
+; RUN: llc -mtriple=riscv64 -mattr=+a,+experimental-ztso,+experimental-zacas,+experimental-zabha -verify-machineinstrs < %s \
+; RUN:   | FileCheck -check-prefixes=RV64IA,RV64IA-ZABHA,RV64IA-TSO-ZABHA %s
 
 define void @cmpxchg_i8_monotonic_monotonic(ptr %ptr, i8 %cmp, i8 %val) nounwind {
 ; RV32I-LABEL: cmpxchg_i8_monotonic_monotonic:
@@ -70,28 +74,79 @@ define void @cmpxchg_i8_monotonic_monotonic(ptr %ptr, i8 %cmp, i8 %val) nounwind
 ; RV64I-NEXT:    addi sp, sp, 16
 ; RV64I-NEXT:    ret
 ;
-; RV64IA-LABEL: cmpxchg_i8_monotonic_monotonic:
-; RV64IA:       # %bb.0:
-; RV64IA-NEXT:    andi a3, a0, -4
-; RV64IA-NEXT:    slli a0, a0, 3
-; RV64IA-NEXT:    li a4, 255
-; RV64IA-NEXT:    sllw a4, a4, a0
-; RV64IA-NEXT:    andi a1, a1, 255
-; RV64IA-NEXT:    sllw a1, a1, a0
-; RV64IA-NEXT:    andi a2, a2, 255
-; RV64IA-NEXT:    sllw a0, a2, a0
-; RV64IA-NEXT:  .LBB0_1: # =>This Inner Loop Header: Depth=1
-; RV64IA-NEXT:    lr.w a2, (a3)
-; RV64IA-NEXT:    and a5, a2, a4
-; RV64IA-NEXT:    bne a5, a1, .LBB0_3
-; RV64IA-NEXT:  # %bb.2: # in Loop: Header=BB0_1 Depth=1
-; RV64IA-NEXT:    xor a5, a2, a0
-; RV64IA-NEXT:    and a5, a5, a4
-; RV64IA-NEXT:    xor a5, a2, a5
-; RV64IA-NEXT:    sc.w a5, a5, (a3)
-; RV64IA-NEXT:    bnez a5, .LBB0_1
-; RV64IA-NEXT:  .LBB0_3:
-; RV64IA-NEXT:    ret
+; RV64IA-WMO-LABEL: cmpxchg_i8_monotonic_monotonic:
+; RV64IA-WMO:       # %bb.0:
+; RV64IA-WMO-NEXT:    andi a3, a0, -4
+; RV64IA-WMO-NEXT:    slli a0, a0, 3
+; RV64IA-WMO-NEXT:    li a4, 255
+; RV64IA-WMO-NEXT:    sllw a4, a4, a0
+; RV64IA-WMO-NEXT:    andi a1, a1, 255
+; RV64IA-WMO-NEXT:    sllw a1, a1, a0
+; RV64IA-WMO-NEXT:    andi a2, a2, 255
+; RV64IA-WMO-NEXT:    sllw a0, a2, a0
+; RV64IA-WMO-NEXT:  .LBB0_1: # =>This Inner Loop Header: Depth=1
+; RV64IA-WMO-NEXT:    lr.w a2, (a3)
+; RV64IA-WMO-NEXT:    and a5, a2, a4
+; RV64IA-WMO-NEXT:    bne a5, a1, .LBB0_3
+; RV64IA-WMO-NEXT:  # %bb.2: # in Loop: Header=BB0_1 Depth=1
+; RV64IA-WMO-NEXT:    xor a5, a2, a0
+; RV64IA-WMO-NEXT:    and a5, a5, a4
+; RV64IA-WMO-NEXT:    xor a5, a2, a5
+; RV64IA-WMO-NEXT:    sc.w a5, a5, (a3)
+; RV64IA-WMO-NEXT:    bnez a5, .LBB0_1
+; RV64IA-WMO-NEXT:  .LBB0_3:
+; RV64IA-WMO-NEXT:    ret
+;
+; RV64IA-ZACAS-LABEL: cmpxchg_i8_monotonic_monotonic:
+; RV64IA-ZACAS:       # %bb.0:
+; RV64IA-ZACAS-NEXT:    andi a3, a0, -4
+; RV64IA-ZACAS-NEXT:    slli a0, a0, 3
+; RV64IA-ZACAS-NEXT:    li a4, 255
+; RV64IA-ZACAS-NEXT:    sllw a4, a4, a0
+; RV64IA-ZACAS-NEXT:    andi a1, a1, 255
+; RV64IA-ZACAS-NEXT:    sllw a1, a1, a0
+; RV64IA-ZACAS-NEXT:    andi a2, a2, 255
+; RV64IA-ZACAS-NEXT:    sllw a0, a2, a0
+; RV64IA-ZACAS-NEXT:  .LBB0_1: # =>This Inner Loop Header: Depth=1
+; RV64IA-ZACAS-NEXT:    lr.w a2, (a3)
+; RV64IA-ZACAS-NEXT:    and a5, a2, a4
+; RV64IA-ZACAS-NEXT:    bne a5, a1, .LBB0_3
+; RV64IA-ZACAS-NEXT:  # %bb.2: # in Loop: Header=BB0_1 Depth=1
+; RV64IA-ZACAS-NEXT:    xor a5, a2, a0
+; RV64IA-ZACAS-NEXT:    and a5, a5, a4
+; RV64IA-ZACAS-NEXT:    xor a5, a2, a5
+; RV64IA-ZACAS-NEXT:    sc.w a5, a5, (a3)
+; RV64IA-ZACAS-NEXT:    bnez a5, .LBB0_1
+; RV64IA-ZACAS-NEXT:  .LBB0_3:
+; RV64IA-ZACAS-NEXT:    ret
+;
+; RV64IA-ZABHA-LABEL: cmpxchg_i8_monotonic_monotonic:
+; RV64IA-ZABHA:       # %bb.0:
+; RV64IA-ZABHA-NEXT:    amocas.b a1, a2, (a0)
+; RV64IA-ZABHA-NEXT:    ret
+;
+; RV64IA-TSO-LABEL: cmpxchg_i8_monotonic_monotonic:
+; RV64IA-TSO:       # %bb.0:
+; RV64IA-TSO-NEXT:    andi a3, a0, -4
+; RV64IA-TSO-NEXT:    slli a0, a0, 3
+; RV64IA-TSO-NEXT:    li a4, 255
+; RV64IA-TSO-NEXT:    sllw a4, a4, a0
+; RV64IA-TSO-NEXT:    andi a1, a1, 255
+; RV64IA-TSO-NEXT:    sllw a1, a1, a0
+; RV64IA-TSO-NEXT:    andi a2, a2, 255
+; RV64IA-TSO-NEXT:    sllw a0, a2, a0
+; RV64IA-TSO-NEXT:  .LBB0_1: # =>This Inner Loop Header: Depth=1
+; RV64IA-TSO-NEXT:    lr.w a2, (a3)
+; RV64IA-TSO-NEXT:    and a5, a2, a4
+; RV64IA-TSO-NEXT:    bne a5, a1, .LBB0_3
+; RV64IA-TSO-NEXT:  # %bb.2: # in Loop: Header=BB0_1 Depth=1
+; RV64IA-TSO-NEXT:    xor a5, a2, a0
+; RV64IA-TSO-NEXT:    and a5, a5, a4
+; RV64IA-TSO-NEXT:    xor a5, a2, a5
+; RV64IA-TSO-NEXT:    sc.w a5, a5, (a3)
+; RV64IA-TSO-NEXT:    bnez a5, .LBB0_1
+; RV64IA-TSO-NEXT:  .LBB0_3:
+; RV64IA-TSO-NEXT:    ret
   %res = cmpxchg ptr %ptr, i8 %cmp, i8 %val monotonic monotonic
   ret void
 }
@@ -261,6 +316,11 @@ define void @cmpxchg_i8_acquire_monotonic(ptr %ptr, i8 %cmp, i8 %val) nounwind {
 ; RV64IA-WMO-ZACAS-NEXT:  .LBB1_3:
 ; RV64IA-WMO-ZACAS-NEXT:    ret
 ;
+; RV64IA-WMO-ZABHA-LABEL: cmpxchg_i8_acquire_monotonic:
+; RV64IA-WMO-ZABHA:       # %bb.0:
+; RV64IA-WMO-ZABHA-NEXT:    amocas.b.aq a1, a2, (a0)
+; RV64IA-WMO-ZABHA-NEXT:    ret
+;
 ; RV64IA-TSO-LABEL: cmpxchg_i8_acquire_monotonic:
 ; RV64IA-TSO:       # %bb.0:
 ; RV64IA-TSO-NEXT:    andi a3, a0, -4
@@ -306,6 +366,11 @@ define void @cmpxchg_i8_acquire_monotonic(ptr %ptr, i8 %cmp, i8 %val) nounwind {
 ; RV64IA-TSO-ZACAS-NEXT:    bnez a5, .LBB1_1
 ; RV64IA-TSO-ZACAS-NEXT:  .LBB1_3:
 ; RV64IA-TSO-ZACAS-NEXT:    ret
+;
+; RV64IA-TSO-ZABHA-LABEL: cmpxchg_i8_acquire_monotonic:
+; RV64IA-TSO-ZABHA:       # %bb.0:
+; RV64IA-TSO-ZABHA-NEXT:    amocas.b a1, a2, (a0)
+; RV64IA-TSO-ZABHA-NEXT:    ret
   %res = cmpxchg ptr %ptr, i8 %cmp, i8 %val acquire monotonic
   ret void
 }
@@ -475,6 +540,11 @@ define void @cmpxchg_i8_acquire_acquire(ptr %ptr, i8 %cmp, i8 %val) nounwind {
 ; RV64IA-WMO-ZACAS-NEXT:  .LBB2_3:
 ; RV64IA-WMO-ZACAS-NEXT:    ret
 ;
+; RV64IA-WMO-ZABHA-LABEL: cmpxchg_i8_acquire_acquire:
+; RV64IA-WMO-ZABHA:       # %bb.0:
+; RV64IA-WMO-ZABHA-NEXT:    amocas.b.aq a1, a2, (a0)
+; RV64IA-WMO-ZABHA-NEXT:    ret
+;
 ; RV64IA-TSO-LABEL: cmpxchg_i8_acquire_acquire:
 ; RV64IA-TSO:       # %bb.0:
 ; RV64IA-TSO-NEXT:    andi a3, a0, -4
@@ -520,6 +590,11 @@ define void @cmpxchg_i8_acquire_acquire(ptr %ptr, i8 %cmp, i8 %val) nounwind {
 ; RV64IA-TSO-ZACAS-NEXT:    bnez a5, .LBB2_1
 ; RV64IA-TSO-ZACAS-NEXT:  .LBB2_3:
 ; RV64IA-TSO-ZACAS-NEXT:    ret
+;
+; RV64IA-TSO-ZABHA-LABEL: cmpxchg_i8_acquire_acquire:
+; RV64IA-TSO-ZABHA:       # %bb.0:
+; RV64IA-TSO-ZABHA-NEXT:    amocas.b a1, a2, (a0)
+; RV64IA-TSO-ZABHA-NEXT:    ret
   %res = cmpxchg ptr %ptr, i8 %cmp, i8 %val acquire acquire
   ret void
 }
@@ -689,6 +764,11 @@ define void @cmpxchg_i8_release_monotonic(ptr %ptr, i8 %cmp, i8 %val) nounwind {
 ; RV64IA-WMO-ZACAS-NEXT:  .LBB3_3:
 ; RV64IA-WMO-ZACAS-NEXT:    ret
 ;
+; RV64IA-WMO-ZABHA-LABEL: cmpxchg_i8_release_monotonic:
+; RV64IA-WMO-ZABHA:       # %bb.0:
+; RV64IA-WMO-ZABHA-NEXT:    amocas.b.rl a1, a2, (a0)
+; RV64IA-WMO-ZABHA-NEXT:    ret
+;
 ; RV64IA-TSO-LABEL: cmpxchg_i8_release_monotonic:
 ; RV64IA-TSO:       # %bb.0:
 ; RV64IA-TSO-NEXT:    andi a3, a0, -4
@@ -734,6 +814,11 @@ define void @cmpxchg_i8_release_monotonic(ptr %ptr, i8 %cmp, i8 %val) nounwind {
 ; RV64IA-TSO-ZACAS-NEXT:    bnez a5, .LBB3_1
 ; RV64IA-TSO-ZACAS-NEXT:  .LBB3_3:
 ; RV64IA-TSO-ZACAS-NEXT:    ret
+;
+; RV64IA-TSO-ZABHA-LABEL: cmpxchg_i8_release_monotonic:
+; RV64IA-TSO-ZABHA:       # %bb.0:
+; RV64IA-TSO-ZABHA-NEXT:    amocas.b a1, a2, (a0)
+; RV64IA-TSO-ZABHA-NEXT:    ret
   %res = cmpxchg ptr %ptr, i8 %cmp, i8 %val release monotonic
   ret void
 }
@@ -903,6 +988,11 @@ define void @cmpxchg_i8_release_acquire(ptr %ptr, i8 %cmp, i8 %val) nounwind {
 ; RV64IA-WMO-ZACAS-NEXT:  .LBB4_3:
 ; RV64IA-WMO-ZACAS-NEXT:    ret
 ;
+; RV64IA-WMO-ZABHA-LABEL: cmpxchg_i8_release_acquire:
+; RV64IA-WMO-ZABHA:       # %bb.0:
+; RV64IA-WMO-ZABHA-NEXT:    amocas.b.aqrl a1, a2, (a0)
+; RV64IA-WMO-ZABHA-NEXT:    ret
+;
 ; RV64IA-TSO-LABEL: cmpxchg_i8_release_acquire:
 ; RV64IA-TSO:       # %bb.0:
 ; RV64IA-TSO-NEXT:    andi a3, a0, -4
@@ -948,6 +1038,11 @@ define void @cmpxchg_i8_release_acquire(ptr %ptr, i8 %cmp, i8 %val) nounwind {
 ; RV64IA-TSO-ZACAS-NEXT:    bnez a5, .LBB4_1
 ; RV64IA-TSO-ZACAS-NEXT:  .LBB4_3:
 ; RV64IA-TSO-ZACAS-NEXT:    ret
+;
+; RV64IA-TSO-ZABHA-LABEL: cmpxchg_i8_release_acquire:
+; RV64IA-TSO-ZABHA:       # %bb.0:
+; RV64IA-TSO-ZABHA-NEXT:    amocas.b a1, a2, (a0)
+; RV64IA-TSO-ZABHA-NEXT:    ret
   %res = cmpxchg ptr %ptr, i8 %cmp, i8 %val release acquire
   ret void
 }
@@ -1117,6 +1212,11 @@ define void @cmpxchg_i8_acq_rel_monotonic(ptr %ptr, i8 %cmp, i8 %val) nounwind {
 ; RV64IA-WMO-ZACAS-NEXT:  .LBB5_3:
 ; RV64IA-WMO-ZACAS-NEXT:    ret
 ;
+; RV64IA-WMO-ZABHA-LABEL: cmpxchg_i8_acq_rel_monotonic:
+; RV64IA-WMO-ZABHA:       # %bb.0:
+; RV64IA-WMO-ZABHA-NEXT:    amocas.b.aqrl a1, a2, (a0)
+; RV64IA-WMO-ZABHA-NEXT:    ret
+;
 ; RV64IA-TSO-LABEL: cmpxchg_i8_acq_rel_monotonic:
 ; RV64IA-TSO:       # %bb.0:
 ; RV64IA-TSO-NEXT:    andi a3, a0, -4
@@ -1162,6 +1262,11 @@ define void @cmpxchg_i8_acq_rel_monotonic(ptr %ptr, i8 %cmp, i8 %val) nounwind {
 ; RV64IA-TSO-ZACAS-NEXT:    bnez a5, .LBB5_1
 ; RV64IA-TSO-ZACAS-NEXT:  .LBB5_3:
 ; RV64IA-TSO-ZACAS-NEXT:    ret
+;
+; RV64IA-TSO-ZABHA-LABEL: cmpxchg_i8_acq_rel_monotonic:
+; RV64IA-TSO-ZABHA:       # %bb.0:
+; RV64IA-TSO-ZABHA-NEXT:    amocas.b a1, a2, (a0)
+; RV64IA-TSO-ZABHA-NEXT:    ret
   %res = cmpxchg ptr %ptr, i8 %cmp, i8 %val acq_rel monotonic
   ret void
 }
@@ -1331,6 +1436,11 @@ define void @cmpxchg_i8_acq_rel_acquire(ptr %ptr, i8 %cmp, i8 %val) nounwind {
 ; RV64IA-WMO-ZACAS-NEXT:  .LBB6_3:
 ; RV64IA-WMO-ZACAS-NEXT:    ret
 ;
+; RV64IA-WMO-ZABHA-LABEL: cmpxchg_i8_acq_rel_acquire:
+; RV64IA-WMO-ZABHA:       # %bb.0:
+; RV64IA-WMO-ZABHA-NEXT:    amocas.b.aqrl a1, a2, (a0)
+; RV64IA-WMO-ZABHA-NEXT:    ret
+;
 ; RV64IA-TSO-LABEL: cmpxchg_i8_acq_rel_acquire:
 ; RV64IA-TSO:       # %bb.0:
 ; RV64IA-TSO-NEXT:    andi a3, a0, -4
@@ -1376,6 +1486,11 @@ define void @cmpxchg_i8_acq_rel_acquire(ptr %ptr, i8 %cmp, i8 %val) nounwind {
 ; RV64IA-TSO-ZACAS-NEXT:    bnez a5, .LBB6_1
 ; RV64IA-TSO-ZACAS-NEXT:  .LBB6_3:
 ; RV64IA-TSO-ZACAS-NEXT:    ret
+;
+; RV64IA-TSO-ZABHA-LABEL: cmpxchg_i8_acq_rel_acquire:
+; RV64IA-TSO-ZABHA:       # %bb.0:
+; RV64IA-TSO-ZABHA-NEXT:    amocas.b a1, a2, (a0)
+; RV64IA-TSO-ZABHA-NEXT:    ret
   %res = cmpxchg ptr %ptr, i8 %cmp, i8 %val acq_rel acquire
   ret void
 }
@@ -1430,28 +1545,84 @@ define void @cmpxchg_i8_seq_cst_monotonic(ptr %ptr, i8 %cmp, i8 %val) nounwind {
 ; RV64I-NEXT:    addi sp, sp, 16
 ; RV64I-NEXT:    ret
 ;
-; RV64IA-LABEL: cmpxchg_i8_seq_cst_monotonic:
-; RV64IA:       # %bb.0:
-; RV64IA-NEXT:    andi a3, a0, -4
-; RV64IA-NEXT:    slli a0, a0, 3
-; RV64IA-NEXT:    li a4, 255
-; RV64IA-NEXT:    sllw a4, a4, a0
-; RV64IA-NEXT:    andi a1, a1, 255
-; RV64IA-NEXT:    sllw a1, a1, a0
-; RV64IA-NEXT:    andi a2, a2, 255
-; RV64IA-NEXT:    sllw a0, a2, a0
-; RV64IA-NEXT:  .LBB7_1: # =>This Inner Loop Header: Depth=1
-; RV64IA-NEXT:    lr.w.aqrl a2, (a3)
-; RV64IA-NEXT:    and a5, a2, a4
-; RV64IA-NEXT:    bne a5, a1, .LBB7_3
-; RV64IA-NEXT:  # %bb.2: # in Loop: Header=BB7_1 Depth=1
-; RV64IA-NEXT:    xor a5, a2, a0
-; RV64IA-NEXT:    and a5, a5, a4
-; RV64IA-NEXT:    xor a5, a2, a5
-; RV64IA-NEXT:    sc.w.rl a5, a5, (a3)
-; ...
[truncated]

@dtcxzyw dtcxzyw force-pushed the zabha-codegen-support branch from 272f110 to 0f25140 Compare February 1, 2024 05:53
@dtcxzyw dtcxzyw requested a review from jyknight February 2, 2024 07:17
topperc added a commit to topperc/llvm-project that referenced this pull request Feb 7, 2024
…wordAtomicRMW.

This gives the target a chance to keep an atomicrmw op that is
smaller than the minimum cmpxchg size. This is needed to support
the Zabha extension for RISC-V which provides i8/i16 atomicrmw
operations, but does not provide an i8/i16 cmpxchg or LR/SC instructions.

This moves the widening until after the target requests
LLSC/CmpXChg/MaskedIntrinsic expansion. Once we widen, we call
shouldExpandAtomicRMWInIR again to give the target another chance
to make a decision about the widened operation.

I considered making the targets return AtomicExpansionKind::Expand
or a new expansion kind for And/Or/Xor, but that required the
targets to special case And/Or/Xor which they weren't currently doing.

This should make it easier to implement llvm#80192.
@dtcxzyw
Copy link
Member Author

dtcxzyw commented Feb 10, 2024

@topperc Thank you for the update!

@dtcxzyw dtcxzyw requested a review from topperc February 15, 2024 04:18
Copy link
Collaborator

@topperc topperc left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM

@dtcxzyw dtcxzyw merged commit a300a1a into llvm:main Feb 16, 2024
@dtcxzyw dtcxzyw deleted the zabha-codegen-support branch February 16, 2024 07:35
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

6 participants