Skip to content

[Targets] Migrate from atomic_load_8/16/32/64 to atomic_load_nonext_8/16/32/64. NFC #137428

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
merged 1 commit into from
Apr 28, 2025

Conversation

topperc
Copy link
Collaborator

@topperc topperc commented Apr 26, 2025

atomic_load_8/16/32/64 will be removed in a separate patch as it will affect out of tree targets.

…/16/32/64. NFC

atomic_load_8/16/32/64 will be removed in a separate patch as it
will affect out of tree targets.
@llvmbot
Copy link
Member

llvmbot commented Apr 26, 2025

@llvm/pr-subscribers-backend-loongarch
@llvm/pr-subscribers-backend-hexagon
@llvm/pr-subscribers-backend-x86

@llvm/pr-subscribers-backend-amdgpu

Author: Craig Topper (topperc)

Changes

atomic_load_8/16/32/64 will be removed in a separate patch as it will affect out of tree targets.


Patch is 48.32 KiB, truncated to 20.00 KiB below, full version: https://github.com/llvm/llvm-project/pull/137428.diff

25 Files Affected:

  • (modified) llvm/lib/Target/AArch64/AArch64InstrAtomics.td (+44-40)
  • (modified) llvm/lib/Target/AMDGPU/AMDGPUInstructions.td (+3-3)
  • (modified) llvm/lib/Target/AMDGPU/BUFInstructions.td (+3-3)
  • (modified) llvm/lib/Target/AMDGPU/DSInstructions.td (+3-3)
  • (modified) llvm/lib/Target/AMDGPU/FLATInstructions.td (+6-6)
  • (modified) llvm/lib/Target/AMDGPU/SIInstrInfo.td (+18-12)
  • (modified) llvm/lib/Target/ARM/ARMInstrInfo.td (+4-4)
  • (modified) llvm/lib/Target/ARM/ARMInstrThumb.td (+2-2)
  • (modified) llvm/lib/Target/ARM/ARMInstrThumb2.td (+4-4)
  • (modified) llvm/lib/Target/AVR/AVRInstrInfo.td (+2-2)
  • (modified) llvm/lib/Target/BPF/BPFInstrInfo.td (+4-4)
  • (modified) llvm/lib/Target/Hexagon/HexagonPatterns.td (+6-6)
  • (modified) llvm/lib/Target/LoongArch/LoongArchInstrInfo.td (+2-2)
  • (modified) llvm/lib/Target/Mips/MicroMipsInstrInfo.td (+1-1)
  • (modified) llvm/lib/Target/Mips/Mips64InstrInfo.td (+1-1)
  • (modified) llvm/lib/Target/Mips/MipsInstrInfo.td (+1-1)
  • (modified) llvm/lib/Target/PowerPC/PPCInstr64Bit.td (+2-2)
  • (modified) llvm/lib/Target/PowerPC/PPCInstrInfo.td (+2-2)
  • (modified) llvm/lib/Target/PowerPC/PPCInstrP10.td (+4-4)
  • (modified) llvm/lib/Target/RISCV/RISCVGISel.td (+2-2)
  • (modified) llvm/lib/Target/Sparc/SparcInstr64Bit.td (+3-3)
  • (modified) llvm/lib/Target/Sparc/SparcInstrInfo.td (+2-2)
  • (modified) llvm/lib/Target/VE/VEInstrInfo.td (+4-4)
  • (modified) llvm/lib/Target/WebAssembly/WebAssemblyInstrAtomics.td (+5-5)
  • (modified) llvm/lib/Target/X86/X86InstrCompiler.td (+40-40)
diff --git a/llvm/lib/Target/AArch64/AArch64InstrAtomics.td b/llvm/lib/Target/AArch64/AArch64InstrAtomics.td
index 28d45fe25d30c..f3734e05ae667 100644
--- a/llvm/lib/Target/AArch64/AArch64InstrAtomics.td
+++ b/llvm/lib/Target/AArch64/AArch64InstrAtomics.td
@@ -55,9 +55,9 @@ let Predicates = [HasRCPC] in {
   // 16-bit loads
   def : Pat<(acquiring_load<atomic_load_azext_16> GPR64sp:$ptr), (LDAPRH GPR64sp:$ptr)>;
   // 32-bit loads
-  def : Pat<(acquiring_load<atomic_load_32> GPR64sp:$ptr), (LDAPRW GPR64sp:$ptr)>;
+  def : Pat<(acquiring_load<atomic_load_nonext_32> GPR64sp:$ptr), (LDAPRW GPR64sp:$ptr)>;
   // 64-bit loads
-  def : Pat<(acquiring_load<atomic_load_64> GPR64sp:$ptr), (LDAPRX GPR64sp:$ptr)>;
+  def : Pat<(acquiring_load<atomic_load_nonext_64> GPR64sp:$ptr), (LDAPRX GPR64sp:$ptr)>;
 }
 
 // 8-bit loads
@@ -93,62 +93,66 @@ def : Pat<(relaxed_load<atomic_load_azext_16>
           (LDURHHi GPR64sp:$Rn, simm9:$offset)>;
 
 // 32-bit loads
-def : Pat<(seq_cst_load<atomic_load_32> GPR64sp:$ptr), (LDARW GPR64sp:$ptr)>;
-def : Pat<(acquiring_load<atomic_load_32> GPR64sp:$ptr), (LDARW GPR64sp:$ptr)>;
-def : Pat<(relaxed_load<atomic_load_32> (ro_Windexed32 GPR64sp:$Rn, GPR32:$Rm,
-                                                       ro_Wextend32:$extend)),
+def : Pat<(seq_cst_load<atomic_load_nonext_32> GPR64sp:$ptr),
+          (LDARW GPR64sp:$ptr)>;
+def : Pat<(acquiring_load<atomic_load_nonext_32> GPR64sp:$ptr),
+          (LDARW GPR64sp:$ptr)>;
+def : Pat<(relaxed_load<atomic_load_nonext_32>
+               (ro_Windexed32 GPR64sp:$Rn, GPR32:$Rm, ro_Wextend32:$extend)),
           (LDRWroW GPR64sp:$Rn, GPR32:$Rm, ro_Wextend32:$extend)>;
-def : Pat<(relaxed_load<atomic_load_32> (ro_Xindexed32 GPR64sp:$Rn, GPR64:$Rm,
-                                                       ro_Xextend32:$extend)),
+def : Pat<(relaxed_load<atomic_load_nonext_32>
+               (ro_Xindexed32 GPR64sp:$Rn, GPR64:$Rm, ro_Xextend32:$extend)),
           (LDRWroX GPR64sp:$Rn, GPR64:$Rm, ro_Xextend32:$extend)>;
-def : Pat<(relaxed_load<atomic_load_32> (am_indexed32 GPR64sp:$Rn,
-                                                      uimm12s4:$offset)),
+def : Pat<(relaxed_load<atomic_load_nonext_32>
+               (am_indexed32 GPR64sp:$Rn, uimm12s4:$offset)),
           (LDRWui GPR64sp:$Rn, uimm12s4:$offset)>;
-def : Pat<(relaxed_load<atomic_load_32>
+def : Pat<(relaxed_load<atomic_load_nonext_32>
                (am_unscaled32 GPR64sp:$Rn, simm9:$offset)),
           (LDURWi GPR64sp:$Rn, simm9:$offset)>;
 
 // 64-bit loads
-def : Pat<(seq_cst_load<atomic_load_64> GPR64sp:$ptr), (LDARX GPR64sp:$ptr)>;
-def : Pat<(acquiring_load<atomic_load_64> GPR64sp:$ptr), (LDARX GPR64sp:$ptr)>;
-def : Pat<(relaxed_load<atomic_load_64> (ro_Windexed64 GPR64sp:$Rn, GPR32:$Rm,
-                                                       ro_Wextend64:$extend)),
+def : Pat<(seq_cst_load<atomic_load_nonext_64> GPR64sp:$ptr),
+          (LDARX GPR64sp:$ptr)>;
+def : Pat<(acquiring_load<atomic_load_nonext_64> GPR64sp:$ptr),
+          (LDARX GPR64sp:$ptr)>;
+def : Pat<(relaxed_load<atomic_load_nonext_64>
+               (ro_Windexed64 GPR64sp:$Rn, GPR32:$Rm, ro_Wextend64:$extend)),
           (LDRXroW GPR64sp:$Rn, GPR32:$Rm, ro_Wextend64:$extend)>;
-def : Pat<(relaxed_load<atomic_load_64> (ro_Xindexed64 GPR64sp:$Rn, GPR64:$Rm,
-                                                       ro_Xextend64:$extend)),
+def : Pat<(relaxed_load<atomic_load_nonext_64>
+               (ro_Xindexed64 GPR64sp:$Rn, GPR64:$Rm, ro_Xextend64:$extend)),
           (LDRXroX GPR64sp:$Rn, GPR64:$Rm, ro_Xextend64:$extend)>;
-def : Pat<(relaxed_load<atomic_load_64> (am_indexed64 GPR64sp:$Rn,
-                                                      uimm12s8:$offset)),
+def : Pat<(relaxed_load<atomic_load_nonext_64>
+               (am_indexed64 GPR64sp:$Rn, uimm12s8:$offset)),
           (LDRXui GPR64sp:$Rn, uimm12s8:$offset)>;
-def : Pat<(relaxed_load<atomic_load_64>
+def : Pat<(relaxed_load<atomic_load_nonext_64>
                (am_unscaled64 GPR64sp:$Rn, simm9:$offset)),
           (LDURXi GPR64sp:$Rn, simm9:$offset)>;
 
 // FP 32-bit loads
-def : Pat<(f32 (bitconvert (i32 (relaxed_load<atomic_load_32> (ro_Windexed32 GPR64sp:$Rn, GPR32:$Rm,
-                                                       ro_Wextend32:$extend))))),
+def : Pat<(f32 (bitconvert (i32 (relaxed_load<atomic_load_nonext_32>
+               (ro_Windexed32 GPR64sp:$Rn, GPR32:$Rm, ro_Wextend32:$extend))))),
           (LDRSroW GPR64sp:$Rn, GPR32:$Rm, ro_Wextend32:$extend)>;
-def : Pat<(f32 (bitconvert (i32 (relaxed_load<atomic_load_32> (ro_Xindexed32 GPR64sp:$Rn, GPR64:$Rm,
-                                                       ro_Xextend32:$extend))))),
+def : Pat<(f32 (bitconvert (i32 (relaxed_load<atomic_load_nonext_32>
+               (ro_Xindexed32 GPR64sp:$Rn, GPR64:$Rm, ro_Xextend32:$extend))))),
           (LDRSroX GPR64sp:$Rn, GPR64:$Rm, ro_Xextend32:$extend)>;
-def : Pat<(f32 (bitconvert (i32 (relaxed_load<atomic_load_32> (am_indexed32 GPR64sp:$Rn,
-                                                      uimm12s8:$offset))))),
+def : Pat<(f32 (bitconvert (i32 (relaxed_load<atomic_load_nonext_32>
+               (am_indexed32 GPR64sp:$Rn, uimm12s8:$offset))))),
           (LDRSui GPR64sp:$Rn, uimm12s8:$offset)>;
-def : Pat<(f32 (bitconvert (i32 (relaxed_load<atomic_load_32>
+def : Pat<(f32 (bitconvert (i32 (relaxed_load<atomic_load_nonext_32>
                (am_unscaled32 GPR64sp:$Rn, simm9:$offset))))),
           (LDURSi GPR64sp:$Rn, simm9:$offset)>;
 
 // FP 64-bit loads
-def : Pat<(f64 (bitconvert (i64 (relaxed_load<atomic_load_64> (ro_Windexed64 GPR64sp:$Rn, GPR32:$Rm,
-                                                       ro_Wextend64:$extend))))),
+def : Pat<(f64 (bitconvert (i64 (relaxed_load<atomic_load_nonext_64>
+               (ro_Windexed64 GPR64sp:$Rn, GPR32:$Rm, ro_Wextend64:$extend))))),
           (LDRDroW GPR64sp:$Rn, GPR32:$Rm, ro_Wextend64:$extend)>;
-def : Pat<(f64 (bitconvert (i64 (relaxed_load<atomic_load_64> (ro_Xindexed64 GPR64sp:$Rn, GPR64:$Rm,
-                                                       ro_Xextend64:$extend))))),
+def : Pat<(f64 (bitconvert (i64 (relaxed_load<atomic_load_nonext_64>
+               (ro_Xindexed64 GPR64sp:$Rn, GPR64:$Rm, ro_Xextend64:$extend))))),
           (LDRDroX GPR64sp:$Rn, GPR64:$Rm, ro_Xextend64:$extend)>;
-def : Pat<(f64 (bitconvert (i64 (relaxed_load<atomic_load_64> (am_indexed64 GPR64sp:$Rn,
-                                                      uimm12s8:$offset))))),
+def : Pat<(f64 (bitconvert (i64 (relaxed_load<atomic_load_nonext_64>
+               (am_indexed64 GPR64sp:$Rn, uimm12s8:$offset))))),
           (LDRDui GPR64sp:$Rn, uimm12s8:$offset)>;
-def : Pat<(f64 (bitconvert (i64 (relaxed_load<atomic_load_64>
+def : Pat<(f64 (bitconvert (i64 (relaxed_load<atomic_load_nonext_64>
                (am_unscaled64 GPR64sp:$Rn, simm9:$offset))))),
           (LDURDi GPR64sp:$Rn, simm9:$offset)>;
 
@@ -561,16 +565,16 @@ let Predicates = [HasLSFE] in {
 let Predicates = [HasRCPC3, HasNEON] in {
   // LDAP1 loads
   def : Pat<(vector_insert (v2i64 VecListOne128:$Rd),
-                (i64 (acquiring_load<atomic_load_64> GPR64sp:$Rn)), (i64 VectorIndexD:$idx)),
+                (i64 (acquiring_load<atomic_load_nonext_64> GPR64sp:$Rn)), (i64 VectorIndexD:$idx)),
             (LDAP1 VecListOne128:$Rd, VectorIndexD:$idx, GPR64sp:$Rn)>;
   def : Pat<(vector_insert (v2f64 VecListOne128:$Rd),
-                (f64 (bitconvert (i64 (acquiring_load<atomic_load_64> GPR64sp:$Rn)))), (i64 VectorIndexD:$idx)),
+                (f64 (bitconvert (i64 (acquiring_load<atomic_load_nonext_64> GPR64sp:$Rn)))), (i64 VectorIndexD:$idx)),
             (LDAP1 VecListOne128:$Rd, VectorIndexD:$idx, GPR64sp:$Rn)>;
   def : Pat<(v1i64 (scalar_to_vector
-                (i64 (acquiring_load<atomic_load_64> GPR64sp:$Rn)))),
+                (i64 (acquiring_load<atomic_load_nonext_64> GPR64sp:$Rn)))),
             (EXTRACT_SUBREG (LDAP1 (v2i64 (IMPLICIT_DEF)), (i64 0), GPR64sp:$Rn), dsub)>;
   def : Pat<(v1f64 (scalar_to_vector
-                (f64 (bitconvert (i64 (acquiring_load<atomic_load_64> GPR64sp:$Rn)))))),
+                (f64 (bitconvert (i64 (acquiring_load<atomic_load_nonext_64> GPR64sp:$Rn)))))),
             (EXTRACT_SUBREG (LDAP1 (v2f64 (IMPLICIT_DEF)), (i64 0), GPR64sp:$Rn), dsub)>;
 
   // STL1 stores
@@ -597,10 +601,10 @@ let Predicates = [HasRCPC_IMMO, UseLDAPUR] in {
   def : Pat<(acquiring_load<atomic_load_azext_16>
                (am_unscaled16 GPR64sp:$Rn, simm9:$offset)),
           (LDAPURHi GPR64sp:$Rn, simm9:$offset)>;
-  def : Pat<(acquiring_load<atomic_load_32>
+  def : Pat<(acquiring_load<atomic_load_nonext_32>
                (am_unscaled32 GPR64sp:$Rn, simm9:$offset)),
           (LDAPURi GPR64sp:$Rn, simm9:$offset)>;
-  def : Pat<(acquiring_load<atomic_load_64>
+  def : Pat<(acquiring_load<atomic_load_nonext_64>
                (am_unscaled64 GPR64sp:$Rn, simm9:$offset)),
           (LDAPURXi GPR64sp:$Rn, simm9:$offset)>;
 }
diff --git a/llvm/lib/Target/AMDGPU/AMDGPUInstructions.td b/llvm/lib/Target/AMDGPU/AMDGPUInstructions.td
index 6cc76b44f1e14..78a92d85cfd8e 100644
--- a/llvm/lib/Target/AMDGPU/AMDGPUInstructions.td
+++ b/llvm/lib/Target/AMDGPU/AMDGPUInstructions.td
@@ -502,15 +502,15 @@ def zextloadi16_#as : PatFrag<(ops node:$ptr), (zextloadi16 node:$ptr)> {
   let IsLoad = 1;
 }
 
-def atomic_load_16_#as : PatFrag<(ops node:$ptr), (atomic_load_16 node:$ptr)> {
+def atomic_load_nonext_16_#as : PatFrag<(ops node:$ptr), (atomic_load_nonext_16 node:$ptr)> {
   let IsAtomic = 1;
 }
 
-def atomic_load_32_#as : PatFrag<(ops node:$ptr), (atomic_load_32 node:$ptr)> {
+def atomic_load_nonext_32_#as : PatFrag<(ops node:$ptr), (atomic_load_nonext_32 node:$ptr)> {
   let IsAtomic = 1;
 }
 
-def atomic_load_64_#as : PatFrag<(ops node:$ptr), (atomic_load_64 node:$ptr)> {
+def atomic_load_nonext_64_#as : PatFrag<(ops node:$ptr), (atomic_load_nonext_64 node:$ptr)> {
   let IsAtomic = 1;
 }
 
diff --git a/llvm/lib/Target/AMDGPU/BUFInstructions.td b/llvm/lib/Target/AMDGPU/BUFInstructions.td
index 7d64a3dd240c8..efcc81716a0f1 100644
--- a/llvm/lib/Target/AMDGPU/BUFInstructions.td
+++ b/llvm/lib/Target/AMDGPU/BUFInstructions.td
@@ -959,7 +959,7 @@ defm : MUBUF_Pseudo_Load_Pats<"BUFFER_LOAD_USHORT", i32, atomic_load_aext_16_glo
 defm : MUBUF_Pseudo_Load_Pats<"BUFFER_LOAD_USHORT", i32, atomic_load_zext_16_global>;
 defm : MUBUF_Pseudo_Load_Pats<"BUFFER_LOAD_UBYTE", i16, atomic_load_aext_8_global>;
 defm : MUBUF_Pseudo_Load_Pats<"BUFFER_LOAD_UBYTE", i16, atomic_load_zext_8_global>;
-defm : MUBUF_Pseudo_Load_Pats<"BUFFER_LOAD_USHORT", i16, atomic_load_16_global>;
+defm : MUBUF_Pseudo_Load_Pats<"BUFFER_LOAD_USHORT", i16, atomic_load_nonext_16_global>;
 defm : MUBUF_Pseudo_Load_Pats<"BUFFER_LOAD_UBYTE", i32, extloadi8_global>;
 defm : MUBUF_Pseudo_Load_Pats<"BUFFER_LOAD_UBYTE", i32, zextloadi8_global>;
 defm : MUBUF_Pseudo_Load_Pats<"BUFFER_LOAD_SBYTE", i32, sextloadi8_global>;
@@ -1933,8 +1933,8 @@ def : MUBUFLoad_PatternADDR64 <BUFFER_LOAD_SSHORT_ADDR64, i32, sextloadi16_const
 def : MUBUFLoad_PatternADDR64 <BUFFER_LOAD_USHORT_ADDR64, i32, extloadi16_constant>;
 def : MUBUFLoad_PatternADDR64 <BUFFER_LOAD_USHORT_ADDR64, i32, zextloadi16_constant>;
 
-defm : MUBUFLoad_Atomic_Pattern <BUFFER_LOAD_DWORD_ADDR64, BUFFER_LOAD_DWORD_OFFSET, i32, atomic_load_32_global>;
-defm : MUBUFLoad_Atomic_Pattern <BUFFER_LOAD_DWORDX2_ADDR64, BUFFER_LOAD_DWORDX2_OFFSET, i64, atomic_load_64_global>;
+defm : MUBUFLoad_Atomic_Pattern <BUFFER_LOAD_DWORD_ADDR64, BUFFER_LOAD_DWORD_OFFSET, i32, atomic_load_nonext_32_global>;
+defm : MUBUFLoad_Atomic_Pattern <BUFFER_LOAD_DWORDX2_ADDR64, BUFFER_LOAD_DWORDX2_OFFSET, i64, atomic_load_nonext_64_global>;
 } // End SubtargetPredicate = isGFX6GFX7
 
 multiclass MUBUFLoad_PatternOffset_Common <string Instr, ValueType vt,
diff --git a/llvm/lib/Target/AMDGPU/DSInstructions.td b/llvm/lib/Target/AMDGPU/DSInstructions.td
index 74884a2207079..604eb7f2c3878 100644
--- a/llvm/lib/Target/AMDGPU/DSInstructions.td
+++ b/llvm/lib/Target/AMDGPU/DSInstructions.td
@@ -859,12 +859,12 @@ defm : DSReadPat_t16 <DS_READ_U8, i16, "atomic_load_zext_8_local">;
 defm : DSReadPat_mc <DS_READ_U8, i32, "atomic_load_zext_8_local">;
 defm : DSReadPat_t16 <DS_READ_I8, i16, "atomic_load_sext_8_local">;
 defm : DSReadPat_mc <DS_READ_I8, i32, "atomic_load_sext_8_local">;
-defm : DSReadPat_t16 <DS_READ_U16, i16, "atomic_load_16_local">;
+defm : DSReadPat_t16 <DS_READ_U16, i16, "atomic_load_nonext_16_local">;
 defm : DSReadPat_mc <DS_READ_U16, i32, "atomic_load_aext_16_local">;
 defm : DSReadPat_mc <DS_READ_U16, i32, "atomic_load_zext_16_local">;
 defm : DSReadPat_mc <DS_READ_I16, i32, "atomic_load_sext_16_local">;
-defm : DSReadPat_mc <DS_READ_B32, i32, "atomic_load_32_local">;
-defm : DSReadPat_mc <DS_READ_B64, i64, "atomic_load_64_local">;
+defm : DSReadPat_mc <DS_READ_B32, i32, "atomic_load_nonext_32_local">;
+defm : DSReadPat_mc <DS_READ_B64, i64, "atomic_load_nonext_64_local">;
 
 let OtherPredicates = [D16PreservesUnusedBits] in {
 // TODO: Atomic loads
diff --git a/llvm/lib/Target/AMDGPU/FLATInstructions.td b/llvm/lib/Target/AMDGPU/FLATInstructions.td
index d8bb6e4378924..c17fda1346115 100644
--- a/llvm/lib/Target/AMDGPU/FLATInstructions.td
+++ b/llvm/lib/Target/AMDGPU/FLATInstructions.td
@@ -1541,7 +1541,7 @@ def : FlatLoadPat <FLAT_LOAD_UBYTE, atomic_load_aext_8_flat, i16>;
 def : FlatLoadPat <FLAT_LOAD_UBYTE, atomic_load_zext_8_flat, i32>;
 def : FlatLoadPat <FLAT_LOAD_UBYTE, atomic_load_zext_8_flat, i16>;
 def : FlatLoadPat <FLAT_LOAD_USHORT, atomic_load_aext_16_flat, i32>;
-def : FlatLoadPat <FLAT_LOAD_USHORT, atomic_load_16_flat, i16>;
+def : FlatLoadPat <FLAT_LOAD_USHORT, atomic_load_nonext_16_flat, i16>;
 def : FlatLoadPat <FLAT_LOAD_USHORT, atomic_load_zext_16_flat, i32>;
 def : FlatLoadPat <FLAT_LOAD_UBYTE, extloadi8_flat, i32>;
 def : FlatLoadPat <FLAT_LOAD_UBYTE, zextloadi8_flat, i32>;
@@ -1573,8 +1573,8 @@ let OtherPredicates = [D16PreservesUnusedBits, HasFlatAddressSpace], True16Predi
   def : FlatStorePat <FLAT_STORE_SHORT_t16, store_flat, i16>;
 } // End let OtherPredicates = [D16PreservesUnusedBits, HasFlatAddressSpace], True16Predicate = UseRealTrue16Insts
 
-def : FlatLoadPat <FLAT_LOAD_DWORD, atomic_load_32_flat, i32>;
-def : FlatLoadPat <FLAT_LOAD_DWORDX2, atomic_load_64_flat, i64>;
+def : FlatLoadPat <FLAT_LOAD_DWORD, atomic_load_nonext_32_flat, i32>;
+def : FlatLoadPat <FLAT_LOAD_DWORDX2, atomic_load_nonext_64_flat, i64>;
 
 def : FlatStorePat <FLAT_STORE_BYTE, truncstorei8_flat, i32>;
 def : FlatStorePat <FLAT_STORE_SHORT, truncstorei16_flat, i32>;
@@ -1682,7 +1682,7 @@ defm : GlobalFLATLoadPats <GLOBAL_LOAD_UBYTE, atomic_load_aext_8_global, i16>;
 defm : GlobalFLATLoadPats <GLOBAL_LOAD_UBYTE, atomic_load_zext_8_global, i32>;
 defm : GlobalFLATLoadPats <GLOBAL_LOAD_UBYTE, atomic_load_zext_8_global, i16>;
 defm : GlobalFLATLoadPats <GLOBAL_LOAD_USHORT, atomic_load_aext_16_global, i32>;
-defm : GlobalFLATLoadPats <GLOBAL_LOAD_USHORT, atomic_load_16_global, i16>;
+defm : GlobalFLATLoadPats <GLOBAL_LOAD_USHORT, atomic_load_nonext_16_global, i16>;
 defm : GlobalFLATLoadPats <GLOBAL_LOAD_USHORT, atomic_load_zext_16_global, i32>;
 defm : GlobalFLATLoadPats <GLOBAL_LOAD_USHORT, atomic_load_zext_16_global, i16>;
 defm : GlobalFLATLoadPats <GLOBAL_LOAD_SBYTE, atomic_load_sext_8_global, i32>;
@@ -1733,8 +1733,8 @@ defm : GlobalFLATStorePats <GLOBAL_STORE_DWORDX4, store_global, vt>;
 // There is no distinction for atomic load lowering during selection;
 // the memory legalizer will set the cache bits and insert the
 // appropriate waits.
-defm : GlobalFLATLoadPats <GLOBAL_LOAD_DWORD, atomic_load_32_global, i32>;
-defm : GlobalFLATLoadPats <GLOBAL_LOAD_DWORDX2, atomic_load_64_global, i64>;
+defm : GlobalFLATLoadPats <GLOBAL_LOAD_DWORD, atomic_load_nonext_32_global, i32>;
+defm : GlobalFLATLoadPats <GLOBAL_LOAD_DWORDX2, atomic_load_nonext_64_global, i64>;
 
 defm : GlobalFLATStorePats <GLOBAL_STORE_BYTE, truncstorei8_global, i32>;
 defm : GlobalFLATStorePats <GLOBAL_STORE_SHORT, truncstorei16_global, i32>;
diff --git a/llvm/lib/Target/AMDGPU/SIInstrInfo.td b/llvm/lib/Target/AMDGPU/SIInstrInfo.td
index ec1fd6fb60d57..5d837d853ac98 100644
--- a/llvm/lib/Target/AMDGPU/SIInstrInfo.td
+++ b/llvm/lib/Target/AMDGPU/SIInstrInfo.td
@@ -361,6 +361,12 @@ def load_glue : PatFrag <(ops node:$ptr), (unindexedload_glue node:$ptr)> {
   let IsNonExtLoad = 1;
 }
 
+def atomic_load_nonext_glue :
+  PatFrag<(ops node:$ptr), (AMDGPUatomic_ld_glue node:$ptr)> {
+  let IsAtomic = true; // FIXME: Should be IsLoad and/or IsAtomic?
+  let IsNonExtLoad = true;
+}
+
 def atomic_load_zext_glue :
   PatFrag<(ops node:$ptr), (AMDGPUatomic_ld_glue node:$ptr)> {
   let IsAtomic = true; // FIXME: Should be IsLoad and/or IsAtomic?
@@ -379,20 +385,20 @@ def atomic_load_aext_glue :
   let IsAnyExtLoad = true;
 }
 
-def atomic_load_16_glue : PatFrag<(ops node:$ptr),
-  (AMDGPUatomic_ld_glue node:$ptr)> {
+def atomic_load_nonext_16_glue : PatFrag<(ops node:$ptr),
+  (atomic_load_nonext_glue node:$ptr)> {
   let IsAtomic = 1;
   let MemoryVT = i16;
 }
 
-def atomic_load_32_glue : PatFrag<(ops node:$ptr),
-  (AMDGPUatomic_ld_glue node:$ptr)> {
+def atomic_load_nonext_32_glue : PatFrag<(ops node:$ptr),
+  (atomic_load_nonext_glue node:$ptr)> {
   let IsAtomic = 1;
   let MemoryVT = i32;
 }
 
-def atomic_load_64_glue : PatFrag<(ops node:$ptr),
-  (AMDGPUatomic_ld_glue node:$ptr)> {
+def atomic_load_nonext_64_glue : PatFrag<(ops node:$ptr),
+  (atomic_load_nonext_glue node:$ptr)> {
   let IsAtomic = 1;
   let MemoryVT = i64;
 }
@@ -506,12 +512,12 @@ def load_align16_local_m0 : PatFrag<(ops node:$ptr),
 }
 
 let IsAtomic = 1, AddressSpaces = LoadAddress_local.AddrSpaces in {
-def atomic_load_16_local_m0 : PatFrag<(ops node:$ptr),
-                                      (atomic_load_16_glue node:$ptr)>;
-def atomic_load_32_local_m0 : PatFrag<(ops node:$ptr),
-                                      (atomic_load_32_glue node:$ptr)>;
-def atomic_load_64_local_m0 : PatFrag<(ops node:$ptr),
-                                       (atomic_load_64_glue node:$ptr)>;
+def atomic_load_nonext_16_local_m0 : PatFrag<(ops node:$ptr),
+                                      (atomic_load_nonext_16_glue node:$ptr)>;
+def atomic_load_nonext_32_local_m0 : PatFrag<(ops node:$ptr),
+                                      (atomic_load_nonext_32_glue node:$ptr)>;
+def atomic_load_nonext_64_local_m0 : PatFrag<(ops node:$ptr),
+                                       (atomic_load_nonext_64_glue node:$ptr)>;
 
 def atomic_load_zext_8_local_m0 : PatFrag<(ops node:$ptr),
                                       (atomic_load_zext_8_glue node:$ptr)>;
diff --git a/llvm/lib/Target/ARM/ARMInstrInfo.td b/llvm/lib/Target/ARM/ARMInstrInfo.td
index 1ce9190a68f3c..c682f597401ec 100644
--- a/llvm/lib/Target/ARM/ARMInstrInfo.td
+++ b/llvm/lib/Target/ARM/ARMInstrInfo.td
@@ -5384,7 +5384,7 @@ class acquiring_load<PatFrags base>
 
 def atomic_load_azext_acquire_8  : acquiring_load<atomic_load_azext_8>;
 def atomic_load_azext_acquire_16 : acquiring_load<atomic_load_azext_16>;
-def atomic_load_acquire_32 : acquiring_load<atomic_load_32>;
+def atomic_load_nonext_acquire_32 : acquiring_load<atomic_load_nonext_32>;
 
 class releasing_store<PatFrag base>
   : PatFrag<(ops node:$ptr, node:$val), (base node:$val, node:$ptr), [{
@@ -5399,7 +5399,7 @@ def atomic_store_release_32 : releasing_store<atomic_store_32>;
 let AddedComplexity = 8 in {
   def : ARMPat<(atomic_load_azext_acquire_8 addr_offset_none:$addr),  (LDAB addr_offset_none:$addr)>;
   def : ARMPat<(atomic_load_azext_acquire_16 addr_offset_none:$addr), (LDAH addr_offset_none:$addr)>;
-  def : ARMPat<(atomic_load_acquire_32 addr_offset_none:$addr), (LDA  addr_offset_none:$addr)>;
+  def : ARMPat<(atomic_load_nonext_acquire_32 addr_offset_none:$addr), (LDA  addr_offset_none:$addr)>;
   def : ARMPat<(atomic_store_release_8 addr_offset_none:$addr, GPR:$val),  (STLB GPR:$val, addr_offset_none:$addr)>;
   def : ARMPat<(atomic_store_release_16 addr_offset_none:$addr, GPR:$val), (STLH GPR:$val, addr_offset_none:$addr)>;
   def : ARMPat<(atomic_store_release_32 addr_offset_none:$addr, GPR:$val), (STL  GPR:$val, addr_offset_none:$addr)>;
@@ -6220,9 +6220,9 @@ def : ARMPat<(atomic_load_azext_8 addrmode_imm12:$src),
              (LDRBi12 addrmode_imm12:$src)>;
 def : ARMPat<(atomic_load_azext_1...
[truncated]

@llvmbot
Copy link
Member

llvmbot commented Apr 26, 2025

@llvm/pr-subscribers-backend-risc-v

Author: Craig Topper (topperc)

Changes

atomic_load_8/16/32/64 will be removed in a separate patch as it will affect out of tree targets.


Patch is 48.32 KiB, truncated to 20.00 KiB below, full version: https://github.com/llvm/llvm-project/pull/137428.diff

25 Files Affected:

  • (modified) llvm/lib/Target/AArch64/AArch64InstrAtomics.td (+44-40)
  • (modified) llvm/lib/Target/AMDGPU/AMDGPUInstructions.td (+3-3)
  • (modified) llvm/lib/Target/AMDGPU/BUFInstructions.td (+3-3)
  • (modified) llvm/lib/Target/AMDGPU/DSInstructions.td (+3-3)
  • (modified) llvm/lib/Target/AMDGPU/FLATInstructions.td (+6-6)
  • (modified) llvm/lib/Target/AMDGPU/SIInstrInfo.td (+18-12)
  • (modified) llvm/lib/Target/ARM/ARMInstrInfo.td (+4-4)
  • (modified) llvm/lib/Target/ARM/ARMInstrThumb.td (+2-2)
  • (modified) llvm/lib/Target/ARM/ARMInstrThumb2.td (+4-4)
  • (modified) llvm/lib/Target/AVR/AVRInstrInfo.td (+2-2)
  • (modified) llvm/lib/Target/BPF/BPFInstrInfo.td (+4-4)
  • (modified) llvm/lib/Target/Hexagon/HexagonPatterns.td (+6-6)
  • (modified) llvm/lib/Target/LoongArch/LoongArchInstrInfo.td (+2-2)
  • (modified) llvm/lib/Target/Mips/MicroMipsInstrInfo.td (+1-1)
  • (modified) llvm/lib/Target/Mips/Mips64InstrInfo.td (+1-1)
  • (modified) llvm/lib/Target/Mips/MipsInstrInfo.td (+1-1)
  • (modified) llvm/lib/Target/PowerPC/PPCInstr64Bit.td (+2-2)
  • (modified) llvm/lib/Target/PowerPC/PPCInstrInfo.td (+2-2)
  • (modified) llvm/lib/Target/PowerPC/PPCInstrP10.td (+4-4)
  • (modified) llvm/lib/Target/RISCV/RISCVGISel.td (+2-2)
  • (modified) llvm/lib/Target/Sparc/SparcInstr64Bit.td (+3-3)
  • (modified) llvm/lib/Target/Sparc/SparcInstrInfo.td (+2-2)
  • (modified) llvm/lib/Target/VE/VEInstrInfo.td (+4-4)
  • (modified) llvm/lib/Target/WebAssembly/WebAssemblyInstrAtomics.td (+5-5)
  • (modified) llvm/lib/Target/X86/X86InstrCompiler.td (+40-40)
diff --git a/llvm/lib/Target/AArch64/AArch64InstrAtomics.td b/llvm/lib/Target/AArch64/AArch64InstrAtomics.td
index 28d45fe25d30c..f3734e05ae667 100644
--- a/llvm/lib/Target/AArch64/AArch64InstrAtomics.td
+++ b/llvm/lib/Target/AArch64/AArch64InstrAtomics.td
@@ -55,9 +55,9 @@ let Predicates = [HasRCPC] in {
   // 16-bit loads
   def : Pat<(acquiring_load<atomic_load_azext_16> GPR64sp:$ptr), (LDAPRH GPR64sp:$ptr)>;
   // 32-bit loads
-  def : Pat<(acquiring_load<atomic_load_32> GPR64sp:$ptr), (LDAPRW GPR64sp:$ptr)>;
+  def : Pat<(acquiring_load<atomic_load_nonext_32> GPR64sp:$ptr), (LDAPRW GPR64sp:$ptr)>;
   // 64-bit loads
-  def : Pat<(acquiring_load<atomic_load_64> GPR64sp:$ptr), (LDAPRX GPR64sp:$ptr)>;
+  def : Pat<(acquiring_load<atomic_load_nonext_64> GPR64sp:$ptr), (LDAPRX GPR64sp:$ptr)>;
 }
 
 // 8-bit loads
@@ -93,62 +93,66 @@ def : Pat<(relaxed_load<atomic_load_azext_16>
           (LDURHHi GPR64sp:$Rn, simm9:$offset)>;
 
 // 32-bit loads
-def : Pat<(seq_cst_load<atomic_load_32> GPR64sp:$ptr), (LDARW GPR64sp:$ptr)>;
-def : Pat<(acquiring_load<atomic_load_32> GPR64sp:$ptr), (LDARW GPR64sp:$ptr)>;
-def : Pat<(relaxed_load<atomic_load_32> (ro_Windexed32 GPR64sp:$Rn, GPR32:$Rm,
-                                                       ro_Wextend32:$extend)),
+def : Pat<(seq_cst_load<atomic_load_nonext_32> GPR64sp:$ptr),
+          (LDARW GPR64sp:$ptr)>;
+def : Pat<(acquiring_load<atomic_load_nonext_32> GPR64sp:$ptr),
+          (LDARW GPR64sp:$ptr)>;
+def : Pat<(relaxed_load<atomic_load_nonext_32>
+               (ro_Windexed32 GPR64sp:$Rn, GPR32:$Rm, ro_Wextend32:$extend)),
           (LDRWroW GPR64sp:$Rn, GPR32:$Rm, ro_Wextend32:$extend)>;
-def : Pat<(relaxed_load<atomic_load_32> (ro_Xindexed32 GPR64sp:$Rn, GPR64:$Rm,
-                                                       ro_Xextend32:$extend)),
+def : Pat<(relaxed_load<atomic_load_nonext_32>
+               (ro_Xindexed32 GPR64sp:$Rn, GPR64:$Rm, ro_Xextend32:$extend)),
           (LDRWroX GPR64sp:$Rn, GPR64:$Rm, ro_Xextend32:$extend)>;
-def : Pat<(relaxed_load<atomic_load_32> (am_indexed32 GPR64sp:$Rn,
-                                                      uimm12s4:$offset)),
+def : Pat<(relaxed_load<atomic_load_nonext_32>
+               (am_indexed32 GPR64sp:$Rn, uimm12s4:$offset)),
           (LDRWui GPR64sp:$Rn, uimm12s4:$offset)>;
-def : Pat<(relaxed_load<atomic_load_32>
+def : Pat<(relaxed_load<atomic_load_nonext_32>
                (am_unscaled32 GPR64sp:$Rn, simm9:$offset)),
           (LDURWi GPR64sp:$Rn, simm9:$offset)>;
 
 // 64-bit loads
-def : Pat<(seq_cst_load<atomic_load_64> GPR64sp:$ptr), (LDARX GPR64sp:$ptr)>;
-def : Pat<(acquiring_load<atomic_load_64> GPR64sp:$ptr), (LDARX GPR64sp:$ptr)>;
-def : Pat<(relaxed_load<atomic_load_64> (ro_Windexed64 GPR64sp:$Rn, GPR32:$Rm,
-                                                       ro_Wextend64:$extend)),
+def : Pat<(seq_cst_load<atomic_load_nonext_64> GPR64sp:$ptr),
+          (LDARX GPR64sp:$ptr)>;
+def : Pat<(acquiring_load<atomic_load_nonext_64> GPR64sp:$ptr),
+          (LDARX GPR64sp:$ptr)>;
+def : Pat<(relaxed_load<atomic_load_nonext_64>
+               (ro_Windexed64 GPR64sp:$Rn, GPR32:$Rm, ro_Wextend64:$extend)),
           (LDRXroW GPR64sp:$Rn, GPR32:$Rm, ro_Wextend64:$extend)>;
-def : Pat<(relaxed_load<atomic_load_64> (ro_Xindexed64 GPR64sp:$Rn, GPR64:$Rm,
-                                                       ro_Xextend64:$extend)),
+def : Pat<(relaxed_load<atomic_load_nonext_64>
+               (ro_Xindexed64 GPR64sp:$Rn, GPR64:$Rm, ro_Xextend64:$extend)),
           (LDRXroX GPR64sp:$Rn, GPR64:$Rm, ro_Xextend64:$extend)>;
-def : Pat<(relaxed_load<atomic_load_64> (am_indexed64 GPR64sp:$Rn,
-                                                      uimm12s8:$offset)),
+def : Pat<(relaxed_load<atomic_load_nonext_64>
+               (am_indexed64 GPR64sp:$Rn, uimm12s8:$offset)),
           (LDRXui GPR64sp:$Rn, uimm12s8:$offset)>;
-def : Pat<(relaxed_load<atomic_load_64>
+def : Pat<(relaxed_load<atomic_load_nonext_64>
                (am_unscaled64 GPR64sp:$Rn, simm9:$offset)),
           (LDURXi GPR64sp:$Rn, simm9:$offset)>;
 
 // FP 32-bit loads
-def : Pat<(f32 (bitconvert (i32 (relaxed_load<atomic_load_32> (ro_Windexed32 GPR64sp:$Rn, GPR32:$Rm,
-                                                       ro_Wextend32:$extend))))),
+def : Pat<(f32 (bitconvert (i32 (relaxed_load<atomic_load_nonext_32>
+               (ro_Windexed32 GPR64sp:$Rn, GPR32:$Rm, ro_Wextend32:$extend))))),
           (LDRSroW GPR64sp:$Rn, GPR32:$Rm, ro_Wextend32:$extend)>;
-def : Pat<(f32 (bitconvert (i32 (relaxed_load<atomic_load_32> (ro_Xindexed32 GPR64sp:$Rn, GPR64:$Rm,
-                                                       ro_Xextend32:$extend))))),
+def : Pat<(f32 (bitconvert (i32 (relaxed_load<atomic_load_nonext_32>
+               (ro_Xindexed32 GPR64sp:$Rn, GPR64:$Rm, ro_Xextend32:$extend))))),
           (LDRSroX GPR64sp:$Rn, GPR64:$Rm, ro_Xextend32:$extend)>;
-def : Pat<(f32 (bitconvert (i32 (relaxed_load<atomic_load_32> (am_indexed32 GPR64sp:$Rn,
-                                                      uimm12s8:$offset))))),
+def : Pat<(f32 (bitconvert (i32 (relaxed_load<atomic_load_nonext_32>
+               (am_indexed32 GPR64sp:$Rn, uimm12s8:$offset))))),
           (LDRSui GPR64sp:$Rn, uimm12s8:$offset)>;
-def : Pat<(f32 (bitconvert (i32 (relaxed_load<atomic_load_32>
+def : Pat<(f32 (bitconvert (i32 (relaxed_load<atomic_load_nonext_32>
                (am_unscaled32 GPR64sp:$Rn, simm9:$offset))))),
           (LDURSi GPR64sp:$Rn, simm9:$offset)>;
 
 // FP 64-bit loads
-def : Pat<(f64 (bitconvert (i64 (relaxed_load<atomic_load_64> (ro_Windexed64 GPR64sp:$Rn, GPR32:$Rm,
-                                                       ro_Wextend64:$extend))))),
+def : Pat<(f64 (bitconvert (i64 (relaxed_load<atomic_load_nonext_64>
+               (ro_Windexed64 GPR64sp:$Rn, GPR32:$Rm, ro_Wextend64:$extend))))),
           (LDRDroW GPR64sp:$Rn, GPR32:$Rm, ro_Wextend64:$extend)>;
-def : Pat<(f64 (bitconvert (i64 (relaxed_load<atomic_load_64> (ro_Xindexed64 GPR64sp:$Rn, GPR64:$Rm,
-                                                       ro_Xextend64:$extend))))),
+def : Pat<(f64 (bitconvert (i64 (relaxed_load<atomic_load_nonext_64>
+               (ro_Xindexed64 GPR64sp:$Rn, GPR64:$Rm, ro_Xextend64:$extend))))),
           (LDRDroX GPR64sp:$Rn, GPR64:$Rm, ro_Xextend64:$extend)>;
-def : Pat<(f64 (bitconvert (i64 (relaxed_load<atomic_load_64> (am_indexed64 GPR64sp:$Rn,
-                                                      uimm12s8:$offset))))),
+def : Pat<(f64 (bitconvert (i64 (relaxed_load<atomic_load_nonext_64>
+               (am_indexed64 GPR64sp:$Rn, uimm12s8:$offset))))),
           (LDRDui GPR64sp:$Rn, uimm12s8:$offset)>;
-def : Pat<(f64 (bitconvert (i64 (relaxed_load<atomic_load_64>
+def : Pat<(f64 (bitconvert (i64 (relaxed_load<atomic_load_nonext_64>
                (am_unscaled64 GPR64sp:$Rn, simm9:$offset))))),
           (LDURDi GPR64sp:$Rn, simm9:$offset)>;
 
@@ -561,16 +565,16 @@ let Predicates = [HasLSFE] in {
 let Predicates = [HasRCPC3, HasNEON] in {
   // LDAP1 loads
   def : Pat<(vector_insert (v2i64 VecListOne128:$Rd),
-                (i64 (acquiring_load<atomic_load_64> GPR64sp:$Rn)), (i64 VectorIndexD:$idx)),
+                (i64 (acquiring_load<atomic_load_nonext_64> GPR64sp:$Rn)), (i64 VectorIndexD:$idx)),
             (LDAP1 VecListOne128:$Rd, VectorIndexD:$idx, GPR64sp:$Rn)>;
   def : Pat<(vector_insert (v2f64 VecListOne128:$Rd),
-                (f64 (bitconvert (i64 (acquiring_load<atomic_load_64> GPR64sp:$Rn)))), (i64 VectorIndexD:$idx)),
+                (f64 (bitconvert (i64 (acquiring_load<atomic_load_nonext_64> GPR64sp:$Rn)))), (i64 VectorIndexD:$idx)),
             (LDAP1 VecListOne128:$Rd, VectorIndexD:$idx, GPR64sp:$Rn)>;
   def : Pat<(v1i64 (scalar_to_vector
-                (i64 (acquiring_load<atomic_load_64> GPR64sp:$Rn)))),
+                (i64 (acquiring_load<atomic_load_nonext_64> GPR64sp:$Rn)))),
             (EXTRACT_SUBREG (LDAP1 (v2i64 (IMPLICIT_DEF)), (i64 0), GPR64sp:$Rn), dsub)>;
   def : Pat<(v1f64 (scalar_to_vector
-                (f64 (bitconvert (i64 (acquiring_load<atomic_load_64> GPR64sp:$Rn)))))),
+                (f64 (bitconvert (i64 (acquiring_load<atomic_load_nonext_64> GPR64sp:$Rn)))))),
             (EXTRACT_SUBREG (LDAP1 (v2f64 (IMPLICIT_DEF)), (i64 0), GPR64sp:$Rn), dsub)>;
 
   // STL1 stores
@@ -597,10 +601,10 @@ let Predicates = [HasRCPC_IMMO, UseLDAPUR] in {
   def : Pat<(acquiring_load<atomic_load_azext_16>
                (am_unscaled16 GPR64sp:$Rn, simm9:$offset)),
           (LDAPURHi GPR64sp:$Rn, simm9:$offset)>;
-  def : Pat<(acquiring_load<atomic_load_32>
+  def : Pat<(acquiring_load<atomic_load_nonext_32>
                (am_unscaled32 GPR64sp:$Rn, simm9:$offset)),
           (LDAPURi GPR64sp:$Rn, simm9:$offset)>;
-  def : Pat<(acquiring_load<atomic_load_64>
+  def : Pat<(acquiring_load<atomic_load_nonext_64>
                (am_unscaled64 GPR64sp:$Rn, simm9:$offset)),
           (LDAPURXi GPR64sp:$Rn, simm9:$offset)>;
 }
diff --git a/llvm/lib/Target/AMDGPU/AMDGPUInstructions.td b/llvm/lib/Target/AMDGPU/AMDGPUInstructions.td
index 6cc76b44f1e14..78a92d85cfd8e 100644
--- a/llvm/lib/Target/AMDGPU/AMDGPUInstructions.td
+++ b/llvm/lib/Target/AMDGPU/AMDGPUInstructions.td
@@ -502,15 +502,15 @@ def zextloadi16_#as : PatFrag<(ops node:$ptr), (zextloadi16 node:$ptr)> {
   let IsLoad = 1;
 }
 
-def atomic_load_16_#as : PatFrag<(ops node:$ptr), (atomic_load_16 node:$ptr)> {
+def atomic_load_nonext_16_#as : PatFrag<(ops node:$ptr), (atomic_load_nonext_16 node:$ptr)> {
   let IsAtomic = 1;
 }
 
-def atomic_load_32_#as : PatFrag<(ops node:$ptr), (atomic_load_32 node:$ptr)> {
+def atomic_load_nonext_32_#as : PatFrag<(ops node:$ptr), (atomic_load_nonext_32 node:$ptr)> {
   let IsAtomic = 1;
 }
 
-def atomic_load_64_#as : PatFrag<(ops node:$ptr), (atomic_load_64 node:$ptr)> {
+def atomic_load_nonext_64_#as : PatFrag<(ops node:$ptr), (atomic_load_nonext_64 node:$ptr)> {
   let IsAtomic = 1;
 }
 
diff --git a/llvm/lib/Target/AMDGPU/BUFInstructions.td b/llvm/lib/Target/AMDGPU/BUFInstructions.td
index 7d64a3dd240c8..efcc81716a0f1 100644
--- a/llvm/lib/Target/AMDGPU/BUFInstructions.td
+++ b/llvm/lib/Target/AMDGPU/BUFInstructions.td
@@ -959,7 +959,7 @@ defm : MUBUF_Pseudo_Load_Pats<"BUFFER_LOAD_USHORT", i32, atomic_load_aext_16_glo
 defm : MUBUF_Pseudo_Load_Pats<"BUFFER_LOAD_USHORT", i32, atomic_load_zext_16_global>;
 defm : MUBUF_Pseudo_Load_Pats<"BUFFER_LOAD_UBYTE", i16, atomic_load_aext_8_global>;
 defm : MUBUF_Pseudo_Load_Pats<"BUFFER_LOAD_UBYTE", i16, atomic_load_zext_8_global>;
-defm : MUBUF_Pseudo_Load_Pats<"BUFFER_LOAD_USHORT", i16, atomic_load_16_global>;
+defm : MUBUF_Pseudo_Load_Pats<"BUFFER_LOAD_USHORT", i16, atomic_load_nonext_16_global>;
 defm : MUBUF_Pseudo_Load_Pats<"BUFFER_LOAD_UBYTE", i32, extloadi8_global>;
 defm : MUBUF_Pseudo_Load_Pats<"BUFFER_LOAD_UBYTE", i32, zextloadi8_global>;
 defm : MUBUF_Pseudo_Load_Pats<"BUFFER_LOAD_SBYTE", i32, sextloadi8_global>;
@@ -1933,8 +1933,8 @@ def : MUBUFLoad_PatternADDR64 <BUFFER_LOAD_SSHORT_ADDR64, i32, sextloadi16_const
 def : MUBUFLoad_PatternADDR64 <BUFFER_LOAD_USHORT_ADDR64, i32, extloadi16_constant>;
 def : MUBUFLoad_PatternADDR64 <BUFFER_LOAD_USHORT_ADDR64, i32, zextloadi16_constant>;
 
-defm : MUBUFLoad_Atomic_Pattern <BUFFER_LOAD_DWORD_ADDR64, BUFFER_LOAD_DWORD_OFFSET, i32, atomic_load_32_global>;
-defm : MUBUFLoad_Atomic_Pattern <BUFFER_LOAD_DWORDX2_ADDR64, BUFFER_LOAD_DWORDX2_OFFSET, i64, atomic_load_64_global>;
+defm : MUBUFLoad_Atomic_Pattern <BUFFER_LOAD_DWORD_ADDR64, BUFFER_LOAD_DWORD_OFFSET, i32, atomic_load_nonext_32_global>;
+defm : MUBUFLoad_Atomic_Pattern <BUFFER_LOAD_DWORDX2_ADDR64, BUFFER_LOAD_DWORDX2_OFFSET, i64, atomic_load_nonext_64_global>;
 } // End SubtargetPredicate = isGFX6GFX7
 
 multiclass MUBUFLoad_PatternOffset_Common <string Instr, ValueType vt,
diff --git a/llvm/lib/Target/AMDGPU/DSInstructions.td b/llvm/lib/Target/AMDGPU/DSInstructions.td
index 74884a2207079..604eb7f2c3878 100644
--- a/llvm/lib/Target/AMDGPU/DSInstructions.td
+++ b/llvm/lib/Target/AMDGPU/DSInstructions.td
@@ -859,12 +859,12 @@ defm : DSReadPat_t16 <DS_READ_U8, i16, "atomic_load_zext_8_local">;
 defm : DSReadPat_mc <DS_READ_U8, i32, "atomic_load_zext_8_local">;
 defm : DSReadPat_t16 <DS_READ_I8, i16, "atomic_load_sext_8_local">;
 defm : DSReadPat_mc <DS_READ_I8, i32, "atomic_load_sext_8_local">;
-defm : DSReadPat_t16 <DS_READ_U16, i16, "atomic_load_16_local">;
+defm : DSReadPat_t16 <DS_READ_U16, i16, "atomic_load_nonext_16_local">;
 defm : DSReadPat_mc <DS_READ_U16, i32, "atomic_load_aext_16_local">;
 defm : DSReadPat_mc <DS_READ_U16, i32, "atomic_load_zext_16_local">;
 defm : DSReadPat_mc <DS_READ_I16, i32, "atomic_load_sext_16_local">;
-defm : DSReadPat_mc <DS_READ_B32, i32, "atomic_load_32_local">;
-defm : DSReadPat_mc <DS_READ_B64, i64, "atomic_load_64_local">;
+defm : DSReadPat_mc <DS_READ_B32, i32, "atomic_load_nonext_32_local">;
+defm : DSReadPat_mc <DS_READ_B64, i64, "atomic_load_nonext_64_local">;
 
 let OtherPredicates = [D16PreservesUnusedBits] in {
 // TODO: Atomic loads
diff --git a/llvm/lib/Target/AMDGPU/FLATInstructions.td b/llvm/lib/Target/AMDGPU/FLATInstructions.td
index d8bb6e4378924..c17fda1346115 100644
--- a/llvm/lib/Target/AMDGPU/FLATInstructions.td
+++ b/llvm/lib/Target/AMDGPU/FLATInstructions.td
@@ -1541,7 +1541,7 @@ def : FlatLoadPat <FLAT_LOAD_UBYTE, atomic_load_aext_8_flat, i16>;
 def : FlatLoadPat <FLAT_LOAD_UBYTE, atomic_load_zext_8_flat, i32>;
 def : FlatLoadPat <FLAT_LOAD_UBYTE, atomic_load_zext_8_flat, i16>;
 def : FlatLoadPat <FLAT_LOAD_USHORT, atomic_load_aext_16_flat, i32>;
-def : FlatLoadPat <FLAT_LOAD_USHORT, atomic_load_16_flat, i16>;
+def : FlatLoadPat <FLAT_LOAD_USHORT, atomic_load_nonext_16_flat, i16>;
 def : FlatLoadPat <FLAT_LOAD_USHORT, atomic_load_zext_16_flat, i32>;
 def : FlatLoadPat <FLAT_LOAD_UBYTE, extloadi8_flat, i32>;
 def : FlatLoadPat <FLAT_LOAD_UBYTE, zextloadi8_flat, i32>;
@@ -1573,8 +1573,8 @@ let OtherPredicates = [D16PreservesUnusedBits, HasFlatAddressSpace], True16Predi
   def : FlatStorePat <FLAT_STORE_SHORT_t16, store_flat, i16>;
 } // End let OtherPredicates = [D16PreservesUnusedBits, HasFlatAddressSpace], True16Predicate = UseRealTrue16Insts
 
-def : FlatLoadPat <FLAT_LOAD_DWORD, atomic_load_32_flat, i32>;
-def : FlatLoadPat <FLAT_LOAD_DWORDX2, atomic_load_64_flat, i64>;
+def : FlatLoadPat <FLAT_LOAD_DWORD, atomic_load_nonext_32_flat, i32>;
+def : FlatLoadPat <FLAT_LOAD_DWORDX2, atomic_load_nonext_64_flat, i64>;
 
 def : FlatStorePat <FLAT_STORE_BYTE, truncstorei8_flat, i32>;
 def : FlatStorePat <FLAT_STORE_SHORT, truncstorei16_flat, i32>;
@@ -1682,7 +1682,7 @@ defm : GlobalFLATLoadPats <GLOBAL_LOAD_UBYTE, atomic_load_aext_8_global, i16>;
 defm : GlobalFLATLoadPats <GLOBAL_LOAD_UBYTE, atomic_load_zext_8_global, i32>;
 defm : GlobalFLATLoadPats <GLOBAL_LOAD_UBYTE, atomic_load_zext_8_global, i16>;
 defm : GlobalFLATLoadPats <GLOBAL_LOAD_USHORT, atomic_load_aext_16_global, i32>;
-defm : GlobalFLATLoadPats <GLOBAL_LOAD_USHORT, atomic_load_16_global, i16>;
+defm : GlobalFLATLoadPats <GLOBAL_LOAD_USHORT, atomic_load_nonext_16_global, i16>;
 defm : GlobalFLATLoadPats <GLOBAL_LOAD_USHORT, atomic_load_zext_16_global, i32>;
 defm : GlobalFLATLoadPats <GLOBAL_LOAD_USHORT, atomic_load_zext_16_global, i16>;
 defm : GlobalFLATLoadPats <GLOBAL_LOAD_SBYTE, atomic_load_sext_8_global, i32>;
@@ -1733,8 +1733,8 @@ defm : GlobalFLATStorePats <GLOBAL_STORE_DWORDX4, store_global, vt>;
 // There is no distinction for atomic load lowering during selection;
 // the memory legalizer will set the cache bits and insert the
 // appropriate waits.
-defm : GlobalFLATLoadPats <GLOBAL_LOAD_DWORD, atomic_load_32_global, i32>;
-defm : GlobalFLATLoadPats <GLOBAL_LOAD_DWORDX2, atomic_load_64_global, i64>;
+defm : GlobalFLATLoadPats <GLOBAL_LOAD_DWORD, atomic_load_nonext_32_global, i32>;
+defm : GlobalFLATLoadPats <GLOBAL_LOAD_DWORDX2, atomic_load_nonext_64_global, i64>;
 
 defm : GlobalFLATStorePats <GLOBAL_STORE_BYTE, truncstorei8_global, i32>;
 defm : GlobalFLATStorePats <GLOBAL_STORE_SHORT, truncstorei16_global, i32>;
diff --git a/llvm/lib/Target/AMDGPU/SIInstrInfo.td b/llvm/lib/Target/AMDGPU/SIInstrInfo.td
index ec1fd6fb60d57..5d837d853ac98 100644
--- a/llvm/lib/Target/AMDGPU/SIInstrInfo.td
+++ b/llvm/lib/Target/AMDGPU/SIInstrInfo.td
@@ -361,6 +361,12 @@ def load_glue : PatFrag <(ops node:$ptr), (unindexedload_glue node:$ptr)> {
   let IsNonExtLoad = 1;
 }
 
+def atomic_load_nonext_glue :
+  PatFrag<(ops node:$ptr), (AMDGPUatomic_ld_glue node:$ptr)> {
+  let IsAtomic = true; // FIXME: Should be IsLoad and/or IsAtomic?
+  let IsNonExtLoad = true;
+}
+
 def atomic_load_zext_glue :
   PatFrag<(ops node:$ptr), (AMDGPUatomic_ld_glue node:$ptr)> {
   let IsAtomic = true; // FIXME: Should be IsLoad and/or IsAtomic?
@@ -379,20 +385,20 @@ def atomic_load_aext_glue :
   let IsAnyExtLoad = true;
 }
 
-def atomic_load_16_glue : PatFrag<(ops node:$ptr),
-  (AMDGPUatomic_ld_glue node:$ptr)> {
+def atomic_load_nonext_16_glue : PatFrag<(ops node:$ptr),
+  (atomic_load_nonext_glue node:$ptr)> {
   let IsAtomic = 1;
   let MemoryVT = i16;
 }
 
-def atomic_load_32_glue : PatFrag<(ops node:$ptr),
-  (AMDGPUatomic_ld_glue node:$ptr)> {
+def atomic_load_nonext_32_glue : PatFrag<(ops node:$ptr),
+  (atomic_load_nonext_glue node:$ptr)> {
   let IsAtomic = 1;
   let MemoryVT = i32;
 }
 
-def atomic_load_64_glue : PatFrag<(ops node:$ptr),
-  (AMDGPUatomic_ld_glue node:$ptr)> {
+def atomic_load_nonext_64_glue : PatFrag<(ops node:$ptr),
+  (atomic_load_nonext_glue node:$ptr)> {
   let IsAtomic = 1;
   let MemoryVT = i64;
 }
@@ -506,12 +512,12 @@ def load_align16_local_m0 : PatFrag<(ops node:$ptr),
 }
 
 let IsAtomic = 1, AddressSpaces = LoadAddress_local.AddrSpaces in {
-def atomic_load_16_local_m0 : PatFrag<(ops node:$ptr),
-                                      (atomic_load_16_glue node:$ptr)>;
-def atomic_load_32_local_m0 : PatFrag<(ops node:$ptr),
-                                      (atomic_load_32_glue node:$ptr)>;
-def atomic_load_64_local_m0 : PatFrag<(ops node:$ptr),
-                                       (atomic_load_64_glue node:$ptr)>;
+def atomic_load_nonext_16_local_m0 : PatFrag<(ops node:$ptr),
+                                      (atomic_load_nonext_16_glue node:$ptr)>;
+def atomic_load_nonext_32_local_m0 : PatFrag<(ops node:$ptr),
+                                      (atomic_load_nonext_32_glue node:$ptr)>;
+def atomic_load_nonext_64_local_m0 : PatFrag<(ops node:$ptr),
+                                       (atomic_load_nonext_64_glue node:$ptr)>;
 
 def atomic_load_zext_8_local_m0 : PatFrag<(ops node:$ptr),
                                       (atomic_load_zext_8_glue node:$ptr)>;
diff --git a/llvm/lib/Target/ARM/ARMInstrInfo.td b/llvm/lib/Target/ARM/ARMInstrInfo.td
index 1ce9190a68f3c..c682f597401ec 100644
--- a/llvm/lib/Target/ARM/ARMInstrInfo.td
+++ b/llvm/lib/Target/ARM/ARMInstrInfo.td
@@ -5384,7 +5384,7 @@ class acquiring_load<PatFrags base>
 
 def atomic_load_azext_acquire_8  : acquiring_load<atomic_load_azext_8>;
 def atomic_load_azext_acquire_16 : acquiring_load<atomic_load_azext_16>;
-def atomic_load_acquire_32 : acquiring_load<atomic_load_32>;
+def atomic_load_nonext_acquire_32 : acquiring_load<atomic_load_nonext_32>;
 
 class releasing_store<PatFrag base>
   : PatFrag<(ops node:$ptr, node:$val), (base node:$val, node:$ptr), [{
@@ -5399,7 +5399,7 @@ def atomic_store_release_32 : releasing_store<atomic_store_32>;
 let AddedComplexity = 8 in {
   def : ARMPat<(atomic_load_azext_acquire_8 addr_offset_none:$addr),  (LDAB addr_offset_none:$addr)>;
   def : ARMPat<(atomic_load_azext_acquire_16 addr_offset_none:$addr), (LDAH addr_offset_none:$addr)>;
-  def : ARMPat<(atomic_load_acquire_32 addr_offset_none:$addr), (LDA  addr_offset_none:$addr)>;
+  def : ARMPat<(atomic_load_nonext_acquire_32 addr_offset_none:$addr), (LDA  addr_offset_none:$addr)>;
   def : ARMPat<(atomic_store_release_8 addr_offset_none:$addr, GPR:$val),  (STLB GPR:$val, addr_offset_none:$addr)>;
   def : ARMPat<(atomic_store_release_16 addr_offset_none:$addr, GPR:$val), (STLH GPR:$val, addr_offset_none:$addr)>;
   def : ARMPat<(atomic_store_release_32 addr_offset_none:$addr, GPR:$val), (STL  GPR:$val, addr_offset_none:$addr)>;
@@ -6220,9 +6220,9 @@ def : ARMPat<(atomic_load_azext_8 addrmode_imm12:$src),
              (LDRBi12 addrmode_imm12:$src)>;
 def : ARMPat<(atomic_load_azext_1...
[truncated]

@llvmbot
Copy link
Member

llvmbot commented Apr 26, 2025

@llvm/pr-subscribers-backend-aarch64

Author: Craig Topper (topperc)

Changes

atomic_load_8/16/32/64 will be removed in a separate patch as it will affect out of tree targets.


Patch is 48.32 KiB, truncated to 20.00 KiB below, full version: https://github.com/llvm/llvm-project/pull/137428.diff

25 Files Affected:

  • (modified) llvm/lib/Target/AArch64/AArch64InstrAtomics.td (+44-40)
  • (modified) llvm/lib/Target/AMDGPU/AMDGPUInstructions.td (+3-3)
  • (modified) llvm/lib/Target/AMDGPU/BUFInstructions.td (+3-3)
  • (modified) llvm/lib/Target/AMDGPU/DSInstructions.td (+3-3)
  • (modified) llvm/lib/Target/AMDGPU/FLATInstructions.td (+6-6)
  • (modified) llvm/lib/Target/AMDGPU/SIInstrInfo.td (+18-12)
  • (modified) llvm/lib/Target/ARM/ARMInstrInfo.td (+4-4)
  • (modified) llvm/lib/Target/ARM/ARMInstrThumb.td (+2-2)
  • (modified) llvm/lib/Target/ARM/ARMInstrThumb2.td (+4-4)
  • (modified) llvm/lib/Target/AVR/AVRInstrInfo.td (+2-2)
  • (modified) llvm/lib/Target/BPF/BPFInstrInfo.td (+4-4)
  • (modified) llvm/lib/Target/Hexagon/HexagonPatterns.td (+6-6)
  • (modified) llvm/lib/Target/LoongArch/LoongArchInstrInfo.td (+2-2)
  • (modified) llvm/lib/Target/Mips/MicroMipsInstrInfo.td (+1-1)
  • (modified) llvm/lib/Target/Mips/Mips64InstrInfo.td (+1-1)
  • (modified) llvm/lib/Target/Mips/MipsInstrInfo.td (+1-1)
  • (modified) llvm/lib/Target/PowerPC/PPCInstr64Bit.td (+2-2)
  • (modified) llvm/lib/Target/PowerPC/PPCInstrInfo.td (+2-2)
  • (modified) llvm/lib/Target/PowerPC/PPCInstrP10.td (+4-4)
  • (modified) llvm/lib/Target/RISCV/RISCVGISel.td (+2-2)
  • (modified) llvm/lib/Target/Sparc/SparcInstr64Bit.td (+3-3)
  • (modified) llvm/lib/Target/Sparc/SparcInstrInfo.td (+2-2)
  • (modified) llvm/lib/Target/VE/VEInstrInfo.td (+4-4)
  • (modified) llvm/lib/Target/WebAssembly/WebAssemblyInstrAtomics.td (+5-5)
  • (modified) llvm/lib/Target/X86/X86InstrCompiler.td (+40-40)
diff --git a/llvm/lib/Target/AArch64/AArch64InstrAtomics.td b/llvm/lib/Target/AArch64/AArch64InstrAtomics.td
index 28d45fe25d30c..f3734e05ae667 100644
--- a/llvm/lib/Target/AArch64/AArch64InstrAtomics.td
+++ b/llvm/lib/Target/AArch64/AArch64InstrAtomics.td
@@ -55,9 +55,9 @@ let Predicates = [HasRCPC] in {
   // 16-bit loads
   def : Pat<(acquiring_load<atomic_load_azext_16> GPR64sp:$ptr), (LDAPRH GPR64sp:$ptr)>;
   // 32-bit loads
-  def : Pat<(acquiring_load<atomic_load_32> GPR64sp:$ptr), (LDAPRW GPR64sp:$ptr)>;
+  def : Pat<(acquiring_load<atomic_load_nonext_32> GPR64sp:$ptr), (LDAPRW GPR64sp:$ptr)>;
   // 64-bit loads
-  def : Pat<(acquiring_load<atomic_load_64> GPR64sp:$ptr), (LDAPRX GPR64sp:$ptr)>;
+  def : Pat<(acquiring_load<atomic_load_nonext_64> GPR64sp:$ptr), (LDAPRX GPR64sp:$ptr)>;
 }
 
 // 8-bit loads
@@ -93,62 +93,66 @@ def : Pat<(relaxed_load<atomic_load_azext_16>
           (LDURHHi GPR64sp:$Rn, simm9:$offset)>;
 
 // 32-bit loads
-def : Pat<(seq_cst_load<atomic_load_32> GPR64sp:$ptr), (LDARW GPR64sp:$ptr)>;
-def : Pat<(acquiring_load<atomic_load_32> GPR64sp:$ptr), (LDARW GPR64sp:$ptr)>;
-def : Pat<(relaxed_load<atomic_load_32> (ro_Windexed32 GPR64sp:$Rn, GPR32:$Rm,
-                                                       ro_Wextend32:$extend)),
+def : Pat<(seq_cst_load<atomic_load_nonext_32> GPR64sp:$ptr),
+          (LDARW GPR64sp:$ptr)>;
+def : Pat<(acquiring_load<atomic_load_nonext_32> GPR64sp:$ptr),
+          (LDARW GPR64sp:$ptr)>;
+def : Pat<(relaxed_load<atomic_load_nonext_32>
+               (ro_Windexed32 GPR64sp:$Rn, GPR32:$Rm, ro_Wextend32:$extend)),
           (LDRWroW GPR64sp:$Rn, GPR32:$Rm, ro_Wextend32:$extend)>;
-def : Pat<(relaxed_load<atomic_load_32> (ro_Xindexed32 GPR64sp:$Rn, GPR64:$Rm,
-                                                       ro_Xextend32:$extend)),
+def : Pat<(relaxed_load<atomic_load_nonext_32>
+               (ro_Xindexed32 GPR64sp:$Rn, GPR64:$Rm, ro_Xextend32:$extend)),
           (LDRWroX GPR64sp:$Rn, GPR64:$Rm, ro_Xextend32:$extend)>;
-def : Pat<(relaxed_load<atomic_load_32> (am_indexed32 GPR64sp:$Rn,
-                                                      uimm12s4:$offset)),
+def : Pat<(relaxed_load<atomic_load_nonext_32>
+               (am_indexed32 GPR64sp:$Rn, uimm12s4:$offset)),
           (LDRWui GPR64sp:$Rn, uimm12s4:$offset)>;
-def : Pat<(relaxed_load<atomic_load_32>
+def : Pat<(relaxed_load<atomic_load_nonext_32>
                (am_unscaled32 GPR64sp:$Rn, simm9:$offset)),
           (LDURWi GPR64sp:$Rn, simm9:$offset)>;
 
 // 64-bit loads
-def : Pat<(seq_cst_load<atomic_load_64> GPR64sp:$ptr), (LDARX GPR64sp:$ptr)>;
-def : Pat<(acquiring_load<atomic_load_64> GPR64sp:$ptr), (LDARX GPR64sp:$ptr)>;
-def : Pat<(relaxed_load<atomic_load_64> (ro_Windexed64 GPR64sp:$Rn, GPR32:$Rm,
-                                                       ro_Wextend64:$extend)),
+def : Pat<(seq_cst_load<atomic_load_nonext_64> GPR64sp:$ptr),
+          (LDARX GPR64sp:$ptr)>;
+def : Pat<(acquiring_load<atomic_load_nonext_64> GPR64sp:$ptr),
+          (LDARX GPR64sp:$ptr)>;
+def : Pat<(relaxed_load<atomic_load_nonext_64>
+               (ro_Windexed64 GPR64sp:$Rn, GPR32:$Rm, ro_Wextend64:$extend)),
           (LDRXroW GPR64sp:$Rn, GPR32:$Rm, ro_Wextend64:$extend)>;
-def : Pat<(relaxed_load<atomic_load_64> (ro_Xindexed64 GPR64sp:$Rn, GPR64:$Rm,
-                                                       ro_Xextend64:$extend)),
+def : Pat<(relaxed_load<atomic_load_nonext_64>
+               (ro_Xindexed64 GPR64sp:$Rn, GPR64:$Rm, ro_Xextend64:$extend)),
           (LDRXroX GPR64sp:$Rn, GPR64:$Rm, ro_Xextend64:$extend)>;
-def : Pat<(relaxed_load<atomic_load_64> (am_indexed64 GPR64sp:$Rn,
-                                                      uimm12s8:$offset)),
+def : Pat<(relaxed_load<atomic_load_nonext_64>
+               (am_indexed64 GPR64sp:$Rn, uimm12s8:$offset)),
           (LDRXui GPR64sp:$Rn, uimm12s8:$offset)>;
-def : Pat<(relaxed_load<atomic_load_64>
+def : Pat<(relaxed_load<atomic_load_nonext_64>
                (am_unscaled64 GPR64sp:$Rn, simm9:$offset)),
           (LDURXi GPR64sp:$Rn, simm9:$offset)>;
 
 // FP 32-bit loads
-def : Pat<(f32 (bitconvert (i32 (relaxed_load<atomic_load_32> (ro_Windexed32 GPR64sp:$Rn, GPR32:$Rm,
-                                                       ro_Wextend32:$extend))))),
+def : Pat<(f32 (bitconvert (i32 (relaxed_load<atomic_load_nonext_32>
+               (ro_Windexed32 GPR64sp:$Rn, GPR32:$Rm, ro_Wextend32:$extend))))),
           (LDRSroW GPR64sp:$Rn, GPR32:$Rm, ro_Wextend32:$extend)>;
-def : Pat<(f32 (bitconvert (i32 (relaxed_load<atomic_load_32> (ro_Xindexed32 GPR64sp:$Rn, GPR64:$Rm,
-                                                       ro_Xextend32:$extend))))),
+def : Pat<(f32 (bitconvert (i32 (relaxed_load<atomic_load_nonext_32>
+               (ro_Xindexed32 GPR64sp:$Rn, GPR64:$Rm, ro_Xextend32:$extend))))),
           (LDRSroX GPR64sp:$Rn, GPR64:$Rm, ro_Xextend32:$extend)>;
-def : Pat<(f32 (bitconvert (i32 (relaxed_load<atomic_load_32> (am_indexed32 GPR64sp:$Rn,
-                                                      uimm12s8:$offset))))),
+def : Pat<(f32 (bitconvert (i32 (relaxed_load<atomic_load_nonext_32>
+               (am_indexed32 GPR64sp:$Rn, uimm12s8:$offset))))),
           (LDRSui GPR64sp:$Rn, uimm12s8:$offset)>;
-def : Pat<(f32 (bitconvert (i32 (relaxed_load<atomic_load_32>
+def : Pat<(f32 (bitconvert (i32 (relaxed_load<atomic_load_nonext_32>
                (am_unscaled32 GPR64sp:$Rn, simm9:$offset))))),
           (LDURSi GPR64sp:$Rn, simm9:$offset)>;
 
 // FP 64-bit loads
-def : Pat<(f64 (bitconvert (i64 (relaxed_load<atomic_load_64> (ro_Windexed64 GPR64sp:$Rn, GPR32:$Rm,
-                                                       ro_Wextend64:$extend))))),
+def : Pat<(f64 (bitconvert (i64 (relaxed_load<atomic_load_nonext_64>
+               (ro_Windexed64 GPR64sp:$Rn, GPR32:$Rm, ro_Wextend64:$extend))))),
           (LDRDroW GPR64sp:$Rn, GPR32:$Rm, ro_Wextend64:$extend)>;
-def : Pat<(f64 (bitconvert (i64 (relaxed_load<atomic_load_64> (ro_Xindexed64 GPR64sp:$Rn, GPR64:$Rm,
-                                                       ro_Xextend64:$extend))))),
+def : Pat<(f64 (bitconvert (i64 (relaxed_load<atomic_load_nonext_64>
+               (ro_Xindexed64 GPR64sp:$Rn, GPR64:$Rm, ro_Xextend64:$extend))))),
           (LDRDroX GPR64sp:$Rn, GPR64:$Rm, ro_Xextend64:$extend)>;
-def : Pat<(f64 (bitconvert (i64 (relaxed_load<atomic_load_64> (am_indexed64 GPR64sp:$Rn,
-                                                      uimm12s8:$offset))))),
+def : Pat<(f64 (bitconvert (i64 (relaxed_load<atomic_load_nonext_64>
+               (am_indexed64 GPR64sp:$Rn, uimm12s8:$offset))))),
           (LDRDui GPR64sp:$Rn, uimm12s8:$offset)>;
-def : Pat<(f64 (bitconvert (i64 (relaxed_load<atomic_load_64>
+def : Pat<(f64 (bitconvert (i64 (relaxed_load<atomic_load_nonext_64>
                (am_unscaled64 GPR64sp:$Rn, simm9:$offset))))),
           (LDURDi GPR64sp:$Rn, simm9:$offset)>;
 
@@ -561,16 +565,16 @@ let Predicates = [HasLSFE] in {
 let Predicates = [HasRCPC3, HasNEON] in {
   // LDAP1 loads
   def : Pat<(vector_insert (v2i64 VecListOne128:$Rd),
-                (i64 (acquiring_load<atomic_load_64> GPR64sp:$Rn)), (i64 VectorIndexD:$idx)),
+                (i64 (acquiring_load<atomic_load_nonext_64> GPR64sp:$Rn)), (i64 VectorIndexD:$idx)),
             (LDAP1 VecListOne128:$Rd, VectorIndexD:$idx, GPR64sp:$Rn)>;
   def : Pat<(vector_insert (v2f64 VecListOne128:$Rd),
-                (f64 (bitconvert (i64 (acquiring_load<atomic_load_64> GPR64sp:$Rn)))), (i64 VectorIndexD:$idx)),
+                (f64 (bitconvert (i64 (acquiring_load<atomic_load_nonext_64> GPR64sp:$Rn)))), (i64 VectorIndexD:$idx)),
             (LDAP1 VecListOne128:$Rd, VectorIndexD:$idx, GPR64sp:$Rn)>;
   def : Pat<(v1i64 (scalar_to_vector
-                (i64 (acquiring_load<atomic_load_64> GPR64sp:$Rn)))),
+                (i64 (acquiring_load<atomic_load_nonext_64> GPR64sp:$Rn)))),
             (EXTRACT_SUBREG (LDAP1 (v2i64 (IMPLICIT_DEF)), (i64 0), GPR64sp:$Rn), dsub)>;
   def : Pat<(v1f64 (scalar_to_vector
-                (f64 (bitconvert (i64 (acquiring_load<atomic_load_64> GPR64sp:$Rn)))))),
+                (f64 (bitconvert (i64 (acquiring_load<atomic_load_nonext_64> GPR64sp:$Rn)))))),
             (EXTRACT_SUBREG (LDAP1 (v2f64 (IMPLICIT_DEF)), (i64 0), GPR64sp:$Rn), dsub)>;
 
   // STL1 stores
@@ -597,10 +601,10 @@ let Predicates = [HasRCPC_IMMO, UseLDAPUR] in {
   def : Pat<(acquiring_load<atomic_load_azext_16>
                (am_unscaled16 GPR64sp:$Rn, simm9:$offset)),
           (LDAPURHi GPR64sp:$Rn, simm9:$offset)>;
-  def : Pat<(acquiring_load<atomic_load_32>
+  def : Pat<(acquiring_load<atomic_load_nonext_32>
                (am_unscaled32 GPR64sp:$Rn, simm9:$offset)),
           (LDAPURi GPR64sp:$Rn, simm9:$offset)>;
-  def : Pat<(acquiring_load<atomic_load_64>
+  def : Pat<(acquiring_load<atomic_load_nonext_64>
                (am_unscaled64 GPR64sp:$Rn, simm9:$offset)),
           (LDAPURXi GPR64sp:$Rn, simm9:$offset)>;
 }
diff --git a/llvm/lib/Target/AMDGPU/AMDGPUInstructions.td b/llvm/lib/Target/AMDGPU/AMDGPUInstructions.td
index 6cc76b44f1e14..78a92d85cfd8e 100644
--- a/llvm/lib/Target/AMDGPU/AMDGPUInstructions.td
+++ b/llvm/lib/Target/AMDGPU/AMDGPUInstructions.td
@@ -502,15 +502,15 @@ def zextloadi16_#as : PatFrag<(ops node:$ptr), (zextloadi16 node:$ptr)> {
   let IsLoad = 1;
 }
 
-def atomic_load_16_#as : PatFrag<(ops node:$ptr), (atomic_load_16 node:$ptr)> {
+def atomic_load_nonext_16_#as : PatFrag<(ops node:$ptr), (atomic_load_nonext_16 node:$ptr)> {
   let IsAtomic = 1;
 }
 
-def atomic_load_32_#as : PatFrag<(ops node:$ptr), (atomic_load_32 node:$ptr)> {
+def atomic_load_nonext_32_#as : PatFrag<(ops node:$ptr), (atomic_load_nonext_32 node:$ptr)> {
   let IsAtomic = 1;
 }
 
-def atomic_load_64_#as : PatFrag<(ops node:$ptr), (atomic_load_64 node:$ptr)> {
+def atomic_load_nonext_64_#as : PatFrag<(ops node:$ptr), (atomic_load_nonext_64 node:$ptr)> {
   let IsAtomic = 1;
 }
 
diff --git a/llvm/lib/Target/AMDGPU/BUFInstructions.td b/llvm/lib/Target/AMDGPU/BUFInstructions.td
index 7d64a3dd240c8..efcc81716a0f1 100644
--- a/llvm/lib/Target/AMDGPU/BUFInstructions.td
+++ b/llvm/lib/Target/AMDGPU/BUFInstructions.td
@@ -959,7 +959,7 @@ defm : MUBUF_Pseudo_Load_Pats<"BUFFER_LOAD_USHORT", i32, atomic_load_aext_16_glo
 defm : MUBUF_Pseudo_Load_Pats<"BUFFER_LOAD_USHORT", i32, atomic_load_zext_16_global>;
 defm : MUBUF_Pseudo_Load_Pats<"BUFFER_LOAD_UBYTE", i16, atomic_load_aext_8_global>;
 defm : MUBUF_Pseudo_Load_Pats<"BUFFER_LOAD_UBYTE", i16, atomic_load_zext_8_global>;
-defm : MUBUF_Pseudo_Load_Pats<"BUFFER_LOAD_USHORT", i16, atomic_load_16_global>;
+defm : MUBUF_Pseudo_Load_Pats<"BUFFER_LOAD_USHORT", i16, atomic_load_nonext_16_global>;
 defm : MUBUF_Pseudo_Load_Pats<"BUFFER_LOAD_UBYTE", i32, extloadi8_global>;
 defm : MUBUF_Pseudo_Load_Pats<"BUFFER_LOAD_UBYTE", i32, zextloadi8_global>;
 defm : MUBUF_Pseudo_Load_Pats<"BUFFER_LOAD_SBYTE", i32, sextloadi8_global>;
@@ -1933,8 +1933,8 @@ def : MUBUFLoad_PatternADDR64 <BUFFER_LOAD_SSHORT_ADDR64, i32, sextloadi16_const
 def : MUBUFLoad_PatternADDR64 <BUFFER_LOAD_USHORT_ADDR64, i32, extloadi16_constant>;
 def : MUBUFLoad_PatternADDR64 <BUFFER_LOAD_USHORT_ADDR64, i32, zextloadi16_constant>;
 
-defm : MUBUFLoad_Atomic_Pattern <BUFFER_LOAD_DWORD_ADDR64, BUFFER_LOAD_DWORD_OFFSET, i32, atomic_load_32_global>;
-defm : MUBUFLoad_Atomic_Pattern <BUFFER_LOAD_DWORDX2_ADDR64, BUFFER_LOAD_DWORDX2_OFFSET, i64, atomic_load_64_global>;
+defm : MUBUFLoad_Atomic_Pattern <BUFFER_LOAD_DWORD_ADDR64, BUFFER_LOAD_DWORD_OFFSET, i32, atomic_load_nonext_32_global>;
+defm : MUBUFLoad_Atomic_Pattern <BUFFER_LOAD_DWORDX2_ADDR64, BUFFER_LOAD_DWORDX2_OFFSET, i64, atomic_load_nonext_64_global>;
 } // End SubtargetPredicate = isGFX6GFX7
 
 multiclass MUBUFLoad_PatternOffset_Common <string Instr, ValueType vt,
diff --git a/llvm/lib/Target/AMDGPU/DSInstructions.td b/llvm/lib/Target/AMDGPU/DSInstructions.td
index 74884a2207079..604eb7f2c3878 100644
--- a/llvm/lib/Target/AMDGPU/DSInstructions.td
+++ b/llvm/lib/Target/AMDGPU/DSInstructions.td
@@ -859,12 +859,12 @@ defm : DSReadPat_t16 <DS_READ_U8, i16, "atomic_load_zext_8_local">;
 defm : DSReadPat_mc <DS_READ_U8, i32, "atomic_load_zext_8_local">;
 defm : DSReadPat_t16 <DS_READ_I8, i16, "atomic_load_sext_8_local">;
 defm : DSReadPat_mc <DS_READ_I8, i32, "atomic_load_sext_8_local">;
-defm : DSReadPat_t16 <DS_READ_U16, i16, "atomic_load_16_local">;
+defm : DSReadPat_t16 <DS_READ_U16, i16, "atomic_load_nonext_16_local">;
 defm : DSReadPat_mc <DS_READ_U16, i32, "atomic_load_aext_16_local">;
 defm : DSReadPat_mc <DS_READ_U16, i32, "atomic_load_zext_16_local">;
 defm : DSReadPat_mc <DS_READ_I16, i32, "atomic_load_sext_16_local">;
-defm : DSReadPat_mc <DS_READ_B32, i32, "atomic_load_32_local">;
-defm : DSReadPat_mc <DS_READ_B64, i64, "atomic_load_64_local">;
+defm : DSReadPat_mc <DS_READ_B32, i32, "atomic_load_nonext_32_local">;
+defm : DSReadPat_mc <DS_READ_B64, i64, "atomic_load_nonext_64_local">;
 
 let OtherPredicates = [D16PreservesUnusedBits] in {
 // TODO: Atomic loads
diff --git a/llvm/lib/Target/AMDGPU/FLATInstructions.td b/llvm/lib/Target/AMDGPU/FLATInstructions.td
index d8bb6e4378924..c17fda1346115 100644
--- a/llvm/lib/Target/AMDGPU/FLATInstructions.td
+++ b/llvm/lib/Target/AMDGPU/FLATInstructions.td
@@ -1541,7 +1541,7 @@ def : FlatLoadPat <FLAT_LOAD_UBYTE, atomic_load_aext_8_flat, i16>;
 def : FlatLoadPat <FLAT_LOAD_UBYTE, atomic_load_zext_8_flat, i32>;
 def : FlatLoadPat <FLAT_LOAD_UBYTE, atomic_load_zext_8_flat, i16>;
 def : FlatLoadPat <FLAT_LOAD_USHORT, atomic_load_aext_16_flat, i32>;
-def : FlatLoadPat <FLAT_LOAD_USHORT, atomic_load_16_flat, i16>;
+def : FlatLoadPat <FLAT_LOAD_USHORT, atomic_load_nonext_16_flat, i16>;
 def : FlatLoadPat <FLAT_LOAD_USHORT, atomic_load_zext_16_flat, i32>;
 def : FlatLoadPat <FLAT_LOAD_UBYTE, extloadi8_flat, i32>;
 def : FlatLoadPat <FLAT_LOAD_UBYTE, zextloadi8_flat, i32>;
@@ -1573,8 +1573,8 @@ let OtherPredicates = [D16PreservesUnusedBits, HasFlatAddressSpace], True16Predi
   def : FlatStorePat <FLAT_STORE_SHORT_t16, store_flat, i16>;
 } // End let OtherPredicates = [D16PreservesUnusedBits, HasFlatAddressSpace], True16Predicate = UseRealTrue16Insts
 
-def : FlatLoadPat <FLAT_LOAD_DWORD, atomic_load_32_flat, i32>;
-def : FlatLoadPat <FLAT_LOAD_DWORDX2, atomic_load_64_flat, i64>;
+def : FlatLoadPat <FLAT_LOAD_DWORD, atomic_load_nonext_32_flat, i32>;
+def : FlatLoadPat <FLAT_LOAD_DWORDX2, atomic_load_nonext_64_flat, i64>;
 
 def : FlatStorePat <FLAT_STORE_BYTE, truncstorei8_flat, i32>;
 def : FlatStorePat <FLAT_STORE_SHORT, truncstorei16_flat, i32>;
@@ -1682,7 +1682,7 @@ defm : GlobalFLATLoadPats <GLOBAL_LOAD_UBYTE, atomic_load_aext_8_global, i16>;
 defm : GlobalFLATLoadPats <GLOBAL_LOAD_UBYTE, atomic_load_zext_8_global, i32>;
 defm : GlobalFLATLoadPats <GLOBAL_LOAD_UBYTE, atomic_load_zext_8_global, i16>;
 defm : GlobalFLATLoadPats <GLOBAL_LOAD_USHORT, atomic_load_aext_16_global, i32>;
-defm : GlobalFLATLoadPats <GLOBAL_LOAD_USHORT, atomic_load_16_global, i16>;
+defm : GlobalFLATLoadPats <GLOBAL_LOAD_USHORT, atomic_load_nonext_16_global, i16>;
 defm : GlobalFLATLoadPats <GLOBAL_LOAD_USHORT, atomic_load_zext_16_global, i32>;
 defm : GlobalFLATLoadPats <GLOBAL_LOAD_USHORT, atomic_load_zext_16_global, i16>;
 defm : GlobalFLATLoadPats <GLOBAL_LOAD_SBYTE, atomic_load_sext_8_global, i32>;
@@ -1733,8 +1733,8 @@ defm : GlobalFLATStorePats <GLOBAL_STORE_DWORDX4, store_global, vt>;
 // There is no distinction for atomic load lowering during selection;
 // the memory legalizer will set the cache bits and insert the
 // appropriate waits.
-defm : GlobalFLATLoadPats <GLOBAL_LOAD_DWORD, atomic_load_32_global, i32>;
-defm : GlobalFLATLoadPats <GLOBAL_LOAD_DWORDX2, atomic_load_64_global, i64>;
+defm : GlobalFLATLoadPats <GLOBAL_LOAD_DWORD, atomic_load_nonext_32_global, i32>;
+defm : GlobalFLATLoadPats <GLOBAL_LOAD_DWORDX2, atomic_load_nonext_64_global, i64>;
 
 defm : GlobalFLATStorePats <GLOBAL_STORE_BYTE, truncstorei8_global, i32>;
 defm : GlobalFLATStorePats <GLOBAL_STORE_SHORT, truncstorei16_global, i32>;
diff --git a/llvm/lib/Target/AMDGPU/SIInstrInfo.td b/llvm/lib/Target/AMDGPU/SIInstrInfo.td
index ec1fd6fb60d57..5d837d853ac98 100644
--- a/llvm/lib/Target/AMDGPU/SIInstrInfo.td
+++ b/llvm/lib/Target/AMDGPU/SIInstrInfo.td
@@ -361,6 +361,12 @@ def load_glue : PatFrag <(ops node:$ptr), (unindexedload_glue node:$ptr)> {
   let IsNonExtLoad = 1;
 }
 
+def atomic_load_nonext_glue :
+  PatFrag<(ops node:$ptr), (AMDGPUatomic_ld_glue node:$ptr)> {
+  let IsAtomic = true; // FIXME: Should be IsLoad and/or IsAtomic?
+  let IsNonExtLoad = true;
+}
+
 def atomic_load_zext_glue :
   PatFrag<(ops node:$ptr), (AMDGPUatomic_ld_glue node:$ptr)> {
   let IsAtomic = true; // FIXME: Should be IsLoad and/or IsAtomic?
@@ -379,20 +385,20 @@ def atomic_load_aext_glue :
   let IsAnyExtLoad = true;
 }
 
-def atomic_load_16_glue : PatFrag<(ops node:$ptr),
-  (AMDGPUatomic_ld_glue node:$ptr)> {
+def atomic_load_nonext_16_glue : PatFrag<(ops node:$ptr),
+  (atomic_load_nonext_glue node:$ptr)> {
   let IsAtomic = 1;
   let MemoryVT = i16;
 }
 
-def atomic_load_32_glue : PatFrag<(ops node:$ptr),
-  (AMDGPUatomic_ld_glue node:$ptr)> {
+def atomic_load_nonext_32_glue : PatFrag<(ops node:$ptr),
+  (atomic_load_nonext_glue node:$ptr)> {
   let IsAtomic = 1;
   let MemoryVT = i32;
 }
 
-def atomic_load_64_glue : PatFrag<(ops node:$ptr),
-  (AMDGPUatomic_ld_glue node:$ptr)> {
+def atomic_load_nonext_64_glue : PatFrag<(ops node:$ptr),
+  (atomic_load_nonext_glue node:$ptr)> {
   let IsAtomic = 1;
   let MemoryVT = i64;
 }
@@ -506,12 +512,12 @@ def load_align16_local_m0 : PatFrag<(ops node:$ptr),
 }
 
 let IsAtomic = 1, AddressSpaces = LoadAddress_local.AddrSpaces in {
-def atomic_load_16_local_m0 : PatFrag<(ops node:$ptr),
-                                      (atomic_load_16_glue node:$ptr)>;
-def atomic_load_32_local_m0 : PatFrag<(ops node:$ptr),
-                                      (atomic_load_32_glue node:$ptr)>;
-def atomic_load_64_local_m0 : PatFrag<(ops node:$ptr),
-                                       (atomic_load_64_glue node:$ptr)>;
+def atomic_load_nonext_16_local_m0 : PatFrag<(ops node:$ptr),
+                                      (atomic_load_nonext_16_glue node:$ptr)>;
+def atomic_load_nonext_32_local_m0 : PatFrag<(ops node:$ptr),
+                                      (atomic_load_nonext_32_glue node:$ptr)>;
+def atomic_load_nonext_64_local_m0 : PatFrag<(ops node:$ptr),
+                                       (atomic_load_nonext_64_glue node:$ptr)>;
 
 def atomic_load_zext_8_local_m0 : PatFrag<(ops node:$ptr),
                                       (atomic_load_zext_8_glue node:$ptr)>;
diff --git a/llvm/lib/Target/ARM/ARMInstrInfo.td b/llvm/lib/Target/ARM/ARMInstrInfo.td
index 1ce9190a68f3c..c682f597401ec 100644
--- a/llvm/lib/Target/ARM/ARMInstrInfo.td
+++ b/llvm/lib/Target/ARM/ARMInstrInfo.td
@@ -5384,7 +5384,7 @@ class acquiring_load<PatFrags base>
 
 def atomic_load_azext_acquire_8  : acquiring_load<atomic_load_azext_8>;
 def atomic_load_azext_acquire_16 : acquiring_load<atomic_load_azext_16>;
-def atomic_load_acquire_32 : acquiring_load<atomic_load_32>;
+def atomic_load_nonext_acquire_32 : acquiring_load<atomic_load_nonext_32>;
 
 class releasing_store<PatFrag base>
   : PatFrag<(ops node:$ptr, node:$val), (base node:$val, node:$ptr), [{
@@ -5399,7 +5399,7 @@ def atomic_store_release_32 : releasing_store<atomic_store_32>;
 let AddedComplexity = 8 in {
   def : ARMPat<(atomic_load_azext_acquire_8 addr_offset_none:$addr),  (LDAB addr_offset_none:$addr)>;
   def : ARMPat<(atomic_load_azext_acquire_16 addr_offset_none:$addr), (LDAH addr_offset_none:$addr)>;
-  def : ARMPat<(atomic_load_acquire_32 addr_offset_none:$addr), (LDA  addr_offset_none:$addr)>;
+  def : ARMPat<(atomic_load_nonext_acquire_32 addr_offset_none:$addr), (LDA  addr_offset_none:$addr)>;
   def : ARMPat<(atomic_store_release_8 addr_offset_none:$addr, GPR:$val),  (STLB GPR:$val, addr_offset_none:$addr)>;
   def : ARMPat<(atomic_store_release_16 addr_offset_none:$addr, GPR:$val), (STLH GPR:$val, addr_offset_none:$addr)>;
   def : ARMPat<(atomic_store_release_32 addr_offset_none:$addr, GPR:$val), (STL  GPR:$val, addr_offset_none:$addr)>;
@@ -6220,9 +6220,9 @@ def : ARMPat<(atomic_load_azext_8 addrmode_imm12:$src),
              (LDRBi12 addrmode_imm12:$src)>;
 def : ARMPat<(atomic_load_azext_1...
[truncated]

@llvmbot
Copy link
Member

llvmbot commented Apr 26, 2025

@llvm/pr-subscribers-backend-powerpc

Author: Craig Topper (topperc)

Changes

atomic_load_8/16/32/64 will be removed in a separate patch as it will affect out of tree targets.


Patch is 48.32 KiB, truncated to 20.00 KiB below, full version: https://github.com/llvm/llvm-project/pull/137428.diff

25 Files Affected:

  • (modified) llvm/lib/Target/AArch64/AArch64InstrAtomics.td (+44-40)
  • (modified) llvm/lib/Target/AMDGPU/AMDGPUInstructions.td (+3-3)
  • (modified) llvm/lib/Target/AMDGPU/BUFInstructions.td (+3-3)
  • (modified) llvm/lib/Target/AMDGPU/DSInstructions.td (+3-3)
  • (modified) llvm/lib/Target/AMDGPU/FLATInstructions.td (+6-6)
  • (modified) llvm/lib/Target/AMDGPU/SIInstrInfo.td (+18-12)
  • (modified) llvm/lib/Target/ARM/ARMInstrInfo.td (+4-4)
  • (modified) llvm/lib/Target/ARM/ARMInstrThumb.td (+2-2)
  • (modified) llvm/lib/Target/ARM/ARMInstrThumb2.td (+4-4)
  • (modified) llvm/lib/Target/AVR/AVRInstrInfo.td (+2-2)
  • (modified) llvm/lib/Target/BPF/BPFInstrInfo.td (+4-4)
  • (modified) llvm/lib/Target/Hexagon/HexagonPatterns.td (+6-6)
  • (modified) llvm/lib/Target/LoongArch/LoongArchInstrInfo.td (+2-2)
  • (modified) llvm/lib/Target/Mips/MicroMipsInstrInfo.td (+1-1)
  • (modified) llvm/lib/Target/Mips/Mips64InstrInfo.td (+1-1)
  • (modified) llvm/lib/Target/Mips/MipsInstrInfo.td (+1-1)
  • (modified) llvm/lib/Target/PowerPC/PPCInstr64Bit.td (+2-2)
  • (modified) llvm/lib/Target/PowerPC/PPCInstrInfo.td (+2-2)
  • (modified) llvm/lib/Target/PowerPC/PPCInstrP10.td (+4-4)
  • (modified) llvm/lib/Target/RISCV/RISCVGISel.td (+2-2)
  • (modified) llvm/lib/Target/Sparc/SparcInstr64Bit.td (+3-3)
  • (modified) llvm/lib/Target/Sparc/SparcInstrInfo.td (+2-2)
  • (modified) llvm/lib/Target/VE/VEInstrInfo.td (+4-4)
  • (modified) llvm/lib/Target/WebAssembly/WebAssemblyInstrAtomics.td (+5-5)
  • (modified) llvm/lib/Target/X86/X86InstrCompiler.td (+40-40)
diff --git a/llvm/lib/Target/AArch64/AArch64InstrAtomics.td b/llvm/lib/Target/AArch64/AArch64InstrAtomics.td
index 28d45fe25d30c..f3734e05ae667 100644
--- a/llvm/lib/Target/AArch64/AArch64InstrAtomics.td
+++ b/llvm/lib/Target/AArch64/AArch64InstrAtomics.td
@@ -55,9 +55,9 @@ let Predicates = [HasRCPC] in {
   // 16-bit loads
   def : Pat<(acquiring_load<atomic_load_azext_16> GPR64sp:$ptr), (LDAPRH GPR64sp:$ptr)>;
   // 32-bit loads
-  def : Pat<(acquiring_load<atomic_load_32> GPR64sp:$ptr), (LDAPRW GPR64sp:$ptr)>;
+  def : Pat<(acquiring_load<atomic_load_nonext_32> GPR64sp:$ptr), (LDAPRW GPR64sp:$ptr)>;
   // 64-bit loads
-  def : Pat<(acquiring_load<atomic_load_64> GPR64sp:$ptr), (LDAPRX GPR64sp:$ptr)>;
+  def : Pat<(acquiring_load<atomic_load_nonext_64> GPR64sp:$ptr), (LDAPRX GPR64sp:$ptr)>;
 }
 
 // 8-bit loads
@@ -93,62 +93,66 @@ def : Pat<(relaxed_load<atomic_load_azext_16>
           (LDURHHi GPR64sp:$Rn, simm9:$offset)>;
 
 // 32-bit loads
-def : Pat<(seq_cst_load<atomic_load_32> GPR64sp:$ptr), (LDARW GPR64sp:$ptr)>;
-def : Pat<(acquiring_load<atomic_load_32> GPR64sp:$ptr), (LDARW GPR64sp:$ptr)>;
-def : Pat<(relaxed_load<atomic_load_32> (ro_Windexed32 GPR64sp:$Rn, GPR32:$Rm,
-                                                       ro_Wextend32:$extend)),
+def : Pat<(seq_cst_load<atomic_load_nonext_32> GPR64sp:$ptr),
+          (LDARW GPR64sp:$ptr)>;
+def : Pat<(acquiring_load<atomic_load_nonext_32> GPR64sp:$ptr),
+          (LDARW GPR64sp:$ptr)>;
+def : Pat<(relaxed_load<atomic_load_nonext_32>
+               (ro_Windexed32 GPR64sp:$Rn, GPR32:$Rm, ro_Wextend32:$extend)),
           (LDRWroW GPR64sp:$Rn, GPR32:$Rm, ro_Wextend32:$extend)>;
-def : Pat<(relaxed_load<atomic_load_32> (ro_Xindexed32 GPR64sp:$Rn, GPR64:$Rm,
-                                                       ro_Xextend32:$extend)),
+def : Pat<(relaxed_load<atomic_load_nonext_32>
+               (ro_Xindexed32 GPR64sp:$Rn, GPR64:$Rm, ro_Xextend32:$extend)),
           (LDRWroX GPR64sp:$Rn, GPR64:$Rm, ro_Xextend32:$extend)>;
-def : Pat<(relaxed_load<atomic_load_32> (am_indexed32 GPR64sp:$Rn,
-                                                      uimm12s4:$offset)),
+def : Pat<(relaxed_load<atomic_load_nonext_32>
+               (am_indexed32 GPR64sp:$Rn, uimm12s4:$offset)),
           (LDRWui GPR64sp:$Rn, uimm12s4:$offset)>;
-def : Pat<(relaxed_load<atomic_load_32>
+def : Pat<(relaxed_load<atomic_load_nonext_32>
                (am_unscaled32 GPR64sp:$Rn, simm9:$offset)),
           (LDURWi GPR64sp:$Rn, simm9:$offset)>;
 
 // 64-bit loads
-def : Pat<(seq_cst_load<atomic_load_64> GPR64sp:$ptr), (LDARX GPR64sp:$ptr)>;
-def : Pat<(acquiring_load<atomic_load_64> GPR64sp:$ptr), (LDARX GPR64sp:$ptr)>;
-def : Pat<(relaxed_load<atomic_load_64> (ro_Windexed64 GPR64sp:$Rn, GPR32:$Rm,
-                                                       ro_Wextend64:$extend)),
+def : Pat<(seq_cst_load<atomic_load_nonext_64> GPR64sp:$ptr),
+          (LDARX GPR64sp:$ptr)>;
+def : Pat<(acquiring_load<atomic_load_nonext_64> GPR64sp:$ptr),
+          (LDARX GPR64sp:$ptr)>;
+def : Pat<(relaxed_load<atomic_load_nonext_64>
+               (ro_Windexed64 GPR64sp:$Rn, GPR32:$Rm, ro_Wextend64:$extend)),
           (LDRXroW GPR64sp:$Rn, GPR32:$Rm, ro_Wextend64:$extend)>;
-def : Pat<(relaxed_load<atomic_load_64> (ro_Xindexed64 GPR64sp:$Rn, GPR64:$Rm,
-                                                       ro_Xextend64:$extend)),
+def : Pat<(relaxed_load<atomic_load_nonext_64>
+               (ro_Xindexed64 GPR64sp:$Rn, GPR64:$Rm, ro_Xextend64:$extend)),
           (LDRXroX GPR64sp:$Rn, GPR64:$Rm, ro_Xextend64:$extend)>;
-def : Pat<(relaxed_load<atomic_load_64> (am_indexed64 GPR64sp:$Rn,
-                                                      uimm12s8:$offset)),
+def : Pat<(relaxed_load<atomic_load_nonext_64>
+               (am_indexed64 GPR64sp:$Rn, uimm12s8:$offset)),
           (LDRXui GPR64sp:$Rn, uimm12s8:$offset)>;
-def : Pat<(relaxed_load<atomic_load_64>
+def : Pat<(relaxed_load<atomic_load_nonext_64>
                (am_unscaled64 GPR64sp:$Rn, simm9:$offset)),
           (LDURXi GPR64sp:$Rn, simm9:$offset)>;
 
 // FP 32-bit loads
-def : Pat<(f32 (bitconvert (i32 (relaxed_load<atomic_load_32> (ro_Windexed32 GPR64sp:$Rn, GPR32:$Rm,
-                                                       ro_Wextend32:$extend))))),
+def : Pat<(f32 (bitconvert (i32 (relaxed_load<atomic_load_nonext_32>
+               (ro_Windexed32 GPR64sp:$Rn, GPR32:$Rm, ro_Wextend32:$extend))))),
           (LDRSroW GPR64sp:$Rn, GPR32:$Rm, ro_Wextend32:$extend)>;
-def : Pat<(f32 (bitconvert (i32 (relaxed_load<atomic_load_32> (ro_Xindexed32 GPR64sp:$Rn, GPR64:$Rm,
-                                                       ro_Xextend32:$extend))))),
+def : Pat<(f32 (bitconvert (i32 (relaxed_load<atomic_load_nonext_32>
+               (ro_Xindexed32 GPR64sp:$Rn, GPR64:$Rm, ro_Xextend32:$extend))))),
           (LDRSroX GPR64sp:$Rn, GPR64:$Rm, ro_Xextend32:$extend)>;
-def : Pat<(f32 (bitconvert (i32 (relaxed_load<atomic_load_32> (am_indexed32 GPR64sp:$Rn,
-                                                      uimm12s8:$offset))))),
+def : Pat<(f32 (bitconvert (i32 (relaxed_load<atomic_load_nonext_32>
+               (am_indexed32 GPR64sp:$Rn, uimm12s8:$offset))))),
           (LDRSui GPR64sp:$Rn, uimm12s8:$offset)>;
-def : Pat<(f32 (bitconvert (i32 (relaxed_load<atomic_load_32>
+def : Pat<(f32 (bitconvert (i32 (relaxed_load<atomic_load_nonext_32>
                (am_unscaled32 GPR64sp:$Rn, simm9:$offset))))),
           (LDURSi GPR64sp:$Rn, simm9:$offset)>;
 
 // FP 64-bit loads
-def : Pat<(f64 (bitconvert (i64 (relaxed_load<atomic_load_64> (ro_Windexed64 GPR64sp:$Rn, GPR32:$Rm,
-                                                       ro_Wextend64:$extend))))),
+def : Pat<(f64 (bitconvert (i64 (relaxed_load<atomic_load_nonext_64>
+               (ro_Windexed64 GPR64sp:$Rn, GPR32:$Rm, ro_Wextend64:$extend))))),
           (LDRDroW GPR64sp:$Rn, GPR32:$Rm, ro_Wextend64:$extend)>;
-def : Pat<(f64 (bitconvert (i64 (relaxed_load<atomic_load_64> (ro_Xindexed64 GPR64sp:$Rn, GPR64:$Rm,
-                                                       ro_Xextend64:$extend))))),
+def : Pat<(f64 (bitconvert (i64 (relaxed_load<atomic_load_nonext_64>
+               (ro_Xindexed64 GPR64sp:$Rn, GPR64:$Rm, ro_Xextend64:$extend))))),
           (LDRDroX GPR64sp:$Rn, GPR64:$Rm, ro_Xextend64:$extend)>;
-def : Pat<(f64 (bitconvert (i64 (relaxed_load<atomic_load_64> (am_indexed64 GPR64sp:$Rn,
-                                                      uimm12s8:$offset))))),
+def : Pat<(f64 (bitconvert (i64 (relaxed_load<atomic_load_nonext_64>
+               (am_indexed64 GPR64sp:$Rn, uimm12s8:$offset))))),
           (LDRDui GPR64sp:$Rn, uimm12s8:$offset)>;
-def : Pat<(f64 (bitconvert (i64 (relaxed_load<atomic_load_64>
+def : Pat<(f64 (bitconvert (i64 (relaxed_load<atomic_load_nonext_64>
                (am_unscaled64 GPR64sp:$Rn, simm9:$offset))))),
           (LDURDi GPR64sp:$Rn, simm9:$offset)>;
 
@@ -561,16 +565,16 @@ let Predicates = [HasLSFE] in {
 let Predicates = [HasRCPC3, HasNEON] in {
   // LDAP1 loads
   def : Pat<(vector_insert (v2i64 VecListOne128:$Rd),
-                (i64 (acquiring_load<atomic_load_64> GPR64sp:$Rn)), (i64 VectorIndexD:$idx)),
+                (i64 (acquiring_load<atomic_load_nonext_64> GPR64sp:$Rn)), (i64 VectorIndexD:$idx)),
             (LDAP1 VecListOne128:$Rd, VectorIndexD:$idx, GPR64sp:$Rn)>;
   def : Pat<(vector_insert (v2f64 VecListOne128:$Rd),
-                (f64 (bitconvert (i64 (acquiring_load<atomic_load_64> GPR64sp:$Rn)))), (i64 VectorIndexD:$idx)),
+                (f64 (bitconvert (i64 (acquiring_load<atomic_load_nonext_64> GPR64sp:$Rn)))), (i64 VectorIndexD:$idx)),
             (LDAP1 VecListOne128:$Rd, VectorIndexD:$idx, GPR64sp:$Rn)>;
   def : Pat<(v1i64 (scalar_to_vector
-                (i64 (acquiring_load<atomic_load_64> GPR64sp:$Rn)))),
+                (i64 (acquiring_load<atomic_load_nonext_64> GPR64sp:$Rn)))),
             (EXTRACT_SUBREG (LDAP1 (v2i64 (IMPLICIT_DEF)), (i64 0), GPR64sp:$Rn), dsub)>;
   def : Pat<(v1f64 (scalar_to_vector
-                (f64 (bitconvert (i64 (acquiring_load<atomic_load_64> GPR64sp:$Rn)))))),
+                (f64 (bitconvert (i64 (acquiring_load<atomic_load_nonext_64> GPR64sp:$Rn)))))),
             (EXTRACT_SUBREG (LDAP1 (v2f64 (IMPLICIT_DEF)), (i64 0), GPR64sp:$Rn), dsub)>;
 
   // STL1 stores
@@ -597,10 +601,10 @@ let Predicates = [HasRCPC_IMMO, UseLDAPUR] in {
   def : Pat<(acquiring_load<atomic_load_azext_16>
                (am_unscaled16 GPR64sp:$Rn, simm9:$offset)),
           (LDAPURHi GPR64sp:$Rn, simm9:$offset)>;
-  def : Pat<(acquiring_load<atomic_load_32>
+  def : Pat<(acquiring_load<atomic_load_nonext_32>
                (am_unscaled32 GPR64sp:$Rn, simm9:$offset)),
           (LDAPURi GPR64sp:$Rn, simm9:$offset)>;
-  def : Pat<(acquiring_load<atomic_load_64>
+  def : Pat<(acquiring_load<atomic_load_nonext_64>
                (am_unscaled64 GPR64sp:$Rn, simm9:$offset)),
           (LDAPURXi GPR64sp:$Rn, simm9:$offset)>;
 }
diff --git a/llvm/lib/Target/AMDGPU/AMDGPUInstructions.td b/llvm/lib/Target/AMDGPU/AMDGPUInstructions.td
index 6cc76b44f1e14..78a92d85cfd8e 100644
--- a/llvm/lib/Target/AMDGPU/AMDGPUInstructions.td
+++ b/llvm/lib/Target/AMDGPU/AMDGPUInstructions.td
@@ -502,15 +502,15 @@ def zextloadi16_#as : PatFrag<(ops node:$ptr), (zextloadi16 node:$ptr)> {
   let IsLoad = 1;
 }
 
-def atomic_load_16_#as : PatFrag<(ops node:$ptr), (atomic_load_16 node:$ptr)> {
+def atomic_load_nonext_16_#as : PatFrag<(ops node:$ptr), (atomic_load_nonext_16 node:$ptr)> {
   let IsAtomic = 1;
 }
 
-def atomic_load_32_#as : PatFrag<(ops node:$ptr), (atomic_load_32 node:$ptr)> {
+def atomic_load_nonext_32_#as : PatFrag<(ops node:$ptr), (atomic_load_nonext_32 node:$ptr)> {
   let IsAtomic = 1;
 }
 
-def atomic_load_64_#as : PatFrag<(ops node:$ptr), (atomic_load_64 node:$ptr)> {
+def atomic_load_nonext_64_#as : PatFrag<(ops node:$ptr), (atomic_load_nonext_64 node:$ptr)> {
   let IsAtomic = 1;
 }
 
diff --git a/llvm/lib/Target/AMDGPU/BUFInstructions.td b/llvm/lib/Target/AMDGPU/BUFInstructions.td
index 7d64a3dd240c8..efcc81716a0f1 100644
--- a/llvm/lib/Target/AMDGPU/BUFInstructions.td
+++ b/llvm/lib/Target/AMDGPU/BUFInstructions.td
@@ -959,7 +959,7 @@ defm : MUBUF_Pseudo_Load_Pats<"BUFFER_LOAD_USHORT", i32, atomic_load_aext_16_glo
 defm : MUBUF_Pseudo_Load_Pats<"BUFFER_LOAD_USHORT", i32, atomic_load_zext_16_global>;
 defm : MUBUF_Pseudo_Load_Pats<"BUFFER_LOAD_UBYTE", i16, atomic_load_aext_8_global>;
 defm : MUBUF_Pseudo_Load_Pats<"BUFFER_LOAD_UBYTE", i16, atomic_load_zext_8_global>;
-defm : MUBUF_Pseudo_Load_Pats<"BUFFER_LOAD_USHORT", i16, atomic_load_16_global>;
+defm : MUBUF_Pseudo_Load_Pats<"BUFFER_LOAD_USHORT", i16, atomic_load_nonext_16_global>;
 defm : MUBUF_Pseudo_Load_Pats<"BUFFER_LOAD_UBYTE", i32, extloadi8_global>;
 defm : MUBUF_Pseudo_Load_Pats<"BUFFER_LOAD_UBYTE", i32, zextloadi8_global>;
 defm : MUBUF_Pseudo_Load_Pats<"BUFFER_LOAD_SBYTE", i32, sextloadi8_global>;
@@ -1933,8 +1933,8 @@ def : MUBUFLoad_PatternADDR64 <BUFFER_LOAD_SSHORT_ADDR64, i32, sextloadi16_const
 def : MUBUFLoad_PatternADDR64 <BUFFER_LOAD_USHORT_ADDR64, i32, extloadi16_constant>;
 def : MUBUFLoad_PatternADDR64 <BUFFER_LOAD_USHORT_ADDR64, i32, zextloadi16_constant>;
 
-defm : MUBUFLoad_Atomic_Pattern <BUFFER_LOAD_DWORD_ADDR64, BUFFER_LOAD_DWORD_OFFSET, i32, atomic_load_32_global>;
-defm : MUBUFLoad_Atomic_Pattern <BUFFER_LOAD_DWORDX2_ADDR64, BUFFER_LOAD_DWORDX2_OFFSET, i64, atomic_load_64_global>;
+defm : MUBUFLoad_Atomic_Pattern <BUFFER_LOAD_DWORD_ADDR64, BUFFER_LOAD_DWORD_OFFSET, i32, atomic_load_nonext_32_global>;
+defm : MUBUFLoad_Atomic_Pattern <BUFFER_LOAD_DWORDX2_ADDR64, BUFFER_LOAD_DWORDX2_OFFSET, i64, atomic_load_nonext_64_global>;
 } // End SubtargetPredicate = isGFX6GFX7
 
 multiclass MUBUFLoad_PatternOffset_Common <string Instr, ValueType vt,
diff --git a/llvm/lib/Target/AMDGPU/DSInstructions.td b/llvm/lib/Target/AMDGPU/DSInstructions.td
index 74884a2207079..604eb7f2c3878 100644
--- a/llvm/lib/Target/AMDGPU/DSInstructions.td
+++ b/llvm/lib/Target/AMDGPU/DSInstructions.td
@@ -859,12 +859,12 @@ defm : DSReadPat_t16 <DS_READ_U8, i16, "atomic_load_zext_8_local">;
 defm : DSReadPat_mc <DS_READ_U8, i32, "atomic_load_zext_8_local">;
 defm : DSReadPat_t16 <DS_READ_I8, i16, "atomic_load_sext_8_local">;
 defm : DSReadPat_mc <DS_READ_I8, i32, "atomic_load_sext_8_local">;
-defm : DSReadPat_t16 <DS_READ_U16, i16, "atomic_load_16_local">;
+defm : DSReadPat_t16 <DS_READ_U16, i16, "atomic_load_nonext_16_local">;
 defm : DSReadPat_mc <DS_READ_U16, i32, "atomic_load_aext_16_local">;
 defm : DSReadPat_mc <DS_READ_U16, i32, "atomic_load_zext_16_local">;
 defm : DSReadPat_mc <DS_READ_I16, i32, "atomic_load_sext_16_local">;
-defm : DSReadPat_mc <DS_READ_B32, i32, "atomic_load_32_local">;
-defm : DSReadPat_mc <DS_READ_B64, i64, "atomic_load_64_local">;
+defm : DSReadPat_mc <DS_READ_B32, i32, "atomic_load_nonext_32_local">;
+defm : DSReadPat_mc <DS_READ_B64, i64, "atomic_load_nonext_64_local">;
 
 let OtherPredicates = [D16PreservesUnusedBits] in {
 // TODO: Atomic loads
diff --git a/llvm/lib/Target/AMDGPU/FLATInstructions.td b/llvm/lib/Target/AMDGPU/FLATInstructions.td
index d8bb6e4378924..c17fda1346115 100644
--- a/llvm/lib/Target/AMDGPU/FLATInstructions.td
+++ b/llvm/lib/Target/AMDGPU/FLATInstructions.td
@@ -1541,7 +1541,7 @@ def : FlatLoadPat <FLAT_LOAD_UBYTE, atomic_load_aext_8_flat, i16>;
 def : FlatLoadPat <FLAT_LOAD_UBYTE, atomic_load_zext_8_flat, i32>;
 def : FlatLoadPat <FLAT_LOAD_UBYTE, atomic_load_zext_8_flat, i16>;
 def : FlatLoadPat <FLAT_LOAD_USHORT, atomic_load_aext_16_flat, i32>;
-def : FlatLoadPat <FLAT_LOAD_USHORT, atomic_load_16_flat, i16>;
+def : FlatLoadPat <FLAT_LOAD_USHORT, atomic_load_nonext_16_flat, i16>;
 def : FlatLoadPat <FLAT_LOAD_USHORT, atomic_load_zext_16_flat, i32>;
 def : FlatLoadPat <FLAT_LOAD_UBYTE, extloadi8_flat, i32>;
 def : FlatLoadPat <FLAT_LOAD_UBYTE, zextloadi8_flat, i32>;
@@ -1573,8 +1573,8 @@ let OtherPredicates = [D16PreservesUnusedBits, HasFlatAddressSpace], True16Predi
   def : FlatStorePat <FLAT_STORE_SHORT_t16, store_flat, i16>;
 } // End let OtherPredicates = [D16PreservesUnusedBits, HasFlatAddressSpace], True16Predicate = UseRealTrue16Insts
 
-def : FlatLoadPat <FLAT_LOAD_DWORD, atomic_load_32_flat, i32>;
-def : FlatLoadPat <FLAT_LOAD_DWORDX2, atomic_load_64_flat, i64>;
+def : FlatLoadPat <FLAT_LOAD_DWORD, atomic_load_nonext_32_flat, i32>;
+def : FlatLoadPat <FLAT_LOAD_DWORDX2, atomic_load_nonext_64_flat, i64>;
 
 def : FlatStorePat <FLAT_STORE_BYTE, truncstorei8_flat, i32>;
 def : FlatStorePat <FLAT_STORE_SHORT, truncstorei16_flat, i32>;
@@ -1682,7 +1682,7 @@ defm : GlobalFLATLoadPats <GLOBAL_LOAD_UBYTE, atomic_load_aext_8_global, i16>;
 defm : GlobalFLATLoadPats <GLOBAL_LOAD_UBYTE, atomic_load_zext_8_global, i32>;
 defm : GlobalFLATLoadPats <GLOBAL_LOAD_UBYTE, atomic_load_zext_8_global, i16>;
 defm : GlobalFLATLoadPats <GLOBAL_LOAD_USHORT, atomic_load_aext_16_global, i32>;
-defm : GlobalFLATLoadPats <GLOBAL_LOAD_USHORT, atomic_load_16_global, i16>;
+defm : GlobalFLATLoadPats <GLOBAL_LOAD_USHORT, atomic_load_nonext_16_global, i16>;
 defm : GlobalFLATLoadPats <GLOBAL_LOAD_USHORT, atomic_load_zext_16_global, i32>;
 defm : GlobalFLATLoadPats <GLOBAL_LOAD_USHORT, atomic_load_zext_16_global, i16>;
 defm : GlobalFLATLoadPats <GLOBAL_LOAD_SBYTE, atomic_load_sext_8_global, i32>;
@@ -1733,8 +1733,8 @@ defm : GlobalFLATStorePats <GLOBAL_STORE_DWORDX4, store_global, vt>;
 // There is no distinction for atomic load lowering during selection;
 // the memory legalizer will set the cache bits and insert the
 // appropriate waits.
-defm : GlobalFLATLoadPats <GLOBAL_LOAD_DWORD, atomic_load_32_global, i32>;
-defm : GlobalFLATLoadPats <GLOBAL_LOAD_DWORDX2, atomic_load_64_global, i64>;
+defm : GlobalFLATLoadPats <GLOBAL_LOAD_DWORD, atomic_load_nonext_32_global, i32>;
+defm : GlobalFLATLoadPats <GLOBAL_LOAD_DWORDX2, atomic_load_nonext_64_global, i64>;
 
 defm : GlobalFLATStorePats <GLOBAL_STORE_BYTE, truncstorei8_global, i32>;
 defm : GlobalFLATStorePats <GLOBAL_STORE_SHORT, truncstorei16_global, i32>;
diff --git a/llvm/lib/Target/AMDGPU/SIInstrInfo.td b/llvm/lib/Target/AMDGPU/SIInstrInfo.td
index ec1fd6fb60d57..5d837d853ac98 100644
--- a/llvm/lib/Target/AMDGPU/SIInstrInfo.td
+++ b/llvm/lib/Target/AMDGPU/SIInstrInfo.td
@@ -361,6 +361,12 @@ def load_glue : PatFrag <(ops node:$ptr), (unindexedload_glue node:$ptr)> {
   let IsNonExtLoad = 1;
 }
 
+def atomic_load_nonext_glue :
+  PatFrag<(ops node:$ptr), (AMDGPUatomic_ld_glue node:$ptr)> {
+  let IsAtomic = true; // FIXME: Should be IsLoad and/or IsAtomic?
+  let IsNonExtLoad = true;
+}
+
 def atomic_load_zext_glue :
   PatFrag<(ops node:$ptr), (AMDGPUatomic_ld_glue node:$ptr)> {
   let IsAtomic = true; // FIXME: Should be IsLoad and/or IsAtomic?
@@ -379,20 +385,20 @@ def atomic_load_aext_glue :
   let IsAnyExtLoad = true;
 }
 
-def atomic_load_16_glue : PatFrag<(ops node:$ptr),
-  (AMDGPUatomic_ld_glue node:$ptr)> {
+def atomic_load_nonext_16_glue : PatFrag<(ops node:$ptr),
+  (atomic_load_nonext_glue node:$ptr)> {
   let IsAtomic = 1;
   let MemoryVT = i16;
 }
 
-def atomic_load_32_glue : PatFrag<(ops node:$ptr),
-  (AMDGPUatomic_ld_glue node:$ptr)> {
+def atomic_load_nonext_32_glue : PatFrag<(ops node:$ptr),
+  (atomic_load_nonext_glue node:$ptr)> {
   let IsAtomic = 1;
   let MemoryVT = i32;
 }
 
-def atomic_load_64_glue : PatFrag<(ops node:$ptr),
-  (AMDGPUatomic_ld_glue node:$ptr)> {
+def atomic_load_nonext_64_glue : PatFrag<(ops node:$ptr),
+  (atomic_load_nonext_glue node:$ptr)> {
   let IsAtomic = 1;
   let MemoryVT = i64;
 }
@@ -506,12 +512,12 @@ def load_align16_local_m0 : PatFrag<(ops node:$ptr),
 }
 
 let IsAtomic = 1, AddressSpaces = LoadAddress_local.AddrSpaces in {
-def atomic_load_16_local_m0 : PatFrag<(ops node:$ptr),
-                                      (atomic_load_16_glue node:$ptr)>;
-def atomic_load_32_local_m0 : PatFrag<(ops node:$ptr),
-                                      (atomic_load_32_glue node:$ptr)>;
-def atomic_load_64_local_m0 : PatFrag<(ops node:$ptr),
-                                       (atomic_load_64_glue node:$ptr)>;
+def atomic_load_nonext_16_local_m0 : PatFrag<(ops node:$ptr),
+                                      (atomic_load_nonext_16_glue node:$ptr)>;
+def atomic_load_nonext_32_local_m0 : PatFrag<(ops node:$ptr),
+                                      (atomic_load_nonext_32_glue node:$ptr)>;
+def atomic_load_nonext_64_local_m0 : PatFrag<(ops node:$ptr),
+                                       (atomic_load_nonext_64_glue node:$ptr)>;
 
 def atomic_load_zext_8_local_m0 : PatFrag<(ops node:$ptr),
                                       (atomic_load_zext_8_glue node:$ptr)>;
diff --git a/llvm/lib/Target/ARM/ARMInstrInfo.td b/llvm/lib/Target/ARM/ARMInstrInfo.td
index 1ce9190a68f3c..c682f597401ec 100644
--- a/llvm/lib/Target/ARM/ARMInstrInfo.td
+++ b/llvm/lib/Target/ARM/ARMInstrInfo.td
@@ -5384,7 +5384,7 @@ class acquiring_load<PatFrags base>
 
 def atomic_load_azext_acquire_8  : acquiring_load<atomic_load_azext_8>;
 def atomic_load_azext_acquire_16 : acquiring_load<atomic_load_azext_16>;
-def atomic_load_acquire_32 : acquiring_load<atomic_load_32>;
+def atomic_load_nonext_acquire_32 : acquiring_load<atomic_load_nonext_32>;
 
 class releasing_store<PatFrag base>
   : PatFrag<(ops node:$ptr, node:$val), (base node:$val, node:$ptr), [{
@@ -5399,7 +5399,7 @@ def atomic_store_release_32 : releasing_store<atomic_store_32>;
 let AddedComplexity = 8 in {
   def : ARMPat<(atomic_load_azext_acquire_8 addr_offset_none:$addr),  (LDAB addr_offset_none:$addr)>;
   def : ARMPat<(atomic_load_azext_acquire_16 addr_offset_none:$addr), (LDAH addr_offset_none:$addr)>;
-  def : ARMPat<(atomic_load_acquire_32 addr_offset_none:$addr), (LDA  addr_offset_none:$addr)>;
+  def : ARMPat<(atomic_load_nonext_acquire_32 addr_offset_none:$addr), (LDA  addr_offset_none:$addr)>;
   def : ARMPat<(atomic_store_release_8 addr_offset_none:$addr, GPR:$val),  (STLB GPR:$val, addr_offset_none:$addr)>;
   def : ARMPat<(atomic_store_release_16 addr_offset_none:$addr, GPR:$val), (STLH GPR:$val, addr_offset_none:$addr)>;
   def : ARMPat<(atomic_store_release_32 addr_offset_none:$addr, GPR:$val), (STL  GPR:$val, addr_offset_none:$addr)>;
@@ -6220,9 +6220,9 @@ def : ARMPat<(atomic_load_azext_8 addrmode_imm12:$src),
              (LDRBi12 addrmode_imm12:$src)>;
 def : ARMPat<(atomic_load_azext_1...
[truncated]

@s-barannikov
Copy link
Contributor

s-barannikov commented Apr 26, 2025

Will they be reintroduced with new semantics or why remove them?

@topperc
Copy link
Collaborator Author

topperc commented Apr 26, 2025

Will they be reintroduced with new semantics or why remove them?

I don't plan to reintroduce them. I guess I can leave them as dead code to avoid breaking out of tree targets?

@s-barannikov
Copy link
Contributor

This is more question to #137401 which I slept through. Why do we need to spell out _nonext? It seems to be implied by non-atomic loads. Are atomic loads somewhat different in this regard?

@topperc
Copy link
Collaborator Author

topperc commented Apr 26, 2025

This is more question to #137401 which I slept through. Why do we need to spell out _nonext? It seems to be implied by non-atomic loads. Are atomic loads somewhat different in this regard?

If I add the IsNonExtLoad flag to atomic_load_8/16/32/64 it will cause "cannot select" failures for out of tree target that haven't implemented the equivalent of #137279. Spelling it out explicitly and deleting the old names give them build time failures instead. I figured the build time failure was preferable to a test failure.

@s-barannikov
Copy link
Contributor

I see, thanks.

Copy link
Contributor

@s-barannikov s-barannikov left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM

@@ -502,15 +502,15 @@ def zextloadi16_#as : PatFrag<(ops node:$ptr), (zextloadi16 node:$ptr)> {
let IsLoad = 1;
}

def atomic_load_16_#as : PatFrag<(ops node:$ptr), (atomic_load_16 node:$ptr)> {
def atomic_load_nonext_16_#as : PatFrag<(ops node:$ptr), (atomic_load_nonext_16 node:$ptr)> {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Could leave the old name (atomic_load_16_#as) if we ever change atomic_load_nonext_16 back to atomic_load_16.
Not a strong objection.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

That would be better, but requires touching more code at once

@topperc topperc merged commit ca21508 into llvm:main Apr 28, 2025
22 checks passed
@topperc topperc deleted the pr/atomic-load-migrate branch April 28, 2025 16:26
@llvm-ci
Copy link
Collaborator

llvm-ci commented Apr 28, 2025

LLVM Buildbot has detected a new failure on builder lldb-arm-ubuntu running on linaro-lldb-arm-ubuntu while building llvm at step 6 "test".

Full details are available at: https://lab.llvm.org/buildbot/#/builders/18/builds/15181

Here is the relevant piece of the build log for the reference
Step 6 (test) failure: build (failure)
...
PASS: lldb-api :: python_api/watchpoint/watchlocation/TestSetWatchlocation.py (1143 of 3001)
PASS: lldb-api :: symbol_ondemand/breakpoint_source_regex/TestSourceTextRegexBreakpoint.py (1144 of 3001)
PASS: lldb-api :: source-manager/TestSourceManager.py (1145 of 3001)
PASS: lldb-api :: symbol_ondemand/shared_library/TestSharedLibOnDemand.py (1146 of 3001)
PASS: lldb-api :: terminal/TestSTTYBeforeAndAfter.py (1147 of 3001)
PASS: lldb-api :: test_utils/TestDecorators.py (1148 of 3001)
PASS: lldb-api :: test_utils/TestInlineTest.py (1149 of 3001)
PASS: lldb-api :: test_utils/TestPExpectTest.py (1150 of 3001)
PASS: lldb-api :: test_utils/base/TestBaseTest.py (1151 of 3001)
PASS: lldb-api :: python_api/watchpoint/watchlocation/TestTargetWatchAddress.py (1152 of 3001)
FAIL: lldb-api :: tools/lldb-dap/attach/TestDAP_attachByPortNum.py (1153 of 3001)
******************** TEST 'lldb-api :: tools/lldb-dap/attach/TestDAP_attachByPortNum.py' FAILED ********************
Script:
--
/usr/bin/python3.10 /home/tcwg-buildbot/worker/lldb-arm-ubuntu/llvm-project/lldb/test/API/dotest.py -u CXXFLAGS -u CFLAGS --env LLVM_LIBS_DIR=/home/tcwg-buildbot/worker/lldb-arm-ubuntu/build/./lib --env LLVM_INCLUDE_DIR=/home/tcwg-buildbot/worker/lldb-arm-ubuntu/build/include --env LLVM_TOOLS_DIR=/home/tcwg-buildbot/worker/lldb-arm-ubuntu/build/./bin --arch armv8l --build-dir /home/tcwg-buildbot/worker/lldb-arm-ubuntu/build/lldb-test-build.noindex --lldb-module-cache-dir /home/tcwg-buildbot/worker/lldb-arm-ubuntu/build/lldb-test-build.noindex/module-cache-lldb/lldb-api --clang-module-cache-dir /home/tcwg-buildbot/worker/lldb-arm-ubuntu/build/lldb-test-build.noindex/module-cache-clang/lldb-api --executable /home/tcwg-buildbot/worker/lldb-arm-ubuntu/build/./bin/lldb --compiler /home/tcwg-buildbot/worker/lldb-arm-ubuntu/build/./bin/clang --dsymutil /home/tcwg-buildbot/worker/lldb-arm-ubuntu/build/./bin/dsymutil --make /usr/bin/gmake --llvm-tools-dir /home/tcwg-buildbot/worker/lldb-arm-ubuntu/build/./bin --lldb-obj-root /home/tcwg-buildbot/worker/lldb-arm-ubuntu/build/tools/lldb --lldb-libs-dir /home/tcwg-buildbot/worker/lldb-arm-ubuntu/build/./lib /home/tcwg-buildbot/worker/lldb-arm-ubuntu/llvm-project/lldb/test/API/tools/lldb-dap/attach -p TestDAP_attachByPortNum.py
--
Exit Code: 1

Command Output (stdout):
--
lldb version 21.0.0git (https://github.com/llvm/llvm-project.git revision ca21508080031c3eda1c6085f1de9cc26be4c336)
  clang revision ca21508080031c3eda1c6085f1de9cc26be4c336
  llvm revision ca21508080031c3eda1c6085f1de9cc26be4c336
Skipping the following test categories: ['libc++', 'dsym', 'gmodules', 'debugserver', 'objc']

--
Command Output (stderr):
--
Usage:
  /home/tcwg-buildbot/worker/lldb-arm-ubuntu/build/bin/lldb-server v[ersion]
  /home/tcwg-buildbot/worker/lldb-arm-ubuntu/build/bin/lldb-server g[dbserver] [options]
  /home/tcwg-buildbot/worker/lldb-arm-ubuntu/build/bin/lldb-server p[latform] [options]
Invoke subcommand for additional help
========= DEBUG ADAPTER PROTOCOL LOGS =========
1745858796.915384054 --> (stdin/stdout) {"command":"initialize","type":"request","arguments":{"adapterID":"lldb-native","clientID":"vscode","columnsStartAt1":true,"linesStartAt1":true,"locale":"en-us","pathFormat":"path","supportsRunInTerminalRequest":true,"supportsVariablePaging":true,"supportsVariableType":true,"supportsStartDebuggingRequest":true,"supportsProgressReporting":true,"$__lldb_sourceInitFile":true},"seq":1}
1745858796.920063972 <-- (stdin/stdout) {"body":{"$__lldb_version":"lldb version 21.0.0git (https://github.com/llvm/llvm-project.git revision ca21508080031c3eda1c6085f1de9cc26be4c336)\n  clang revision ca21508080031c3eda1c6085f1de9cc26be4c336\n  llvm revision ca21508080031c3eda1c6085f1de9cc26be4c336","completionTriggerCharacters":["."," ","\t"],"exceptionBreakpointFilters":[{"default":false,"filter":"cpp_catch","label":"C++ Catch"},{"default":false,"filter":"cpp_throw","label":"C++ Throw"},{"default":false,"filter":"objc_catch","label":"Objective-C Catch"},{"default":false,"filter":"objc_throw","label":"Objective-C Throw"}],"supportTerminateDebuggee":true,"supportsBreakpointLocationsRequest":true,"supportsCancelRequest":true,"supportsCompletionsRequest":true,"supportsConditionalBreakpoints":true,"supportsConfigurationDoneRequest":true,"supportsDataBreakpoints":true,"supportsDelayedStackTraceLoading":true,"supportsDisassembleRequest":true,"supportsEvaluateForHovers":true,"supportsExceptionInfoRequest":true,"supportsExceptionOptions":true,"supportsFunctionBreakpoints":true,"supportsHitConditionalBreakpoints":true,"supportsInstructionBreakpoints":true,"supportsLogPoints":true,"supportsModulesRequest":true,"supportsReadMemoryRequest":true,"supportsRestartRequest":true,"supportsSetVariable":true,"supportsStepInTargetsRequest":true,"supportsSteppingGranularity":true,"supportsValueFormattingOptions":true},"command":"initialize","request_seq":1,"seq":0,"success":true,"type":"response"}
1745858796.920762300 --> (stdin/stdout) {"command":"attach","type":"request","arguments":{"program":"/home/tcwg-buildbot/worker/lldb-arm-ubuntu/build/lldb-test-build.noindex/tools/lldb-dap/attach/TestDAP_attachByPortNum.test_by_illegal_port/a.out","initCommands":["settings clear -all","settings set symbols.enable-external-lookup false","settings set target.inherit-tcc true","settings set target.disable-aslr false","settings set target.detach-on-error false","settings set target.auto-apply-fixits false","settings set plugin.process.gdb-remote.packet-timeout 60","settings set symbols.clang-modules-cache-path \"/home/tcwg-buildbot/worker/lldb-arm-ubuntu/build/lldb-test-build.noindex/module-cache-lldb/lldb-api\"","settings set use-color false","settings set show-statusline false"],"gdb-remote-port":65536},"seq":2}
1745858796.921314478 <-- (stdin/stdout) {"body":{"category":"console","output":"Running initCommands:\n"},"event":"output","seq":0,"type":"event"}
1745858796.921362877 <-- (stdin/stdout) {"body":{"category":"console","output":"(lldb) settings clear -all\n"},"event":"output","seq":0,"type":"event"}
1745858796.921376467 <-- (stdin/stdout) {"body":{"category":"console","output":"(lldb) settings set symbols.enable-external-lookup false\n"},"event":"output","seq":0,"type":"event"}
1745858796.921388865 <-- (stdin/stdout) {"body":{"category":"console","output":"(lldb) settings set target.inherit-tcc true\n"},"event":"output","seq":0,"type":"event"}
1745858796.921400785 <-- (stdin/stdout) {"body":{"category":"console","output":"(lldb) settings set target.disable-aslr false\n"},"event":"output","seq":0,"type":"event"}
1745858796.921413183 <-- (stdin/stdout) {"body":{"category":"console","output":"(lldb) settings set target.detach-on-error false\n"},"event":"output","seq":0,"type":"event"}
1745858796.921424389 <-- (stdin/stdout) {"body":{"category":"console","output":"(lldb) settings set target.auto-apply-fixits false\n"},"event":"output","seq":0,"type":"event"}
1745858796.921435595 <-- (stdin/stdout) {"body":{"category":"console","output":"(lldb) settings set plugin.process.gdb-remote.packet-timeout 60\n"},"event":"output","seq":0,"type":"event"}
1745858796.921470881 <-- (stdin/stdout) {"body":{"category":"console","output":"(lldb) settings set symbols.clang-modules-cache-path \"/home/tcwg-buildbot/worker/lldb-arm-ubuntu/build/lldb-test-build.noindex/module-cache-lldb/lldb-api\"\n"},"event":"output","seq":0,"type":"event"}
1745858796.921483278 <-- (stdin/stdout) {"body":{"category":"console","output":"(lldb) settings set use-color false\n"},"event":"output","seq":0,"type":"event"}
1745858796.921496153 <-- (stdin/stdout) {"body":{"category":"console","output":"(lldb) settings set show-statusline false\n"},"event":"output","seq":0,"type":"event"}
1745858801.836555481 <-- (stdin/stdout) {"command":"attach","message":"invalid host:port specification: 'localhost:65536'","request_seq":2,"seq":0,"success":false,"type":"response"}

IanWood1 pushed a commit to IanWood1/llvm-project that referenced this pull request May 6, 2025
…/16/32/64. NFC (llvm#137428)

This makes them more consistent with the checks performed by regular loads. We can't simply add IsNonExtLoad to the existing atomic_load_8/16/32/64 as that would affect out of tree targets.
IanWood1 pushed a commit to IanWood1/llvm-project that referenced this pull request May 6, 2025
…/16/32/64. NFC (llvm#137428)

This makes them more consistent with the checks performed by regular loads. We can't simply add IsNonExtLoad to the existing atomic_load_8/16/32/64 as that would affect out of tree targets.
IanWood1 pushed a commit to IanWood1/llvm-project that referenced this pull request May 6, 2025
…/16/32/64. NFC (llvm#137428)

This makes them more consistent with the checks performed by regular loads. We can't simply add IsNonExtLoad to the existing atomic_load_8/16/32/64 as that would affect out of tree targets.
GeorgeARM pushed a commit to GeorgeARM/llvm-project that referenced this pull request May 7, 2025
…/16/32/64. NFC (llvm#137428)

This makes them more consistent with the checks performed by regular loads. We can't simply add IsNonExtLoad to the existing atomic_load_8/16/32/64 as that would affect out of tree targets.
Ankur-0429 pushed a commit to Ankur-0429/llvm-project that referenced this pull request May 9, 2025
…/16/32/64. NFC (llvm#137428)

This makes them more consistent with the checks performed by regular loads. We can't simply add IsNonExtLoad to the existing atomic_load_8/16/32/64 as that would affect out of tree targets.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

5 participants