Skip to content

[LoongArch] Support LA V1.1 feature ld-seq-sa that don't generate dbar 0x700. #116762

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
merged 5 commits into from
Nov 22, 2024

Conversation

tangaac
Copy link
Contributor

@tangaac tangaac commented Nov 19, 2024

Two options for clang
-mld-seq-sa: Do not generate load-load barrier instructions (dbar 0x700)
-mno-ld-seq-sa: Generate load-load barrier instructions (dbar 0x700)
The default is -mno-ld-seq-sa

@llvmbot llvmbot added clang Clang issues not falling into any other category clang:driver 'clang' and 'clang++' user-facing binaries. Not 'clang-cl' clang:frontend Language frontend issues, e.g. anything involving "Sema" backend:loongarch labels Nov 19, 2024
@llvmbot
Copy link
Member

llvmbot commented Nov 19, 2024

@llvm/pr-subscribers-clang

@llvm/pr-subscribers-clang-driver

Author: None (tangaac)

Changes

Two options for clang
-mld-seq-sa: Do not generate load-load barrier instructions (dbar 0x700)
-mno-ld-seq-sa: Generate load-load barrier instructions (dbar 0x700)
The default is -mno-ld-seq-sa


Patch is 76.37 KiB, truncated to 20.00 KiB below, full version: https://github.com/llvm/llvm-project/pull/116762.diff

13 Files Affected:

  • (modified) clang/include/clang/Driver/Options.td (+4)
  • (modified) clang/lib/Basic/Targets/LoongArch.cpp (+6-1)
  • (modified) clang/lib/Basic/Targets/LoongArch.h (+2)
  • (modified) clang/lib/Driver/ToolChains/Arch/LoongArch.cpp (+9)
  • (modified) clang/test/Driver/loongarch-march.c (+4-4)
  • (added) clang/test/Driver/loongarch-mld-seq-sa.c (+30)
  • (modified) clang/test/Preprocessor/init-loongarch.c (+22-8)
  • (modified) llvm/include/llvm/TargetParser/LoongArchTargetParser.def (+2-1)
  • (modified) llvm/include/llvm/TargetParser/LoongArchTargetParser.h (+4)
  • (modified) llvm/lib/Target/LoongArch/LoongArch.td (+5)
  • (modified) llvm/lib/Target/LoongArch/LoongArchExpandAtomicPseudoInsts.cpp (+3-1)
  • (modified) llvm/lib/TargetParser/LoongArchTargetParser.cpp (+1)
  • (modified) llvm/test/CodeGen/LoongArch/ir-instruction/atomic-cmpxchg.ll (+1145)
diff --git a/clang/include/clang/Driver/Options.td b/clang/include/clang/Driver/Options.td
index d7230dd7272fd6..3ebb682397a0b6 100644
--- a/clang/include/clang/Driver/Options.td
+++ b/clang/include/clang/Driver/Options.td
@@ -5416,6 +5416,10 @@ def mlam_bh : Flag<["-"], "mlam-bh">, Group<m_loongarch_Features_Group>,
   HelpText<"Enable amswap[_db].{b/h} and amadd[_db].{b/h}">;
 def mno_lam_bh : Flag<["-"], "mno-lam-bh">, Group<m_loongarch_Features_Group>,
   HelpText<"Disable amswap[_db].{b/h} and amadd[_db].{b/h}">;
+def mld_seq_sa : Flag<["-"], "mld-seq-sa">, Group<m_loongarch_Features_Group>,
+  HelpText<"Do not generate load-load barrier instructions (dbar 0x700)">;
+def mno_ld_seq_sa : Flag<["-"], "mno-ld-seq-sa">, Group<m_loongarch_Features_Group>,
+  HelpText<"Generate load-load barrier instructions (dbar 0x700)">;
 def mannotate_tablejump : Flag<["-"], "mannotate-tablejump">, Group<m_loongarch_Features_Group>,
   HelpText<"Enable annotate table jump instruction to correlate it with the jump table.">;
 def mno_annotate_tablejump : Flag<["-"], "mno-annotate-tablejump">, Group<m_loongarch_Features_Group>,
diff --git a/clang/lib/Basic/Targets/LoongArch.cpp b/clang/lib/Basic/Targets/LoongArch.cpp
index 07b22b35f603ce..3f2d7317532aaf 100644
--- a/clang/lib/Basic/Targets/LoongArch.cpp
+++ b/clang/lib/Basic/Targets/LoongArch.cpp
@@ -205,7 +205,7 @@ void LoongArchTargetInfo::getTargetDefines(const LangOptions &Opts,
       // TODO: As more features of the V1.1 ISA are supported, a unified "v1.1"
       // arch feature set will be used to include all sub-features belonging to
       // the V1.1 ISA version.
-      if (HasFeatureFrecipe && HasFeatureLAM_BH)
+      if (HasFeatureFrecipe && HasFeatureLAM_BH && HasFeatureLD_SEQ_SA)
         Builder.defineMacro("__loongarch_arch",
                             Twine('"') + "la64v1.1" + Twine('"'));
       else
@@ -239,6 +239,9 @@ void LoongArchTargetInfo::getTargetDefines(const LangOptions &Opts,
   if (HasFeatureLAM_BH)
     Builder.defineMacro("__loongarch_lam_bh", Twine(1));
 
+  if (HasFeatureLD_SEQ_SA)
+    Builder.defineMacro("__loongarch_ld_seq_sa", Twine(1));
+
   StringRef ABI = getABI();
   if (ABI == "lp64d" || ABI == "lp64f" || ABI == "lp64s")
     Builder.defineMacro("__loongarch_lp64");
@@ -317,6 +320,8 @@ bool LoongArchTargetInfo::handleTargetFeatures(
       HasFeatureFrecipe = true;
     else if (Feature == "+lam-bh")
       HasFeatureLAM_BH = true;
+    else if (Feature == "+ld-seq-sa")
+      HasFeatureLD_SEQ_SA = true;
   }
   return true;
 }
diff --git a/clang/lib/Basic/Targets/LoongArch.h b/clang/lib/Basic/Targets/LoongArch.h
index 3585e9f7968b4b..e5eae7a8fcf677 100644
--- a/clang/lib/Basic/Targets/LoongArch.h
+++ b/clang/lib/Basic/Targets/LoongArch.h
@@ -31,6 +31,7 @@ class LLVM_LIBRARY_VISIBILITY LoongArchTargetInfo : public TargetInfo {
   bool HasFeatureLASX;
   bool HasFeatureFrecipe;
   bool HasFeatureLAM_BH;
+  bool HasFeatureLD_SEQ_SA;
 
 public:
   LoongArchTargetInfo(const llvm::Triple &Triple, const TargetOptions &)
@@ -41,6 +42,7 @@ class LLVM_LIBRARY_VISIBILITY LoongArchTargetInfo : public TargetInfo {
     HasFeatureLASX = false;
     HasFeatureFrecipe = false;
     HasFeatureLAM_BH = false;
+    HasFeatureLD_SEQ_SA = false;
     LongDoubleWidth = 128;
     LongDoubleAlign = 128;
     LongDoubleFormat = &llvm::APFloat::IEEEquad();
diff --git a/clang/lib/Driver/ToolChains/Arch/LoongArch.cpp b/clang/lib/Driver/ToolChains/Arch/LoongArch.cpp
index 987db4638fca88..67b71a3ec623e4 100644
--- a/clang/lib/Driver/ToolChains/Arch/LoongArch.cpp
+++ b/clang/lib/Driver/ToolChains/Arch/LoongArch.cpp
@@ -274,6 +274,15 @@ void loongarch::getLoongArchTargetFeatures(const Driver &D,
     else
       Features.push_back("-lam-bh");
   }
+
+  // Select ld-seq-sa feature determined by -m[no-]ld-seq-sa.
+  if (const Arg *A = Args.getLastArg(options::OPT_mld_seq_sa,
+                                     options::OPT_mno_ld_seq_sa)) {
+    if (A->getOption().matches(options::OPT_mld_seq_sa))
+      Features.push_back("+ld-seq-sa");
+    else
+      Features.push_back("-ld-seq-sa");
+  }
 }
 
 std::string loongarch::postProcessTargetCPUString(const std::string &CPU,
diff --git a/clang/test/Driver/loongarch-march.c b/clang/test/Driver/loongarch-march.c
index d4cd5b07ae905f..c7091336f3bc80 100644
--- a/clang/test/Driver/loongarch-march.c
+++ b/clang/test/Driver/loongarch-march.c
@@ -39,21 +39,21 @@
 
 // CC1-LA64V1P1: "-target-cpu" "loongarch64"
 // CC1-LA64V1P1-NOT: "-target-feature"
-// CC1-LA64V1P1: "-target-feature" "+64bit" "-target-feature" "+d" "-target-feature" "+lsx" "-target-feature" "+ual" "-target-feature" "+frecipe" "-target-feature" "+lam-bh"
+// CC1-LA64V1P1: "-target-feature" "+64bit" "-target-feature" "+d" "-target-feature" "+lsx" "-target-feature" "+ual" "-target-feature" "+frecipe" "-target-feature" "+lam-bh" "-target-feature" "+ld-seq-sa"
 // CC1-LA64V1P1-NOT: "-target-feature"
 // CC1-LA64V1P1: "-target-abi" "lp64d"
 
 // CC1-LA664: "-target-cpu" "la664"
 // CC1-LA664-NOT: "-target-feature"
-// CC1-LA664: "-target-feature" "+64bit" "-target-feature" "+f" "-target-feature" "+d" "-target-feature" "+lsx" "-target-feature" "+lasx" "-target-feature" "+ual" "-target-feature" "+frecipe" "-target-feature" "+lam-bh"
+// CC1-LA664: "-target-feature" "+64bit" "-target-feature" "+f" "-target-feature" "+d" "-target-feature" "+lsx" "-target-feature" "+lasx" "-target-feature" "+ual" "-target-feature" "+frecipe" "-target-feature" "+lam-bh" "-target-feature" "+ld-seq-sa"
 // CC1-LA664-NOT: "-target-feature"
 // CC1-LA664: "-target-abi" "lp64d"
 
 // IR-LOONGARCH64: attributes #[[#]] ={{.*}}"target-cpu"="loongarch64" {{.*}}"target-features"="+64bit,+d,+f,+ual"
 // IR-LA464: attributes #[[#]] ={{.*}}"target-cpu"="la464" {{.*}}"target-features"="+64bit,+d,+f,+lasx,+lsx,+ual"
 // IR-LA64V1P0: attributes #[[#]] ={{.*}}"target-cpu"="loongarch64" {{.*}}"target-features"="+64bit,+d,+lsx,+ual"
-// IR-LA64V1P1: attributes #[[#]] ={{.*}}"target-cpu"="loongarch64" {{.*}}"target-features"="+64bit,+d,+frecipe,+lam-bh,+lsx,+ual"
-// IR-LA664: attributes #[[#]] ={{.*}}"target-cpu"="la664" {{.*}}"target-features"="+64bit,+d,+f,+frecipe,+lam-bh,+lasx,+lsx,+ual"
+// IR-LA64V1P1: attributes #[[#]] ={{.*}}"target-cpu"="loongarch64" {{.*}}"target-features"="+64bit,+d,+frecipe,+lam-bh,+ld-seq-sa,+lsx,+ual"
+// IR-LA664: attributes #[[#]] ={{.*}}"target-cpu"="la664" {{.*}}"target-features"="+64bit,+d,+f,+frecipe,+lam-bh,+lasx,+ld-seq-sa,+lsx,+ual"
 
 int foo(void) {
   return 3;
diff --git a/clang/test/Driver/loongarch-mld-seq-sa.c b/clang/test/Driver/loongarch-mld-seq-sa.c
new file mode 100644
index 00000000000000..3d1d90d3f9cf72
--- /dev/null
+++ b/clang/test/Driver/loongarch-mld-seq-sa.c
@@ -0,0 +1,30 @@
+/// Test -m[no]ld-seq-sa options.
+
+// RUN: %clang --target=loongarch64 -mld-seq-sa -fsyntax-only %s -### 2>&1 | \
+// RUN:     FileCheck %s --check-prefix=CC1-ld-seq-sa
+// RUN: %clang --target=loongarch64 -mno-ld-seq-sa -fsyntax-only %s -### 2>&1 | \
+// RUN:     FileCheck %s --check-prefix=CC1-NO-ld-seq-sa
+// RUN: %clang --target=loongarch64 -mno-ld-seq-sa -mld-seq-sa -fsyntax-only %s -### 2>&1 | \
+// RUN:     FileCheck %s --check-prefix=CC1-ld-seq-sa
+// RUN: %clang --target=loongarch64  -mld-seq-sa -mno-ld-seq-sa -fsyntax-only %s -### 2>&1 | \
+// RUN:     FileCheck %s --check-prefix=CC1-NO-ld-seq-sa
+
+// RUN: %clang --target=loongarch64 -mld-seq-sa -S -emit-llvm %s -o - | \
+// RUN: FileCheck %s --check-prefix=IR-ld-seq-sa
+// RUN: %clang --target=loongarch64 -mno-ld-seq-sa -S -emit-llvm %s -o - | \
+// RUN: FileCheck %s --check-prefix=IR-NO-ld-seq-sa
+// RUN: %clang --target=loongarch64 -mno-ld-seq-sa -mld-seq-sa -S -emit-llvm %s -o - | \
+// RUN: FileCheck %s --check-prefix=IR-ld-seq-sa
+// RUN: %clang --target=loongarch64 -mld-seq-sa -mno-ld-seq-sa -S -emit-llvm %s -o - | \
+// RUN: FileCheck %s --check-prefix=IR-NO-ld-seq-sa
+
+
+// CC1-ld-seq-sa: "-target-feature" "+ld-seq-sa"
+// CC1-NO-ld-seq-sa: "-target-feature" "-ld-seq-sa"
+
+// IR-ld-seq-sa: attributes #[[#]] ={{.*}}"target-features"="{{(.*,)?}}+ld-seq-sa{{(,.*)?}}"
+// IR-NO-ld-seq-sa: attributes #[[#]] ={{.*}}"target-features"="{{(.*,)?}}-ld-seq-sa{{(,.*)?}}"
+
+int foo(void) {
+  return 42;
+}
diff --git a/clang/test/Preprocessor/init-loongarch.c b/clang/test/Preprocessor/init-loongarch.c
index 8019292e0f10e0..f34a152a69f5c9 100644
--- a/clang/test/Preprocessor/init-loongarch.c
+++ b/clang/test/Preprocessor/init-loongarch.c
@@ -798,7 +798,7 @@
 // LA64-FPU0-LP64S-NOT: #define __loongarch_single_float
 // LA64-FPU0-LP64S: #define __loongarch_soft_float 1
 
-/// Check __loongarch_arch{_tune/_frecipe/_lam_bh}.
+/// Check __loongarch_arch{_tune/_frecipe/_lam_bh/_ld_seq_sa}.
 
 // RUN: %clang --target=loongarch64 -x c -E -dM %s -o - | \
 // RUN:   FileCheck --match-full-lines --check-prefix=ARCH-TUNE -DARCH=la64v1.0 -DTUNE=loongarch64 %s
@@ -823,11 +823,11 @@
 // RUN: %clang --target=loongarch64 -x c -E -dM %s -o - -march=loongarch64 -Xclang -target-feature -Xclang +lsx | \
 // RUN:   FileCheck --match-full-lines --check-prefix=ARCH-TUNE -DARCH=la64v1.0 -DTUNE=loongarch64 %s
 // RUN: %clang --target=loongarch64 -x c -E -dM %s -o - -march=la64v1.1 | \
-// RUN:   FileCheck --match-full-lines  --check-prefixes=ARCH-TUNE,FRECIPE,LAM-BH -DARCH=la64v1.1 -DTUNE=loongarch64 %s
+// RUN:   FileCheck --match-full-lines  --check-prefixes=ARCH-TUNE,FRECIPE,LAM-BH,LD-SEQ-SA -DARCH=la64v1.1 -DTUNE=loongarch64 %s
 // RUN: %clang --target=loongarch64 -x c -E -dM %s -o - -march=la64v1.1 -Xclang -target-feature -Xclang -frecipe | \
-// RUN:   FileCheck --match-full-lines --check-prefixes=ARCH-TUNE,LAM-BH -DARCH=la64v1.0 -DTUNE=loongarch64 %s
+// RUN:   FileCheck --match-full-lines --check-prefixes=ARCH-TUNE,LAM-BH,LD-SEQ-SA -DARCH=la64v1.0 -DTUNE=loongarch64 %s
 // RUN: %clang --target=loongarch64 -x c -E -dM %s -o - -march=la64v1.1 -Xclang -target-feature -Xclang -lsx | \
-// RUN:   FileCheck --match-full-lines --check-prefixes=ARCH-TUNE,FRECIPE,LAM-BH -DARCH=loongarch64 -DTUNE=loongarch64 %s
+// RUN:   FileCheck --match-full-lines --check-prefixes=ARCH-TUNE,FRECIPE,LAM-BH,LD-SEQ-SA -DARCH=loongarch64 -DTUNE=loongarch64 %s
 // RUN: %clang --target=loongarch64 -x c -E -dM %s -o - -march=loongarch64 -Xclang -target-feature -Xclang +frecipe | \
 // RUN:   FileCheck --match-full-lines --check-prefixes=ARCH-TUNE,FRECIPE -DARCH=loongarch64 -DTUNE=loongarch64 %s
 // RUN: %clang --target=loongarch64 -x c -E -dM %s -o - -march=loongarch64 -Xclang -target-feature -Xclang +lsx -Xclang -target-feature -Xclang +frecipe | \
@@ -835,25 +835,39 @@
 // RUN: %clang --target=loongarch64 -x c -E -dM %s -o - -march=la64v1.0 -Xclang -target-feature -Xclang +lam-bh | \
 // RUN:   FileCheck --match-full-lines --check-prefixes=ARCH-TUNE,LAM-BH -DARCH=la64v1.0 -DTUNE=loongarch64 %s
 // RUN: %clang --target=loongarch64 -x c -E -dM %s -o - -march=la64v1.1 -Xclang -target-feature -Xclang -lam-bh | \
-// RUN:   FileCheck --match-full-lines --check-prefixes=ARCH-TUNE,FRECIPE -DARCH=la64v1.0 -DTUNE=loongarch64 %s
+// RUN:   FileCheck --match-full-lines --check-prefixes=ARCH-TUNE,FRECIPE,LD-SEQ-SA -DARCH=la64v1.0 -DTUNE=loongarch64 %s
 // RUN: %clang --target=loongarch64 -x c -E -dM %s -o - -march=loongarch64 -Xclang -target-feature -Xclang +lam-bh | \
 // RUN:   FileCheck --match-full-lines --check-prefixes=ARCH-TUNE,LAM-BH -DARCH=loongarch64 -DTUNE=loongarch64 %s
 // RUN: %clang --target=loongarch64 -x c -E -dM %s -o - -march=loongarch64 -Xclang -target-feature -Xclang +lsx -Xclang -target-feature -Xclang +lam-bh | \
 // RUN:   FileCheck --match-full-lines --check-prefixes=ARCH-TUNE,LAM-BH -DARCH=la64v1.0 -DTUNE=loongarch64 %s
-// RUN: %clang --target=loongarch64 -x c -E -dM %s -o - -march=la64v1.0 -Xclang -target-feature -Xclang +frecipe -Xclang -target-feature -Xclang +lam-bh | \
+
+// RUN: %clang --target=loongarch64 -x c -E -dM %s -o - -march=la64v1.0 -Xclang -target-feature -Xclang +ld-seq-sa | \
+// RUN:   FileCheck --match-full-lines --check-prefixes=ARCH-TUNE,LD-SEQ-SA -DARCH=la64v1.0 -DTUNE=loongarch64 %s
+// RUN: %clang --target=loongarch64 -x c -E -dM %s -o - -march=la64v1.1 -Xclang -target-feature -Xclang -ld-seq-sa | \
+// RUN:   FileCheck --match-full-lines --check-prefixes=ARCH-TUNE,FRECIPE,LAM-BH -DARCH=la64v1.0 -DTUNE=loongarch64 %s
+// RUN: %clang --target=loongarch64 -x c -E -dM %s -o - -march=loongarch64 -Xclang -target-feature -Xclang +ld-seq-sa | \
+// RUN:   FileCheck --match-full-lines --check-prefixes=ARCH-TUNE,LD-SEQ-SA -DARCH=loongarch64 -DTUNE=loongarch64 %s
+// RUN: %clang --target=loongarch64 -x c -E -dM %s -o - -march=loongarch64 -Xclang -target-feature -Xclang +lsx -Xclang -target-feature -Xclang +ld-seq-sa | \
+// RUN:   FileCheck --match-full-lines --check-prefixes=ARCH-TUNE,LD-SEQ-SA -DARCH=la64v1.0 -DTUNE=loongarch64 %s
+
+
+
+// RUN: %clang --target=loongarch64 -x c -E -dM %s -o - -march=la64v1.0 -Xclang -target-feature -Xclang +frecipe -Xclang -target-feature -Xclang +lam-bh  -Xclang -target-feature -Xclang +ld-seq-sa | \
 // RUN:   FileCheck --match-full-lines --check-prefixes=ARCH-TUNE -DARCH=la64v1.1 -DTUNE=loongarch64 %s
+
 // RUN: %clang --target=loongarch64 -x c -E -dM %s -o - -march=la664 | \
-// RUN:   FileCheck --match-full-lines --check-prefixes=ARCH-TUNE,FRECIPE,LAM-BH -DARCH=la664 -DTUNE=la664 %s
+// RUN:   FileCheck --match-full-lines --check-prefixes=ARCH-TUNE,FRECIPE,LAM-BH,LD-SEQ-SA -DARCH=la664 -DTUNE=la664 %s
 // RUN: %clang --target=loongarch64 -x c -E -dM %s -o - -mtune=la664 | \
 // RUN:   FileCheck --match-full-lines --check-prefix=ARCH-TUNE -DARCH=la64v1.0 -DTUNE=la664 %s
 // RUN: %clang --target=loongarch64 -x c -E -dM %s -o - -march=loongarch64 -mtune=la664 | \
 // RUN:   FileCheck --match-full-lines --check-prefix=ARCH-TUNE -DARCH=loongarch64 -DTUNE=la664 %s
 // RUN: %clang --target=loongarch64 -x c -E -dM %s -o - -march=la664 -mtune=loongarch64 | \
-// RUN:   FileCheck --match-full-lines --check-prefixes=ARCH-TUNE,FRECIPE,LAM-BH -DARCH=la664 -DTUNE=loongarch64 %s
+// RUN:   FileCheck --match-full-lines --check-prefixes=ARCH-TUNE,FRECIPE,LAM-BH,LD-SEQ-SA -DARCH=la664 -DTUNE=loongarch64 %s
 
 // ARCH-TUNE: #define __loongarch_arch "[[ARCH]]"
 // FRECIPE: #define __loongarch_frecipe 1
 // LAM-BH: #define __loongarch_lam_bh 1
+// LD-SEQ-SA: #define __loongarch_ld_seq_sa 1
 // ARCH-TUNE: #define __loongarch_tune "[[TUNE]]"
 
 // RUN: %clang --target=loongarch64 -mlsx -x c -E -dM %s -o - \
diff --git a/llvm/include/llvm/TargetParser/LoongArchTargetParser.def b/llvm/include/llvm/TargetParser/LoongArchTargetParser.def
index 6cd2018b7b59cb..324d5c18e6dea3 100644
--- a/llvm/include/llvm/TargetParser/LoongArchTargetParser.def
+++ b/llvm/include/llvm/TargetParser/LoongArchTargetParser.def
@@ -12,6 +12,7 @@ LOONGARCH_FEATURE("+lvz", FK_LVZ)
 LOONGARCH_FEATURE("+ual", FK_UAL)
 LOONGARCH_FEATURE("+frecipe", FK_FRECIPE)
 LOONGARCH_FEATURE("+lam-bh", FK_LAM_BH)
+LOONGARCH_FEATURE("+ld-seq-sa", FK_LD_SEQ_SA)
 
 #undef LOONGARCH_FEATURE
 
@@ -21,6 +22,6 @@ LOONGARCH_FEATURE("+lam-bh", FK_LAM_BH)
 
 LOONGARCH_ARCH("loongarch64", AK_LOONGARCH64, FK_64BIT | FK_FP32 | FK_FP64 | FK_UAL)
 LOONGARCH_ARCH("la464", AK_LA464, FK_64BIT | FK_FP32 | FK_FP64 | FK_LSX | FK_LASX | FK_UAL)
-LOONGARCH_ARCH("la664", AK_LA664, FK_64BIT | FK_FP32 | FK_FP64 | FK_LSX | FK_LASX | FK_UAL | FK_FRECIPE | FK_LAM_BH)
+LOONGARCH_ARCH("la664", AK_LA664, FK_64BIT | FK_FP32 | FK_FP64 | FK_LSX | FK_LASX | FK_UAL | FK_FRECIPE | FK_LAM_BH | FK_LD_SEQ_SA)
 
 #undef LOONGARCH_ARCH
diff --git a/llvm/include/llvm/TargetParser/LoongArchTargetParser.h b/llvm/include/llvm/TargetParser/LoongArchTargetParser.h
index b5be03b1b67fbb..00957b84ab576c 100644
--- a/llvm/include/llvm/TargetParser/LoongArchTargetParser.h
+++ b/llvm/include/llvm/TargetParser/LoongArchTargetParser.h
@@ -53,6 +53,10 @@ enum FeatureKind : uint32_t {
   // Atomic memory swap and add instructions for byte and half word are
   // available.
   FK_LAM_BH = 1 << 10,
+
+  // Do not generate load-load barrier instructions (dbar 0x700).
+  FK_LD_SEQ_SA = 1 << 12,
+
 };
 
 struct FeatureInfo {
diff --git a/llvm/lib/Target/LoongArch/LoongArch.td b/llvm/lib/Target/LoongArch/LoongArch.td
index ecd00cd6d5d619..100bdba36c440c 100644
--- a/llvm/lib/Target/LoongArch/LoongArch.td
+++ b/llvm/lib/Target/LoongArch/LoongArch.td
@@ -118,6 +118,11 @@ def FeatureLAM_BH
                         "Support amswap[_db].{b/h} and amadd[_db].{b/h} instructions.">;
 def HasLAM_BH : Predicate<"Subtarget->hasLAM_BH()">;
 
+def FeatureLD_SEQ_SA
+    : SubtargetFeature<"ld-seq-sa", "HasLD_SEQ_SA", "true",
+                        "Don't use load-load barrier (dbar 0x700).">;
+def HasLD_SEQ_SA : Predicate<"Subtarget->hasLD_SEQ_SA()">;
+
 def TunePreferWInst
     : SubtargetFeature<"prefer-w-inst", "PreferWInst", "true",
                        "Prefer instructions with W suffix">;
diff --git a/llvm/lib/Target/LoongArch/LoongArchExpandAtomicPseudoInsts.cpp b/llvm/lib/Target/LoongArch/LoongArchExpandAtomicPseudoInsts.cpp
index 18a532b55ee5a9..35f84425cb0eba 100644
--- a/llvm/lib/Target/LoongArch/LoongArchExpandAtomicPseudoInsts.cpp
+++ b/llvm/lib/Target/LoongArch/LoongArchExpandAtomicPseudoInsts.cpp
@@ -588,7 +588,9 @@ bool LoongArchExpandAtomicPseudo::expandAtomicCmpXchg(
 
   // .tail:
   //   dbar 0x700 | acquire
-  BuildMI(TailMBB, DL, TII->get(LoongArch::DBAR)).addImm(hint);
+
+  if (!(hint == 0x700 && MF->getSubtarget<LoongArchSubtarget>().hasLD_SEQ_SA()))
+    BuildMI(TailMBB, DL, TII->get(LoongArch::DBAR)).addImm(hint);
 
   NextMBBI = MBB.end();
   MI.eraseFromParent();
diff --git a/llvm/lib/TargetParser/LoongArchTargetParser.cpp b/llvm/lib/TargetParser/LoongArchTargetParser.cpp
index 27e3b5683c5a6e..9b8407a73bea3f 100644
--- a/llvm/lib/TargetParser/LoongArchTargetParser.cpp
+++ b/llvm/lib/TargetParser/LoongArchTargetParser.cpp
@@ -53,6 +53,7 @@ bool LoongArch::getArchFeatures(StringRef Arch,
     if (Arch == "la64v1.1") {
       Features.push_back("+frecipe");
       Features.push_back("+lam-bh");
+      Features.push_back("+ld-seq-sa");
     }
     return true;
   }
diff --git a/llvm/test/CodeGen/LoongArch/ir-instruction/atomic-cmpxchg.ll b/llvm/test/CodeGen/LoongArch/ir-instruction/atomic-cmpxchg.ll
index ad98397dfe8f02..b457beb47eab99 100644
--- a/llvm/test/CodeGen/LoongArch/ir-instruction/atomic-cmpxchg.ll
+++ b/llvm/test/CodeGen/LoongArch/ir-instruction/atomic-cmpxchg.ll
@@ -1,5 +1,6 @@
 ; NOTE: Assertions have been autogenerated by utils/update_llc_test_checks.py
 ; RUN: llc --mtriple=loongarch64 -mattr=+d < %s | FileCheck %s --check-prefix=LA64
+; RUN: llc --mtriple=loongarch64 -mattr=+d,+ld-seq-sa < %s | FileCheck %s --check-prefix=LA64-LD-SEQ-SA
 
 define void @cmpxchg_i8_acquire_acquire(ptr %ptr, i8 %cmp, i8 %val) nounwind {
 ; LA64-LABEL: cmpxchg_i8_acquire_acquire:
@@ -26,6 +27,55 @@ define void @cmpxchg_i8_acquire_acquire(ptr %ptr, i8 %cmp, i8 %val) nounwind {
 ; LA64-NEXT:    dbar 20
 ; LA64-NEXT:  .LBB0_4:
 ; LA64-NEXT:    ret
+;
+; LA64-LD-SEQ-SA-LABEL: cmpxchg_i8_acquire_acquire:
+; LA64-LD-SEQ-SA:       # %bb.0:
+; LA64-LD-SEQ-SA-NEXT:    slli.d $a3, $a0, 3
+; LA64-LD-SEQ-SA-NEXT:    bstrins.d $a0, $zero, 1, 0
+; LA64-LD-SEQ-SA-NEXT:    ori $a4, $zero, 255
+; LA64-LD-SEQ-SA-NEXT:    sll.w $a4, $a4, $a3
+; LA64-LD-SEQ-SA-NEXT:    andi $a1, $a1, 255
+; LA64-LD-SEQ-SA-NEXT:    sll.w $a1, $a1, $a3
+; LA64-LD-SEQ-SA-NEXT:    andi $a2, $a2, 255
+; LA64-LD-SEQ-SA-NEXT:    sll.w $a2, $a2, $a3
+; LA64-LD-SEQ-SA-NEXT:  .LBB0_1: # =>This Inner Loop Header: Depth=1
+; LA64-LD-SEQ-SA-NEXT:    ll.w $a3, $a0, 0
+; LA64-LD-SEQ-SA-NEXT:    and $a5, $a3, $a4
+; LA64-LD-SEQ-SA-NEXT:    bne $a5, $a1, .LBB0_3
+; LA64-LD-SEQ-SA-NEXT:  # %bb.2: # in Loop: Header=BB0_1 Depth=1
+; LA64-LD-SEQ-SA-NEXT:    andn $a5, $a3, $a4
+; LA64-LD-SEQ-SA-NEXT:    or $a5, $a5, $a2
+; LA64-LD-SEQ-SA-NEXT:    sc.w $a5, $a0, 0
+; LA64-LD-SEQ-SA-NEXT:    beqz $a5, .LBB0_1
+; LA64-LD-SEQ-SA-NEXT:    b .LBB0_4
+; LA64-LD-SEQ-SA-NEXT:  .LBB0_3:
+; LA64-LD-SEQ-SA-NEXT:    dbar 20
+; LA64-LD-SEQ-SA-NEXT:  .LBB0_4:
+; LA64-LD-SEQ-SA-NEXT:    ret
+; LA64-LD-SEQ_SA-LABEL: cmpxchg_i8_acquire_acquire:
+; LA64-LD-SEQ_SA:       # %bb.0:
+; LA64-LD-SEQ_SA-NEXT:    slli.d $a3, $a0, 3
+; LA64-LD-SEQ_SA-NEXT:    bstrins.d $a0, $zero, 1, 0
+; LA64-LD-SEQ_SA-NEXT:    ori $a4, $zero, 255
+; LA64-LD-SEQ_SA-NEXT:    sll.w $a4, $a4, $a3
+; LA64-LD-SEQ_SA-NEXT:    andi $a1, $a1, 255
+; LA64-LD-SEQ_SA-NEXT:    sll.w $a1, $a1, $a3
+;...
[truncated]

@llvmbot
Copy link
Member

llvmbot commented Nov 19, 2024

@llvm/pr-subscribers-backend-loongarch

Author: None (tangaac)

Changes

Two options for clang
-mld-seq-sa: Do not generate load-load barrier instructions (dbar 0x700)
-mno-ld-seq-sa: Generate load-load barrier instructions (dbar 0x700)
The default is -mno-ld-seq-sa


Patch is 76.37 KiB, truncated to 20.00 KiB below, full version: https://github.com/llvm/llvm-project/pull/116762.diff

13 Files Affected:

  • (modified) clang/include/clang/Driver/Options.td (+4)
  • (modified) clang/lib/Basic/Targets/LoongArch.cpp (+6-1)
  • (modified) clang/lib/Basic/Targets/LoongArch.h (+2)
  • (modified) clang/lib/Driver/ToolChains/Arch/LoongArch.cpp (+9)
  • (modified) clang/test/Driver/loongarch-march.c (+4-4)
  • (added) clang/test/Driver/loongarch-mld-seq-sa.c (+30)
  • (modified) clang/test/Preprocessor/init-loongarch.c (+22-8)
  • (modified) llvm/include/llvm/TargetParser/LoongArchTargetParser.def (+2-1)
  • (modified) llvm/include/llvm/TargetParser/LoongArchTargetParser.h (+4)
  • (modified) llvm/lib/Target/LoongArch/LoongArch.td (+5)
  • (modified) llvm/lib/Target/LoongArch/LoongArchExpandAtomicPseudoInsts.cpp (+3-1)
  • (modified) llvm/lib/TargetParser/LoongArchTargetParser.cpp (+1)
  • (modified) llvm/test/CodeGen/LoongArch/ir-instruction/atomic-cmpxchg.ll (+1145)
diff --git a/clang/include/clang/Driver/Options.td b/clang/include/clang/Driver/Options.td
index d7230dd7272fd6..3ebb682397a0b6 100644
--- a/clang/include/clang/Driver/Options.td
+++ b/clang/include/clang/Driver/Options.td
@@ -5416,6 +5416,10 @@ def mlam_bh : Flag<["-"], "mlam-bh">, Group<m_loongarch_Features_Group>,
   HelpText<"Enable amswap[_db].{b/h} and amadd[_db].{b/h}">;
 def mno_lam_bh : Flag<["-"], "mno-lam-bh">, Group<m_loongarch_Features_Group>,
   HelpText<"Disable amswap[_db].{b/h} and amadd[_db].{b/h}">;
+def mld_seq_sa : Flag<["-"], "mld-seq-sa">, Group<m_loongarch_Features_Group>,
+  HelpText<"Do not generate load-load barrier instructions (dbar 0x700)">;
+def mno_ld_seq_sa : Flag<["-"], "mno-ld-seq-sa">, Group<m_loongarch_Features_Group>,
+  HelpText<"Generate load-load barrier instructions (dbar 0x700)">;
 def mannotate_tablejump : Flag<["-"], "mannotate-tablejump">, Group<m_loongarch_Features_Group>,
   HelpText<"Enable annotate table jump instruction to correlate it with the jump table.">;
 def mno_annotate_tablejump : Flag<["-"], "mno-annotate-tablejump">, Group<m_loongarch_Features_Group>,
diff --git a/clang/lib/Basic/Targets/LoongArch.cpp b/clang/lib/Basic/Targets/LoongArch.cpp
index 07b22b35f603ce..3f2d7317532aaf 100644
--- a/clang/lib/Basic/Targets/LoongArch.cpp
+++ b/clang/lib/Basic/Targets/LoongArch.cpp
@@ -205,7 +205,7 @@ void LoongArchTargetInfo::getTargetDefines(const LangOptions &Opts,
       // TODO: As more features of the V1.1 ISA are supported, a unified "v1.1"
       // arch feature set will be used to include all sub-features belonging to
       // the V1.1 ISA version.
-      if (HasFeatureFrecipe && HasFeatureLAM_BH)
+      if (HasFeatureFrecipe && HasFeatureLAM_BH && HasFeatureLD_SEQ_SA)
         Builder.defineMacro("__loongarch_arch",
                             Twine('"') + "la64v1.1" + Twine('"'));
       else
@@ -239,6 +239,9 @@ void LoongArchTargetInfo::getTargetDefines(const LangOptions &Opts,
   if (HasFeatureLAM_BH)
     Builder.defineMacro("__loongarch_lam_bh", Twine(1));
 
+  if (HasFeatureLD_SEQ_SA)
+    Builder.defineMacro("__loongarch_ld_seq_sa", Twine(1));
+
   StringRef ABI = getABI();
   if (ABI == "lp64d" || ABI == "lp64f" || ABI == "lp64s")
     Builder.defineMacro("__loongarch_lp64");
@@ -317,6 +320,8 @@ bool LoongArchTargetInfo::handleTargetFeatures(
       HasFeatureFrecipe = true;
     else if (Feature == "+lam-bh")
       HasFeatureLAM_BH = true;
+    else if (Feature == "+ld-seq-sa")
+      HasFeatureLD_SEQ_SA = true;
   }
   return true;
 }
diff --git a/clang/lib/Basic/Targets/LoongArch.h b/clang/lib/Basic/Targets/LoongArch.h
index 3585e9f7968b4b..e5eae7a8fcf677 100644
--- a/clang/lib/Basic/Targets/LoongArch.h
+++ b/clang/lib/Basic/Targets/LoongArch.h
@@ -31,6 +31,7 @@ class LLVM_LIBRARY_VISIBILITY LoongArchTargetInfo : public TargetInfo {
   bool HasFeatureLASX;
   bool HasFeatureFrecipe;
   bool HasFeatureLAM_BH;
+  bool HasFeatureLD_SEQ_SA;
 
 public:
   LoongArchTargetInfo(const llvm::Triple &Triple, const TargetOptions &)
@@ -41,6 +42,7 @@ class LLVM_LIBRARY_VISIBILITY LoongArchTargetInfo : public TargetInfo {
     HasFeatureLASX = false;
     HasFeatureFrecipe = false;
     HasFeatureLAM_BH = false;
+    HasFeatureLD_SEQ_SA = false;
     LongDoubleWidth = 128;
     LongDoubleAlign = 128;
     LongDoubleFormat = &llvm::APFloat::IEEEquad();
diff --git a/clang/lib/Driver/ToolChains/Arch/LoongArch.cpp b/clang/lib/Driver/ToolChains/Arch/LoongArch.cpp
index 987db4638fca88..67b71a3ec623e4 100644
--- a/clang/lib/Driver/ToolChains/Arch/LoongArch.cpp
+++ b/clang/lib/Driver/ToolChains/Arch/LoongArch.cpp
@@ -274,6 +274,15 @@ void loongarch::getLoongArchTargetFeatures(const Driver &D,
     else
       Features.push_back("-lam-bh");
   }
+
+  // Select ld-seq-sa feature determined by -m[no-]ld-seq-sa.
+  if (const Arg *A = Args.getLastArg(options::OPT_mld_seq_sa,
+                                     options::OPT_mno_ld_seq_sa)) {
+    if (A->getOption().matches(options::OPT_mld_seq_sa))
+      Features.push_back("+ld-seq-sa");
+    else
+      Features.push_back("-ld-seq-sa");
+  }
 }
 
 std::string loongarch::postProcessTargetCPUString(const std::string &CPU,
diff --git a/clang/test/Driver/loongarch-march.c b/clang/test/Driver/loongarch-march.c
index d4cd5b07ae905f..c7091336f3bc80 100644
--- a/clang/test/Driver/loongarch-march.c
+++ b/clang/test/Driver/loongarch-march.c
@@ -39,21 +39,21 @@
 
 // CC1-LA64V1P1: "-target-cpu" "loongarch64"
 // CC1-LA64V1P1-NOT: "-target-feature"
-// CC1-LA64V1P1: "-target-feature" "+64bit" "-target-feature" "+d" "-target-feature" "+lsx" "-target-feature" "+ual" "-target-feature" "+frecipe" "-target-feature" "+lam-bh"
+// CC1-LA64V1P1: "-target-feature" "+64bit" "-target-feature" "+d" "-target-feature" "+lsx" "-target-feature" "+ual" "-target-feature" "+frecipe" "-target-feature" "+lam-bh" "-target-feature" "+ld-seq-sa"
 // CC1-LA64V1P1-NOT: "-target-feature"
 // CC1-LA64V1P1: "-target-abi" "lp64d"
 
 // CC1-LA664: "-target-cpu" "la664"
 // CC1-LA664-NOT: "-target-feature"
-// CC1-LA664: "-target-feature" "+64bit" "-target-feature" "+f" "-target-feature" "+d" "-target-feature" "+lsx" "-target-feature" "+lasx" "-target-feature" "+ual" "-target-feature" "+frecipe" "-target-feature" "+lam-bh"
+// CC1-LA664: "-target-feature" "+64bit" "-target-feature" "+f" "-target-feature" "+d" "-target-feature" "+lsx" "-target-feature" "+lasx" "-target-feature" "+ual" "-target-feature" "+frecipe" "-target-feature" "+lam-bh" "-target-feature" "+ld-seq-sa"
 // CC1-LA664-NOT: "-target-feature"
 // CC1-LA664: "-target-abi" "lp64d"
 
 // IR-LOONGARCH64: attributes #[[#]] ={{.*}}"target-cpu"="loongarch64" {{.*}}"target-features"="+64bit,+d,+f,+ual"
 // IR-LA464: attributes #[[#]] ={{.*}}"target-cpu"="la464" {{.*}}"target-features"="+64bit,+d,+f,+lasx,+lsx,+ual"
 // IR-LA64V1P0: attributes #[[#]] ={{.*}}"target-cpu"="loongarch64" {{.*}}"target-features"="+64bit,+d,+lsx,+ual"
-// IR-LA64V1P1: attributes #[[#]] ={{.*}}"target-cpu"="loongarch64" {{.*}}"target-features"="+64bit,+d,+frecipe,+lam-bh,+lsx,+ual"
-// IR-LA664: attributes #[[#]] ={{.*}}"target-cpu"="la664" {{.*}}"target-features"="+64bit,+d,+f,+frecipe,+lam-bh,+lasx,+lsx,+ual"
+// IR-LA64V1P1: attributes #[[#]] ={{.*}}"target-cpu"="loongarch64" {{.*}}"target-features"="+64bit,+d,+frecipe,+lam-bh,+ld-seq-sa,+lsx,+ual"
+// IR-LA664: attributes #[[#]] ={{.*}}"target-cpu"="la664" {{.*}}"target-features"="+64bit,+d,+f,+frecipe,+lam-bh,+lasx,+ld-seq-sa,+lsx,+ual"
 
 int foo(void) {
   return 3;
diff --git a/clang/test/Driver/loongarch-mld-seq-sa.c b/clang/test/Driver/loongarch-mld-seq-sa.c
new file mode 100644
index 00000000000000..3d1d90d3f9cf72
--- /dev/null
+++ b/clang/test/Driver/loongarch-mld-seq-sa.c
@@ -0,0 +1,30 @@
+/// Test -m[no]ld-seq-sa options.
+
+// RUN: %clang --target=loongarch64 -mld-seq-sa -fsyntax-only %s -### 2>&1 | \
+// RUN:     FileCheck %s --check-prefix=CC1-ld-seq-sa
+// RUN: %clang --target=loongarch64 -mno-ld-seq-sa -fsyntax-only %s -### 2>&1 | \
+// RUN:     FileCheck %s --check-prefix=CC1-NO-ld-seq-sa
+// RUN: %clang --target=loongarch64 -mno-ld-seq-sa -mld-seq-sa -fsyntax-only %s -### 2>&1 | \
+// RUN:     FileCheck %s --check-prefix=CC1-ld-seq-sa
+// RUN: %clang --target=loongarch64  -mld-seq-sa -mno-ld-seq-sa -fsyntax-only %s -### 2>&1 | \
+// RUN:     FileCheck %s --check-prefix=CC1-NO-ld-seq-sa
+
+// RUN: %clang --target=loongarch64 -mld-seq-sa -S -emit-llvm %s -o - | \
+// RUN: FileCheck %s --check-prefix=IR-ld-seq-sa
+// RUN: %clang --target=loongarch64 -mno-ld-seq-sa -S -emit-llvm %s -o - | \
+// RUN: FileCheck %s --check-prefix=IR-NO-ld-seq-sa
+// RUN: %clang --target=loongarch64 -mno-ld-seq-sa -mld-seq-sa -S -emit-llvm %s -o - | \
+// RUN: FileCheck %s --check-prefix=IR-ld-seq-sa
+// RUN: %clang --target=loongarch64 -mld-seq-sa -mno-ld-seq-sa -S -emit-llvm %s -o - | \
+// RUN: FileCheck %s --check-prefix=IR-NO-ld-seq-sa
+
+
+// CC1-ld-seq-sa: "-target-feature" "+ld-seq-sa"
+// CC1-NO-ld-seq-sa: "-target-feature" "-ld-seq-sa"
+
+// IR-ld-seq-sa: attributes #[[#]] ={{.*}}"target-features"="{{(.*,)?}}+ld-seq-sa{{(,.*)?}}"
+// IR-NO-ld-seq-sa: attributes #[[#]] ={{.*}}"target-features"="{{(.*,)?}}-ld-seq-sa{{(,.*)?}}"
+
+int foo(void) {
+  return 42;
+}
diff --git a/clang/test/Preprocessor/init-loongarch.c b/clang/test/Preprocessor/init-loongarch.c
index 8019292e0f10e0..f34a152a69f5c9 100644
--- a/clang/test/Preprocessor/init-loongarch.c
+++ b/clang/test/Preprocessor/init-loongarch.c
@@ -798,7 +798,7 @@
 // LA64-FPU0-LP64S-NOT: #define __loongarch_single_float
 // LA64-FPU0-LP64S: #define __loongarch_soft_float 1
 
-/// Check __loongarch_arch{_tune/_frecipe/_lam_bh}.
+/// Check __loongarch_arch{_tune/_frecipe/_lam_bh/_ld_seq_sa}.
 
 // RUN: %clang --target=loongarch64 -x c -E -dM %s -o - | \
 // RUN:   FileCheck --match-full-lines --check-prefix=ARCH-TUNE -DARCH=la64v1.0 -DTUNE=loongarch64 %s
@@ -823,11 +823,11 @@
 // RUN: %clang --target=loongarch64 -x c -E -dM %s -o - -march=loongarch64 -Xclang -target-feature -Xclang +lsx | \
 // RUN:   FileCheck --match-full-lines --check-prefix=ARCH-TUNE -DARCH=la64v1.0 -DTUNE=loongarch64 %s
 // RUN: %clang --target=loongarch64 -x c -E -dM %s -o - -march=la64v1.1 | \
-// RUN:   FileCheck --match-full-lines  --check-prefixes=ARCH-TUNE,FRECIPE,LAM-BH -DARCH=la64v1.1 -DTUNE=loongarch64 %s
+// RUN:   FileCheck --match-full-lines  --check-prefixes=ARCH-TUNE,FRECIPE,LAM-BH,LD-SEQ-SA -DARCH=la64v1.1 -DTUNE=loongarch64 %s
 // RUN: %clang --target=loongarch64 -x c -E -dM %s -o - -march=la64v1.1 -Xclang -target-feature -Xclang -frecipe | \
-// RUN:   FileCheck --match-full-lines --check-prefixes=ARCH-TUNE,LAM-BH -DARCH=la64v1.0 -DTUNE=loongarch64 %s
+// RUN:   FileCheck --match-full-lines --check-prefixes=ARCH-TUNE,LAM-BH,LD-SEQ-SA -DARCH=la64v1.0 -DTUNE=loongarch64 %s
 // RUN: %clang --target=loongarch64 -x c -E -dM %s -o - -march=la64v1.1 -Xclang -target-feature -Xclang -lsx | \
-// RUN:   FileCheck --match-full-lines --check-prefixes=ARCH-TUNE,FRECIPE,LAM-BH -DARCH=loongarch64 -DTUNE=loongarch64 %s
+// RUN:   FileCheck --match-full-lines --check-prefixes=ARCH-TUNE,FRECIPE,LAM-BH,LD-SEQ-SA -DARCH=loongarch64 -DTUNE=loongarch64 %s
 // RUN: %clang --target=loongarch64 -x c -E -dM %s -o - -march=loongarch64 -Xclang -target-feature -Xclang +frecipe | \
 // RUN:   FileCheck --match-full-lines --check-prefixes=ARCH-TUNE,FRECIPE -DARCH=loongarch64 -DTUNE=loongarch64 %s
 // RUN: %clang --target=loongarch64 -x c -E -dM %s -o - -march=loongarch64 -Xclang -target-feature -Xclang +lsx -Xclang -target-feature -Xclang +frecipe | \
@@ -835,25 +835,39 @@
 // RUN: %clang --target=loongarch64 -x c -E -dM %s -o - -march=la64v1.0 -Xclang -target-feature -Xclang +lam-bh | \
 // RUN:   FileCheck --match-full-lines --check-prefixes=ARCH-TUNE,LAM-BH -DARCH=la64v1.0 -DTUNE=loongarch64 %s
 // RUN: %clang --target=loongarch64 -x c -E -dM %s -o - -march=la64v1.1 -Xclang -target-feature -Xclang -lam-bh | \
-// RUN:   FileCheck --match-full-lines --check-prefixes=ARCH-TUNE,FRECIPE -DARCH=la64v1.0 -DTUNE=loongarch64 %s
+// RUN:   FileCheck --match-full-lines --check-prefixes=ARCH-TUNE,FRECIPE,LD-SEQ-SA -DARCH=la64v1.0 -DTUNE=loongarch64 %s
 // RUN: %clang --target=loongarch64 -x c -E -dM %s -o - -march=loongarch64 -Xclang -target-feature -Xclang +lam-bh | \
 // RUN:   FileCheck --match-full-lines --check-prefixes=ARCH-TUNE,LAM-BH -DARCH=loongarch64 -DTUNE=loongarch64 %s
 // RUN: %clang --target=loongarch64 -x c -E -dM %s -o - -march=loongarch64 -Xclang -target-feature -Xclang +lsx -Xclang -target-feature -Xclang +lam-bh | \
 // RUN:   FileCheck --match-full-lines --check-prefixes=ARCH-TUNE,LAM-BH -DARCH=la64v1.0 -DTUNE=loongarch64 %s
-// RUN: %clang --target=loongarch64 -x c -E -dM %s -o - -march=la64v1.0 -Xclang -target-feature -Xclang +frecipe -Xclang -target-feature -Xclang +lam-bh | \
+
+// RUN: %clang --target=loongarch64 -x c -E -dM %s -o - -march=la64v1.0 -Xclang -target-feature -Xclang +ld-seq-sa | \
+// RUN:   FileCheck --match-full-lines --check-prefixes=ARCH-TUNE,LD-SEQ-SA -DARCH=la64v1.0 -DTUNE=loongarch64 %s
+// RUN: %clang --target=loongarch64 -x c -E -dM %s -o - -march=la64v1.1 -Xclang -target-feature -Xclang -ld-seq-sa | \
+// RUN:   FileCheck --match-full-lines --check-prefixes=ARCH-TUNE,FRECIPE,LAM-BH -DARCH=la64v1.0 -DTUNE=loongarch64 %s
+// RUN: %clang --target=loongarch64 -x c -E -dM %s -o - -march=loongarch64 -Xclang -target-feature -Xclang +ld-seq-sa | \
+// RUN:   FileCheck --match-full-lines --check-prefixes=ARCH-TUNE,LD-SEQ-SA -DARCH=loongarch64 -DTUNE=loongarch64 %s
+// RUN: %clang --target=loongarch64 -x c -E -dM %s -o - -march=loongarch64 -Xclang -target-feature -Xclang +lsx -Xclang -target-feature -Xclang +ld-seq-sa | \
+// RUN:   FileCheck --match-full-lines --check-prefixes=ARCH-TUNE,LD-SEQ-SA -DARCH=la64v1.0 -DTUNE=loongarch64 %s
+
+
+
+// RUN: %clang --target=loongarch64 -x c -E -dM %s -o - -march=la64v1.0 -Xclang -target-feature -Xclang +frecipe -Xclang -target-feature -Xclang +lam-bh  -Xclang -target-feature -Xclang +ld-seq-sa | \
 // RUN:   FileCheck --match-full-lines --check-prefixes=ARCH-TUNE -DARCH=la64v1.1 -DTUNE=loongarch64 %s
+
 // RUN: %clang --target=loongarch64 -x c -E -dM %s -o - -march=la664 | \
-// RUN:   FileCheck --match-full-lines --check-prefixes=ARCH-TUNE,FRECIPE,LAM-BH -DARCH=la664 -DTUNE=la664 %s
+// RUN:   FileCheck --match-full-lines --check-prefixes=ARCH-TUNE,FRECIPE,LAM-BH,LD-SEQ-SA -DARCH=la664 -DTUNE=la664 %s
 // RUN: %clang --target=loongarch64 -x c -E -dM %s -o - -mtune=la664 | \
 // RUN:   FileCheck --match-full-lines --check-prefix=ARCH-TUNE -DARCH=la64v1.0 -DTUNE=la664 %s
 // RUN: %clang --target=loongarch64 -x c -E -dM %s -o - -march=loongarch64 -mtune=la664 | \
 // RUN:   FileCheck --match-full-lines --check-prefix=ARCH-TUNE -DARCH=loongarch64 -DTUNE=la664 %s
 // RUN: %clang --target=loongarch64 -x c -E -dM %s -o - -march=la664 -mtune=loongarch64 | \
-// RUN:   FileCheck --match-full-lines --check-prefixes=ARCH-TUNE,FRECIPE,LAM-BH -DARCH=la664 -DTUNE=loongarch64 %s
+// RUN:   FileCheck --match-full-lines --check-prefixes=ARCH-TUNE,FRECIPE,LAM-BH,LD-SEQ-SA -DARCH=la664 -DTUNE=loongarch64 %s
 
 // ARCH-TUNE: #define __loongarch_arch "[[ARCH]]"
 // FRECIPE: #define __loongarch_frecipe 1
 // LAM-BH: #define __loongarch_lam_bh 1
+// LD-SEQ-SA: #define __loongarch_ld_seq_sa 1
 // ARCH-TUNE: #define __loongarch_tune "[[TUNE]]"
 
 // RUN: %clang --target=loongarch64 -mlsx -x c -E -dM %s -o - \
diff --git a/llvm/include/llvm/TargetParser/LoongArchTargetParser.def b/llvm/include/llvm/TargetParser/LoongArchTargetParser.def
index 6cd2018b7b59cb..324d5c18e6dea3 100644
--- a/llvm/include/llvm/TargetParser/LoongArchTargetParser.def
+++ b/llvm/include/llvm/TargetParser/LoongArchTargetParser.def
@@ -12,6 +12,7 @@ LOONGARCH_FEATURE("+lvz", FK_LVZ)
 LOONGARCH_FEATURE("+ual", FK_UAL)
 LOONGARCH_FEATURE("+frecipe", FK_FRECIPE)
 LOONGARCH_FEATURE("+lam-bh", FK_LAM_BH)
+LOONGARCH_FEATURE("+ld-seq-sa", FK_LD_SEQ_SA)
 
 #undef LOONGARCH_FEATURE
 
@@ -21,6 +22,6 @@ LOONGARCH_FEATURE("+lam-bh", FK_LAM_BH)
 
 LOONGARCH_ARCH("loongarch64", AK_LOONGARCH64, FK_64BIT | FK_FP32 | FK_FP64 | FK_UAL)
 LOONGARCH_ARCH("la464", AK_LA464, FK_64BIT | FK_FP32 | FK_FP64 | FK_LSX | FK_LASX | FK_UAL)
-LOONGARCH_ARCH("la664", AK_LA664, FK_64BIT | FK_FP32 | FK_FP64 | FK_LSX | FK_LASX | FK_UAL | FK_FRECIPE | FK_LAM_BH)
+LOONGARCH_ARCH("la664", AK_LA664, FK_64BIT | FK_FP32 | FK_FP64 | FK_LSX | FK_LASX | FK_UAL | FK_FRECIPE | FK_LAM_BH | FK_LD_SEQ_SA)
 
 #undef LOONGARCH_ARCH
diff --git a/llvm/include/llvm/TargetParser/LoongArchTargetParser.h b/llvm/include/llvm/TargetParser/LoongArchTargetParser.h
index b5be03b1b67fbb..00957b84ab576c 100644
--- a/llvm/include/llvm/TargetParser/LoongArchTargetParser.h
+++ b/llvm/include/llvm/TargetParser/LoongArchTargetParser.h
@@ -53,6 +53,10 @@ enum FeatureKind : uint32_t {
   // Atomic memory swap and add instructions for byte and half word are
   // available.
   FK_LAM_BH = 1 << 10,
+
+  // Do not generate load-load barrier instructions (dbar 0x700).
+  FK_LD_SEQ_SA = 1 << 12,
+
 };
 
 struct FeatureInfo {
diff --git a/llvm/lib/Target/LoongArch/LoongArch.td b/llvm/lib/Target/LoongArch/LoongArch.td
index ecd00cd6d5d619..100bdba36c440c 100644
--- a/llvm/lib/Target/LoongArch/LoongArch.td
+++ b/llvm/lib/Target/LoongArch/LoongArch.td
@@ -118,6 +118,11 @@ def FeatureLAM_BH
                         "Support amswap[_db].{b/h} and amadd[_db].{b/h} instructions.">;
 def HasLAM_BH : Predicate<"Subtarget->hasLAM_BH()">;
 
+def FeatureLD_SEQ_SA
+    : SubtargetFeature<"ld-seq-sa", "HasLD_SEQ_SA", "true",
+                        "Don't use load-load barrier (dbar 0x700).">;
+def HasLD_SEQ_SA : Predicate<"Subtarget->hasLD_SEQ_SA()">;
+
 def TunePreferWInst
     : SubtargetFeature<"prefer-w-inst", "PreferWInst", "true",
                        "Prefer instructions with W suffix">;
diff --git a/llvm/lib/Target/LoongArch/LoongArchExpandAtomicPseudoInsts.cpp b/llvm/lib/Target/LoongArch/LoongArchExpandAtomicPseudoInsts.cpp
index 18a532b55ee5a9..35f84425cb0eba 100644
--- a/llvm/lib/Target/LoongArch/LoongArchExpandAtomicPseudoInsts.cpp
+++ b/llvm/lib/Target/LoongArch/LoongArchExpandAtomicPseudoInsts.cpp
@@ -588,7 +588,9 @@ bool LoongArchExpandAtomicPseudo::expandAtomicCmpXchg(
 
   // .tail:
   //   dbar 0x700 | acquire
-  BuildMI(TailMBB, DL, TII->get(LoongArch::DBAR)).addImm(hint);
+
+  if (!(hint == 0x700 && MF->getSubtarget<LoongArchSubtarget>().hasLD_SEQ_SA()))
+    BuildMI(TailMBB, DL, TII->get(LoongArch::DBAR)).addImm(hint);
 
   NextMBBI = MBB.end();
   MI.eraseFromParent();
diff --git a/llvm/lib/TargetParser/LoongArchTargetParser.cpp b/llvm/lib/TargetParser/LoongArchTargetParser.cpp
index 27e3b5683c5a6e..9b8407a73bea3f 100644
--- a/llvm/lib/TargetParser/LoongArchTargetParser.cpp
+++ b/llvm/lib/TargetParser/LoongArchTargetParser.cpp
@@ -53,6 +53,7 @@ bool LoongArch::getArchFeatures(StringRef Arch,
     if (Arch == "la64v1.1") {
       Features.push_back("+frecipe");
       Features.push_back("+lam-bh");
+      Features.push_back("+ld-seq-sa");
     }
     return true;
   }
diff --git a/llvm/test/CodeGen/LoongArch/ir-instruction/atomic-cmpxchg.ll b/llvm/test/CodeGen/LoongArch/ir-instruction/atomic-cmpxchg.ll
index ad98397dfe8f02..b457beb47eab99 100644
--- a/llvm/test/CodeGen/LoongArch/ir-instruction/atomic-cmpxchg.ll
+++ b/llvm/test/CodeGen/LoongArch/ir-instruction/atomic-cmpxchg.ll
@@ -1,5 +1,6 @@
 ; NOTE: Assertions have been autogenerated by utils/update_llc_test_checks.py
 ; RUN: llc --mtriple=loongarch64 -mattr=+d < %s | FileCheck %s --check-prefix=LA64
+; RUN: llc --mtriple=loongarch64 -mattr=+d,+ld-seq-sa < %s | FileCheck %s --check-prefix=LA64-LD-SEQ-SA
 
 define void @cmpxchg_i8_acquire_acquire(ptr %ptr, i8 %cmp, i8 %val) nounwind {
 ; LA64-LABEL: cmpxchg_i8_acquire_acquire:
@@ -26,6 +27,55 @@ define void @cmpxchg_i8_acquire_acquire(ptr %ptr, i8 %cmp, i8 %val) nounwind {
 ; LA64-NEXT:    dbar 20
 ; LA64-NEXT:  .LBB0_4:
 ; LA64-NEXT:    ret
+;
+; LA64-LD-SEQ-SA-LABEL: cmpxchg_i8_acquire_acquire:
+; LA64-LD-SEQ-SA:       # %bb.0:
+; LA64-LD-SEQ-SA-NEXT:    slli.d $a3, $a0, 3
+; LA64-LD-SEQ-SA-NEXT:    bstrins.d $a0, $zero, 1, 0
+; LA64-LD-SEQ-SA-NEXT:    ori $a4, $zero, 255
+; LA64-LD-SEQ-SA-NEXT:    sll.w $a4, $a4, $a3
+; LA64-LD-SEQ-SA-NEXT:    andi $a1, $a1, 255
+; LA64-LD-SEQ-SA-NEXT:    sll.w $a1, $a1, $a3
+; LA64-LD-SEQ-SA-NEXT:    andi $a2, $a2, 255
+; LA64-LD-SEQ-SA-NEXT:    sll.w $a2, $a2, $a3
+; LA64-LD-SEQ-SA-NEXT:  .LBB0_1: # =>This Inner Loop Header: Depth=1
+; LA64-LD-SEQ-SA-NEXT:    ll.w $a3, $a0, 0
+; LA64-LD-SEQ-SA-NEXT:    and $a5, $a3, $a4
+; LA64-LD-SEQ-SA-NEXT:    bne $a5, $a1, .LBB0_3
+; LA64-LD-SEQ-SA-NEXT:  # %bb.2: # in Loop: Header=BB0_1 Depth=1
+; LA64-LD-SEQ-SA-NEXT:    andn $a5, $a3, $a4
+; LA64-LD-SEQ-SA-NEXT:    or $a5, $a5, $a2
+; LA64-LD-SEQ-SA-NEXT:    sc.w $a5, $a0, 0
+; LA64-LD-SEQ-SA-NEXT:    beqz $a5, .LBB0_1
+; LA64-LD-SEQ-SA-NEXT:    b .LBB0_4
+; LA64-LD-SEQ-SA-NEXT:  .LBB0_3:
+; LA64-LD-SEQ-SA-NEXT:    dbar 20
+; LA64-LD-SEQ-SA-NEXT:  .LBB0_4:
+; LA64-LD-SEQ-SA-NEXT:    ret
+; LA64-LD-SEQ_SA-LABEL: cmpxchg_i8_acquire_acquire:
+; LA64-LD-SEQ_SA:       # %bb.0:
+; LA64-LD-SEQ_SA-NEXT:    slli.d $a3, $a0, 3
+; LA64-LD-SEQ_SA-NEXT:    bstrins.d $a0, $zero, 1, 0
+; LA64-LD-SEQ_SA-NEXT:    ori $a4, $zero, 255
+; LA64-LD-SEQ_SA-NEXT:    sll.w $a4, $a4, $a3
+; LA64-LD-SEQ_SA-NEXT:    andi $a1, $a1, 255
+; LA64-LD-SEQ_SA-NEXT:    sll.w $a1, $a1, $a3
+;...
[truncated]

Copy link
Contributor

@SixWeining SixWeining left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM. But we can also add this feature in sys::getHostCPUFeatures in this PR.

Copy link
Contributor

@SixWeining SixWeining left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM except a nit.

@SixWeining SixWeining merged commit 1d46020 into llvm:main Nov 22, 2024
5 of 8 checks passed
@llvm-ci
Copy link
Collaborator

llvm-ci commented Nov 22, 2024

LLVM Buildbot has detected a new failure on builder openmp-offload-libc-amdgpu-runtime running on omp-vega20-1 while building clang,llvm at step 10 "Add check check-offload".

Full details are available at: https://lab.llvm.org/buildbot/#/builders/73/builds/8976

Here is the relevant piece of the build log for the reference
Step 10 (Add check check-offload) failure: 1200 seconds without output running [b'ninja', b'-j 32', b'check-offload'], attempting to kill
...
PASS: libomptarget :: x86_64-unknown-linux-gnu-LTO :: offloading/bug53727.cpp (947 of 960)
PASS: libomptarget :: x86_64-unknown-linux-gnu-LTO :: offloading/bug47654.cpp (948 of 960)
PASS: libomptarget :: x86_64-unknown-linux-gnu-LTO :: offloading/bug49779.cpp (949 of 960)
PASS: libomptarget :: x86_64-unknown-linux-gnu-LTO :: offloading/test_libc.cpp (950 of 960)
PASS: libomptarget :: x86_64-unknown-linux-gnu-LTO :: offloading/wtime.c (951 of 960)
PASS: libomptarget :: x86_64-unknown-linux-gnu :: offloading/bug49021.cpp (952 of 960)
PASS: libomptarget :: x86_64-unknown-linux-gnu :: offloading/std_complex_arithmetic.cpp (953 of 960)
PASS: libomptarget :: x86_64-unknown-linux-gnu-LTO :: offloading/complex_reduction.cpp (954 of 960)
PASS: libomptarget :: x86_64-unknown-linux-gnu-LTO :: offloading/bug49021.cpp (955 of 960)
PASS: libomptarget :: x86_64-unknown-linux-gnu-LTO :: offloading/std_complex_arithmetic.cpp (956 of 960)
command timed out: 1200 seconds without output running [b'ninja', b'-j 32', b'check-offload'], attempting to kill
process killed by signal 9
program finished with exit code -1
elapsedTime=1238.823334

@@ -2011,8 +2011,9 @@ const StringMap<bool> sys::getHostCPUFeatures() {
const StringMap<bool> sys::getHostCPUFeatures() {
unsigned long hwcap = getauxval(AT_HWCAP);
bool HasFPU = hwcap & (1UL << 3); // HWCAP_LOONGARCH_FPU
uint32_t cpucfg2 = 0x2;
const uint32_t cpucfg2 = 0x2, cpucfg3 = 0x3;
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I encountered a compilation error with gcc 13.2:

[1/371] Building CXX object lib/TargetParser/CMakeFiles/LLVMTargetParser.dir/Host.cpp.o
FAILED: lib/TargetParser/CMakeFiles/LLVMTargetParser.dir/Host.cpp.o 
ccache /usr/bin/g++ -DGTEST_HAS_RTTI=0 -DLLVM_EXPORTS -D_DEBUG -D_GLIBCXX_ASSERTIONS -D_GNU_SOURCE -D__STDC_CONSTANT_MACROS -D__STDC_FORMAT_MACROS -D__STDC_LIMIT_MACROS -I/home/dtcxzyw/llvm-build/lib/TargetParser -I/home/dtcxzyw/llvm-project/llvm/lib/TargetParser -I/home/dtcxzyw/llvm-build/include -I/home/dtcxzyw/llvm-project/llvm/include -fPIC -fno-semantic-interposition -fvisibility-inlines-hidden -Werror=date-time -fno-lifetime-dse -w -fdiagnostics-color -ffunction-sections -fdata-sections -O3 -DNDEBUG -std=c++17 -fPIC  -fno-exceptions -funwind-tables -fno-rtti -UNDEBUG -MD -MT lib/TargetParser/CMakeFiles/LLVMTargetParser.dir/Host.cpp.o -MF lib/TargetParser/CMakeFiles/LLVMTargetParser.dir/Host.cpp.o.d -o lib/TargetParser/CMakeFiles/LLVMTargetParser.dir/Host.cpp.o -c /home/dtcxzyw/llvm-project/llvm/lib/TargetParser/Host.cpp
/home/dtcxzyw/llvm-project/llvm/lib/TargetParser/Host.cpp: In function ‘const llvm::StringMap<bool, llvm::MallocAllocator> llvm::sys::getHostCPUFeatures()’:
/home/dtcxzyw/llvm-project/llvm/lib/TargetParser/Host.cpp:2015:3: error: read-only variable ‘cpucfg2’ used as ‘asm’ output
 2015 |   __asm__("cpucfg %[cpucfg2], %[cpucfg2]\n\t" : [cpucfg2] "+r"(cpucfg2));
      |   ^~~~~~~
/home/dtcxzyw/llvm-project/llvm/lib/TargetParser/Host.cpp:2016:3: error: read-only variable ‘cpucfg3’ used as ‘asm’ output
 2016 |   __asm__("cpucfg %[cpucfg3], %[cpucfg3]\n\t" : [cpucfg3] "+r"(cpucfg3));
      |   ^~~~~~~
ninja: build stopped: subcommand failed.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Oh, my bad. I'll fix it now.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Should already be fixed.

SixWeining added a commit that referenced this pull request Nov 25, 2024
@tangaac tangaac deleted the feature/ld-seq-sa branch November 25, 2024 01:52
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
backend:loongarch clang:driver 'clang' and 'clang++' user-facing binaries. Not 'clang-cl' clang:frontend Language frontend issues, e.g. anything involving "Sema" clang Clang issues not falling into any other category
Projects
None yet
Development

Successfully merging this pull request may close these issues.

5 participants