[RISCV] Mark {vl, vtype} as clobber in inline assembly #128636

HankChang736 · 2025-02-25T05:53:54Z

This patch use the hook getClobbers() in RISCV target and mark {vl, type} as clobber to prevent Post-RA scheduler moving vsetvl across inline assembly.

Fixing #97794.

This patch use the hook getClobbers() in RISCV target and mark {vl, type} as clobber to prevent Post-RA scheduler move vsetvl across inline assembly.

github-actions · 2025-02-25T05:54:12Z

Thank you for submitting a Pull Request (PR) to the LLVM Project!

This PR will be automatically labeled and the relevant teams will be notified.

If you wish to, you can add reviewers by using the "Reviewers" section on this page.

If this is not working for you, it is probably because you do not have write permissions for the repository. In which case you can instead tag reviewers by name in a comment by using @ followed by their GitHub username.

If you have received no comments on your PR for a week, you can request a review by "ping"ing the PR by adding a comment “Ping”. The common courtesy "ping" rate is once a week. Please remember that you are asking for valuable time from other developers.

If you have further questions, they may be answered by the LLVM GitHub User Guide.

You can also ask questions in a comment on this PR, on the LLVM Discord or on the forums.

llvmbot · 2025-02-25T05:54:44Z

@llvm/pr-subscribers-clang

@llvm/pr-subscribers-backend-risc-v

Author: Hank Chang (HankChang736)

Changes

This patch use the hook getClobbers() in RISCV target and mark {vl, type} as clobber to prevent Post-RA scheduler moving vsetvl across inline assembly.

Fixing #97794.

Full diff: https://github.com/llvm/llvm-project/pull/128636.diff

6 Files Affected:

(modified) clang/lib/Basic/Targets/RISCV.h (+1-1)
(modified) clang/test/CodeGen/RISCV/riscv-inline-asm-clobber.c (+36-12)
(modified) clang/test/CodeGen/RISCV/riscv-inline-asm-rvv.c (+4-4)
(modified) clang/test/CodeGen/RISCV/riscv-inline-asm.c (+21-21)
(added) llvm/test/CodeGen/RISCV/rvv/vsetvl-cross-inline-asm-clobber.ll (+37)
(added) llvm/test/CodeGen/RISCV/rvv/vsetvl-cross-inline-asm.ll (+37)

diff --git a/clang/lib/Basic/Targets/RISCV.h b/clang/lib/Basic/Targets/RISCV.h
index c26aa19080162..5590aa9d03c75 100644
--- a/clang/lib/Basic/Targets/RISCV.h
+++ b/clang/lib/Basic/Targets/RISCV.h
@@ -68,7 +68,7 @@ class RISCVTargetInfo : public TargetInfo {
     return TargetInfo::VoidPtrBuiltinVaList;
   }
 
-  std::string_view getClobbers() const override { return ""; }
+  std::string_view getClobbers() const override { return "~{vl},~{vtype}"; }
 
   StringRef getConstraintRegister(StringRef Constraint,
                                   StringRef Expression) const override {
diff --git a/clang/test/CodeGen/RISCV/riscv-inline-asm-clobber.c b/clang/test/CodeGen/RISCV/riscv-inline-asm-clobber.c
index 8aa80386f205f..65cfc081fe26d 100644
--- a/clang/test/CodeGen/RISCV/riscv-inline-asm-clobber.c
+++ b/clang/test/CodeGen/RISCV/riscv-inline-asm-clobber.c
@@ -7,38 +7,62 @@
 
 // Test RISC-V specific clobbered registers in inline assembly.
 
-// CHECK-LABEL: define {{.*}} void @test_fflags
-// CHECK:    tail call void asm sideeffect "", "~{fflags}"()
+// CHECK-LABEL: define dso_local void @test_fflags(
+// CHECK-SAME: ) local_unnamed_addr #[[ATTR0:[0-9]+]] {
+// CHECK-NEXT:  entry:
+// CHECK-NEXT:    tail call void asm sideeffect "", "~{fflags},~{vl},~{vtype}"() #[[ATTR1:[0-9]+]], !srcloc [[META6:![0-9]+]]
+// CHECK-NEXT:    ret void
+//
 void test_fflags(void) {
   asm volatile ("" :::"fflags");
 }
 
-// CHECK-LABEL: define {{.*}} void @test_frm
-// CHECK:    tail call void asm sideeffect "", "~{frm}"()
+// CHECK-LABEL: define dso_local void @test_frm(
+// CHECK-SAME: ) local_unnamed_addr #[[ATTR0]] {
+// CHECK-NEXT:  entry:
+// CHECK-NEXT:    tail call void asm sideeffect "", "~{frm},~{vl},~{vtype}"() #[[ATTR1]], !srcloc [[META7:![0-9]+]]
+// CHECK-NEXT:    ret void
+//
 void test_frm(void) {
   asm volatile ("" :::"frm");
 }
 
-// CHECK-LABEL: define {{.*}} void @test_vtype
-// CHECK:    tail call void asm sideeffect "", "~{vtype}"()
+// CHECK-LABEL: define dso_local void @test_vtype(
+// CHECK-SAME: ) local_unnamed_addr #[[ATTR0]] {
+// CHECK-NEXT:  entry:
+// CHECK-NEXT:    tail call void asm sideeffect "", "~{vtype},~{vl},~{vtype}"() #[[ATTR1]], !srcloc [[META8:![0-9]+]]
+// CHECK-NEXT:    ret void
+//
 void test_vtype(void) {
   asm volatile ("" :::"vtype");
 }
 
-// CHECK-LABEL: define {{.*}} void @test_vl
-// CHECK:    tail call void asm sideeffect "", "~{vl}"()
+// CHECK-LABEL: define dso_local void @test_vl(
+// CHECK-SAME: ) local_unnamed_addr #[[ATTR0]] {
+// CHECK-NEXT:  entry:
+// CHECK-NEXT:    tail call void asm sideeffect "", "~{vl},~{vl},~{vtype}"() #[[ATTR1]], !srcloc [[META9:![0-9]+]]
+// CHECK-NEXT:    ret void
+//
 void test_vl(void) {
   asm volatile ("" :::"vl");
 }
 
-// CHECK-LABEL: define {{.*}} void @test_vxsat
-// CHECK:    tail call void asm sideeffect "", "~{vxsat}"()
+// CHECK-LABEL: define dso_local void @test_vxsat(
+// CHECK-SAME: ) local_unnamed_addr #[[ATTR0]] {
+// CHECK-NEXT:  entry:
+// CHECK-NEXT:    tail call void asm sideeffect "", "~{vxsat},~{vl},~{vtype}"() #[[ATTR1]], !srcloc [[META10:![0-9]+]]
+// CHECK-NEXT:    ret void
+//
 void test_vxsat(void) {
   asm volatile ("" :::"vxsat");
 }
 
-// CHECK-LABEL: define {{.*}} void @test_vxrm
-// CHECK:    tail call void asm sideeffect "", "~{vxrm}"()
+// CHECK-LABEL: define dso_local void @test_vxrm(
+// CHECK-SAME: ) local_unnamed_addr #[[ATTR0]] {
+// CHECK-NEXT:  entry:
+// CHECK-NEXT:    tail call void asm sideeffect "", "~{vxrm},~{vl},~{vtype}"() #[[ATTR1]], !srcloc [[META11:![0-9]+]]
+// CHECK-NEXT:    ret void
+//
 void test_vxrm(void) {
   asm volatile ("" :::"vxrm");
 }
diff --git a/clang/test/CodeGen/RISCV/riscv-inline-asm-rvv.c b/clang/test/CodeGen/RISCV/riscv-inline-asm-rvv.c
index 879fb1238d83a..e9444c6a76d87 100644
--- a/clang/test/CodeGen/RISCV/riscv-inline-asm-rvv.c
+++ b/clang/test/CodeGen/RISCV/riscv-inline-asm-rvv.c
@@ -18,12 +18,12 @@ void test_v_reg() {
       :
       : "v1", "x1");
 // CHECK-LABEL: define{{.*}} @test_v_reg
-// CHECK: "~{v1},~{x1}"
+// CHECK: "~{v1},~{x1},~{vl},~{vtype}"
 }
 
 vint32m1_t test_vr(vint32m1_t a, vint32m1_t b) {
 // CHECK-LABEL: define{{.*}} @test_vr
-// CHECK: %0 = tail call <vscale x 2 x i32> asm sideeffect "vadd.vv $0, $1, $2", "=^vr,^vr,^vr"(<vscale x 2 x i32> %a, <vscale x 2 x i32> %b)
+// CHECK: %0 = tail call <vscale x 2 x i32> asm sideeffect "vadd.vv $0, $1, $2", "=^vr,^vr,^vr,~{vl},~{vtype}"(<vscale x 2 x i32> %a, <vscale x 2 x i32> %b)
   vint32m1_t ret;
   asm volatile ("vadd.vv %0, %1, %2" : "=vr"(ret) : "vr"(a), "vr"(b));
   return ret;
@@ -31,7 +31,7 @@ vint32m1_t test_vr(vint32m1_t a, vint32m1_t b) {
 
 vint32m1_t test_vd(vint32m1_t a, vint32m1_t b) {
 // CHECK-LABEL: define{{.*}} @test_vd
-// CHECK: %0 = tail call <vscale x 2 x i32> asm sideeffect "vadd.vv $0, $1, $2", "=^vd,^vd,^vd"(<vscale x 2 x i32> %a, <vscale x 2 x i32> %b)
+// CHECK: %0 = tail call <vscale x 2 x i32> asm sideeffect "vadd.vv $0, $1, $2", "=^vd,^vd,^vd,~{vl},~{vtype}"(<vscale x 2 x i32> %a, <vscale x 2 x i32> %b)
   vint32m1_t ret;
   asm volatile ("vadd.vv %0, %1, %2" : "=vd"(ret) : "vd"(a), "vd"(b));
   return ret;
@@ -39,7 +39,7 @@ vint32m1_t test_vd(vint32m1_t a, vint32m1_t b) {
 
 vbool1_t test_vm(vbool1_t a, vbool1_t b) {
 // CHECK-LABEL: define{{.*}} @test_vm
-// CHECK: %0 = tail call <vscale x 64 x i1> asm sideeffect "vmand.mm $0, $1, $2", "=^vm,^vm,^vm"(<vscale x 64 x i1> %a, <vscale x 64 x i1> %b)
+// CHECK: %0 = tail call <vscale x 64 x i1> asm sideeffect "vmand.mm $0, $1, $2", "=^vm,^vm,^vm,~{vl},~{vtype}"(<vscale x 64 x i1> %a, <vscale x 64 x i1> %b)
   vbool1_t ret;
   asm volatile ("vmand.mm %0, %1, %2" : "=vm"(ret) : "vm"(a), "vm"(b));
   return ret;
diff --git a/clang/test/CodeGen/RISCV/riscv-inline-asm.c b/clang/test/CodeGen/RISCV/riscv-inline-asm.c
index f2031e0adcbcb..1a8ccb7ea5a68 100644
--- a/clang/test/CodeGen/RISCV/riscv-inline-asm.c
+++ b/clang/test/CodeGen/RISCV/riscv-inline-asm.c
@@ -7,17 +7,17 @@
 
 long test_r(long x) {
 // CHECK-LABEL: define{{.*}} {{i64|i32}} @test_r(
-// CHECK: call {{i64|i32}} asm sideeffect "", "=r,r"({{i64|i32}} %{{.*}})
+// CHECK: call {{i64|i32}} asm sideeffect "", "=r,r,~{vl},~{vtype}"({{i64|i32}} %{{.*}})
   long ret;
   asm volatile ("" : "=r"(ret) : "r"(x));
-// CHECK: call {{i64|i32}} asm sideeffect "", "=r,r"({{i64|i32}} %{{.*}})
+// CHECK: call {{i64|i32}} asm sideeffect "", "=r,r,~{vl},~{vtype}"({{i64|i32}} %{{.*}})
   asm volatile ("" : "=r"(ret) : "r"(x));
   return ret;
 }
 
 long test_cr(long x) {
 // CHECK-LABEL: define{{.*}} {{i64|i32}} @test_cr(
-// CHECK: call {{i64|i32}} asm sideeffect "", "=^cr,^cr"({{i64|i32}} %{{.*}})
+// CHECK: call {{i64|i32}} asm sideeffect "", "=^cr,^cr,~{vl},~{vtype}"({{i64|i32}} %{{.*}})
   long ret;
   asm volatile ("" : "=cr"(ret) : "cr"(x));
   return ret;
@@ -27,9 +27,9 @@ float cf;
 double cd;
 void test_cf(float f, double d) {
 // CHECK-LABEL: define{{.*}} void @test_cf(
-// CHECK: call float asm sideeffect "", "=^cf,^cf"(float %{{.*}})
+// CHECK: call float asm sideeffect "", "=^cf,^cf,~{vl},~{vtype}"(float %{{.*}})
   asm volatile("" : "=cf"(cf) : "cf"(f));
-// CHECK: call double asm sideeffect "", "=^cf,^cf"(double %{{.*}})
+// CHECK: call double asm sideeffect "", "=^cf,^cf,~{vl},~{vtype}"(double %{{.*}})
   asm volatile("" : "=cf"(cd) : "cf"(d));
 }
 
@@ -40,7 +40,7 @@ typedef __int128_t double_xlen_t;
 #endif
 double_xlen_t test_R_wide_scalar(double_xlen_t p) {
 // CHECK-LABEL: define{{.*}} {{i128|i64}} @test_R_wide_scalar(
-// CHECK: call {{i128|i64}} asm sideeffect "", "=R,R"({{i128|i64}} %{{.*}})
+// CHECK: call {{i128|i64}} asm sideeffect "", "=R,R,~{vl},~{vtype}"({{i128|i64}} %{{.*}})
   double_xlen_t ret;
   asm volatile("" : "=R"(ret) : "R"(p));
   return ret;
@@ -48,7 +48,7 @@ double_xlen_t test_R_wide_scalar(double_xlen_t p) {
 
 double_xlen_t test_cR_wide_scalar(double_xlen_t p) {
 // CHECK-LABEL: define{{.*}} {{i128|i64}} @test_cR_wide_scalar(
-// CHECK: call {{i128|i64}} asm sideeffect "", "=^cR,^cR"({{i128|i64}} %{{.*}})
+// CHECK: call {{i128|i64}} asm sideeffect "", "=^cR,^cR,~{vl},~{vtype}"({{i128|i64}} %{{.*}})
   double_xlen_t ret;
   asm volatile("" : "=cR"(ret) : "cR"(p));
   return ret;
@@ -56,23 +56,23 @@ double_xlen_t test_cR_wide_scalar(double_xlen_t p) {
 
 void test_I(void) {
 // CHECK-LABEL: define{{.*}} void @test_I()
-// CHECK: call void asm sideeffect "", "I"(i32 2047)
+// CHECK: call void asm sideeffect "", "I,~{vl},~{vtype}"(i32 2047)
   asm volatile ("" :: "I"(2047));
-// CHECK: call void asm sideeffect "", "I"(i32 -2048)
+// CHECK: call void asm sideeffect "", "I,~{vl},~{vtype}"(i32 -2048)
   asm volatile ("" :: "I"(-2048));
 }
 
 void test_J(void) {
 // CHECK-LABEL: define{{.*}} void @test_J()
-// CHECK: call void asm sideeffect "", "J"(i32 0)
+// CHECK: call void asm sideeffect "", "J,~{vl},~{vtype}"(i32 0)
   asm volatile ("" :: "J"(0));
 }
 
 void test_K(void) {
 // CHECK-LABEL: define{{.*}} void @test_K()
-// CHECK: call void asm sideeffect "", "K"(i32 31)
+// CHECK: call void asm sideeffect "", "K,~{vl},~{vtype}"(i32 31)
   asm volatile ("" :: "K"(31));
-// CHECK: call void asm sideeffect "", "K"(i32 0)
+// CHECK: call void asm sideeffect "", "K,~{vl},~{vtype}"(i32 0)
   asm volatile ("" :: "K"(0));
 }
 
@@ -81,16 +81,16 @@ double d;
 void test_f(void) {
 // CHECK-LABEL: define{{.*}} void @test_f()
 // CHECK: [[FLT_ARG:%[a-zA-Z_0-9]+]] = load float, ptr @f
-// CHECK: call void asm sideeffect "", "f"(float [[FLT_ARG]])
+// CHECK: call void asm sideeffect "", "f,~{vl},~{vtype}"(float [[FLT_ARG]])
   asm volatile ("" :: "f"(f));
 // CHECK: [[FLT_ARG:%[a-zA-Z_0-9]+]] = load double, ptr @d
-// CHECK: call void asm sideeffect "", "f"(double [[FLT_ARG]])
+// CHECK: call void asm sideeffect "", "f,~{vl},~{vtype}"(double [[FLT_ARG]])
   asm volatile ("" :: "f"(d));
 }
 
 void test_A(int *p) {
 // CHECK-LABEL: define{{.*}} void @test_A(ptr noundef %p)
-// CHECK: call void asm sideeffect "", "*A"(ptr elementtype(i32) %p)
+// CHECK: call void asm sideeffect "", "*A,~{vl},~{vtype}"(ptr elementtype(i32) %p)
   asm volatile("" :: "A"(*p));
 }
 
@@ -98,9 +98,9 @@ extern int var, arr[2][2];
 struct Pair { int a, b; } pair;
 
 // CHECK-LABEL: test_s(
-// CHECK:         call void asm sideeffect "// $0 $1 $2", "s,s,s"(ptr nonnull @var, ptr nonnull getelementptr inbounds nuw (i8, ptr @arr, {{.*}}), ptr nonnull @test_s)
-// CHECK:         call void asm sideeffect "// $0", "s"(ptr nonnull getelementptr inbounds nuw (i8, ptr @pair, {{.*}}))
-// CHECK:         call void asm sideeffect "// $0 $1 $2", "S,S,S"(ptr nonnull @var, ptr nonnull getelementptr inbounds nuw (i8, ptr @arr, {{.*}}), ptr nonnull @test_s)
+// CHECK:         call void asm sideeffect "// $0 $1 $2", "s,s,s,~{vl},~{vtype}"(ptr nonnull @var, ptr nonnull getelementptr inbounds nuw (i8, ptr @arr, {{.*}}), ptr nonnull @test_s)
+// CHECK:         call void asm sideeffect "// $0", "s,~{vl},~{vtype}"(ptr nonnull getelementptr inbounds nuw (i8, ptr @pair, {{.*}}))
+// CHECK:         call void asm sideeffect "// $0 $1 $2", "S,S,S,~{vl},~{vtype}"(ptr nonnull @var, ptr nonnull getelementptr inbounds nuw (i8, ptr @arr, {{.*}}), ptr nonnull @test_s)
 void test_s(void) {
   asm("// %0 %1 %2" :: "s"(&var), "s"(&arr[1][1]), "s"(test_s));
   asm("// %0" :: "s"(&pair.b));
@@ -109,9 +109,9 @@ void test_s(void) {
 }
 
 // CHECK-LABEL: test_modifiers(
-// CHECK: call void asm sideeffect "// ${0:i} ${1:i}", "r,r"({{i32|i64}} %val, i32 37)
-// CHECK: call void asm sideeffect "// ${0:z} ${1:z}", "i,i"(i32 0, i32 1)
-// CHECK: call void asm sideeffect "// ${0:N}", "r"({{i32|i64}} %val)
+// CHECK: call void asm sideeffect "// ${0:i} ${1:i}", "r,r,~{vl},~{vtype}"({{i32|i64}} %val, i32 37)
+// CHECK: call void asm sideeffect "// ${0:z} ${1:z}", "i,i,~{vl},~{vtype}"(i32 0, i32 1)
+// CHECK: call void asm sideeffect "// ${0:N}", "r,~{vl},~{vtype}"({{i32|i64}} %val)
 void test_modifiers(long val) {
   asm volatile("// %i0 %i1" :: "r"(val), "r"(37));
   asm volatile("// %z0 %z1" :: "i"(0), "i"(1));
diff --git a/llvm/test/CodeGen/RISCV/rvv/vsetvl-cross-inline-asm-clobber.ll b/llvm/test/CodeGen/RISCV/rvv/vsetvl-cross-inline-asm-clobber.ll
new file mode 100644
index 0000000000000..75d8b4c5437b1
--- /dev/null
+++ b/llvm/test/CodeGen/RISCV/rvv/vsetvl-cross-inline-asm-clobber.ll
@@ -0,0 +1,37 @@
+; NOTE: Assertions have been autogenerated by utils/update_llc_test_checks.py UTC_ARGS: --version 5
+; RUN: llc -mtriple=riscv64 -mcpu=sifive-x280 -verify-machineinstrs < %s | FileCheck %s
+
+define void @foo(<vscale x 8 x half> %0, <vscale x 8 x half> %1) {
+; CHECK-LABEL: foo:
+; CHECK:       # %bb.0: # %entry
+; CHECK-NEXT:    vsetvli a0, zero, e32, m4, ta, ma
+; CHECK-NEXT:    vmv.v.i v12, 0
+; CHECK-NEXT:    lui a0, 1
+; CHECK-NEXT:    addiw a0, a0, -1096
+; CHECK-NEXT:    vmv.v.i v16, 0
+; CHECK-NEXT:    vsetvli zero, a0, e8, m1, ta, ma
+; CHECK-NEXT:    #APP
+; CHECK-NEXT:    vfmadd.vv v16, v12, v12
+; CHECK-NEXT:    #NO_APP
+; CHECK-NEXT:    #APP
+; CHECK-NEXT:    vfmadd.vv v16, v12, v12
+; CHECK-NEXT:    #NO_APP
+; CHECK-NEXT:    vsetvli zero, a0, e16, m2, ta, ma
+; CHECK-NEXT:    vse16.v v8, (zero)
+; CHECK-NEXT:    ret
+entry:
+  %2 = tail call i64 @llvm.riscv.vsetvli.i64(i64 3000, i64 0, i64 0)
+  %3 = tail call <vscale x 8 x float> asm sideeffect "vfmadd.vv $0, $1, $2", "=^vr,^vr,^vr,0,~{vl},~{vtype}"(<vscale x 8 x float> zeroinitializer, <vscale x 8 x float> zeroinitializer, <vscale x 8 x float> zeroinitializer)
+  %4 = tail call <vscale x 8 x float> asm sideeffect "vfmadd.vv $0, $1, $2", "=^vr,^vr,^vr,0,~{vl},~{vtype}"(<vscale x 8 x float> zeroinitializer, <vscale x 8 x float> zeroinitializer, <vscale x 8 x float> %3)
+  tail call void @llvm.riscv.vse.nxv8f16.i64(<vscale x 8 x half> %0, ptr null, i64 %2)
+  ret void
+}
+
+; Function Attrs: nocallback nofree nosync nounwind willreturn memory(none)
+declare i64 @llvm.riscv.vsetvli.i64(i64, i64 immarg, i64 immarg) #0
+
+; Function Attrs: nocallback nofree nosync nounwind willreturn memory(argmem: write)
+declare void @llvm.riscv.vse.nxv8f16.i64(<vscale x 8 x half>, ptr nocapture, i64) #1
+
+attributes #0 = { nocallback nofree nosync nounwind willreturn memory(none) }
+attributes #1 = { nocallback nofree nosync nounwind willreturn memory(argmem: write) }
diff --git a/llvm/test/CodeGen/RISCV/rvv/vsetvl-cross-inline-asm.ll b/llvm/test/CodeGen/RISCV/rvv/vsetvl-cross-inline-asm.ll
new file mode 100644
index 0000000000000..e60ce90d5ed8b
--- /dev/null
+++ b/llvm/test/CodeGen/RISCV/rvv/vsetvl-cross-inline-asm.ll
@@ -0,0 +1,37 @@
+; NOTE: Assertions have been autogenerated by utils/update_llc_test_checks.py UTC_ARGS: --version 5
+; RUN: llc -mtriple=riscv64 -mcpu=sifive-x280 -verify-machineinstrs < %s | FileCheck %s
+
+define void @foo(<vscale x 8 x half> %0, <vscale x 8 x half> %1) {
+; CHECK-LABEL: foo:
+; CHECK:       # %bb.0: # %entry
+; CHECK-NEXT:    vsetvli a0, zero, e32, m4, ta, ma
+; CHECK-NEXT:    vmv.v.i v12, 0
+; CHECK-NEXT:    lui a0, 1
+; CHECK-NEXT:    addiw a0, a0, -1096
+; CHECK-NEXT:    vmv.v.i v16, 0
+; CHECK-NEXT:    vsetvli zero, a0, e8, m1, ta, ma
+; CHECK-NEXT:    #APP
+; CHECK-NEXT:    vfmadd.vv v16, v12, v12
+; CHECK-NEXT:    #NO_APP
+; CHECK-NEXT:    vsetvli zero, a0, e16, m2, ta, ma
+; CHECK-NEXT:    #APP
+; CHECK-NEXT:    vfmadd.vv v16, v12, v12
+; CHECK-NEXT:    #NO_APP
+; CHECK-NEXT:    vse16.v v8, (zero)
+; CHECK-NEXT:    ret
+entry:
+  %2 = tail call i64 @llvm.riscv.vsetvli.i64(i64 3000, i64 0, i64 0)
+  %3 = tail call <vscale x 8 x float> asm sideeffect "vfmadd.vv $0, $1, $2", "=^vr,^vr,^vr,0"(<vscale x 8 x float> zeroinitializer, <vscale x 8 x float> zeroinitializer, <vscale x 8 x float> zeroinitializer)
+  %4 = tail call <vscale x 8 x float> asm sideeffect "vfmadd.vv $0, $1, $2", "=^vr,^vr,^vr,0"(<vscale x 8 x float> zeroinitializer, <vscale x 8 x float> zeroinitializer, <vscale x 8 x float> %3)
+  tail call void @llvm.riscv.vse.nxv8f16.i64(<vscale x 8 x half> %0, ptr null, i64 %2)
+  ret void
+}
+
+; Function Attrs: nocallback nofree nosync nounwind willreturn memory(none)
+declare i64 @llvm.riscv.vsetvli.i64(i64, i64 immarg, i64 immarg) #0
+
+; Function Attrs: nocallback nofree nosync nounwind willreturn memory(argmem: write)
+declare void @llvm.riscv.vse.nxv8f16.i64(<vscale x 8 x half>, ptr nocapture, i64) #1
+
+attributes #0 = { nocallback nofree nosync nounwind willreturn memory(none) }
+attributes #1 = { nocallback nofree nosync nounwind willreturn memory(argmem: write) }

wangpc-pp · 2025-02-25T06:31:05Z

I think this may not be the right way.

We should at least add vl/vtype to clobbered registers when V is specified.
The asm may not depend on vl/type, but adding vl/type dependencies unconditionally stop further scheduling.

HankChang736 · 2025-02-25T06:57:02Z

I think this may not be the right way.

1. We should at least add vl/vtype to clobbered registers when V is specified.

2. The asm may not depend on vl/type, but adding vl/type dependencies unconditionally stop further scheduling.

Yes, you're right. But the purpose we add vl/vtype dependencies is to prevent the Post-RA scheduler moving vsetvl instruction across inline assembly. I'm not sure if there's better approach to solve this problem.

wangpc-pp

I see. But what's the difference if we add vl/vtype to the list of clobbered registers explicitly in C/C++ asm statements?

wangpc-pp · 2025-02-25T08:00:36Z

clang/lib/Basic/Targets/RISCV.h

@@ -68,7 +68,7 @@ class RISCVTargetInfo : public TargetInfo {
    return TargetInfo::VoidPtrBuiltinVaList;
  }

-  std::string_view getClobbers() const override { return ""; }
+  std::string_view getClobbers() const override { return "~{vl},~{vtype}"; }


Suggested change

std::string_view getClobbers() const override { return "~{vl},~{vtype}"; }

std::string_view getClobbers() const override {

if (ISAInfo->hasExtension("zve32x"))

return "~{vl},~{vtype}";

return "";

}

I think this implementation is better, just update it with your implementation.

That doesn't work with the target attribute.

preames · 2025-02-25T15:59:51Z

But the purpose we add vl/vtype dependencies is to prevent the Post-RA scheduler moving vsetvl instruction across inline assembly. I'm not sure if there's better approach to solve this problem.

Maybe have RISCVInsertVSETVLI add implicit use operands to the inline assembly at that point? It's what we do for all the vector instructions. In this case, you might want implicit defs.

mshockwave · 2025-02-25T19:23:39Z

I think what @wangpc-pp advocated here (please correct me if I'm wrong) was that user should be responsible annotating these registers as clobbered so that we have more freedom on scheduling when the inline assembly is not using any vector instructions. While other approaches -- regardless of marking them as clobber or implicit-def these two registers in RISCVInsertVSETVLI -- are taking a more conservative path to ensure correctness.

It's a shame that we're not able to analyze the instructions within inline assembly to make a better decision. Personally I think we should prioritize correctness, therefore I'm more incline to the latter approaches. I do agree that we probably should only do it when vector instructions are present, as pointed out by one of the review comments by @wangpc-pp.

wangpc-pp · 2025-02-26T03:28:18Z

I think what @wangpc-pp advocated here (please correct me if I'm wrong) was that user should be responsible annotating these registers as clobbered so that we have more freedom on scheduling when the inline assembly is not using any vector instructions. While other approaches -- regardless of marking them as clobber or implicit-def these two registers in RISCVInsertVSETVLI -- are taking a more conservative path to ensure correctness.

Yes this is exactly what I meant!

It's a shame that we're not able to analyze the instructions within inline assembly to make a better decision. Personally I think we should prioritize correctness, therefore I'm more incline to the latter approaches. I do agree that we probably should only do it when vector instructions are present, as pointed out by one of the review comments by @wangpc-pp.

I am not opposed to this. But I have to say, the use case is not so common. We don't mix assemblies with RVV instructions and RVV intrinsics in the same code. In this scenario, we should write all RVV instructions in asm statements or just use RVV intrinsics. Even for these rare cases, the user can achieve the same effect if they put vl/vtype into clobbered registers explicitly.
And let me quote @kito-cheng's comments: #97794 (comment), there are some other approaches.

topperc · 2025-02-26T04:03:45Z

Even for these rare cases, the user can achieve the same effect if they put vl/vtype into clobbered registers explicitly.

They can, but will they? clobbers are a relatively uncommon feature I suspect many people don't know about it.

I think maybe we need documentation of how to use RVV with inline assembly. I've seen a lot of things that aren't technically correct, but usually work.

For example, one statement per instruction with hand allocated vector registers and no clobbers. The vector register aren't technically alive between the statements as far as the compiler is concerned.

topperc · 2025-03-05T00:07:33Z

clang/lib/Basic/Targets/RISCV.h

@@ -68,7 +68,11 @@ class RISCVTargetInfo : public TargetInfo {
    return TargetInfo::VoidPtrBuiltinVaList;
  }

-  std::string_view getClobbers() const override { return ""; }
+  std::string_view getClobbers() const override {
+    if (ISAInfo->hasExtension("zve32x"))


This doesn't work if zve32x is added by attribute(target("zve32x"))

I just reverted to original implementation.

wangpc-pp · 2025-03-05T04:27:28Z

I am not familiar with the target attribute implementation, can we get the list of function features here and add the clobbers at:

llvm-project/clang/lib/CodeGen/CGStmt.cpp

Lines 3087 to 3093 in 6d93280

    
           // Add machine specific clobbers 
        
           std::string_view MachineClobbers = getTarget().getClobbers(); 
        
           if (!MachineClobbers.empty()) { 
        
             if (!Constraints.empty()) 
        
               Constraints += ','; 
        
             Constraints += MachineClobbers; 
        
           }

cc @topperc @4vtomat

topperc · 2025-03-05T04:31:30Z

I am not familiar with the target attribute implementation, can we get the list of function features here and add the clobbers at:

llvm-project/clang/lib/CodeGen/CGStmt.cpp

Lines 3087 to 3093 in 6d93280

// Add machine specific clobbers

std::string_view MachineClobbers = getTarget().getClobbers();

if (!MachineClobbers.empty()) {

if (!Constraints.empty())

Constraints += ',';

Constraints += MachineClobbers;

}

cc @topperc @4vtomat

I'm not sure. But I don't understand why we need to check for Zve32x at all?

wangpc-pp · 2025-03-05T04:36:22Z

I am not familiar with the target attribute implementation, can we get the list of function features here and add the clobbers at:

llvm-project/clang/lib/CodeGen/CGStmt.cpp

Lines 3087 to 3093 in 6d93280

// Add machine specific clobbers

std::string_view MachineClobbers = getTarget().getClobbers();

if (!MachineClobbers.empty()) {

if (!Constraints.empty())

Constraints += ',';

Constraints += MachineClobbers;

}

cc @topperc @4vtomat

I'm not sure. But I don't understand why we need to check for Zve32x at all?

Just see the previous discussion, it is weird to add vector CSRs to clobbers if no vector support.

topperc · 2025-03-05T04:49:03Z

But the purpose we add vl/vtype dependencies is to prevent the Post-RA scheduler moving vsetvl instruction across inline assembly. I'm not sure if there's better approach to solve this problem.

Maybe have RISCVInsertVSETVLI add implicit use operands to the inline assembly at that point? It's what we do for all the vector instructions. In this case, you might want implicit defs.

I think we should try this RISCVInsertVSETVLI idea.

HankChang736 · 2025-03-05T05:23:20Z

I tested the following case without passing the 'v' extension in the Clang command line argument:

__attribute__((target("arch=rv32gcv_zve32x")))
void test_A(int *p) {
  asm volatile("" :: "A"(*p));
}

The generated LLVM IR result is:

; Function Attrs: nounwind
define dso_local void @test_A(ptr noundef %p) local_unnamed_addr #0 {
entry:
  tail call void asm sideeffect "", "*A"(ptr elementtype(i32) %p) #1, !srcloc !6
  ret void
}

attributes #0 = { nounwind "no-trapping-math"="true" "stack-protector-buffer-size"="8" "target-cpu"="generic-rv32" "target-features"="+32bit,+a,+c,+d,+f,+m,+relax,+v,+zaamo,+zalrsc,+zicsr,+zifencei,+zmmul,+zve32f,+zve32x,+zve64d,+zve64f,+zve64x,+zvl128b,+zvl32b,+zvl64b ...

From this result, it appears that the target attribute does not have the intended effect in this case. Given this, perhaps we should keep the initial implementation temporary and try the RISCVInsertVSETVLI approach for future improvements.

cc @wangpc-pp @topperc

wangpc-pp · 2025-03-05T07:25:03Z

Agree, we should try the RISCVInsertVSETVLI approach.

This reverts commit 4258fae.

topperc · 2025-03-08T00:07:36Z

The RISCVInsertVSETVLI approach is probably just a matter of doing something like. May should check if an implicit def already exists before adding it. I don't think addOperand checks for duplicates.

diff --git a/llvm/lib/Target/RISCV/RISCVInsertVSETVLI.cpp b/llvm/lib/Target/RISCV/RISCVInsertVSETVLI.cpp
index 7433603daff8..2247610c21ff 100644
--- a/llvm/lib/Target/RISCV/RISCVInsertVSETVLI.cpp
+++ b/llvm/lib/Target/RISCV/RISCVInsertVSETVLI.cpp
@@ -1531,6 +1531,13 @@ void RISCVInsertVSETVLI::emitVSETVLIs(MachineBasicBlock &MBB) {
                                               /*isImp*/ true));
     }
 
+    if (MI.isInlineAsm()) {
+      MI.addOperand(MachineOperand::CreateReg(RISCV::VL, /*isDef*/ true,
+                                              /*isImp*/ true));
+      MI.addOperand(MachineOperand::CreateReg(RISCV::VTYPE, /*isDef*/ true,
+                                              /*isImp*/ true));
+    }
+
     if (MI.isCall() || MI.isInlineAsm() ||
         MI.modifiesRegister(RISCV::VL, /*TRI=*/nullptr) ||
         MI.modifiesRegister(RISCV::VTYPE, /*TRI=*/nullptr))

dzaima · 2025-03-08T00:45:37Z

Would there be some way to explicitly request to not clobber vl/vtype? An asm block is quite useful as an "optimization fence" on x86 & ARM NEON, but it'd be sad if that necessarily meant extra vsetvl spam on RVV (currently clang doesn't do much optimization on RVV intrinsics so there's not much chance for it to misoptimize, but presumably that may change; more significant for __attribute__ vector_size vectors, though using those with asm is non-trivial and generally not ideal on RVV; some quite contrived examples)

HankChang736 · 2025-03-11T08:38:04Z

I create another pull request #130733 that is implemented with the RISCVInsertVSETVL approach, since it's a new approach.

@preames

#130733) …sing inline assembly. Fixing [#128636](#128636). This patch has RISCVInsertVSETVLI to add implicit use operand to inline assembly, this approach is suggested by @preames and the implementation I referenced is from @topperc . The purpose of adding vl, vtype implicit operand is to prevent Post-RA scheduler moving vsetvl across inline assembly.

@preames

…TVLI when u… (#130733) …sing inline assembly. Fixing [#128636](llvm/llvm-project#128636). This patch has RISCVInsertVSETVLI to add implicit use operand to inline assembly, this approach is suggested by @preames and the implementation I referenced is from @topperc . The purpose of adding vl, vtype implicit operand is to prevent Post-RA scheduler moving vsetvl across inline assembly.

HankChang736 · 2025-03-13T08:11:11Z

#130733 is merged. This pr will be marked as closed.

@preames

llvm#130733) …sing inline assembly. Fixing [llvm#128636](llvm#128636). This patch has RISCVInsertVSETVLI to add implicit use operand to inline assembly, this approach is suggested by @preames and the implementation I referenced is from @topperc . The purpose of adding vl, vtype implicit operand is to prevent Post-RA scheduler moving vsetvl across inline assembly.

[RISCV] Mark {vl, vtype} as clobber in inline assembly

d7718e9

This patch use the hook getClobbers() in RISCV target and mark {vl, type} as clobber to prevent Post-RA scheduler move vsetvl across inline assembly.

HankChang736 requested review from lukel97, kito-cheng, BeMg and topperc February 25, 2025 05:53

llvmbot added clang Clang issues not falling into any other category backend:RISC-V clang:frontend Language frontend issues, e.g. anything involving "Sema" labels Feb 25, 2025

topperc requested review from preames and wangpc-pp February 25, 2025 06:02

wangpc-pp reviewed Feb 25, 2025

View reviewed changes

Update with suggestion from Pengcheng Wang

4258fae

topperc reviewed Mar 5, 2025

View reviewed changes

Revert "Update with suggestion from Pengcheng Wang"

50ac9db

This reverts commit 4258fae.

topperc self-requested a review March 6, 2025 03:56

HankChang736 mentioned this pull request Mar 11, 2025

[RISCV] Add implicit operand {VL, VTYPE} in RISCVInsertVSETVLI when u… #130733

Merged

HankChang736 closed this Mar 13, 2025

[RISCV] Mark {vl, vtype} as clobber in inline assembly #128636

[RISCV] Mark {vl, vtype} as clobber in inline assembly #128636

Uh oh!

Conversation

HankChang736 commented Feb 25, 2025

Uh oh!

github-actions bot commented Feb 25, 2025

Uh oh!

llvmbot commented Feb 25, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

wangpc-pp commented Feb 25, 2025

Uh oh!

HankChang736 commented Feb 25, 2025

Uh oh!

wangpc-pp left a comment

Choose a reason for hiding this comment

Uh oh!

wangpc-pp Feb 25, 2025

Choose a reason for hiding this comment

Uh oh!

HankChang736 Feb 26, 2025

Choose a reason for hiding this comment

Uh oh!

topperc Feb 26, 2025

Choose a reason for hiding this comment

Uh oh!

preames commented Feb 25, 2025

Uh oh!

mshockwave commented Feb 25, 2025

Uh oh!

wangpc-pp commented Feb 26, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

topperc commented Feb 26, 2025

Uh oh!

topperc Mar 5, 2025

Choose a reason for hiding this comment

Uh oh!

HankChang736 Mar 7, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Choose a reason for hiding this comment

Uh oh!

wangpc-pp commented Mar 5, 2025

Uh oh!

topperc commented Mar 5, 2025

Uh oh!

wangpc-pp commented Mar 5, 2025

Uh oh!

topperc commented Mar 5, 2025

Uh oh!

HankChang736 commented Mar 5, 2025

Uh oh!

wangpc-pp commented Mar 5, 2025

Uh oh!

topperc commented Mar 8, 2025

Uh oh!

dzaima commented Mar 8, 2025

Uh oh!

HankChang736 commented Mar 11, 2025

Uh oh!

HankChang736 commented Mar 13, 2025

Uh oh!

Uh oh!

llvmbot commented Feb 25, 2025 •

edited

Loading

wangpc-pp commented Feb 26, 2025 •

edited

Loading

HankChang736 Mar 7, 2025 •

edited

Loading