-
Notifications
You must be signed in to change notification settings - Fork 14.3k
[RISCV] Mark {vl, vtype} as clobber in inline assembly #128636
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[RISCV] Mark {vl, vtype} as clobber in inline assembly #128636
Conversation
This patch use the hook getClobbers() in RISCV target and mark {vl, type} as clobber to prevent Post-RA scheduler move vsetvl across inline assembly.
Thank you for submitting a Pull Request (PR) to the LLVM Project! This PR will be automatically labeled and the relevant teams will be notified. If you wish to, you can add reviewers by using the "Reviewers" section on this page. If this is not working for you, it is probably because you do not have write permissions for the repository. In which case you can instead tag reviewers by name in a comment by using If you have received no comments on your PR for a week, you can request a review by "ping"ing the PR by adding a comment “Ping”. The common courtesy "ping" rate is once a week. Please remember that you are asking for valuable time from other developers. If you have further questions, they may be answered by the LLVM GitHub User Guide. You can also ask questions in a comment on this PR, on the LLVM Discord or on the forums. |
@llvm/pr-subscribers-clang @llvm/pr-subscribers-backend-risc-v Author: Hank Chang (HankChang736) ChangesThis patch use the hook getClobbers() in RISCV target and mark {vl, type} as clobber to prevent Post-RA scheduler moving vsetvl across inline assembly. Fixing #97794. Full diff: https://github.com/llvm/llvm-project/pull/128636.diff 6 Files Affected:
diff --git a/clang/lib/Basic/Targets/RISCV.h b/clang/lib/Basic/Targets/RISCV.h
index c26aa19080162..5590aa9d03c75 100644
--- a/clang/lib/Basic/Targets/RISCV.h
+++ b/clang/lib/Basic/Targets/RISCV.h
@@ -68,7 +68,7 @@ class RISCVTargetInfo : public TargetInfo {
return TargetInfo::VoidPtrBuiltinVaList;
}
- std::string_view getClobbers() const override { return ""; }
+ std::string_view getClobbers() const override { return "~{vl},~{vtype}"; }
StringRef getConstraintRegister(StringRef Constraint,
StringRef Expression) const override {
diff --git a/clang/test/CodeGen/RISCV/riscv-inline-asm-clobber.c b/clang/test/CodeGen/RISCV/riscv-inline-asm-clobber.c
index 8aa80386f205f..65cfc081fe26d 100644
--- a/clang/test/CodeGen/RISCV/riscv-inline-asm-clobber.c
+++ b/clang/test/CodeGen/RISCV/riscv-inline-asm-clobber.c
@@ -7,38 +7,62 @@
// Test RISC-V specific clobbered registers in inline assembly.
-// CHECK-LABEL: define {{.*}} void @test_fflags
-// CHECK: tail call void asm sideeffect "", "~{fflags}"()
+// CHECK-LABEL: define dso_local void @test_fflags(
+// CHECK-SAME: ) local_unnamed_addr #[[ATTR0:[0-9]+]] {
+// CHECK-NEXT: entry:
+// CHECK-NEXT: tail call void asm sideeffect "", "~{fflags},~{vl},~{vtype}"() #[[ATTR1:[0-9]+]], !srcloc [[META6:![0-9]+]]
+// CHECK-NEXT: ret void
+//
void test_fflags(void) {
asm volatile ("" :::"fflags");
}
-// CHECK-LABEL: define {{.*}} void @test_frm
-// CHECK: tail call void asm sideeffect "", "~{frm}"()
+// CHECK-LABEL: define dso_local void @test_frm(
+// CHECK-SAME: ) local_unnamed_addr #[[ATTR0]] {
+// CHECK-NEXT: entry:
+// CHECK-NEXT: tail call void asm sideeffect "", "~{frm},~{vl},~{vtype}"() #[[ATTR1]], !srcloc [[META7:![0-9]+]]
+// CHECK-NEXT: ret void
+//
void test_frm(void) {
asm volatile ("" :::"frm");
}
-// CHECK-LABEL: define {{.*}} void @test_vtype
-// CHECK: tail call void asm sideeffect "", "~{vtype}"()
+// CHECK-LABEL: define dso_local void @test_vtype(
+// CHECK-SAME: ) local_unnamed_addr #[[ATTR0]] {
+// CHECK-NEXT: entry:
+// CHECK-NEXT: tail call void asm sideeffect "", "~{vtype},~{vl},~{vtype}"() #[[ATTR1]], !srcloc [[META8:![0-9]+]]
+// CHECK-NEXT: ret void
+//
void test_vtype(void) {
asm volatile ("" :::"vtype");
}
-// CHECK-LABEL: define {{.*}} void @test_vl
-// CHECK: tail call void asm sideeffect "", "~{vl}"()
+// CHECK-LABEL: define dso_local void @test_vl(
+// CHECK-SAME: ) local_unnamed_addr #[[ATTR0]] {
+// CHECK-NEXT: entry:
+// CHECK-NEXT: tail call void asm sideeffect "", "~{vl},~{vl},~{vtype}"() #[[ATTR1]], !srcloc [[META9:![0-9]+]]
+// CHECK-NEXT: ret void
+//
void test_vl(void) {
asm volatile ("" :::"vl");
}
-// CHECK-LABEL: define {{.*}} void @test_vxsat
-// CHECK: tail call void asm sideeffect "", "~{vxsat}"()
+// CHECK-LABEL: define dso_local void @test_vxsat(
+// CHECK-SAME: ) local_unnamed_addr #[[ATTR0]] {
+// CHECK-NEXT: entry:
+// CHECK-NEXT: tail call void asm sideeffect "", "~{vxsat},~{vl},~{vtype}"() #[[ATTR1]], !srcloc [[META10:![0-9]+]]
+// CHECK-NEXT: ret void
+//
void test_vxsat(void) {
asm volatile ("" :::"vxsat");
}
-// CHECK-LABEL: define {{.*}} void @test_vxrm
-// CHECK: tail call void asm sideeffect "", "~{vxrm}"()
+// CHECK-LABEL: define dso_local void @test_vxrm(
+// CHECK-SAME: ) local_unnamed_addr #[[ATTR0]] {
+// CHECK-NEXT: entry:
+// CHECK-NEXT: tail call void asm sideeffect "", "~{vxrm},~{vl},~{vtype}"() #[[ATTR1]], !srcloc [[META11:![0-9]+]]
+// CHECK-NEXT: ret void
+//
void test_vxrm(void) {
asm volatile ("" :::"vxrm");
}
diff --git a/clang/test/CodeGen/RISCV/riscv-inline-asm-rvv.c b/clang/test/CodeGen/RISCV/riscv-inline-asm-rvv.c
index 879fb1238d83a..e9444c6a76d87 100644
--- a/clang/test/CodeGen/RISCV/riscv-inline-asm-rvv.c
+++ b/clang/test/CodeGen/RISCV/riscv-inline-asm-rvv.c
@@ -18,12 +18,12 @@ void test_v_reg() {
:
: "v1", "x1");
// CHECK-LABEL: define{{.*}} @test_v_reg
-// CHECK: "~{v1},~{x1}"
+// CHECK: "~{v1},~{x1},~{vl},~{vtype}"
}
vint32m1_t test_vr(vint32m1_t a, vint32m1_t b) {
// CHECK-LABEL: define{{.*}} @test_vr
-// CHECK: %0 = tail call <vscale x 2 x i32> asm sideeffect "vadd.vv $0, $1, $2", "=^vr,^vr,^vr"(<vscale x 2 x i32> %a, <vscale x 2 x i32> %b)
+// CHECK: %0 = tail call <vscale x 2 x i32> asm sideeffect "vadd.vv $0, $1, $2", "=^vr,^vr,^vr,~{vl},~{vtype}"(<vscale x 2 x i32> %a, <vscale x 2 x i32> %b)
vint32m1_t ret;
asm volatile ("vadd.vv %0, %1, %2" : "=vr"(ret) : "vr"(a), "vr"(b));
return ret;
@@ -31,7 +31,7 @@ vint32m1_t test_vr(vint32m1_t a, vint32m1_t b) {
vint32m1_t test_vd(vint32m1_t a, vint32m1_t b) {
// CHECK-LABEL: define{{.*}} @test_vd
-// CHECK: %0 = tail call <vscale x 2 x i32> asm sideeffect "vadd.vv $0, $1, $2", "=^vd,^vd,^vd"(<vscale x 2 x i32> %a, <vscale x 2 x i32> %b)
+// CHECK: %0 = tail call <vscale x 2 x i32> asm sideeffect "vadd.vv $0, $1, $2", "=^vd,^vd,^vd,~{vl},~{vtype}"(<vscale x 2 x i32> %a, <vscale x 2 x i32> %b)
vint32m1_t ret;
asm volatile ("vadd.vv %0, %1, %2" : "=vd"(ret) : "vd"(a), "vd"(b));
return ret;
@@ -39,7 +39,7 @@ vint32m1_t test_vd(vint32m1_t a, vint32m1_t b) {
vbool1_t test_vm(vbool1_t a, vbool1_t b) {
// CHECK-LABEL: define{{.*}} @test_vm
-// CHECK: %0 = tail call <vscale x 64 x i1> asm sideeffect "vmand.mm $0, $1, $2", "=^vm,^vm,^vm"(<vscale x 64 x i1> %a, <vscale x 64 x i1> %b)
+// CHECK: %0 = tail call <vscale x 64 x i1> asm sideeffect "vmand.mm $0, $1, $2", "=^vm,^vm,^vm,~{vl},~{vtype}"(<vscale x 64 x i1> %a, <vscale x 64 x i1> %b)
vbool1_t ret;
asm volatile ("vmand.mm %0, %1, %2" : "=vm"(ret) : "vm"(a), "vm"(b));
return ret;
diff --git a/clang/test/CodeGen/RISCV/riscv-inline-asm.c b/clang/test/CodeGen/RISCV/riscv-inline-asm.c
index f2031e0adcbcb..1a8ccb7ea5a68 100644
--- a/clang/test/CodeGen/RISCV/riscv-inline-asm.c
+++ b/clang/test/CodeGen/RISCV/riscv-inline-asm.c
@@ -7,17 +7,17 @@
long test_r(long x) {
// CHECK-LABEL: define{{.*}} {{i64|i32}} @test_r(
-// CHECK: call {{i64|i32}} asm sideeffect "", "=r,r"({{i64|i32}} %{{.*}})
+// CHECK: call {{i64|i32}} asm sideeffect "", "=r,r,~{vl},~{vtype}"({{i64|i32}} %{{.*}})
long ret;
asm volatile ("" : "=r"(ret) : "r"(x));
-// CHECK: call {{i64|i32}} asm sideeffect "", "=r,r"({{i64|i32}} %{{.*}})
+// CHECK: call {{i64|i32}} asm sideeffect "", "=r,r,~{vl},~{vtype}"({{i64|i32}} %{{.*}})
asm volatile ("" : "=r"(ret) : "r"(x));
return ret;
}
long test_cr(long x) {
// CHECK-LABEL: define{{.*}} {{i64|i32}} @test_cr(
-// CHECK: call {{i64|i32}} asm sideeffect "", "=^cr,^cr"({{i64|i32}} %{{.*}})
+// CHECK: call {{i64|i32}} asm sideeffect "", "=^cr,^cr,~{vl},~{vtype}"({{i64|i32}} %{{.*}})
long ret;
asm volatile ("" : "=cr"(ret) : "cr"(x));
return ret;
@@ -27,9 +27,9 @@ float cf;
double cd;
void test_cf(float f, double d) {
// CHECK-LABEL: define{{.*}} void @test_cf(
-// CHECK: call float asm sideeffect "", "=^cf,^cf"(float %{{.*}})
+// CHECK: call float asm sideeffect "", "=^cf,^cf,~{vl},~{vtype}"(float %{{.*}})
asm volatile("" : "=cf"(cf) : "cf"(f));
-// CHECK: call double asm sideeffect "", "=^cf,^cf"(double %{{.*}})
+// CHECK: call double asm sideeffect "", "=^cf,^cf,~{vl},~{vtype}"(double %{{.*}})
asm volatile("" : "=cf"(cd) : "cf"(d));
}
@@ -40,7 +40,7 @@ typedef __int128_t double_xlen_t;
#endif
double_xlen_t test_R_wide_scalar(double_xlen_t p) {
// CHECK-LABEL: define{{.*}} {{i128|i64}} @test_R_wide_scalar(
-// CHECK: call {{i128|i64}} asm sideeffect "", "=R,R"({{i128|i64}} %{{.*}})
+// CHECK: call {{i128|i64}} asm sideeffect "", "=R,R,~{vl},~{vtype}"({{i128|i64}} %{{.*}})
double_xlen_t ret;
asm volatile("" : "=R"(ret) : "R"(p));
return ret;
@@ -48,7 +48,7 @@ double_xlen_t test_R_wide_scalar(double_xlen_t p) {
double_xlen_t test_cR_wide_scalar(double_xlen_t p) {
// CHECK-LABEL: define{{.*}} {{i128|i64}} @test_cR_wide_scalar(
-// CHECK: call {{i128|i64}} asm sideeffect "", "=^cR,^cR"({{i128|i64}} %{{.*}})
+// CHECK: call {{i128|i64}} asm sideeffect "", "=^cR,^cR,~{vl},~{vtype}"({{i128|i64}} %{{.*}})
double_xlen_t ret;
asm volatile("" : "=cR"(ret) : "cR"(p));
return ret;
@@ -56,23 +56,23 @@ double_xlen_t test_cR_wide_scalar(double_xlen_t p) {
void test_I(void) {
// CHECK-LABEL: define{{.*}} void @test_I()
-// CHECK: call void asm sideeffect "", "I"(i32 2047)
+// CHECK: call void asm sideeffect "", "I,~{vl},~{vtype}"(i32 2047)
asm volatile ("" :: "I"(2047));
-// CHECK: call void asm sideeffect "", "I"(i32 -2048)
+// CHECK: call void asm sideeffect "", "I,~{vl},~{vtype}"(i32 -2048)
asm volatile ("" :: "I"(-2048));
}
void test_J(void) {
// CHECK-LABEL: define{{.*}} void @test_J()
-// CHECK: call void asm sideeffect "", "J"(i32 0)
+// CHECK: call void asm sideeffect "", "J,~{vl},~{vtype}"(i32 0)
asm volatile ("" :: "J"(0));
}
void test_K(void) {
// CHECK-LABEL: define{{.*}} void @test_K()
-// CHECK: call void asm sideeffect "", "K"(i32 31)
+// CHECK: call void asm sideeffect "", "K,~{vl},~{vtype}"(i32 31)
asm volatile ("" :: "K"(31));
-// CHECK: call void asm sideeffect "", "K"(i32 0)
+// CHECK: call void asm sideeffect "", "K,~{vl},~{vtype}"(i32 0)
asm volatile ("" :: "K"(0));
}
@@ -81,16 +81,16 @@ double d;
void test_f(void) {
// CHECK-LABEL: define{{.*}} void @test_f()
// CHECK: [[FLT_ARG:%[a-zA-Z_0-9]+]] = load float, ptr @f
-// CHECK: call void asm sideeffect "", "f"(float [[FLT_ARG]])
+// CHECK: call void asm sideeffect "", "f,~{vl},~{vtype}"(float [[FLT_ARG]])
asm volatile ("" :: "f"(f));
// CHECK: [[FLT_ARG:%[a-zA-Z_0-9]+]] = load double, ptr @d
-// CHECK: call void asm sideeffect "", "f"(double [[FLT_ARG]])
+// CHECK: call void asm sideeffect "", "f,~{vl},~{vtype}"(double [[FLT_ARG]])
asm volatile ("" :: "f"(d));
}
void test_A(int *p) {
// CHECK-LABEL: define{{.*}} void @test_A(ptr noundef %p)
-// CHECK: call void asm sideeffect "", "*A"(ptr elementtype(i32) %p)
+// CHECK: call void asm sideeffect "", "*A,~{vl},~{vtype}"(ptr elementtype(i32) %p)
asm volatile("" :: "A"(*p));
}
@@ -98,9 +98,9 @@ extern int var, arr[2][2];
struct Pair { int a, b; } pair;
// CHECK-LABEL: test_s(
-// CHECK: call void asm sideeffect "// $0 $1 $2", "s,s,s"(ptr nonnull @var, ptr nonnull getelementptr inbounds nuw (i8, ptr @arr, {{.*}}), ptr nonnull @test_s)
-// CHECK: call void asm sideeffect "// $0", "s"(ptr nonnull getelementptr inbounds nuw (i8, ptr @pair, {{.*}}))
-// CHECK: call void asm sideeffect "// $0 $1 $2", "S,S,S"(ptr nonnull @var, ptr nonnull getelementptr inbounds nuw (i8, ptr @arr, {{.*}}), ptr nonnull @test_s)
+// CHECK: call void asm sideeffect "// $0 $1 $2", "s,s,s,~{vl},~{vtype}"(ptr nonnull @var, ptr nonnull getelementptr inbounds nuw (i8, ptr @arr, {{.*}}), ptr nonnull @test_s)
+// CHECK: call void asm sideeffect "// $0", "s,~{vl},~{vtype}"(ptr nonnull getelementptr inbounds nuw (i8, ptr @pair, {{.*}}))
+// CHECK: call void asm sideeffect "// $0 $1 $2", "S,S,S,~{vl},~{vtype}"(ptr nonnull @var, ptr nonnull getelementptr inbounds nuw (i8, ptr @arr, {{.*}}), ptr nonnull @test_s)
void test_s(void) {
asm("// %0 %1 %2" :: "s"(&var), "s"(&arr[1][1]), "s"(test_s));
asm("// %0" :: "s"(&pair.b));
@@ -109,9 +109,9 @@ void test_s(void) {
}
// CHECK-LABEL: test_modifiers(
-// CHECK: call void asm sideeffect "// ${0:i} ${1:i}", "r,r"({{i32|i64}} %val, i32 37)
-// CHECK: call void asm sideeffect "// ${0:z} ${1:z}", "i,i"(i32 0, i32 1)
-// CHECK: call void asm sideeffect "// ${0:N}", "r"({{i32|i64}} %val)
+// CHECK: call void asm sideeffect "// ${0:i} ${1:i}", "r,r,~{vl},~{vtype}"({{i32|i64}} %val, i32 37)
+// CHECK: call void asm sideeffect "// ${0:z} ${1:z}", "i,i,~{vl},~{vtype}"(i32 0, i32 1)
+// CHECK: call void asm sideeffect "// ${0:N}", "r,~{vl},~{vtype}"({{i32|i64}} %val)
void test_modifiers(long val) {
asm volatile("// %i0 %i1" :: "r"(val), "r"(37));
asm volatile("// %z0 %z1" :: "i"(0), "i"(1));
diff --git a/llvm/test/CodeGen/RISCV/rvv/vsetvl-cross-inline-asm-clobber.ll b/llvm/test/CodeGen/RISCV/rvv/vsetvl-cross-inline-asm-clobber.ll
new file mode 100644
index 0000000000000..75d8b4c5437b1
--- /dev/null
+++ b/llvm/test/CodeGen/RISCV/rvv/vsetvl-cross-inline-asm-clobber.ll
@@ -0,0 +1,37 @@
+; NOTE: Assertions have been autogenerated by utils/update_llc_test_checks.py UTC_ARGS: --version 5
+; RUN: llc -mtriple=riscv64 -mcpu=sifive-x280 -verify-machineinstrs < %s | FileCheck %s
+
+define void @foo(<vscale x 8 x half> %0, <vscale x 8 x half> %1) {
+; CHECK-LABEL: foo:
+; CHECK: # %bb.0: # %entry
+; CHECK-NEXT: vsetvli a0, zero, e32, m4, ta, ma
+; CHECK-NEXT: vmv.v.i v12, 0
+; CHECK-NEXT: lui a0, 1
+; CHECK-NEXT: addiw a0, a0, -1096
+; CHECK-NEXT: vmv.v.i v16, 0
+; CHECK-NEXT: vsetvli zero, a0, e8, m1, ta, ma
+; CHECK-NEXT: #APP
+; CHECK-NEXT: vfmadd.vv v16, v12, v12
+; CHECK-NEXT: #NO_APP
+; CHECK-NEXT: #APP
+; CHECK-NEXT: vfmadd.vv v16, v12, v12
+; CHECK-NEXT: #NO_APP
+; CHECK-NEXT: vsetvli zero, a0, e16, m2, ta, ma
+; CHECK-NEXT: vse16.v v8, (zero)
+; CHECK-NEXT: ret
+entry:
+ %2 = tail call i64 @llvm.riscv.vsetvli.i64(i64 3000, i64 0, i64 0)
+ %3 = tail call <vscale x 8 x float> asm sideeffect "vfmadd.vv $0, $1, $2", "=^vr,^vr,^vr,0,~{vl},~{vtype}"(<vscale x 8 x float> zeroinitializer, <vscale x 8 x float> zeroinitializer, <vscale x 8 x float> zeroinitializer)
+ %4 = tail call <vscale x 8 x float> asm sideeffect "vfmadd.vv $0, $1, $2", "=^vr,^vr,^vr,0,~{vl},~{vtype}"(<vscale x 8 x float> zeroinitializer, <vscale x 8 x float> zeroinitializer, <vscale x 8 x float> %3)
+ tail call void @llvm.riscv.vse.nxv8f16.i64(<vscale x 8 x half> %0, ptr null, i64 %2)
+ ret void
+}
+
+; Function Attrs: nocallback nofree nosync nounwind willreturn memory(none)
+declare i64 @llvm.riscv.vsetvli.i64(i64, i64 immarg, i64 immarg) #0
+
+; Function Attrs: nocallback nofree nosync nounwind willreturn memory(argmem: write)
+declare void @llvm.riscv.vse.nxv8f16.i64(<vscale x 8 x half>, ptr nocapture, i64) #1
+
+attributes #0 = { nocallback nofree nosync nounwind willreturn memory(none) }
+attributes #1 = { nocallback nofree nosync nounwind willreturn memory(argmem: write) }
diff --git a/llvm/test/CodeGen/RISCV/rvv/vsetvl-cross-inline-asm.ll b/llvm/test/CodeGen/RISCV/rvv/vsetvl-cross-inline-asm.ll
new file mode 100644
index 0000000000000..e60ce90d5ed8b
--- /dev/null
+++ b/llvm/test/CodeGen/RISCV/rvv/vsetvl-cross-inline-asm.ll
@@ -0,0 +1,37 @@
+; NOTE: Assertions have been autogenerated by utils/update_llc_test_checks.py UTC_ARGS: --version 5
+; RUN: llc -mtriple=riscv64 -mcpu=sifive-x280 -verify-machineinstrs < %s | FileCheck %s
+
+define void @foo(<vscale x 8 x half> %0, <vscale x 8 x half> %1) {
+; CHECK-LABEL: foo:
+; CHECK: # %bb.0: # %entry
+; CHECK-NEXT: vsetvli a0, zero, e32, m4, ta, ma
+; CHECK-NEXT: vmv.v.i v12, 0
+; CHECK-NEXT: lui a0, 1
+; CHECK-NEXT: addiw a0, a0, -1096
+; CHECK-NEXT: vmv.v.i v16, 0
+; CHECK-NEXT: vsetvli zero, a0, e8, m1, ta, ma
+; CHECK-NEXT: #APP
+; CHECK-NEXT: vfmadd.vv v16, v12, v12
+; CHECK-NEXT: #NO_APP
+; CHECK-NEXT: vsetvli zero, a0, e16, m2, ta, ma
+; CHECK-NEXT: #APP
+; CHECK-NEXT: vfmadd.vv v16, v12, v12
+; CHECK-NEXT: #NO_APP
+; CHECK-NEXT: vse16.v v8, (zero)
+; CHECK-NEXT: ret
+entry:
+ %2 = tail call i64 @llvm.riscv.vsetvli.i64(i64 3000, i64 0, i64 0)
+ %3 = tail call <vscale x 8 x float> asm sideeffect "vfmadd.vv $0, $1, $2", "=^vr,^vr,^vr,0"(<vscale x 8 x float> zeroinitializer, <vscale x 8 x float> zeroinitializer, <vscale x 8 x float> zeroinitializer)
+ %4 = tail call <vscale x 8 x float> asm sideeffect "vfmadd.vv $0, $1, $2", "=^vr,^vr,^vr,0"(<vscale x 8 x float> zeroinitializer, <vscale x 8 x float> zeroinitializer, <vscale x 8 x float> %3)
+ tail call void @llvm.riscv.vse.nxv8f16.i64(<vscale x 8 x half> %0, ptr null, i64 %2)
+ ret void
+}
+
+; Function Attrs: nocallback nofree nosync nounwind willreturn memory(none)
+declare i64 @llvm.riscv.vsetvli.i64(i64, i64 immarg, i64 immarg) #0
+
+; Function Attrs: nocallback nofree nosync nounwind willreturn memory(argmem: write)
+declare void @llvm.riscv.vse.nxv8f16.i64(<vscale x 8 x half>, ptr nocapture, i64) #1
+
+attributes #0 = { nocallback nofree nosync nounwind willreturn memory(none) }
+attributes #1 = { nocallback nofree nosync nounwind willreturn memory(argmem: write) }
|
I think this may not be the right way.
|
Yes, you're right. But the purpose we add vl/vtype dependencies is to prevent the Post-RA scheduler moving vsetvl instruction across inline assembly. I'm not sure if there's better approach to solve this problem. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I see. But what's the difference if we add vl/vtype
to the list of clobbered registers explicitly in C/C++ asm statements?
@@ -68,7 +68,7 @@ class RISCVTargetInfo : public TargetInfo { | |||
return TargetInfo::VoidPtrBuiltinVaList; | |||
} | |||
|
|||
std::string_view getClobbers() const override { return ""; } | |||
std::string_view getClobbers() const override { return "~{vl},~{vtype}"; } |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
std::string_view getClobbers() const override { return "~{vl},~{vtype}"; } | |
std::string_view getClobbers() const override { | |
if (ISAInfo->hasExtension("zve32x")) | |
return "~{vl},~{vtype}"; | |
return ""; | |
} |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think this implementation is better, just update it with your implementation.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
That doesn't work with the target
attribute.
Maybe have RISCVInsertVSETVLI add implicit use operands to the inline assembly at that point? It's what we do for all the vector instructions. In this case, you might want implicit defs. |
I think what @wangpc-pp advocated here (please correct me if I'm wrong) was that user should be responsible annotating these registers as clobbered so that we have more freedom on scheduling when the inline assembly is not using any vector instructions. While other approaches -- regardless of marking them as clobber or implicit-def these two registers in RISCVInsertVSETVLI -- are taking a more conservative path to ensure correctness. It's a shame that we're not able to analyze the instructions within inline assembly to make a better decision. Personally I think we should prioritize correctness, therefore I'm more incline to the latter approaches. I do agree that we probably should only do it when vector instructions are present, as pointed out by one of the review comments by @wangpc-pp. |
Yes this is exactly what I meant!
I am not opposed to this. But I have to say, the use case is not so common. We don't mix assemblies with RVV instructions and RVV intrinsics in the same code. In this scenario, we should write all RVV instructions in asm statements or just use RVV intrinsics. Even for these rare cases, the user can achieve the same effect if they put |
They can, but will they? clobbers are a relatively uncommon feature I suspect many people don't know about it. I think maybe we need documentation of how to use RVV with inline assembly. I've seen a lot of things that aren't technically correct, but usually work. For example, one statement per instruction with hand allocated vector registers and no clobbers. The vector register aren't technically alive between the statements as far as the compiler is concerned. |
clang/lib/Basic/Targets/RISCV.h
Outdated
@@ -68,7 +68,11 @@ class RISCVTargetInfo : public TargetInfo { | |||
return TargetInfo::VoidPtrBuiltinVaList; | |||
} | |||
|
|||
std::string_view getClobbers() const override { return ""; } | |||
std::string_view getClobbers() const override { | |||
if (ISAInfo->hasExtension("zve32x")) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This doesn't work if zve32x is added by attribute(target("zve32x"))
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I just reverted to original implementation.
I am not familiar with the target attribute implementation, can we get the list of function features here and add the clobbers at: llvm-project/clang/lib/CodeGen/CGStmt.cpp Lines 3087 to 3093 in 6d93280
cc @topperc @4vtomat |
I'm not sure. But I don't understand why we need to check for Zve32x at all? |
Just see the previous discussion, it is weird to add vector CSRs to clobbers if no vector support. |
I think we should try this RISCVInsertVSETVLI idea. |
I tested the following case without passing the 'v' extension in the Clang command line argument: __attribute__((target("arch=rv32gcv_zve32x")))
void test_A(int *p) {
asm volatile("" :: "A"(*p));
} The generated LLVM IR result is:
From this result, it appears that the target attribute does not have the intended effect in this case. Given this, perhaps we should keep the initial implementation temporary and try the RISCVInsertVSETVLI approach for future improvements. |
Agree, we should try the RISCVInsertVSETVLI approach. |
This reverts commit 4258fae.
The RISCVInsertVSETVLI approach is probably just a matter of doing something like. May should check if an implicit def already exists before adding it. I don't think addOperand checks for duplicates.
|
Would there be some way to explicitly request to not clobber vl/vtype? An asm block is quite useful as an "optimization fence" on x86 & ARM NEON, but it'd be sad if that necessarily meant extra vsetvl spam on RVV (currently clang doesn't do much optimization on RVV intrinsics so there's not much chance for it to misoptimize, but presumably that may change; more significant for |
I create another pull request #130733 that is implemented with the RISCVInsertVSETVL approach, since it's a new approach. |
#130733) …sing inline assembly. Fixing [#128636](#128636). This patch has RISCVInsertVSETVLI to add implicit use operand to inline assembly, this approach is suggested by @preames and the implementation I referenced is from @topperc . The purpose of adding vl, vtype implicit operand is to prevent Post-RA scheduler moving vsetvl across inline assembly.
…TVLI when u… (#130733) …sing inline assembly. Fixing [#128636](llvm/llvm-project#128636). This patch has RISCVInsertVSETVLI to add implicit use operand to inline assembly, this approach is suggested by @preames and the implementation I referenced is from @topperc . The purpose of adding vl, vtype implicit operand is to prevent Post-RA scheduler moving vsetvl across inline assembly.
#130733 is merged. This pr will be marked as closed. |
llvm#130733) …sing inline assembly. Fixing [llvm#128636](llvm#128636). This patch has RISCVInsertVSETVLI to add implicit use operand to inline assembly, this approach is suggested by @preames and the implementation I referenced is from @topperc . The purpose of adding vl, vtype implicit operand is to prevent Post-RA scheduler moving vsetvl across inline assembly.
This patch use the hook getClobbers() in RISCV target and mark {vl, type} as clobber to prevent Post-RA scheduler moving vsetvl across inline assembly.
Fixing #97794.