[RISCV] Account for zvfhmin and zvfbfmin promotion in register usage #108370

lukel97 · 2024-09-12T11:37:59Z

A half with only zvfhmin or bfloat will end up getting promoted to a f32 for most instructions.

Unless the loop consists only of memory ops and permutation instructions which don't need promoted (is this common?), we'll end up using double the LMUL than what's currently being returned by getRegUsageForType.

Since this is used by the loop vectorizer, it seems better to be conservative and assume that any usage of a zvfhmin half/bfloat will end up being widened to a f32

A half with only zvfhmin or bfloat will end up getting promoted to a f32 for most instructions. Unless the loop consists only of memory ops and permutation instructions which don't need promoted (is this common?), we'll end up using double the LMUL than what's currently being returned by getRegUsageForType. Since this is used by the loop vectorizer, it seems better to be conservative and assume that any usage of a zvfhmin half/bfloat will end up being widened to a f32.

llvmbot · 2024-09-12T11:38:31Z

@llvm/pr-subscribers-backend-risc-v

@llvm/pr-subscribers-llvm-transforms

Author: Luke Lau (lukel97)

Changes

A half with only zvfhmin or bfloat will end up getting promoted to a f32 for most instructions.

Unless the loop consists only of memory ops and permutation instructions which don't need promoted (is this common?), we'll end up using double the LMUL than what's currently being returned by getRegUsageForType.

Since this is used by the loop vectorizer, it seems better to be conservative and assume that any usage of a zvfhmin half/bfloat will end up being widened to a f32

Full diff: https://github.com/llvm/llvm-project/pull/108370.diff

3 Files Affected:

(modified) llvm/lib/Target/RISCV/RISCVTargetTransformInfo.cpp (+8-1)
(added) llvm/test/Transforms/LoopVectorize/RISCV/reg-usage-bf16.ll (+31)
(added) llvm/test/Transforms/LoopVectorize/RISCV/reg-usage-f16.ll (+37)

diff --git a/llvm/lib/Target/RISCV/RISCVTargetTransformInfo.cpp b/llvm/lib/Target/RISCV/RISCVTargetTransformInfo.cpp
index 2b5e7c47279284..3303534ecb4968 100644
--- a/llvm/lib/Target/RISCV/RISCVTargetTransformInfo.cpp
+++ b/llvm/lib/Target/RISCV/RISCVTargetTransformInfo.cpp
@@ -2030,8 +2030,15 @@ void RISCVTTIImpl::getPeelingPreferences(Loop *L, ScalarEvolution &SE,
 }
 
 unsigned RISCVTTIImpl::getRegUsageForType(Type *Ty) {
-  TypeSize Size = DL.getTypeSizeInBits(Ty);
   if (Ty->isVectorTy()) {
+    // f16 w/ zvfhmin and bf16 types will be promoted to f32
+    Type *EltTy = cast<VectorType>(Ty)->getElementType();
+    if ((EltTy->isHalfTy() && !ST->hasVInstructionsF16()) ||
+        EltTy->isBFloatTy())
+      Ty = VectorType::get(Type::getFloatTy(Ty->getContext()),
+                           cast<VectorType>(Ty));
+
+    TypeSize Size = DL.getTypeSizeInBits(Ty);
     if (Size.isScalable() && ST->hasVInstructions())
       return divideCeil(Size.getKnownMinValue(), RISCV::RVVBitsPerBlock);
 
diff --git a/llvm/test/Transforms/LoopVectorize/RISCV/reg-usage-bf16.ll b/llvm/test/Transforms/LoopVectorize/RISCV/reg-usage-bf16.ll
new file mode 100644
index 00000000000000..89514431278a74
--- /dev/null
+++ b/llvm/test/Transforms/LoopVectorize/RISCV/reg-usage-bf16.ll
@@ -0,0 +1,31 @@
+; RUN: opt -passes=loop-vectorize -mtriple riscv64 -mattr=+v,+zvfbfmin -debug-only=loop-vectorize -riscv-v-register-bit-width-lmul=1 -S < %s 2>&1 | FileCheck %s
+
+define void @add(ptr noalias nocapture readonly %src1, ptr noalias nocapture readonly %src2, i32 signext %size, ptr noalias nocapture writeonly %result) {
+; CHECK-LABEL: add
+; CHECK:       LV(REG): Found max usage: 2 item
+; CHECK-NEXT:  LV(REG): RegisterClass: RISCV::GPRRC, 2 registers
+; CHECK-NEXT:  LV(REG): RegisterClass: RISCV::VRRC, 4 registers
+; CHECK-NEXT:  LV(REG): Found invariant usage: 1 item
+; CHECK-NEXT:  LV(REG): RegisterClass: RISCV::GPRRC, 1 registers
+
+entry:
+  %conv = zext i32 %size to i64
+  %cmp10.not = icmp eq i32 %size, 0
+  br i1 %cmp10.not, label %for.cond.cleanup, label %for.body
+
+for.cond.cleanup:
+  ret void
+
+for.body:
+  %i.011 = phi i64 [ %add4, %for.body ], [ 0, %entry ]
+  %arrayidx = getelementptr inbounds bfloat, ptr %src1, i64 %i.011
+  %0 = load bfloat, ptr %arrayidx, align 4
+  %arrayidx2 = getelementptr inbounds bfloat, ptr %src2, i64 %i.011
+  %1 = load bfloat, ptr %arrayidx2, align 4
+  %add = fadd bfloat %0, %1
+  %arrayidx3 = getelementptr inbounds bfloat, ptr %result, i64 %i.011
+  store bfloat %add, ptr %arrayidx3, align 4
+  %add4 = add nuw nsw i64 %i.011, 1
+  %exitcond.not = icmp eq i64 %add4, %conv
+  br i1 %exitcond.not, label %for.cond.cleanup, label %for.body
+}
diff --git a/llvm/test/Transforms/LoopVectorize/RISCV/reg-usage-f16.ll b/llvm/test/Transforms/LoopVectorize/RISCV/reg-usage-f16.ll
new file mode 100644
index 00000000000000..ceedcfba4691e1
--- /dev/null
+++ b/llvm/test/Transforms/LoopVectorize/RISCV/reg-usage-f16.ll
@@ -0,0 +1,37 @@
+; RUN: opt -passes=loop-vectorize -mtriple riscv64 -mattr=+v,+zvfh -debug-only=loop-vectorize -riscv-v-register-bit-width-lmul=1 -S < %s 2>&1 | FileCheck %s --check-prefix=ZVFH
+; RUN: opt -passes=loop-vectorize -mtriple riscv64 -mattr=+v,+zvfhmin -debug-only=loop-vectorize -riscv-v-register-bit-width-lmul=1 -S < %s 2>&1 | FileCheck %s --check-prefix=ZVFHMIN
+
+define void @add(ptr noalias nocapture readonly %src1, ptr noalias nocapture readonly %src2, i32 signext %size, ptr noalias nocapture writeonly %result) {
+; CHECK-LABEL: add
+; ZVFH:       LV(REG): Found max usage: 2 item
+; ZVFH-NEXT:  LV(REG): RegisterClass: RISCV::GPRRC, 2 registers
+; ZVFH-NEXT:  LV(REG): RegisterClass: RISCV::VRRC, 2 registers
+; ZVFH-NEXT:  LV(REG): Found invariant usage: 1 item
+; ZVFH-NEXT:  LV(REG): RegisterClass: RISCV::GPRRC, 1 registers
+; ZVFHMIN:       LV(REG): Found max usage: 2 item
+; ZVFHMIN-NEXT:  LV(REG): RegisterClass: RISCV::GPRRC, 2 registers
+; ZVFHMIN-NEXT:  LV(REG): RegisterClass: RISCV::VRRC, 4 registers
+; ZVFHMIN-NEXT:  LV(REG): Found invariant usage: 1 item
+; ZVFHMIN-NEXT:  LV(REG): RegisterClass: RISCV::GPRRC, 1 registers
+
+entry:
+  %conv = zext i32 %size to i64
+  %cmp10.not = icmp eq i32 %size, 0
+  br i1 %cmp10.not, label %for.cond.cleanup, label %for.body
+
+for.cond.cleanup:
+  ret void
+
+for.body:
+  %i.011 = phi i64 [ %add4, %for.body ], [ 0, %entry ]
+  %arrayidx = getelementptr inbounds half, ptr %src1, i64 %i.011
+  %0 = load half, ptr %arrayidx, align 4
+  %arrayidx2 = getelementptr inbounds half, ptr %src2, i64 %i.011
+  %1 = load half, ptr %arrayidx2, align 4
+  %add = fadd half %0, %1
+  %arrayidx3 = getelementptr inbounds half, ptr %result, i64 %i.011
+  store half %add, ptr %arrayidx3, align 4
+  %add4 = add nuw nsw i64 %i.011, 1
+  %exitcond.not = icmp eq i64 %add4, %conv
+  br i1 %exitcond.not, label %for.cond.cleanup, label %for.body
+}

topperc

Seems reasonable to me.

jacquesguan

LGTM

llvm-ci · 2024-09-17T06:14:15Z

LLVM Buildbot has detected a new failure on builder fuchsia-x86_64-linux running on fuchsia-debian-64-us-central1-a-1 while building llvm at step 4 "annotate".

Full details are available at: https://lab.llvm.org/buildbot/#/builders/11/builds/5124

Here is the relevant piece of the build log for the reference

Step 4 (annotate) failure: 'python ../llvm-zorg/zorg/buildbot/builders/annotated/fuchsia-linux.py ...' (failure)
...
[1322/1327] Building CXX object unittests/Transforms/Scalar/CMakeFiles/ScalarTests.dir/LoopPassManagerTest.cpp.o
clang++: warning: optimization flag '-ffat-lto-objects' is not supported [-Wignored-optimization-argument]
[1323/1327] Linking CXX executable unittests/Transforms/Scalar/ScalarTests
[1323/1327] Running the LLVM regression tests
llvm-lit: /var/lib/buildbot/fuchsia-x86_64-linux/llvm-project/llvm/utils/lit/lit/llvm/config.py:506: note: using ld.lld: /var/lib/buildbot/fuchsia-x86_64-linux/build/llvm-build-2f042fl1/bin/ld.lld
llvm-lit: /var/lib/buildbot/fuchsia-x86_64-linux/llvm-project/llvm/utils/lit/lit/llvm/config.py:506: note: using lld-link: /var/lib/buildbot/fuchsia-x86_64-linux/build/llvm-build-2f042fl1/bin/lld-link
llvm-lit: /var/lib/buildbot/fuchsia-x86_64-linux/llvm-project/llvm/utils/lit/lit/llvm/config.py:506: note: using ld64.lld: /var/lib/buildbot/fuchsia-x86_64-linux/build/llvm-build-2f042fl1/bin/ld64.lld
llvm-lit: /var/lib/buildbot/fuchsia-x86_64-linux/llvm-project/llvm/utils/lit/lit/llvm/config.py:506: note: using wasm-ld: /var/lib/buildbot/fuchsia-x86_64-linux/build/llvm-build-2f042fl1/bin/wasm-ld
-- Testing: 55717 tests, 60 workers --
Testing:  0.. 10.. 20.. 30.. 40.. 50.. 60.. 70.. 
FAIL: LLVM :: Transforms/LoopVectorize/RISCV/reg-usage-bf16.ll (44761 of 55717)
******************** TEST 'LLVM :: Transforms/LoopVectorize/RISCV/reg-usage-bf16.ll' FAILED ********************
Exit Code: 1

Command Output (stderr):
--
RUN: at line 1: /var/lib/buildbot/fuchsia-x86_64-linux/build/llvm-build-2f042fl1/bin/opt -passes=loop-vectorize -mtriple riscv64 -mattr=+v,+zvfbfmin -debug-only=loop-vectorize -riscv-v-register-bit-width-lmul=1 -S < /var/lib/buildbot/fuchsia-x86_64-linux/llvm-project/llvm/test/Transforms/LoopVectorize/RISCV/reg-usage-bf16.ll 2>&1 | /var/lib/buildbot/fuchsia-x86_64-linux/build/llvm-build-2f042fl1/bin/FileCheck /var/lib/buildbot/fuchsia-x86_64-linux/llvm-project/llvm/test/Transforms/LoopVectorize/RISCV/reg-usage-bf16.ll
+ /var/lib/buildbot/fuchsia-x86_64-linux/build/llvm-build-2f042fl1/bin/opt -passes=loop-vectorize -mtriple riscv64 -mattr=+v,+zvfbfmin -debug-only=loop-vectorize -riscv-v-register-bit-width-lmul=1 -S
+ /var/lib/buildbot/fuchsia-x86_64-linux/build/llvm-build-2f042fl1/bin/FileCheck /var/lib/buildbot/fuchsia-x86_64-linux/llvm-project/llvm/test/Transforms/LoopVectorize/RISCV/reg-usage-bf16.ll
/var/lib/buildbot/fuchsia-x86_64-linux/llvm-project/llvm/test/Transforms/LoopVectorize/RISCV/reg-usage-bf16.ll:4:16: error: CHECK-LABEL: expected string not found in input
; CHECK-LABEL: add
               ^
<stdin>:1:1: note: scanning from here
opt: Unknown command line argument '-debug-only=loop-vectorize'. Try: '/var/lib/buildbot/fuchsia-x86_64-linux/build/llvm-build-2f042fl1/bin/opt --help'
^
<stdin>:1:18: note: possible intended match here
opt: Unknown command line argument '-debug-only=loop-vectorize'. Try: '/var/lib/buildbot/fuchsia-x86_64-linux/build/llvm-build-2f042fl1/bin/opt --help'
                 ^

Input file: <stdin>
Check file: /var/lib/buildbot/fuchsia-x86_64-linux/llvm-project/llvm/test/Transforms/LoopVectorize/RISCV/reg-usage-bf16.ll

-dump-input=help explains the following input dump.

Input was:
<<<<<<
           1: opt: Unknown command line argument '-debug-only=loop-vectorize'. Try: '/var/lib/buildbot/fuchsia-x86_64-linux/build/llvm-build-2f042fl1/bin/opt --help' 
label:4'0     X~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ error: no match found
label:4'1                      ?                                                                                                                                       possible intended match
           2: opt: Did you mean '--debug-pass=loop-vectorize'? 
label:4'0     ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
>>>>>>

--

********************
Testing:  0.. 10.. 20.. 30.. 40.. 50.. 60.. 70.. 
FAIL: LLVM :: Transforms/LoopVectorize/RISCV/reg-usage-f16.ll (44763 of 55717)
******************** TEST 'LLVM :: Transforms/LoopVectorize/RISCV/reg-usage-f16.ll' FAILED ********************
Step 7 (check) failure: check (failure)
...
[1322/1327] Building CXX object unittests/Transforms/Scalar/CMakeFiles/ScalarTests.dir/LoopPassManagerTest.cpp.o
clang++: warning: optimization flag '-ffat-lto-objects' is not supported [-Wignored-optimization-argument]
[1323/1327] Linking CXX executable unittests/Transforms/Scalar/ScalarTests
[1323/1327] Running the LLVM regression tests
llvm-lit: /var/lib/buildbot/fuchsia-x86_64-linux/llvm-project/llvm/utils/lit/lit/llvm/config.py:506: note: using ld.lld: /var/lib/buildbot/fuchsia-x86_64-linux/build/llvm-build-2f042fl1/bin/ld.lld
llvm-lit: /var/lib/buildbot/fuchsia-x86_64-linux/llvm-project/llvm/utils/lit/lit/llvm/config.py:506: note: using lld-link: /var/lib/buildbot/fuchsia-x86_64-linux/build/llvm-build-2f042fl1/bin/lld-link
llvm-lit: /var/lib/buildbot/fuchsia-x86_64-linux/llvm-project/llvm/utils/lit/lit/llvm/config.py:506: note: using ld64.lld: /var/lib/buildbot/fuchsia-x86_64-linux/build/llvm-build-2f042fl1/bin/ld64.lld
llvm-lit: /var/lib/buildbot/fuchsia-x86_64-linux/llvm-project/llvm/utils/lit/lit/llvm/config.py:506: note: using wasm-ld: /var/lib/buildbot/fuchsia-x86_64-linux/build/llvm-build-2f042fl1/bin/wasm-ld
-- Testing: 55717 tests, 60 workers --
Testing:  0.. 10.. 20.. 30.. 40.. 50.. 60.. 70.. 
FAIL: LLVM :: Transforms/LoopVectorize/RISCV/reg-usage-bf16.ll (44761 of 55717)
******************** TEST 'LLVM :: Transforms/LoopVectorize/RISCV/reg-usage-bf16.ll' FAILED ********************
Exit Code: 1

Command Output (stderr):
--
RUN: at line 1: /var/lib/buildbot/fuchsia-x86_64-linux/build/llvm-build-2f042fl1/bin/opt -passes=loop-vectorize -mtriple riscv64 -mattr=+v,+zvfbfmin -debug-only=loop-vectorize -riscv-v-register-bit-width-lmul=1 -S < /var/lib/buildbot/fuchsia-x86_64-linux/llvm-project/llvm/test/Transforms/LoopVectorize/RISCV/reg-usage-bf16.ll 2>&1 | /var/lib/buildbot/fuchsia-x86_64-linux/build/llvm-build-2f042fl1/bin/FileCheck /var/lib/buildbot/fuchsia-x86_64-linux/llvm-project/llvm/test/Transforms/LoopVectorize/RISCV/reg-usage-bf16.ll
+ /var/lib/buildbot/fuchsia-x86_64-linux/build/llvm-build-2f042fl1/bin/opt -passes=loop-vectorize -mtriple riscv64 -mattr=+v,+zvfbfmin -debug-only=loop-vectorize -riscv-v-register-bit-width-lmul=1 -S
+ /var/lib/buildbot/fuchsia-x86_64-linux/build/llvm-build-2f042fl1/bin/FileCheck /var/lib/buildbot/fuchsia-x86_64-linux/llvm-project/llvm/test/Transforms/LoopVectorize/RISCV/reg-usage-bf16.ll
/var/lib/buildbot/fuchsia-x86_64-linux/llvm-project/llvm/test/Transforms/LoopVectorize/RISCV/reg-usage-bf16.ll:4:16: error: CHECK-LABEL: expected string not found in input
; CHECK-LABEL: add
               ^
<stdin>:1:1: note: scanning from here
opt: Unknown command line argument '-debug-only=loop-vectorize'. Try: '/var/lib/buildbot/fuchsia-x86_64-linux/build/llvm-build-2f042fl1/bin/opt --help'
^
<stdin>:1:18: note: possible intended match here
opt: Unknown command line argument '-debug-only=loop-vectorize'. Try: '/var/lib/buildbot/fuchsia-x86_64-linux/build/llvm-build-2f042fl1/bin/opt --help'
                 ^

Input file: <stdin>
Check file: /var/lib/buildbot/fuchsia-x86_64-linux/llvm-project/llvm/test/Transforms/LoopVectorize/RISCV/reg-usage-bf16.ll

-dump-input=help explains the following input dump.

Input was:
<<<<<<
           1: opt: Unknown command line argument '-debug-only=loop-vectorize'. Try: '/var/lib/buildbot/fuchsia-x86_64-linux/build/llvm-build-2f042fl1/bin/opt --help' 
label:4'0     X~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ error: no match found
label:4'1                      ?                                                                                                                                       possible intended match
           2: opt: Did you mean '--debug-pass=loop-vectorize'? 
label:4'0     ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
>>>>>>

--

********************
Testing:  0.. 10.. 20.. 30.. 40.. 50.. 60.. 70.. 
FAIL: LLVM :: Transforms/LoopVectorize/RISCV/reg-usage-f16.ll (44763 of 55717)
******************** TEST 'LLVM :: Transforms/LoopVectorize/RISCV/reg-usage-f16.ll' FAILED ********************

topperc · 2024-09-17T06:15:44Z

llvm/test/Transforms/LoopVectorize/RISCV/reg-usage-bf16.ll

@@ -0,0 +1,31 @@
+; RUN: opt -passes=loop-vectorize -mtriple riscv64 -mattr=+v,+zvfbfmin -debug-only=loop-vectorize -riscv-v-register-bit-width-lmul=1 -S < %s 2>&1 | FileCheck %s


Need REQUIRES: asserts

Done in 30d7dcc

…lvm#108370) A half with only zvfhmin or bfloat will end up getting promoted to a f32 for most instructions. Unless the loop consists only of memory ops and permutation instructions which don't need promoted (is this common?), we'll end up using double the LMUL than what's currently being returned by getRegUsageForType. Since this is used by the loop vectorizer, it seems better to be conservative and assume that any usage of a zvfhmin half/bfloat will end up being widened to a f32

lukel97 added 2 commits September 12, 2024 19:15

Precommit tests

c2790b9

lukel97 requested review from preames, kito-cheng, jacquesguan, topperc and wangpc-pp September 12, 2024 11:37

llvmbot added backend:RISC-V llvm:transforms labels Sep 12, 2024

Clarify wording in comment

c005980

topperc approved these changes Sep 12, 2024

View reviewed changes

jacquesguan approved these changes Sep 13, 2024

View reviewed changes

lukel97 merged commit 41f1b46 into llvm:main Sep 17, 2024
8 checks passed

topperc reviewed Sep 17, 2024

View reviewed changes

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

[RISCV] Account for zvfhmin and zvfbfmin promotion in register usage #108370

[RISCV] Account for zvfhmin and zvfbfmin promotion in register usage #108370

Uh oh!

lukel97 commented Sep 12, 2024

Uh oh!

llvmbot commented Sep 12, 2024 •

edited

Loading

Uh oh!

topperc left a comment

Uh oh!

jacquesguan left a comment

Uh oh!

Uh oh!

llvm-ci commented Sep 17, 2024

Uh oh!

topperc Sep 17, 2024

Uh oh!

lukel97 Sep 17, 2024

Uh oh!

Uh oh!

		@@ -0,0 +1,31 @@
		; RUN: opt -passes=loop-vectorize -mtriple riscv64 -mattr=+v,+zvfbfmin -debug-only=loop-vectorize -riscv-v-register-bit-width-lmul=1 -S < %s 2>&1 \| FileCheck %s

[RISCV] Account for zvfhmin and zvfbfmin promotion in register usage #108370

[RISCV] Account for zvfhmin and zvfbfmin promotion in register usage #108370

Uh oh!

Conversation

lukel97 commented Sep 12, 2024

Uh oh!

llvmbot commented Sep 12, 2024 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

topperc left a comment

Choose a reason for hiding this comment

Uh oh!

jacquesguan left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

llvm-ci commented Sep 17, 2024

Uh oh!

topperc Sep 17, 2024

Choose a reason for hiding this comment

Uh oh!

lukel97 Sep 17, 2024

Choose a reason for hiding this comment

Uh oh!

Uh oh!

llvmbot commented Sep 12, 2024 •

edited

Loading