-
Notifications
You must be signed in to change notification settings - Fork 14.3k
[LLVM][SVE] Implement isel for bfloat fptoi and itofp operations. #129713
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Changes from all commits
File filter
Filter by extension
Conversations
Jump to
Diff view
Diff view
There are no files selected for viewing
Original file line number | Diff line number | Diff line change |
---|---|---|
|
@@ -5465,6 +5465,14 @@ multiclass sve_int_dup_fpimm_pred<string asm> { | |
(!cast<Instruction>(NAME # _S) $zd, $pg, fpimm32:$imm8)>; | ||
def : Pat<(nxv2f64 (vselect nxv2i1:$pg, (splat_vector fpimm64:$imm8), nxv2f64:$zd)), | ||
(!cast<Instruction>(NAME # _D) $zd, $pg, fpimm64:$imm8)>; | ||
|
||
// Some half precision immediates alias with bfloat (e.g. f16(1.875) == bf16(1.0)). | ||
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. This comment implies that some don't, so what happens if fpimmbf16 matches a value that the fp16 variant doesn't have? Or should the comment actually be something like There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Not sure I understand. The comment is saying "some" half precision immediates alias with bfloat and the way this is achieved is by using the
|
||
def : Pat<(nxv8bf16 (vselect nxv8i1:$pg, (splat_vector fpimmbf16:$imm8), nxv8bf16:$zd)), | ||
(!cast<Instruction>(NAME # _H) $zd, $pg, (fpimm16XForm bf16:$imm8))>; | ||
def : Pat<(nxv4bf16 (vselect nxv4i1:$pg, (splat_vector fpimmbf16:$imm8), nxv4bf16:$zd)), | ||
(!cast<Instruction>(NAME # _H) $zd, $pg, (fpimm16XForm bf16:$imm8))>; | ||
def : Pat<(nxv2bf16 (vselect nxv2i1:$pg, (splat_vector fpimmbf16:$imm8), nxv2bf16:$zd)), | ||
(!cast<Instruction>(NAME # _H) $zd, $pg, (fpimm16XForm bf16:$imm8))>; | ||
} | ||
|
||
class sve_int_dup_imm_pred<bits<2> sz8_64, bit m, string asm, | ||
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I assume the reason you've put this after the i1 case above is because you believe that when converting nxv8bf16 -> nxv8i1 it's better to do:
FP_EXTEND: nvx8bf16 -> nxv8f32
VectorFP_TO_INT: nxv8f32 -> nxv8i32
SETNE: nxv8f32, zero -> nxv8i1
than
FP_EXTEND: nvx8bf16 -> nxv8f32
VectorFP_TO_INT: nxv8f32 -> nxv8i1
Presumably because you think SETNE will do a better job of splitting with an i1 result element type, than VectorFP_TO_INT?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Not really. I only put the bail out code here because it's the next blob of code that definitely doesn't support
MVT::nxv8f32
. When I move it before thei1
handling the output changes thusly:Looking at the Neoverse SWOG the two compares of the new output look like they'll be serialised and so I might have hit the better output by fluke rather than judgement?