-
Notifications
You must be signed in to change notification settings - Fork 292
Avx512f #901
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Avx512f #901
Conversation
merge from base
r? @Amanieu (rust_highfive has picked a reviewer for you, use r? to override) |
// Constifies 8 bits along with an sae option without rounding control. | ||
// This macro enforces that. | ||
#[allow(unused)] | ||
macro_rules! constify_imm8_sae { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
How is this different from just constify_imm8
?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
How is this different from just
constify_imm8
?
($imm4:expr, $imm2:expr) => {
...
$imm2 << 2 | $imm4,
In LLVM function, it merges two parameters to one. Clang accepts (0..15,0..3), but imm4 uses only 2 bits, and imm2 uses 2 bits in the definition. So, we should follow Clang or just use (0..3,0..3) for the function?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
But this macro only has ($imm8:expr, $expand:ident)
, just like consitfy_imm8
.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
sorry,
constify_imm8 is [0..255].
->254 => $expand!(254),
->_ => $expand!(255),
constify_imm8_sae has out of range check.
->255 => $expand!(255),
->_ => panic!("Invalid sae value"),
Do you think this is necessary?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Ah OK. I think the naming could be clearer, or at least a comment explaining the difference.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Ah OK. I think the naming could be clearer, or at least a comment explaining the difference.
ok. I will make it clearer.
crates/core_arch/src/x86/avx512f.rs
Outdated
let e = _mm512_setr_epi32(0, -2, 2, -4, 4, -6, 6, -8, 0, 0, 0, 0, 0, 0, 0, 0); | ||
assert_eq_m512i(r, e); | ||
} | ||
*/ |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Why are these tests commented out?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Why are these tests commented out?
LLVM uses different way to implement float16(ph). I will remove those lines.
(15, 1) => $expand!(15, 1), | ||
(15, 2) => $expand!(15, 2), | ||
(15, 3) => $expand!(15, 3), | ||
(_, _) => panic!("Invalid sae value"), |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Should this be invalid mantissa instead?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Should this be invalid mantissa instead?
($imm4:expr, $imm2:expr) => {
...
$imm2 << 2 | $imm4,
In LLVM function, it merges two parameters to one. Clang accepts (0..15,0..3), but imm4 uses only 2 bits, and imm2 uses 2 bits in the definition. So, we should follow Clang or just use (0..3,0..3) for the function?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
When in doubt you should check what GCC accepts. It is generally closer to the spec than Clang.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
When in doubt you should check what GCC accepts. It is generally closer to the spec than Clang.
Yes, I tried Intel compiler also. GCC is closer to ICC I guess.
Btw, Clang, GCC, ICC accept (0..15)(0..3).
In Rust, we use enum(0x00..0x03) and enum(0x00..0x02). So, we can do (0..4).(0..3) or just use (0..16)(0..3) in this case?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Just use (0..15)(0..3) like the C compilers.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Why? compatibility?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yes
add constify_imm8_sae to fix rotation
getmant: ps,pd
getmant_round: ps,pd
roundps_pd
cvtps_pd