Skip to content

Add remaining integer AVX 512f comparison intrinsics #864

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Closed

Conversation

Daniel-B-Smith
Copy link
Contributor

@Daniel-B-Smith Daniel-B-Smith commented May 31, 2020

This is just a draft. I created the PR in order to get CI to run.

@rust-highfive
Copy link

r? @gnzlbg

(rust_highfive has picked a reviewer for you, use r? to override)

@Daniel-B-Smith Daniel-B-Smith marked this pull request as ready for review June 6, 2020 19:09
Copy link
Contributor Author

@Daniel-B-Smith Daniel-B-Smith left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This should be ready for review.

#[target_feature(enable = "avx512f")]
#[rustc_args_required_const(3)]
#[cfg_attr(test, assert_instr(vpcmp, op = 0))]
pub unsafe fn _mm512_mask_cmp_epu64_mask(
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Arguably, all of the other intrinsics should be implemented on top of this intrinsic. My preference would be to declare victory and move on, but I can make the change if you would like.

#[target_feature(enable = "avx512f")]
#[rustc_args_required_const(2)]
#[cfg_attr(test, assert_instr(vpcmp, op = 0))]
pub unsafe fn _mm512_cmp_epu64_mask(a: __m512i, b: __m512i, op: _MM_CMPINT_ENUM) -> __mmask8 {
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It's kind of annoying to have the full implementation duplicated here. When I attempted to call _mm512_mask_cmp_epu64_mask here, the constness of op was not propagated along, so I got compiler errors that the argument needed to be a constant.

@Amanieu
Copy link
Member

Amanieu commented Jun 7, 2020

Are you planning on adding the epi32/epu32 intrinsics as well. Those are also part of AVX512F.

@Daniel-B-Smith
Copy link
Contributor Author

I'm a little confused by the documentation for the 32 bit comparisons. Most of the 32 bit comparisons read CPUID Flags: AVX512F for AVX-512, KNCNI for KNC, e.g.:

https://software.intel.com/sites/landingpage/IntrinsicsGuide/#expand=727,1063,4909,1062,1062,1063,874,850,779&avx512techs=AVX512F&text=_mm512_cmpeq

The cmplt_epi32 variants have separate AVX 512f and KNCNI documentation entries (which is why those ended up on Alex Crichton's list):

https://software.intel.com/sites/landingpage/IntrinsicsGuide/#expand=727,1063,4909,1062,1062,1063,874,850,779,1029,1029,1030&text=_mm512_cmplt_epi32_mask

I have been reading that documentation as being an implicit AND between AVX512f and Knight's Corner, but it may be an implicit OR (clang implements all of the combined comparisons in the AVX512f header). If it's the latter, we will also need to update the verification tests to accept avx512f for avx512f/kncni (see CI results).

@Daniel-B-Smith Daniel-B-Smith changed the title Add remaining AVX 512f comparison intrinsics Add remaining integer AVX 512f comparison intrinsics Jun 7, 2020
@Amanieu
Copy link
Member

Amanieu commented Jun 8, 2020

I believe the condition is AVX512F OR KNC. I would ignore KNC for now and just gate it on AVX512F only.

@Daniel-B-Smith
Copy link
Contributor Author

I've added the 32 bit integer comparisons and updated stdarch-verify to accept "avx512f/kncni" intrinsics to be gated by solely "avx512f". Now that I know it is an OR, I will update my list so that I get all of the mixed intrinsics going forward.

@Daniel-B-Smith
Copy link
Contributor Author

Closing since #866 also contains these changes.

@Daniel-B-Smith Daniel-B-Smith deleted the moar-avx512f-cmp branch June 14, 2020 15:02
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants