Skip to content

Implement AVX512f floating point comparisons #869

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
merged 56 commits into from
Jul 15, 2020

Conversation

Daniel-B-Smith
Copy link
Contributor

@Daniel-B-Smith Daniel-B-Smith commented Jun 14, 2020

This adds the AVX 512f floating point comparison intrinsics. This should finish all of the AVX 512f comparison intrinsics.

@bors
Copy link
Contributor

bors commented Jun 16, 2020

☔ The latest upstream changes (presumably b17efd8) made this pull request unmergeable. Please resolve the merge conflicts.

@Daniel-B-Smith Daniel-B-Smith changed the title DRAFT: Implement AVX512f floating point comparisons Implement AVX512f floating point comparisons Jul 5, 2020
@Daniel-B-Smith Daniel-B-Smith marked this pull request as ready for review July 5, 2020 23:31
@Daniel-B-Smith
Copy link
Contributor Author

This should be ready for review. There's one remaining rough edge here. If you pass an invalid value to sae, the intrinsic won't link and llvm gives a very unhelpful error message. In particular, passing 0 in the assert_instr failed to link.

@Amanieu
Copy link
Member

Amanieu commented Jul 6, 2020

It seems that not all values are valid for the sae argument. You should replicate the exact rules (from here) in a custom constify macro.

@Daniel-B-Smith
Copy link
Contributor Author

Done. Thanks! I tried to find where/how the sae parameter was validated, but I struggled to find it.

vcmpps(a.as_f32x16(), b.as_f32x16(), $imm5, neg_one, $imm4)
};
}
let r = constify_imm5_sae!(op, _MM_FROUND_CUR_DIRECTION, call);
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why not just use constify_imm5 here and hard-code the rounding mode in the vcmpps call above? This code creates many unnecessary match branches.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Just a bad refactoring. I had it like that originally. Should be fixed now.

@Daniel-B-Smith
Copy link
Contributor Author

CI is fixed now.

@Amanieu Amanieu merged commit fa96710 into rust-lang:master Jul 15, 2020
@Amanieu
Copy link
Member

Amanieu commented Jul 15, 2020

Thanks!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

5 participants