-
Notifications
You must be signed in to change notification settings - Fork 294
Add AVX 512f gather instructions #862
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Closed
Closed
Changes from 13 commits
Commits
Show all changes
30 commits
Select commit
Hold shift + click to select a range
37a37e2
Add 64 bit AVX512f le and ge comparisons
3f88738
Checkpointing first gather implementation
cf3e316
Fix interface to be consistent
72959dd
Merge remote-tracking branch 'upstream/master' into avx-512-cmp
01102d7
Fix instruction assert
79dee01
Add _mm512_mask_i32gather_epi64
0d3a19b
Add pd gather intrinsics
f244d2e
Add 64 bit index variants
9b90883
Add 32 bit output gather intrinsics
0238065
Fix comments
d7e2afa
Fix comparison comments
dcf5d47
s/unsigned/signed/ for epi64
d9d0fc9
Add neq integer comparisons
9a1200d
Remove feature that wasn't added
ed9bbe4
Merge branch 'master' into moar-avx512f-cmp
f70f643
Constanting the arguments
e29e2ba
Merge branch 'avx-512-cmp' of github.com:Daniel-B-Smith/stdarch into …
c5cec2d
Fix comment
f775ef1
Make instruction check less specific for CI
2957e2e
Add comparison operator integer comparisons
7538c0f
Fix comments
33a4dd5
Allow non camel case types
a74886b
Add cmplt_ep(i|u)32
e8cfdb8
Allow AVX512f or KNC intrinsics to be gated by avx512f
690a03c
Add remaining 32bit integer comparisons
45aa0bd
Merge branch 'moar-avx512f-cmp' into avx-512-cmp
475c51d
Merge remote-tracking branch 'upstream/master' into moar-avx512f-cmp
832166a
Fix verify test with updated XML
1c81797
Merge branch 'moar-avx512f-cmp' into avx-512-cmp
c761d6f
Add remaining gather intrinsics
File filter
Filter by extension
Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
There are no files selected for viewing
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Oops, something went wrong.
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
You should use
_mm512_undefined
here instead to match what Clang is doing.There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Hmm actually it seems that Clang defines
_mm512_undefined
as zero-initialization, so it doesn't matter either way.There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Are you sure? I see it defined as a particular builtin, but
_mm512_setzero
is explicitly defined as zero initialization. I'm not sure of the behavior of__builtin_ia32_undef512
, however.https://github.com/llvm/llvm-project/blob/1b02db52b79e01f038775f59193a49850a34184d/clang/lib/Headers/avx512fintrin.h#L189
https://github.com/llvm/llvm-project/blob/a3dc9490004ce1601fb1bc67cf218b86a6fdf652/clang/include/clang/Basic/BuiltinsX86.def#L40
https://github.com/llvm/llvm-project/blob/1b02db52b79e01f038775f59193a49850a34184d/clang/lib/Headers/avx512fintrin.h#L259
https://github.com/llvm/llvm-project/blob/1b02db52b79e01f038775f59193a49850a34184d/clang/lib/Headers/avx512fintrin.h#L253
LLVM should be able to optimize away the dead store, but I'm happy to change the code regardless. I'm not quite sure how/if I can implement
_mm512_undefined
since my reading of thestd::mem::MaybeUninit
is that I couldn't create an unitialized__m512i
without inviting UB. Assuming the calling convention allows it, I should be able to create aMaybeUninit<__m512i>
and pass that tovpgatherdq
.