Documents arithmetic reduction semantics #412
Merged
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
This PR documents the arithmetic reductions and updates
stdsimd
to the latest changes in the RFC:sum
->wrapping_sum
product
->wrapping_product
min
->min_element
max
->max_element
.Currently there are two open issues that affect computations involving
NaN
s:llvm.experimental.vector.reduce.f{min,max}
do not behave like we'd like with respect toNaN
s due to min_element / max_element produce incorrect results for NaNs in the last place #408 (which points to the LLVM bug). Basically, we want these to behave likemin
/max
for floating point numbers, that is, they should always return a number unless all elements in a vector areNaN
just like IEEE-754{min,max}Num
and{minimum,maximum}Number
do. Right now, they behave more likefcmp+select
, producing different results depending on whereNaN
s are situated inside a vector. For example,(NaN, 1., 2., -1.).min_element()
returns-1.
but(-1., 1., 2., NaN).min_element()
returnsNaN
...llvm.experimental.vector.reduce.f{add,mul}
are a bit broken upstream and only work with math flags enabled. . . (see floating-point sum / product are buggy w.r.t. NaNs #409 which points to the LLVM bug). They do produce correct results AFAICT, but because fast-maths are enabled code around these makes incorrect assumptions about their results when they are NaN. For example:(1., 2., NaN, 4.).wrapping_sum()
returnsNaN
but(1., 2., NaN, 4.).wrapping_sum().is_nan()
returnsfalse
...We should work around these in
stdsimd
.