Avx512 avx512vl #999

minybot · 2021-02-09T17:43:40Z

permute_ps,pd: mm256,mm; permutevar_ps,pd: mm256,mm
permutex_epi64,pd: mm256
permutexvar_epi32,epi64,ps,pd: mm256
permutex2var_epi32,epi64,ps,pd: mm256,mm
i32gather_epi32,epi64,ps,pd: mm256,mm
i64gather_epi32,epi64,ps,pd: mm256,mm

…m256,mm

rust-highfive · 2021-02-09T17:43:43Z

r? @Amanieu

(rust-highfive has picked a reviewer for you, use r? to override)

Amanieu · 2021-02-10T01:53:47Z

crates/core_arch/src/x86/avx512f.rs

+        };
+    }
+    let r = constify_imm8_gather!(scale, call);
+    transmute(simd_select_bitmask(mask, r.as_f32x8(), src.as_f32x8()))


simd_select_bitmask is incorrect in this case. We need to use the special LLVM intrinsic llvm.x86.avx512.mask.gather3siv4.sf.

The difference here is that the LLVM intrinsic will only access memory addresses which are not masked out. However your version will access memory addresses that are masked out, which could cause a crash if a masked value is an invalid offset.

simd_select_bitmask is incorrect in this case. We need to use the special LLVM intrinsic llvm.x86.avx512.mask.gather3siv4.sf.

The difference here is that the LLVM intrinsic will only access memory addresses which are not masked out. However your version will access memory addresses that are masked out, which could cause a crash if a masked value is an invalid offset.

True, Thanks for your remind.
llvm.x86.avx512.mask.gather3siv4.sf will requires i1, so I need to remove those mask gather functions

…s,pd: mm256,mm

minybot added 5 commits February 8, 2021 11:02

permute_ps,pd: mm256,mm; permutevar_ps,pd: mm256,mm

5189374

permutex_epi64,pd: mm256

bb21398

permutexvar_epi32,epi64,ps,pd: mm256

4c82521

permutex2var_epi32,epi64,ps,pd: mm256,mm

3a907e7

i32gather_epi32,epi64,ps,pd: mm256,mm; i64gather_epi32,epi64,ps,pd: m…

84228ae

…m256,mm

rust-highfive assigned Amanieu Feb 9, 2021

minybot added 4 commits February 9, 2021 13:00

try to pass msvc CI

21e4611

try to pass msvc CI 2

3a722e1

remove github osX CI

5d750bf

update avx512f.md

f60814d

Amanieu reviewed Feb 10, 2021

View reviewed changes

remove i32gather_epi32,epi64,ps,pd: mm256,mm; i64gather_epi32,epi64,p…

c200d44

…s,pd: mm256,mm

Amanieu merged commit 11fd33d into rust-lang:master Feb 10, 2021

minybot deleted the avx512_avx512vl branch February 10, 2021 13:10

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Avx512 avx512vl #999

Avx512 avx512vl #999

Uh oh!

minybot commented Feb 9, 2021

Uh oh!

rust-highfive commented Feb 9, 2021

Uh oh!

Amanieu Feb 10, 2021

Uh oh!

minybot Feb 10, 2021

Uh oh!

Uh oh!

Avx512 avx512vl #999

Avx512 avx512vl #999

Uh oh!

Conversation

minybot commented Feb 9, 2021

Uh oh!

rust-highfive commented Feb 9, 2021

Uh oh!

Amanieu Feb 10, 2021

Choose a reason for hiding this comment

Uh oh!

minybot Feb 10, 2021

Choose a reason for hiding this comment

Uh oh!

Uh oh!