Skip to content

Commit 6d08983

Browse files
committed
Added small amount of documentation to vectorized_convenience_functions.
1 parent 747382d commit 6d08983

File tree

2 files changed

+53
-1
lines changed

2 files changed

+53
-1
lines changed

docs/src/getting_started.md

Lines changed: 0 additions & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -42,7 +42,6 @@ Aside from loops, `LoopVectorization.jl` also supports broadcasting.
4242

4343
```julia
4444
julia> using LoopVectorization, BenchmarkTools
45-
[ Info: Precompiling LoopVectorization [bdcacae8-1622-11e9-2a5c-532679323890]
4645

4746
julia> M, K, N = 47, 73, 7;
4847

Lines changed: 53 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,53 @@
1+
# Convenient Vectorized Functions
2+
3+
## vmap
4+
5+
6+
7+
8+
## vfilter
9+
10+
This function requires LLVM 7 or greater, and is only likly to give better performance if your CPU has AVX512. This is because it uses the compressed store intrinsic, which was added in LLVM 7. AVX512 provides a corresponding instruction, making the operation fast, while other instruction sets must emulate it, and thus are likely to get similar performance with `LoopVectorization.vfilter` as they do from `Base.filter`.
11+
12+
```julia
13+
julia> using LoopVectorization, BenchmarkTools
14+
15+
julia> x = rand(997);
16+
17+
julia> y1 = filter(a -> a > 0.7, x);
18+
19+
julia> y2 = vfilter(a -> a > 0.7, x);
20+
21+
julia> y1 == y2
22+
true
23+
24+
julia> @benchmark filter(a -> a > 0.7, $x)
25+
BenchmarkTools.Trial:
26+
memory estimate: 7.94 KiB
27+
allocs estimate: 1
28+
--------------
29+
minimum time: 955.389 ns (0.00% GC)
30+
median time: 1.050 μs (0.00% GC)
31+
mean time: 1.191 μs (9.72% GC)
32+
maximum time: 82.799 μs (94.92% GC)
33+
--------------
34+
samples: 10000
35+
evals/sample: 18
36+
37+
julia> @benchmark vfilter(a -> a > 0.7, $x)
38+
BenchmarkTools.Trial:
39+
memory estimate: 7.94 KiB
40+
allocs estimate: 1
41+
--------------
42+
minimum time: 477.487 ns (0.00% GC)
43+
median time: 575.166 ns (0.00% GC)
44+
mean time: 711.526 ns (17.87% GC)
45+
maximum time: 9.257 μs (79.17% GC)
46+
--------------
47+
samples: 10000
48+
evals/sample: 193
49+
```
50+
51+
52+
53+

0 commit comments

Comments
 (0)