Skip to content

Commit 2fe75b3

Browse files
committed
[shuffle] Stand back! I'm about to (try to) do math!
Especially with blends and large tree heights there was a problem with the fuzzer where it would end up with enough undef shuffle elements in enough parts of the tree that in a birthday-attack kind of way we ended up regularly having large numbers of undef elements in the result. I was seeing reasonably frequent cases of *all* results being undef which prevents us from doing any correctness checking at all. While having undef lanes is important, this was too much. So I've tried to apply some math to the probabilities of having an undef lane and balance them against the tree height. Please be gentle, I'm really terrible at math. I probably made a bunch of amateur mistakes here. Fixes, etc. are quite welcome. =D At least in running it some, it seems to be producing more interesting (for correctness testing) results. llvm-svn: 215540
1 parent 74c2c8f commit 2fe75b3

File tree

1 file changed

+39
-2
lines changed

1 file changed

+39
-2
lines changed

llvm/utils/shuffle_fuzz.py

Lines changed: 39 additions & 2 deletions
Original file line numberDiff line numberDiff line change
@@ -57,9 +57,46 @@ def main():
5757
'f32': 1 << 32, 'f64': 1 << 64}[element_type]
5858

5959
shuffle_range = (2 * width) if args.blends else width
60-
shuffle_indices = [-1] + range(shuffle_range)
6160

62-
shuffle_tree = [[[random.choice(shuffle_indices)
61+
# Because undef (-1) saturates and is indistinguishable when testing the
62+
# correctness of a shuffle, we want to bias our fuzz toward having a decent
63+
# mixture of non-undef lanes in the end. With a deep shuffle tree, the
64+
# probabilies aren't good so we need to bias things. The math here is that if
65+
# we uniformly select between -1 and the other inputs, each element of the
66+
# result will have the following probability of being undef:
67+
#
68+
# 1 - (shuffle_range/(shuffle_range+1))^max_shuffle_height
69+
#
70+
# More generally, for any probability P of selecting a defined element in
71+
# a single shuffle, the end result is:
72+
#
73+
# 1 - P^max_shuffle_height
74+
#
75+
# The power of the shuffle height is the real problem, as we want:
76+
#
77+
# 1 - shuffle_range/(shuffle_range+1)
78+
#
79+
# So we bias the selection of undef at any given node based on the tree
80+
# height. Below, let 'A' be 'len(shuffle_range)', 'C' be 'max_shuffle_height',
81+
# and 'B' be the bias we use to compensate for
82+
# C '((A+1)*A^(1/C))/(A*(A+1)^(1/C))':
83+
#
84+
# 1 - (B * A)/(A + 1)^C = 1 - A/(A + 1)
85+
#
86+
# So at each node we use:
87+
#
88+
# 1 - (B * A)/(A + 1)
89+
# = 1 - ((A + 1) * A * A^(1/C))/(A * (A + 1) * (A + 1)^(1/C))
90+
# = 1 - ((A + 1) * A^((C + 1)/C))/(A * (A + 1)^((C + 1)/C))
91+
#
92+
# This is the formula we use to select undef lanes in the shuffle.
93+
A = float(shuffle_range)
94+
C = float(args.max_shuffle_height)
95+
undef_prob = 1.0 - (((A + 1.0) * pow(A, (C + 1.0)/C)) /
96+
(A * pow(A + 1.0, (C + 1.0)/C)))
97+
98+
shuffle_tree = [[[-1 if random.random() <= undef_prob
99+
else random.choice(range(shuffle_range))
63100
for _ in itertools.repeat(None, width)]
64101
for _ in itertools.repeat(None, args.max_shuffle_height - i)]
65102
for i in xrange(args.max_shuffle_height)]

0 commit comments

Comments
 (0)