Fix RandomSample crash when 0 was out from random. #70

sidepelican · 2021-01-25T09:41:44Z

Fix #69 .

I changed the random ranges from [0, 1) to (0, 1] .
w must not be 0 so fixed nextW too. (in log(1 - w), log(1) is 0 so zero division happens.)

Checklist

I've added at least one test that validates that my change is working, if appropriate
I've followed the code style of the rest of the project
I've read the Contribution Guidelines
I've updated the documentation if necessary

timvermeulen · 2021-02-03T00:33:42Z

Thanks for your contribution, @sidepelican! This looks great.

1 - .random(in: 0..<1, using: &rng) is a good way of excluding 0 from the result. It does now include the upper bound, 1, which the original algorithm does not, but as far as I can see this does not affect the algorithm in any meaningful way.

Your seeded number generator correctly produces 0 as its first random number, and this tests the change you made in nextW (which is where rng is first used). However, note that it does not test the change in nextOffset because at that point rng no longer generates 0. Can you think of a way to test both changes, perhaps using a custom RandomNumberGenerator type?

sidepelican · 2021-02-03T16:50:42Z

Can you think of a way to test both changes, perhaps using a custom RandomNumberGenerator type?

I can make the RandomNumberGenerator type its anytime returns 0, but it doesn't work correctly becase Int.random(in: 0..<N, using: &g) (N > 2) falls into infinit loop.

swift-algorithms/Sources/Algorithms/RandomSample.swift

Line 140 in cda6fdd

let j = Int.random(in: 0..<result.count, using: &rng)

The test stops at this line when using such a generator.

The fundamental loop is here.
https://github.com/apple/swift/blob/a73a8087968f9111149073107c5242d83635107a/stdlib/public/core/Random.swift#L104

Based on the above, I changed the test to use internal function directly. 5acda56
How about this? @timvermeulen

Or, should make a generator that return 0 first 2 times?

timvermeulen

I can make the RandomNumberGenerator type its anytime returns 0, but it doesn't work correctly becase Int.random(in: 0..<N, using: &g) (N > 2) falls into infinit loop.

Indeed, what I had in mind was either a generator that starts by generating at least two 0s like you mentioned, or one that generates a 0 with a certain (high) probability.

Based on the above, I changed the test to use internal function directly.

This I like even more! Now we're not making any assumptions about the exact order in which these functions are called.

stephentyrone · 2021-02-03T19:53:29Z

I think this means that we can't test in release any longer, right? @natecook1000, how do you feel about that? Do we already use a testable import elsewhere?

(n.b. this comment is not a "we shouldn't do this," it's a "we should be aware that we are making a choice here.")

stephentyrone · 2021-02-05T17:38:25Z

We chatted, and we're fine with this. Approved.

stephentyrone · 2021-02-07T16:14:31Z

Upon more detailed review, this fixes the most obvious problem, but is not actually a complete solution. log(1-w) can underflow to zero without w being exactly zero; we simply need to have a sequence of w values whose product is less than .ulpOfOne/2. This should be replaced with log(onePlus: -w) to resolve the problem (numerically this only matters for very large collections--more than 2**53 elements, which can't happen on most systems supported by Swift--but it's trivially the right change to make, so we should do it).

The zero division itself is actually fine. Division by zero in floating-point is not an error; it produces infinity or NaN, which is fine. It's the conversion to Int that causes the problem. So we should be able to keep using 0..<1, but instead use:

internal func nextOffset<G: RandomNumberGenerator>(
  w: Double, using rng: inout G
) -> Int {
  let offset = Double.log(.random(in: 0..<1, using: &rng)) / .log(onePlus: -w)
  return offset < Double(Int.max) ? Int(offset) : Int.max
}

We should also have a clamping initializer for Int from BinaryFloatingPoint, so that people don't have to do this themselves. That belongs in the standard library, but I'll land it in Numerics first and we can switch over to using it in algorithms.

timvermeulen · 2021-02-08T23:14:23Z

Nice catch, and TIL about log(onePlus:)!

numerically this only matters for very large collections--more than 2**53 elements

We can trivially construct a range larger than that, so we have real reasons to fix this.

sidepelican · 2021-02-09T11:34:58Z

TIL about log(onePlus:) too!

I tried #70 (comment) , its looks fine but, I found crash when returning Int.max in nextOffset because arithmetic overflow happens at here.

swift-algorithms/Sources/Algorithms/RandomSample.swift

Line 204 in cda6fdd

var offset = nextOffset(w: w, using: &rng) + 1

I fixed this by just adding & but I'm not smart about the algorithm there so is this right?

stephentyrone · 2021-02-09T14:09:42Z

Can't we eliminate the +1 there and use while offset > 0 in the loop following?

(&+ is wrong, because it would wrap to Int.min causing us to try to select an earlier element, which is broken.)

Fix RandomSample crash when 0 was out from random.

7a302e8

natecook1000 requested a review from stephentyrone January 26, 2021 04:02

stephentyrone requested a review from timvermeulen February 1, 2021 19:22

test internal function directly

5acda56

timvermeulen approved these changes Feb 3, 2021

View reviewed changes

stephentyrone approved these changes Feb 5, 2021

View reviewed changes

use log(onePlus:) and add clamping

ca5e225

Remove offset + 1 to avoid overflow

7a1f51e

kylemacomber merged commit 1a3c706 into apple:main Feb 12, 2021

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Fix RandomSample crash when 0 was out from random. #70

Fix RandomSample crash when 0 was out from random. #70

Uh oh!

sidepelican commented Jan 25, 2021

Uh oh!

timvermeulen commented Feb 3, 2021

Uh oh!

sidepelican commented Feb 3, 2021 •

edited

Loading

Uh oh!

timvermeulen left a comment

Uh oh!

stephentyrone commented Feb 3, 2021 •

edited

Loading

Uh oh!

stephentyrone commented Feb 5, 2021

Uh oh!

stephentyrone commented Feb 7, 2021 •

edited

Loading

Uh oh!

timvermeulen commented Feb 8, 2021

Uh oh!

sidepelican commented Feb 9, 2021

Uh oh!

stephentyrone commented Feb 9, 2021 •

edited

Loading

Uh oh!

Uh oh!

Fix RandomSample crash when 0 was out from random. #70

Fix RandomSample crash when 0 was out from random. #70

Uh oh!

Conversation

sidepelican commented Jan 25, 2021

Checklist

Uh oh!

timvermeulen commented Feb 3, 2021

Uh oh!

sidepelican commented Feb 3, 2021 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

timvermeulen left a comment

Choose a reason for hiding this comment

Uh oh!

stephentyrone commented Feb 3, 2021 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

stephentyrone commented Feb 5, 2021

Uh oh!

stephentyrone commented Feb 7, 2021 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

timvermeulen commented Feb 8, 2021

Uh oh!

sidepelican commented Feb 9, 2021

Uh oh!

stephentyrone commented Feb 9, 2021 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

Uh oh!

sidepelican commented Feb 3, 2021 •

edited

Loading

stephentyrone commented Feb 3, 2021 •

edited

Loading

stephentyrone commented Feb 7, 2021 •

edited

Loading

stephentyrone commented Feb 9, 2021 •

edited

Loading