Skip to content

Add bifurcate(_:) #151

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Closed
wants to merge 3 commits into from
Closed
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
60 changes: 60 additions & 0 deletions Guides/Bifurcate.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,60 @@
# Bifurcate

[[Source](https://github.com/apple/swift-algorithms/blob/main/Sources/Algorithms/Bifurcate.swift) |
[Tests](https://github.com/apple/swift-algorithms/blob/main/Tests/SwiftAlgorithmsTests/BifurcateTests.swift)]

Methods for splitting a sequence in two.

The standard library’s existing `filter(_:)` method provides similar functionality, but only returns the elements that match the predicate (returning `true`). `bifurcate(_:)` returns both the elements that match the preciate as well as those that don’t, as a tuple.

```swift
let cast = ["Vivien", "Marlon", "Kim", "Karl"]
let (shortNames, longNames) = cast.bifurcate({ $0.count < 5 })
print(shortNames)
// Prints "["Kim", "Karl"]"
print(longNames)
// Prints "["Vivien", "Marlon"]"
```

There’s also a function to bifurcate a collection into a prefix and a suffix, up to but not including a given index:

```swift
let cast = ["Vivien", "Marlon", "Kim", "Karl"]
let (callbacks, alternates) = cast.bifurcate(upTo: 2)
print(callbacks)
// Prints "["Vivien", "Marlon"]"
print(alternates)
// Prints "["Kim", "Karl"]"
```

## Detailed Design

The primary method is declared as an extension to `Sequence`, but has an optimized version for `Collection`.

```swift
extension Sequence {
public func bifurcate(_ belongsInFirstCollection: (Element) throws -> Bool) rethrows -> ([Element], [Element])
}
```

The other function is an extension to `Collection`, as it works with indices.

```swift
extension Collection {
public func bifurcate(upTo index: Index) -> (SubSequence, SubSequence)
}
```

### Complexity and Performance

`bifurcate(_:)` is an O(_n_) operation, where _n_ is the number of elements in the original sequence.

`bifurcate(upTo:)` is an O(_1_) operation.

Bifurcate is more efficient than calling `filter(_:)` twice with mutually-exclusive predicates (negatated) for two reasons:

1. It only requires a single pass through the elements.

2. When operating on a `Collection`, since the combined size of the two returned arrays is equal to the size of the original collection, the output buffer can be created and avoid needing to be resized.

If you ever find yourself calling `filter(_:)` and also needing the elements that didn’t match the predicate, `bifurcate(_:)` is the optimal choice. When testing with compiler optimizations enabled (`-O`, `-Ofast`), the results are consistantly faster, taking less than half the time (between 33–45%).
1 change: 1 addition & 0 deletions README.md
Original file line number Diff line number Diff line change
Expand Up @@ -26,6 +26,7 @@ Read more about the package, and the intent behind it, in the [announcement on s

#### Subsetting operations

- [`bifurcate(_:)`](https://github.com/apple/swift-algorithms/blob/main/Guides/Bifurcate.md): Separates elements in a sequence or collection into two groups based on whether each elements matches a given predicate.
- [`compacted()`](https://github.com/apple/swift-algorithms/blob/main/Guides/Compacted.md): Drops the `nil`s from a sequence or collection, unwrapping the remaining elements.
- [`randomSample(count:)`, `randomSample(count:using:)`](https://github.com/apple/swift-algorithms/blob/main/Guides/RandomSampling.md): Randomly selects a specific number of elements from a collection.
- [`randomStableSample(count:)`, `randomStableSample(count:using:)`](https://github.com/apple/swift-algorithms/blob/main/Guides/RandomSampling.md): Randomly selects a specific number of elements from a collection, preserving their original relative order.
Expand Down
152 changes: 152 additions & 0 deletions Sources/Algorithms/Bifurcate.swift
Original file line number Diff line number Diff line change
@@ -0,0 +1,152 @@
//===----------------------------------------------------------------------===//
//
// This source file is part of the Swift Algorithms open source project
//
// Copyright (c) 2021 Apple Inc. and the Swift project authors
// Licensed under Apache License v2.0 with Runtime Library Exception
//
// See https://swift.org/LICENSE.txt for license information
//
//===----------------------------------------------------------------------===//

extension Sequence {
/// Returns two arrays containing, in order, the elements of the sequence that
/// do and don’t satisfy the given predicate, respectively.
///
/// In this example, `bifurcate()` is used to include only
/// names shorter than five characters:
///
/// let cast = ["Vivien", "Marlon", "Kim", "Karl"]
/// let (shortNames, longNames) = cast.bifurcate({ $0.count < 5 })
/// print(shortNames)
/// // Prints "["Kim", "Karl"]"
/// print(longNames)
/// // Prints "["Vivien", "Marlon"]"
///
/// - Parameter belongsInFirstCollection: A closure that takes an element of
/// the sequence as its argument and returns a Boolean value indicating
/// whether the element should be included in the first returned array.
/// Otherwise, the element will appear in the second returned array.
///
/// - Returns: Two arrays with with all of the elements of the receiver. The
/// first array contains all the elements that `belongsInFirstCollection`
/// allowed, and the second array contains all the elements that
/// `belongsInFirstCollection` didn’t allow.
///
/// - Complexity: O(*n*), where *n* is the length of the sequence.
///
/// - Note: This algorithm performs a bit slower than the same algorithm on
/// `RandomAccessCollection` since the size of the sequence is unknown, unlike
/// `RandomAccessCollection`.
@inlinable
public func bifurcate(
_ belongsInFirstCollection: (Element) throws -> Bool
) rethrows -> ([Element], [Element]) {
var lhs = ContiguousArray<Element>()
var rhs = ContiguousArray<Element>()

for element in self {
if try belongsInFirstCollection(element) {
lhs.append(element)
} else {
rhs.append(element)
}
}

return _tupleMap((lhs, rhs), { Array($0) })
}
}

extension Collection {
// This is a specialized version of the same algorithm on `Sequence` that
// avoids reallocation of arrays since `count` is known ahead of time.
@inlinable
public func bifurcate(
_ belongsInFirstCollection: (Element) throws -> Bool
) rethrows -> ([Element], [Element]) {
guard !self.isEmpty else {
return ([], [])
}

// Since `RandomAccessCollection`s have known sizes (access to `count` is
// constant time, O(1)), we can allocate one array of size `self.count`,
// then insert items at the beginning or end of that contiguous block. This
// way, we don’t have to do any dynamic array resizing. Since we insert the
// right elements on the right side in reverse order, we need to reverse
// them back to the original order at the end.

let count = self.count

// Inside of the `initializer` closure, we set what the actual mid-point is.
// We will use this to bifurcate the single array into two in constant time.
var midPoint: Int = 0

let elements = try [Element](
unsafeUninitializedCapacity: count,
initializingWith: { buffer, initializedCount in
var lhs = buffer.baseAddress!
var rhs = lhs + buffer.count
do {
for element in self {
if try belongsInFirstCollection(element) {
lhs.initialize(to: element)
lhs += 1
} else {
rhs -= 1
rhs.initialize(to: element)
}
}

let rhsIndex = rhs - buffer.baseAddress!
buffer[rhsIndex...].reverse()
initializedCount = buffer.count

midPoint = rhsIndex
} catch {
let lhsCount = lhs - buffer.baseAddress!
let rhsCount = (buffer.baseAddress! + buffer.count) - rhs
buffer.baseAddress!.deinitialize(count: lhsCount)
rhs.deinitialize(count: rhsCount)
throw error
}
})

let collections = elements.bifurcate(upTo: midPoint)
return _tupleMap(collections, { Array($0) })
}
}

extension Collection {
/// Splits the receiving collection into two at the specified index
/// - Parameter index: The index within the receiver to split the collection
/// - Returns: A tuple with the first and second parts of the receiving
/// collection after splitting it
/// - Note: The first subsequence in the returned tuple does *not* include
/// the element at `index`. That element is in the second subsequence.
/// - Complexity: O(*1*)
@inlinable
public func bifurcate(upTo index: Index) -> (SubSequence, SubSequence) {
return (
self[self.startIndex..<index],
self[index..<self.endIndex]
)
}
}

/// Returns a tuple containing the results of mapping the given closure over
/// each of the tuple’s elements.
/// - Parameters:
/// - x: The tuple to transform
/// - transform: A mapping closure. `transform` accepts an element of this
/// sequence as its parameter and returns a transformed
/// - Returns: A tuple containing the transformed elements of this tuple.
@usableFromInline
internal func _tupleMap<T, U>(
_ x: (T, T),
_ transform: (T) throws -> U
) rethrows -> (U, U) {
return (
try transform(x.0),
try transform(x.1)
)
}
73 changes: 73 additions & 0 deletions Tests/SwiftAlgorithmsTests/BifurcateTests.swift
Original file line number Diff line number Diff line change
@@ -0,0 +1,73 @@
//===----------------------------------------------------------------------===//
//
// This source file is part of the Swift Algorithms open source project
//
// Copyright (c) 2021 Apple Inc. and the Swift project authors
// Licensed under Apache License v2.0 with Runtime Library Exception
//
// See https://swift.org/LICENSE.txt for license information
//
//===----------------------------------------------------------------------===//

import XCTest
import Algorithms

class BifurcateTests: XCTestCase {
func testEmpty() {
let input: [Int] = []

let s0 = input.bifurcate({ _ in return true })

XCTAssertTrue(s0.0.isEmpty)
XCTAssertTrue(s0.1.isEmpty)
}

func testExample() throws {
let cast = ["Vivien", "Marlon", "Kim", "Karl"]
let (shortNames, longNames) = cast.bifurcate({ $0.count < 5 })
XCTAssertEqual(shortNames, ["Kim", "Karl"])
XCTAssertEqual(longNames, ["Vivien", "Marlon"])
}

func testWithPredicate() throws {
let s0 = ["A", "B", "C", "D"].bifurcate({ $0 == $0.lowercased() })
let s1 = ["a", "B", "C", "D"].bifurcate({ $0 == $0.lowercased() })
let s2 = ["a", "B", "c", "D"].bifurcate({ $0 == $0.lowercased() })
let s3 = ["a", "B", "c", "d"].bifurcate({ $0 == $0.lowercased() })

XCTAssertEqual(s0.0, [])
XCTAssertEqual(s0.1, ["A", "B", "C", "D"])

XCTAssertEqual(s1.0, ["a"])
XCTAssertEqual(s1.1, ["B", "C", "D"])

XCTAssertEqual(s2.0, ["a", "c"])
XCTAssertEqual(s2.1, ["B", "D"])

XCTAssertEqual(s3.0, ["a", "c", "d"])
XCTAssertEqual(s3.1, ["B"])
}

func testWithIndex() throws {
let s0 = ["A", "B", "C", "D"].bifurcate(upTo: 0)
let s1 = ["A", "B", "C", "D"].bifurcate(upTo: 1)
let s2 = ["A", "B", "C", "D"].bifurcate(upTo: 2)
let s3 = ["A", "B", "C", "D"].bifurcate(upTo: 3)
let s4 = ["A", "B", "C", "D"].bifurcate(upTo: 4)

XCTAssertEqual(s0.0, [])
XCTAssertEqual(s0.1, ["A", "B", "C", "D"])

XCTAssertEqual(s1.0, ["A"])
XCTAssertEqual(s1.1, ["B", "C", "D"])

XCTAssertEqual(s2.0, ["A", "B"])
XCTAssertEqual(s2.1, ["C", "D"])

XCTAssertEqual(s3.0, ["A", "B", "C"])
XCTAssertEqual(s3.1, ["D"])

XCTAssertEqual(s4.0, ["A", "B", "C", "D"])
XCTAssertEqual(s4.1, [])
}
}