Skip to content

[benchmark] Add ReplaceSubrange benchmark #25310

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
merged 10 commits into from
Aug 2, 2019
Merged
1 change: 1 addition & 0 deletions benchmark/CMakeLists.txt
Original file line number Diff line number Diff line change
Expand Up @@ -164,6 +164,7 @@ set(SWIFT_BENCH_MODULES
single-source/StringInterpolation
single-source/StringMatch
single-source/StringRemoveDupes
single-source/StringReplaceSubrange
single-source/StringTests
single-source/StringWalk
single-source/Substring
Expand Down
75 changes: 75 additions & 0 deletions benchmark/single-source/StringReplaceSubrange.swift
Original file line number Diff line number Diff line change
@@ -0,0 +1,75 @@
//===--- StringReplaceSubrange.swift -------------------------------------------===//
//
// This source file is part of the Swift.org open source project
//
// Copyright (c) 2014 - 2019 Apple Inc. and the Swift project authors
// Licensed under Apache License v2.0 with Runtime Library Exception
//
// See https://swift.org/LICENSE.txt for license information
// See https://swift.org/CONTRIBUTORS.txt for the list of Swift project authors
//
//===----------------------------------------------------------------------===//

import TestsUtils

let tags: [BenchmarkCategory] = [.validation, .api, .String]

public let StringReplaceSubrange = [
BenchmarkInfo(
name: "String.replaceSubrange.String.Small",
runFunction: { replaceSubrange($0, smallString, with: "t") },
tags: tags
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Would you mind explaining what is the difference between the small literal string vs the large managed string? Is this something related to this small string optimization? If the string fits 15 ASCII characters length, it won't be allocated in the heap memory?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Correct. See: _SmallString and _StringGuts for implementation details if you're interested.

Copy link
Contributor Author

@keitaito keitaito Jul 4, 2019

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for the links! I will take a look at them πŸ”

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Small form accommodates 15 UTF-8 code units in length (not just ASCII)

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thank you for clarifying the length of the small string, Michael πŸ˜„

),
BenchmarkInfo(
name: "String.replaceSubrange.String",
runFunction: { replaceSubrange($0, largeString, with: "t") },
tags: tags
),
BenchmarkInfo(
name: "String.replaceSubrange.Substring.Small",
runFunction: { replaceSubrange($0, smallString, with: "t"[...]) },
tags: tags
),
BenchmarkInfo(
name: "String.replaceSubrange.Substring",
runFunction: { replaceSubrange($0, largeString, with: "t"[...]) },
tags: tags
),
BenchmarkInfo(
name: "String.replaceSubrange.ArrChar.Small",
runFunction: { replaceSubrange($0, smallString, with: arrayCharacter) },
tags: tags
),
BenchmarkInfo(
name: "String.replaceSubrange.ArrChar",
runFunction: { replaceSubrange($0, largeString, with: arrayCharacter) },
tags: tags
),
BenchmarkInfo(
name: "String.replaceSubrange.RepChar.Small",
runFunction: { replaceSubrange($0, smallString, with: repeatedCharacter) },
tags: tags
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The benchmark name "Str.replaceSubrange.SmallLiteral.RepeatedChar" is longer than 40 characters, but I couldn't think a better name fitting 40. Maybe it can be like "Str.replaceSubrange.LargeManagedRepChar", but I was concerned "RepChar" is a little bit hard to understand that it means Repeated<Character>. @palimondo What do you think about this naming? Please let me know if you have any suggestions on it πŸ™‚

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looking at what @milseman writes in SR-8905:

replaceSubrange<C: Collection>(_:C)

  • Arguments of types String, Substring, Array<Character>, Repeated<Character>, etc

I'd say the naming convention calls for base name of String.replaceSubrange which varies across the argument type (String, Substring, ArrChar, RepChar) for the general case of large strings. Then we'll denote the special optimization for small strings with a simple .Small suffix and we'll get these benchmarks:

  • String.replaceSubrange.String
  • String.replaceSubrange.Substring
  • String.replaceSubrange.ArrChar
  • String.replaceSubrange.RepChar
  • String.replaceSubrange.String.Small
  • String.replaceSubrange.Substring.Small
  • String.replaceSubrange.ArrChar.Small
  • String.replaceSubrange.RepChar.Small

The longest one is String.replaceSubrange.Substring.Small at 39 characters, just under the 40 chars limit.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

That's an awesome naming idea. I will use them. Thanks for your suggestion!

),
BenchmarkInfo(
name: "String.replaceSubrange.RepChar",
runFunction: { replaceSubrange($0, largeString, with: repeatedCharacter) },
tags: tags
),
]

let smallString = "coffee"
let largeString = "coffee\u{301}coffeecoffeecoffeecoffee"

let arrayCharacter = Array<Character>(["t"])
let repeatedCharacter = repeatElement(Character("t"), count: 1)

@inline(never)
private func replaceSubrange<C: Collection>(
_ N: Int, _ string: String, with newElements: C
) where C.Element == Character {
var copy = getString(string)
let range = string.startIndex..<string.index(after: string.startIndex)
for _ in 0 ..< 500 * N {
copy.replaceSubrange(range, with: newElements)
}
}
2 changes: 2 additions & 0 deletions benchmark/utils/main.swift
Original file line number Diff line number Diff line change
Expand Up @@ -159,6 +159,7 @@ import StringEnum
import StringInterpolation
import StringMatch
import StringRemoveDupes
import StringReplaceSubrange
import StringTests
import StringWalk
import Substring
Expand Down Expand Up @@ -339,6 +340,7 @@ registerBenchmark(StringInterpolationManySmallSegments)
registerBenchmark(StringMatch)
registerBenchmark(StringNormalization)
registerBenchmark(StringRemoveDupes)
registerBenchmark(StringReplaceSubrange)
registerBenchmark(StringTests)
registerBenchmark(StringWalk)
registerBenchmark(SubstringTest)
Expand Down