Skip to content

Improve performance for comparing AttributedStrings with differing character counts #1224

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
merged 3 commits into from
Mar 24, 2025

Conversation

jmschonfeld
Copy link
Contributor

Two AttributedStrings with differing character counts will never compare equal. When we can cheaply determine that the character counts are inequal, we should return early from AttributedString.== rather than comparing each runs contents. This situation can be quite common (for example, just appending a single character to the end of the last run like when a user types into a text box) and it turns a full run-by-run comparison of the entire string into a single constant time evaluation. The benchmark results highlight that for this worst-case, but common, scenario the wins are significant:

╒══════════════════════════════════════════╤═════════╤═════════╤═════════╤═════════╤═════════╤═════════╤═════════╤═════════╕
│          Throughput (# / s) (M)          │      p0 │     p25 │     p50 │     p75 │     p90 │     p99 │    p100 │ Samples │
╞══════════════════════════════════════════╪═════════╪═════════╪═════════╪═════════╪═════════╪═════════╪═════════╪═════════╡
│                   main                   │      92 │      91 │      90 │      88 │      83 │      71 │      71 │      89 │
├──────────────────────────────────────────┼─────────┼─────────┼─────────┼─────────┼─────────┼─────────┼─────────┼─────────┤
│                  branch                  │  480077 │  461567 │  453119 │  452863 │  444671 │  353023 │   11964 │   10000 │
├──────────────────────────────────────────┼─────────┼─────────┼─────────┼─────────┼─────────┼─────────┼─────────┼─────────┤
│                    Δ                     │  479985 │  461476 │  453029 │  452775 │  444588 │  352952 │   11893 │    9911 │
├──────────────────────────────────────────┼─────────┼─────────┼─────────┼─────────┼─────────┼─────────┼─────────┼─────────┤
│              Improvement %               │  521723 │  507116 │  503366 │  514517 │  535648 │  497115 │   16751 │    9911 │
╘══════════════════════════════════════════╧═════════╧═════════╧═════════╧═════════╧═════════╧═════════╧═════════╧═════════╛

Note: there's an explanation left in the comments of the code but this optimization is currently only performed for AttributedString.== and not for AttributedSubstring.== or DiscontiguousAttributedSubstring.==.

@jmschonfeld
Copy link
Contributor Author

@swift-ci please test

Copy link
Member

@lorentey lorentey left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Nit: Can we replace the fullString: argument with a simple _isPartial (or similar) predicate on AttributedString.Runs?

It's fast/easy to test if the instance is over a full string; we do not need to pass this bit down explicitly. Eliminating the argument allows the fast path to trigger on substrings that happen to cover the entire string. It also simplifies flow, keeping all comparison logic in the _characterwiseIsEqual function.

@jmschonfeld
Copy link
Contributor Author

@lorentey sure thing! Updated the implementation to add an _isPartial property rather than passing it from the caller and confirmed that the benchmark results are unchanged.

@jmschonfeld
Copy link
Contributor Author

@swift-ci please test

@jmschonfeld jmschonfeld merged commit 95a222f into swiftlang:main Mar 24, 2025
3 checks passed
@jmschonfeld jmschonfeld deleted the attrstr/equatable-perf branch March 24, 2025 23:04
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants