Skip to content

Improve performance of enumerating constraint-bound attributes consistent across many runs #1226

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
merged 1 commit into from
Mar 28, 2025

Conversation

jmschonfeld
Copy link
Contributor

Attributes bound to particular run boundaries (paragraphs, characters, etc.) are guaranteed to have consistent values between each boundary. This invariant is enforced at mutation time so that reading the attributed string can take advantage of this information. One place where we have yet to take advantage of this is when enumerating runs sliced to a particular attribute (for example attrStr.runs[\.someParagraphConstrainedAttribute]). Currently, we follow these steps to find the end of a coalesced run:

  • Iterate run-by-run through the storage comparing attribute values until we find the run where the attribute value has changed
  • If we have iterated past the end of the current chunk in a discontiguous slice, jump back to the end of the chunk
  • Scan the original location through the current location looking for a constraint boundary and jump back to that index if found

However, since we know that attribute values must be consistent up to the next constraint boundary, there's no need to look at the attribute values at all! If we are only dealing with a single kind of run boundary (for any number of attributes) we can just find the next constraint break (within the current discontiguous slice chunk) and return that without spending time comparing attribute values (which can be quite expensive). I added some benchmarks that show quite a bit of improvement in this area with these updates:

----------------------------------------------------------------------------------------------------------------------------
paragraphBoundSliceEnumeration-shortRuns metrics
----------------------------------------------------------------------------------------------------------------------------

╒══════════════════════════════════════════╤═════════╤═════════╤═════════╤═════════╤═════════╤═════════╤═════════╤═════════╕
│          Throughput (# / s) (K)          │      p0 │     p25 │     p50 │     p75 │     p90 │     p99 │    p100 │ Samples │
╞══════════════════════════════════════════╪═════════╪═════════╪═════════╪═════════╪═════════╪═════════╪═════════╪═════════╡
│                   main                   │     395 │     377 │     372 │     366 │     356 │     278 │     262 │     367 │
├──────────────────────────────────────────┼─────────┼─────────┼─────────┼─────────┼─────────┼─────────┼─────────┼─────────┤
│                  branch                  │    4859 │    4351 │    4347 │    4343 │    4283 │    4051 │    2841 │    4276 │
├──────────────────────────────────────────┼─────────┼─────────┼─────────┼─────────┼─────────┼─────────┼─────────┼─────────┤
│                    Δ                     │    4464 │    3974 │    3975 │    3977 │    3927 │    3773 │    2579 │    3909 │
├──────────────────────────────────────────┼─────────┼─────────┼─────────┼─────────┼─────────┼─────────┼─────────┼─────────┤
│              Improvement %               │    1130 │    1054 │    1069 │    1087 │    1103 │    1357 │     984 │    3909 │
╘══════════════════════════════════════════╧═════════╧═════════╧═════════╧═════════╧═════════╧═════════╧═════════╧═════════╛

----------------------------------------------------------------------------------------------------------------------------
paragraphBoundSliceEnumeration-shortRuns-reversed metrics
----------------------------------------------------------------------------------------------------------------------------

╒══════════════════════════════════════════╤═════════╤═════════╤═════════╤═════════╤═════════╤═════════╤═════════╤═════════╕
│          Throughput (# / s) (K)          │      p0 │     p25 │     p50 │     p75 │     p90 │     p99 │    p100 │ Samples │
╞══════════════════════════════════════════╪═════════╪═════════╪═════════╪═════════╪═════════╪═════════╪═════════╪═════════╡
│                   main                   │     199 │     190 │     187 │     184 │     181 │     170 │     168 │     187 │
├──────────────────────────────────────────┼─────────┼─────────┼─────────┼─────────┼─────────┼─────────┼─────────┼─────────┤
│                  branch                  │    2700 │    2639 │    2639 │    2619 │    2601 │    2475 │    1920 │    2607 │
├──────────────────────────────────────────┼─────────┼─────────┼─────────┼─────────┼─────────┼─────────┼─────────┼─────────┤
│                    Δ                     │    2501 │    2449 │    2452 │    2435 │    2420 │    2305 │    1752 │    2420 │
├──────────────────────────────────────────┼─────────┼─────────┼─────────┼─────────┼─────────┼─────────┼─────────┼─────────┤
│              Improvement %               │    1257 │    1289 │    1311 │    1323 │    1337 │    1356 │    1043 │    2420 │
╘══════════════════════════════════════════╧═════════╧═════════╧═════════╧═════════╧═════════╧═════════╧═════════╧═════════╛

The long run benchmarks are unchanged (because they don't spend a lot of time comparing attribute values) but these changes don't regress them in any way.

rdar://146903366

@jmschonfeld
Copy link
Contributor Author

@swift-ci please test

@jmschonfeld jmschonfeld merged commit b72fb0f into swiftlang:main Mar 28, 2025
3 checks passed
@jmschonfeld jmschonfeld deleted the attrstr/run-enum-perf branch March 28, 2025 16:37
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants