Skip to content

Speedup for NSString -getCharacters:range: #1928

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
merged 1 commit into from
Feb 21, 2019
Merged

Speedup for NSString -getCharacters:range: #1928

merged 1 commit into from
Feb 21, 2019

Conversation

Catfish-Man
Copy link
Contributor

No description provided.

@Catfish-Man Catfish-Man self-assigned this Feb 19, 2019
@Catfish-Man
Copy link
Contributor Author

@swift-ci please test

@milseman
Copy link
Member

Do you have a benchmark for this? UBP's subscript is probably too slow, I've found it significantly faster to get the underlying UP and dereference that.

@Catfish-Man
Copy link
Contributor Author

Heh whoops. A little TOO untested there. Will fix it up later.

@milseman it shows up on the regex test in the benchmark game

@gottesmm
Copy link
Contributor

@Catfish-Man Do we have a version of this benchmark? (We should probably add it).

@gottesmm
Copy link
Contributor

Maybe we even add it before this PR and then we can use the PR to validate the speed up! = )

@millenomi
Copy link
Contributor

@swift-ci please test

1 similar comment
@millenomi
Copy link
Contributor

@swift-ci please test

@Catfish-Man Catfish-Man changed the title Untested speedup for NSString -getCharacters:range: Speedup for NSString -getCharacters:range: Feb 21, 2019
@Catfish-Man
Copy link
Contributor Author

@swift-ci please test

1 similar comment
@Catfish-Man
Copy link
Contributor Author

@swift-ci please test

@Catfish-Man
Copy link
Contributor Author

I whipped up a "simulated" benchmark for this since I don't have a Linux machine handy, and on my test at least the current patch is >10x faster than the old implementation.

Unfortunately we were also seeing crashes in NSKeyedArchiver tests with the previous version of it… we'll see how this one does.

@millenomi
Copy link
Contributor

We changed how testing occurs for SCF patches. Restarting test.

@millenomi
Copy link
Contributor

@swift-ci please test

@Catfish-Man
Copy link
Contributor Author

Hm. I wonder what "llvm-config: error: component libraries and shared library" means (from the linux build)

@spevans
Copy link
Contributor

spevans commented Feb 21, 2019

@Catfish-Man When running your unicopy branch under Xcode 10.2b3 with swift-DEVELOPMENT-snapshot-2019-02-19 I see the same crash in TestNSKeyedArchiver.test_archive_locale as in the Linux logs:

EXC_BAD_ACCESS in NSString.swift line 349

@Catfish-Man
Copy link
Contributor Author

Oh excellent, thank you

@Catfish-Man
Copy link
Contributor Author

Ok, finally got myself set up properly to do corelibs debugging rather than stdlib, and I see what's wrong here. Fix shortly, thank you for your patience :)

@Catfish-Man
Copy link
Contributor Author

Should be good to go now. It wasn't handling _NSCFConstantString.

@Catfish-Man
Copy link
Contributor Author

@swift-ci please test

@@ -343,8 +343,18 @@ open class NSString : NSObject, NSCopying, NSMutableCopying, NSSecureCoding, NSC

extension NSString {
public func getCharacters(_ buffer: UnsafeMutablePointer<unichar>, range: NSRange) {
for idx in 0..<range.length {
buffer[idx] = character(at: idx + range.location)
if type(of: self) == NSString.self || type(of: self) == NSMutableString.self {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

How much does this type check slow things down? I suppose anything is better than character(at:)...

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

There's a lot of things we could do to improve this further. My goal here is just to get it from n^2 to n so that it's not a huge problem. If we find important uses of it we can sink more time into it.

@Catfish-Man
Copy link
Contributor Author

@swift-ci please test

@millenomi millenomi merged commit a45ca7d into master Feb 21, 2019
@milseman
Copy link
Member

You can also get up to 2x improvement on contiguous UTF-8 strings in general and several X when it's ASCII, depending on how many code paths you want to maintain here.

Pseudo-code (you'll have to struggle with handle optionality, because optional interfaces are a bad idea annoying) for a dedicated ASCII path .

// Fast-path Handle ASCII prefixes
_storage.utf8.withContiguousStorageIfAvailable {
  for byte in $0 {
    guard byte < 0x80 else { return }
    output[i] = UInt16(truncatingIfNeeded: byte)
  }
}

// Now we calculate the index corresponding to our (potentially further advanced) offsets...

@parkera parkera deleted the unicopy branch July 30, 2024 22:11
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

5 participants