[String] Add UTF-8 fast-paths for Foundation initializers #21959

milseman · 2019-01-17T22:17:04Z

Many Foundation initializers could benefit from faster string
construction and subsequent reads in Swift 5. Add UTF-8 fast paths for
when constructing a string from a valid UTF-8 code units.

Many Foundation initializers could benefit from faster string construction and subsequent reads in Swift 5. Add UTF-8 fast paths for when constructing a string from a valid UTF-8 code units.

milseman · 2019-01-17T22:17:12Z

@swift-ci please test

milseman · 2019-01-17T22:17:19Z

@swift-ci please benchmark

swift-ci · 2019-01-17T23:02:23Z

Build comment file:

Performance: -O

TEST	OLD	NEW	DELTA	RATIO
Regression
UTF8Decode_InitFromBytes_ascii	390	675	+73.1%	0.58x
UTF8Decode_InitFromData_ascii	544	648	+19.1%	0.84x (?)
Improvement
UTF8Decode_InitFromData	1084	214	-80.3%	5.07x
UTF8Decode_InitFromBytes	1053	228	-78.3%	4.62x
DictionaryBridgeToObjC_Access	968	780	-19.4%	1.24x (?)
Data.hash.Empty	64	58	-9.4%	1.10x (?)

Performance: -Osize

TEST	OLD	NEW	DELTA	RATIO
Regression
UTF8Decode_InitFromBytes_ascii	432	701	+62.3%	0.62x
Improvement
UTF8Decode_InitFromData	1104	215	-80.5%	5.13x
UTF8Decode_InitFromBytes	1049	229	-78.2%	4.58x
Data.hash.Empty	66	61	-7.6%	1.08x (?)

Performance: -Onone

TEST	OLD	NEW	DELTA	RATIO
Regression
UTF8Decode_InitFromBytes_ascii	423	691	+63.4%	0.61x
UTF8Decode_InitFromData_ascii	548	654	+19.3%	0.84x (?)
Improvement
UTF8Decode_InitFromData	1101	215	-80.5%	5.12x
UTF8Decode_InitFromBytes	1058	226	-78.6%	4.68x

How to read the data

The tables contain differences in performance which are larger than 8% and differences in code size which are larger than 1%.

If you see any unexpected regressions, you should consider fixing the
regressions before you merge the PR.

Noise: Sometimes the performance results (not code size!) contain false
alarms. Unexpected regressions which are marked with '(?)' are probably noise.
If you see regressions which you cannot explain you can try to run the
benchmarks again. If regressions still show up, please consult with the
performance team (@eeckstein).

Hardware Overview

  Model Name: Mac Pro
  Model Identifier: MacPro6,1
  Processor Name: 8-Core Intel Xeon E5
  Processor Speed: 3 GHz
  Number of Processors: 1
  Total Number of Cores: 8
  L2 Cache (per Core): 256 KB
  L3 Cache: 25 MB
  Memory: 64 GB

--------------

airspeedswift · 2019-01-17T23:56:38Z

Do we have test coverage for these paths?

milseman · 2019-01-18T00:53:44Z

@swift-ci please test

milseman · 2019-01-18T00:54:09Z

@swift-ci please benchmark

swift-ci · 2019-01-18T00:55:46Z

Build failed
Swift Test Linux Platform
Git Sha - a088e13

swift-ci · 2019-01-18T00:57:47Z

Build failed
Swift Test OS X Platform
Git Sha - a088e13

swift-ci · 2019-01-18T01:38:24Z

Build comment file:

Performance: -O

TEST	OLD	NEW	DELTA	RATIO
Regression
UTF8Decode_InitFromBytes_ascii	441	586	+32.9%	0.75x (?)
Improvement
UTF8Decode_InitFromData	1213	192	-84.2%	6.32x
UTF8Decode_InitFromBytes	1166	207	-82.2%	5.63x
UTF8Decode_InitFromData_ascii	641	529	-17.5%	1.21x
LessSubstringSubstring	43	39	-9.3%	1.10x
EqualStringSubstring	43	39	-9.3%	1.10x (?)
EqualSubstringSubstringGenericEquatable	43	39	-9.3%	1.10x
EqualSubstringString	43	39	-9.3%	1.10x (?)
LessSubstringSubstringGenericComparable	43	39	-9.3%	1.10x
Data.hash.Empty	71	65	-8.5%	1.09x (?)
StringComparison_slowerPrenormal	1720	1580	-8.1%	1.09x (?)
StringBuilderSmallReservingCapacity	384	354	-7.8%	1.08x (?)
StringBuilder	368	340	-7.6%	1.08x (?)
EqualSubstringSubstring	43	40	-7.0%	1.07x (?)

Performance: -Osize

TEST	OLD	NEW	DELTA	RATIO
Regression
UTF8Decode_InitFromBytes_ascii	449	608	+35.4%	0.74x
Improvement
UTF8Decode_InitFromData	1218	193	-84.2%	6.31x
UTF8Decode_InitFromBytes	1167	210	-82.0%	5.56x
DataAppendDataLargeToLarge	50800	37800	-25.6%	1.34x (?)
LessSubstringSubstringGenericComparable	44	39	-11.4%	1.13x
EqualSubstringSubstring	43	39	-9.3%	1.10x (?)
EqualStringSubstring	43	39	-9.3%	1.10x (?)
EqualSubstringString	43	39	-9.3%	1.10x
Data.hash.Empty	74	68	-8.1%	1.09x (?)
UTF8Decode_InitFromData_ascii	598	552	-7.7%	1.08x
LessSubstringSubstring	43	40	-7.0%	1.07x
EqualSubstringSubstringGenericEquatable	43	40	-7.0%	1.07x (?)
StringComparison_slowerPrenormal	1730	1610	-6.9%	1.07x

Performance: -Onone

TEST	OLD	NEW	DELTA	RATIO
Regression
UTF8Decode_InitFromBytes_ascii	462	591	+27.9%	0.78x (?)
StrComplexWalk	6670	7360	+10.3%	0.91x (?)
Improvement
UTF8Decode_InitFromData	1220	194	-84.1%	6.29x
UTF8Decode_InitFromBytes	1176	209	-82.2%	5.63x
UTF8Decode_InitFromData_ascii	669	530	-20.8%	1.26x
ObjectiveCBridgeStubToNSDate2	1529	1314	-14.1%	1.16x (?)
DataAppendDataSmallToSmall	4520	4020	-11.1%	1.12x (?)
SortStringsUnicode	5305	4924	-7.2%	1.08x

How to read the data

The tables contain differences in performance which are larger than 8% and differences in code size which are larger than 1%.

If you see any unexpected regressions, you should consider fixing the
regressions before you merge the PR.

Noise: Sometimes the performance results (not code size!) contain false
alarms. Unexpected regressions which are marked with '(?)' are probably noise.
If you see regressions which you cannot explain you can try to run the
benchmarks again. If regressions still show up, please consult with the
performance team (@eeckstein).

Hardware Overview

  Model Name: Mac Pro
  Model Identifier: MacPro6,1
  Processor Name: 12-Core Intel Xeon E5
  Processor Speed: 2.7 GHz
  Number of Processors: 1
  Total Number of Cores: 12
  L2 Cache (per Core): 256 KB
  L3 Cache: 30 MB
  Memory: 64 GB

--------------

karwa · 2019-01-19T13:03:14Z

stdlib/public/Darwin/Foundation/NSStringAPI.swift

+    if let str = String(validatingUTF8: bytes) {
+      self = str
+      return
+    }
    if let ns = NSString(utf8String: bytes) {


Why not remove the NSString-and-bridge part of this entirely?
Then we could also make it inlineable since it trivially forwards to the stdlib function.

I'm trying to preserve behavior if the bytes are invalidly encoded. We should probably deprecate this initializer.

milseman · 2019-01-22T17:57:45Z

@swift-ci please test

milseman · 2019-01-22T18:27:56Z

@itaiferber these are some pretty exciting numbers, and they actually understate the benefit as they only measure creation time: reads of these fast-pathed strings will also be on all our fast-paths.

I'm working on the last regression currently, but we'd like to get this ready ASAP

itaiferber

Overall, looks good! Perf numbers are great 😄 I am somewhat concerned about memory binding here though — can we express this without the binding?

itaiferber · 2019-01-22T18:37:00Z

stdlib/public/Darwin/Foundation/NSStringAPI.swift

@@ -381,6 +395,14 @@ extension String {
  /// Returns a `String` initialized by converting given `data` into
  /// Unicode characters using a given `encoding`.
  public init?(data: __shared Data, encoding: Encoding) {
+    if encoding == .utf8,
+       let str = data.withUnsafeBytes({
+         String._tryFromUTF8($0.bindMemory(to: UInt8.self))


Mm, in previous discussions, we've agreed that the safety of binding Data is dubious given the fact that the underlying buffer may have been previously bound. Are we making the assertion that binding to a trivial type here is safe? /CC @atrick

Is it possible to express the UTF-8 validation in terms of raw buffer pointers instead of typed? (Or overload to make this possible safely?)

I don't know if this is something to be concerned with, but if it is:

The fix would be for Data to provide a way to access its contents as a UBP<UInt8>, since Data is the owner (or knows who the owner is) and is the only one allowed to do so. If String exposed something taking a URBP, then treated it as a collection of UInt8s, that seems like it's sweeping the potential time bomb under the living room rug.

Data might know who the owner is, but it doesn't necessarily make it safe to bind the contents to UInt8, since the previous owner may have bound the buffer to a nontrivial type, and Data cannot unbind it — URBPs specifically do allow reading as UInt8s via a raw interface, though, making it always safe to read from.

I'm not terribly concerned (as we've previously discussed — we can't prevent someone from doing something like reading arbitrary data bound to some non-trivial type U and trying to pass it along to be interpreted as UTF-8), but we did do a lot in Data to make sure we don't invoke undefined behavior like this silently on others' behalf. @atrick might have stronger opinions here, but I agree with him in that we shouldn't be repeating that here.

FYI withContiguousStorageIfAvailable provides a typed pointer, so Data could use that. Unfortunately, URBP doesn't support it (I guess because it doesn't want to bind random memory). EDIT: Oops, I thought Data supported that, but it's doing what URBP does.

I encountered that issue with #22028
Currently, if you write:

class MyClass {} var s = [MyClass(), MyClass(), MyClass()] let sbytes = withUnsafeBytes(of: &s) { String(decoding: $0, as: UTF8.self) }

you will invoke a fast-path which calls assumingMemoryBound(to: UInt8.self) on the raw pointer, which IIUC is technically undefined behaviour. But it was marked "unsafe"...

Any String utilities that we want to accept a raw byte stream, like Data, including validateUTF8 and _uncheckedFromUTF8 should either take URBP or some ContiguouslyStored protocol. Either will be source compatible with the existing APIs, so they can probably be fixed later.

swift-ci · 2019-01-22T18:50:26Z

Build failed
Swift Test Linux Platform
Git Sha - 98a38b4b3114668d56248049b5fecedd7c616f64

milseman · 2019-01-22T19:29:44Z

@swift-ci please benchmark

Catfish-Man · 2019-01-22T19:29:56Z

stdlib/public/core/StringCreate.swift

+  let address = Int(bitPattern: ptr)
+
+  while (address &+ i) % stride != 0 && i < count {
+    guard ptr[i] <= 0x7F else { return false }


This isn't really semantically important, but just for readability could we stick to (& 0x80…) rather than using that in one place and <= 0x7F in the other?

Makes sense

Depending on what the expected trip-count is here, it may not be worth explicitly aligning. Did you experiment with this at all?

No, and there's not really any time for much experimentation if this is going to ship. According to @jckarter, our loads and stores assumed aligned pointers, so I thought this was necessary for correctness.

Good enough for now, then.

Catfish-Man · 2019-01-22T19:30:45Z

stdlib/public/core/StringCreate.swift

+    let bytes: UInt = UnsafePointer(
+      bitPattern: address &+ i
+    )._unsafelyUnwrappedUnchecked.pointee
+    guard bytes & UInt(truncatingIfNeeded: 0x8080_8080_8080_8080 as UInt64)


We should give the compiler folks a bug to unroll this so we don't have to do it by hand ;)

(Also, does truncatingIfNeeded elide truncation checks when given a constant like this? We definitely don't want any unnecessary branches in this loop)

Do you mean do the whole operation, or just the promoting to a higher stride? CC @eeckstein @aschwaighofer

(As wacky as it sounds, truncatingIfNeeded is the way to skip checks)

Both would be nice! But I meant doing multiple words per loop, which this doesn't currently do. We squeezed another 1.5x out of CF's by doing that.

BTW, the first loop is unrolled AFAICT

Nice. That matches my mental model of the state of LLVM's loop unroller, which is that it handles known trip counts well, but mostly bails on unknown trip counts.

milseman · 2019-01-22T19:36:15Z

@swift-ci please test

swift-ci · 2019-01-22T19:38:47Z

Build failed
Swift Test Linux Platform
Git Sha - 98a38b4b3114668d56248049b5fecedd7c616f64

swift-ci · 2019-01-22T19:40:11Z

Build failed
Swift Test OS X Platform
Git Sha - 98a38b4b3114668d56248049b5fecedd7c616f64

milseman · 2019-01-22T20:05:43Z

@swift-ci please test

milseman · 2019-01-22T20:05:50Z

@swift-ci please benchmark

swift-ci · 2019-01-22T20:08:06Z

Build failed
Swift Test Linux Platform
Git Sha - 245cacb102faefeef2d7a18901f2568d8c088624

swift-ci · 2019-01-22T20:10:11Z

Build failed
Swift Test OS X Platform
Git Sha - 245cacb102faefeef2d7a18901f2568d8c088624

swift-ci · 2019-01-22T20:22:48Z

Build failed
Swift Test OS X Platform
Git Sha - 98a38b4b3114668d56248049b5fecedd7c616f64

swift-ci · 2019-01-22T21:02:03Z

Build comment file:

Performance: -O

TEST	OLD	NEW	DELTA	RATIO
Regression
StringComparison_abnormal	1080	1240	+14.8%	0.87x (?)
Improvement
UTF8Decode_InitFromData	1210	233	-80.7%	5.19x
UTF8Decode_InitFromBytes	1158	246	-78.8%	4.71x
UTF8Decode_InitFromData_ascii	632	245	-61.2%	2.58x
UTF8Decode_InitDecoding_ascii	546	262	-52.0%	2.08x
CStringLongAscii	391	232	-40.7%	1.69x
UTF8Decode_InitFromBytes_ascii	442	303	-31.4%	1.46x
StringComparison_slowerPrenormal	1760	1600	-9.1%	1.10x
Data.hash.Empty	71	65	-8.5%	1.09x (?)
LessSubstringSubstring	42	39	-7.1%	1.08x (?)
LessSubstringSubstringGenericComparable	42	39	-7.1%	1.08x

Performance: -Osize

TEST	OLD	NEW	DELTA	RATIO
Regression
DataCreateSmall	2000	3000	+50.0%	0.67x
StringComparison_abnormal	1100	1240	+12.7%	0.89x (?)
Improvement
UTF8Decode_InitFromData	1214	234	-80.7%	5.19x
UTF8Decode_InitFromBytes	1168	247	-78.9%	4.73x
UTF8Decode_InitFromData_ascii	612	246	-59.8%	2.49x
UTF8Decode_InitDecoding_ascii	541	256	-52.7%	2.11x
CStringLongAscii	387	229	-40.8%	1.69x
UTF8Decode_InitFromBytes_ascii	454	306	-32.6%	1.48x
StringComparison_slowerPrenormal	1780	1610	-9.6%	1.11x
LessSubstringSubstringGenericComparable	43	39	-9.3%	1.10x (?)
Data.hash.Empty	74	68	-8.1%	1.09x (?)
EqualSubstringSubstringGenericEquatable	43	40	-7.0%	1.07x

Performance: -Onone

TEST	OLD	NEW	DELTA	RATIO
Regression
DataAppendDataLargeToLarge	38800	51000	+31.4%	0.76x (?)
Histogram	8304	9663	+16.4%	0.86x (?)
LazilyFilteredArrays2	125055	144597	+15.6%	0.86x
Breadcrumbs.MutatedUTF16ToIdx.ASCII	23	26	+13.0%	0.88x (?)
Breadcrumbs.UTF16ToIdxRange.longASCII	358	401	+12.0%	0.89x (?)
ChainedFilterMap	333783	373491	+11.9%	0.89x (?)
LazilyFilteredArrayContains	1342309	1499861	+11.7%	0.89x (?)
Breadcrumbs.UTF16ToIdx.longASCII	1704	1897	+11.3%	0.90x (?)
Breadcrumbs.UTF16ToIdx.longMixed	1933	2148	+11.1%	0.90x (?)
Breadcrumbs.UTF16ToIdxRange.longMixed	452	494	+9.3%	0.91x (?)
FatCompactMap	448180	489180	+9.1%	0.92x
Improvement
UTF8Decode_InitFromData	1216	235	-80.7%	5.17x
UTF8Decode_InitFromBytes	1164	247	-78.8%	4.71x
UTF8Decode_InitFromData_ascii	651	261	-59.9%	2.49x
UTF8Decode_InitDecoding_ascii	682	395	-42.1%	1.73x
CStringLongAscii	417	261	-37.4%	1.60x
UTF8Decode_InitFromBytes_ascii	482	314	-34.9%	1.54x
StrComplexWalk	7350	6690	-9.0%	1.10x (?)
StringBuilderWithLongSubstring	3240	2950	-9.0%	1.10x (?)

How to read the data

The tables contain differences in performance which are larger than 8% and differences in code size which are larger than 1%.

If you see any unexpected regressions, you should consider fixing the
regressions before you merge the PR.

Noise: Sometimes the performance results (not code size!) contain false
alarms. Unexpected regressions which are marked with '(?)' are probably noise.
If you see regressions which you cannot explain you can try to run the
benchmarks again. If regressions still show up, please consult with the
performance team (@eeckstein).

Hardware Overview

  Model Name: Mac Pro
  Model Identifier: MacPro6,1
  Processor Name: 12-Core Intel Xeon E5
  Processor Speed: 2.7 GHz
  Number of Processors: 1
  Total Number of Cores: 12
  L2 Cache (per Core): 256 KB
  L3 Cache: 30 MB
  Memory: 64 GB

--------------

milseman · 2019-01-22T21:48:37Z

There we go, that gives us a good 2x for ASCII content. The regressions seem unrelated and flaky benchmarks.

Perform ASCII checking using pointer-width strides, making sure to align properly.

milseman · 2019-01-22T23:31:30Z

That force-push is just to squash the WIP commit in the middle.

@swift-ci please smoke test

[String] Add UTF-8 fast-paths for Foundation initializers

a088e13

Many Foundation initializers could benefit from faster string construction and subsequent reads in Swift 5. Add UTF-8 fast paths for when constructing a string from a valid UTF-8 code units.

airspeedswift approved these changes Jan 17, 2019

View reviewed changes

karwa reviewed Jan 19, 2019

View reviewed changes

milseman requested a review from itaiferber January 22, 2019 18:26

itaiferber reviewed Jan 22, 2019

View reviewed changes

Catfish-Man reviewed Jan 22, 2019

View reviewed changes

milseman force-pushed the fast_foundation branch from 245cacb to dc1d2d6 Compare January 22, 2019 20:05

[String] Speed up ASCII checking.

3df9291

Perform ASCII checking using pointer-width strides, making sure to align properly.

milseman force-pushed the fast_foundation branch from dc1d2d6 to 3df9291 Compare January 22, 2019 23:31

milseman mentioned this pull request Jan 22, 2019

[5.0][String] Add UTF-8 fast-paths for Foundation initializers #22051

Merged

milseman merged commit 81e87ac into swiftlang:master Jan 23, 2019

milseman deleted the fast_foundation branch January 23, 2019 00:26

milseman mentioned this pull request Jan 24, 2019

[SDK] Use withContiguousStorageIfAvailable in some overlay methods #22001

Closed

milseman mannequin mentioned this pull request Jan 22, 2019

[SR-7869] String.init(decoding:sourceEncoding:) Suffers Lack of Speed #50404

Closed

[String] Add UTF-8 fast-paths for Foundation initializers #21959

[String] Add UTF-8 fast-paths for Foundation initializers #21959

Uh oh!

Conversation

milseman commented Jan 17, 2019

Uh oh!

milseman commented Jan 17, 2019

Uh oh!

milseman commented Jan 17, 2019

Uh oh!

swift-ci commented Jan 17, 2019

Build comment file:

Performance: -O

Performance: -Osize

Performance: -Onone

Uh oh!

airspeedswift commented Jan 17, 2019

Uh oh!

milseman commented Jan 18, 2019

Uh oh!

milseman commented Jan 18, 2019

Uh oh!

swift-ci commented Jan 18, 2019

Uh oh!

swift-ci commented Jan 18, 2019

Uh oh!

swift-ci commented Jan 18, 2019

Build comment file:

Performance: -O

Performance: -Osize

Performance: -Onone

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

milseman commented Jan 22, 2019

Uh oh!

milseman commented Jan 22, 2019

Uh oh!

itaiferber left a comment • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

karwa Jan 22, 2019 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

swift-ci commented Jan 22, 2019

Uh oh!

milseman commented Jan 22, 2019

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

itaiferber left a comment •

edited

Loading

karwa Jan 22, 2019 •

edited

Loading