Skip to content

[stdlib] Make use of protocol requirements to convert from concrete floating-point types #33803

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
merged 3 commits into from
Sep 5, 2020

Conversation

xwu
Copy link
Collaborator

@xwu xwu commented Sep 4, 2020

This PR is about picking the low-hanging fruit in terms of recovering performance during generic floating-point conversion. It is so low-hanging, in fact, it's actually partly a reversion to the original placeholder implementation of generic floating-point conversion.

In brief, we never removed the requirements for init(_: Float), init(_: Double), and init(_: Float80) from the protocol, and now that we have ABI stability, these requirements will be present forever. So, we can call them from the generic implementation.

There is one caveat to be highlighted: As in other situations (see #30417), the compiler regards the generic initializer itself as a suitable default implementation of init(_: Float) and friends, and therefore it will not warn if a concrete type does not provide its own implementations of the latter. Currently, this means that the concrete type will unwittingly have a very slow implementation; with this change, the concrete type will have an implementation that crashes at runtime due to infinite recursion. I think it's arguable as to which one is the worse outcome; this PR doesn't cause or otherwise change the underlying issue.

@xwu

This comment has been minimized.

@xwu
Copy link
Collaborator Author

xwu commented Sep 4, 2020

cc @troughton

I figure it's best to start small and chip away at this issue gradually. This should be an easy win.

@stephentyrone
Copy link
Contributor

@swift-ci benchmark

@@ -1890,7 +1890,18 @@ extension BinaryFloatingPoint {
/// - Parameter value: A floating-point value to be converted.
@inlinable
public init<Source: BinaryFloatingPoint>(_ value: Source) {
self = Self._convert(from: value).value
switch value {
case let value_ as Float:
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This should handle Float16 as well (implemented as self = Self(Float(value_)) in that case).

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@stephentyrone The required checks for the availability of Float16 are...not pretty. See the commit "🤮" for details.

@xwu
Copy link
Collaborator Author

xwu commented Sep 4, 2020

There's a benchmark PR not yet merged, by the way.

@xwu

This comment has been minimized.

@swift-ci

This comment has been minimized.

@swift-ci

This comment has been minimized.

@xwu xwu force-pushed the fp-init-specialization branch from c4ef4b6 to 70b68d6 Compare September 4, 2020 20:16
@xwu
Copy link
Collaborator Author

xwu commented Sep 4, 2020

@swift-ci test

@swift-ci

This comment has been minimized.

@swift-ci

This comment has been minimized.

@swift-ci

This comment has been minimized.

@xwu
Copy link
Collaborator Author

xwu commented Sep 4, 2020

@swift-ci test

@swift-ci

This comment has been minimized.

@swift-ci

This comment has been minimized.

@xwu

This comment has been minimized.

@xwu
Copy link
Collaborator Author

xwu commented Sep 5, 2020

@swift-ci test Linux platform

@xwu

This comment has been minimized.

@xwu

This comment has been minimized.

@xwu
Copy link
Collaborator Author

xwu commented Sep 5, 2020

Hmm

@swift-ci please benchmark

@swift-ci
Copy link
Contributor

swift-ci commented Sep 5, 2020

Performance: -O

Regression OLD NEW DELTA RATIO
EqualSubstringSubstring 22 29 +31.8% 0.76x
LessSubstringSubstring 22 29 +31.8% 0.76x (?)
EqualStringSubstring 22 29 +31.8% 0.76x
EqualSubstringSubstringGenericEquatable 22 29 +31.8% 0.76x
EqualSubstringString 22 29 +31.8% 0.76x
LessSubstringSubstringGenericComparable 22 29 +31.8% 0.76x
StringComparison_longSharedPrefix 341 379 +11.1% 0.90x (?)
Set.isSuperset.Seq.Empty.Int 49 54 +10.2% 0.91x (?)
Set.isDisjoint.Empty.Int 84 92 +9.5% 0.91x (?)
CSVParsing.Scalar 275 298 +8.4% 0.92x (?)
Set.isDisjoint.Empty.Box 87 94 +8.0% 0.93x (?)
 
Improvement OLD NEW DELTA RATIO
ConvertFloatingPoint.GenericDoubleToDouble 2129 57 -97.3% 37.35x
AngryPhonebook.ASCII2 144 110 -23.6% 1.31x
UTF8Decode_InitDecoding 167 142 -15.0% 1.18x
UTF8Decode_InitFromCustom_contiguous 163 145 -11.0% 1.12x (?)
UTF8Decode_InitFromCustom_noncontiguous 318 294 -7.5% 1.08x (?)
StringHashing_fastPrenormal 670 620 -7.5% 1.08x (?)

Code size: -O

Performance: -Osize

Regression OLD NEW DELTA RATIO
EqualSubstringSubstring 22 29 +31.8% 0.76x
EqualSubstringSubstringGenericEquatable 22 29 +31.8% 0.76x
EqualSubstringString 22 29 +31.8% 0.76x
LessSubstringSubstring 23 29 +26.1% 0.79x
EqualStringSubstring 23 29 +26.1% 0.79x
LessSubstringSubstringGenericComparable 23 29 +26.1% 0.79x (?)
StringComparison_longSharedPrefix 340 380 +11.8% 0.89x (?)
Data.hash.Medium 30 33 +10.0% 0.91x (?)
Set.isDisjoint.Seq.Empty.Int 84 92 +9.5% 0.91x (?)
Set.isSuperset.Seq.Int0 118 128 +8.5% 0.92x (?)
Set.isDisjoint.Seq.Int.Empty 50 54 +8.0% 0.93x (?)
Set.isDisjoint.Empty.Box 88 95 +8.0% 0.93x (?)
Set.isSuperset.Seq.Int.Empty 88 95 +8.0% 0.93x (?)
Set.isStrictSubset.Int.Empty 51 55 +7.8% 0.93x (?)
DropFirstAnySequenceLazy 1643 1768 +7.6% 0.93x (?)
 
Improvement OLD NEW DELTA RATIO
ConvertFloatingPoint.GenericDoubleToDouble 2132 61 -97.1% 34.95x
AngryPhonebook.ASCII2 144 110 -23.6% 1.31x
UTF8Decode_InitDecoding 167 144 -13.8% 1.16x
UTF8Decode_InitFromCustom_contiguous 165 145 -12.1% 1.14x (?)
IterateData 922 859 -6.8% 1.07x (?)

Code size: -Osize

Regression OLD NEW DELTA RATIO
FloatingPointConversion.o 1681 1706 +1.5% 0.99x

Performance: -Onone

Regression OLD NEW DELTA RATIO
LessSubstringSubstring 26 34 +30.8% 0.76x
EqualSubstringSubstringGenericEquatable 26 33 +26.9% 0.79x
EqualSubstringSubstring 27 34 +25.9% 0.79x (?)
EqualStringSubstring 27 34 +25.9% 0.79x (?)
LessSubstringSubstringGenericComparable 27 33 +22.2% 0.82x (?)
EqualSubstringString 28 34 +21.4% 0.82x
Set.subtracting.Seq.Empty.Box 1019 1160 +13.8% 0.88x (?)
Set.subtracting.Empty.Box 196 221 +12.8% 0.89x (?)
Set.isDisjoint.Box.Empty 1055 1187 +12.5% 0.89x (?)
Set.isDisjoint.Seq.Empty.Box 816 915 +12.1% 0.89x (?)
Set.subtracting.Seq.Box.Empty 1245 1394 +12.0% 0.89x (?)
Set.subtracting.Box.Empty 221 247 +11.8% 0.89x (?)
Set.isDisjoint.Seq.Box.Empty 1010 1128 +11.7% 0.90x (?)
SetIsSubsetBox0 1112 1233 +10.9% 0.90x (?)
Set.isStrictSubset.Empty.Int 511 566 +10.8% 0.90x (?)
Set.isStrictSubset.Box0 1129 1247 +10.5% 0.91x (?)
Set.isDisjoint.Empty.Box 971 1072 +10.4% 0.91x (?)
DictionaryLiteral 8060 8780 +8.9% 0.92x (?)
Combos 1826 1983 +8.6% 0.92x (?)
NSDictionaryCastToSwift 3180 3450 +8.5% 0.92x (?)
Set.isStrictSubset.Int.Empty 294 318 +8.2% 0.92x (?)
StackPromo 69700 75100 +7.7% 0.93x (?)
 
Improvement OLD NEW DELTA RATIO
ConvertFloatingPoint.GenericDoubleToDouble 2059 1132 -45.0% 1.82x
AngryPhonebook.ASCII2 145 111 -23.4% 1.31x
UTF8Decode_InitDecoding 178 148 -16.9% 1.20x
UTF8Decode_InitFromCustom_contiguous 180 151 -16.1% 1.19x (?)

Code size: -swiftlibs

How to read the data The tables contain differences in performance which are larger than 8% and differences in code size which are larger than 1%.

If you see any unexpected regressions, you should consider fixing the
regressions before you merge the PR.

Noise: Sometimes the performance results (not code size!) contain false
alarms. Unexpected regressions which are marked with '(?)' are probably noise.
If you see regressions which you cannot explain you can try to run the
benchmarks again. If regressions still show up, please consult with the
performance team (@eeckstein).

Hardware Overview
  Model Name: Mac mini
  Model Identifier: Macmini8,1
  Processor Name: 6-Core Intel Core i7
  Processor Speed: 3.2 GHz
  Number of Processors: 1
  Total Number of Cores: 6
  L2 Cache (per Core): 256 KB
  L3 Cache: 12 MB
  Memory: 64 GB

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants