Skip to content

[nfc] document legacy ABI in the implementation of sort #40828

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
merged 3 commits into from
Feb 8, 2022

Conversation

glessard
Copy link
Contributor

This adds documentation of legacy ABI in the implementation of sort(). A return value was originally added as a workaround for a compiler bug, and the workaround became ABI. The bug has long been fixed, and the workaround was recently removed (#40081), while leaving the ABI untouched.

No SR.

@glessard glessard requested a review from lorentey January 12, 2022 23:09
@glessard
Copy link
Contributor Author

@swift-ci please smoke test

Copy link
Contributor

@HassanElDesouky HassanElDesouky left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thank you for adding the documentation!

/// Merges the elements at `runs[i]` and `runs[i - 1]`, using `buffer` as
/// out-of-place storage.
///
/// The unused return value is legacy ABI. It was originally added as a
/// workaround for a compiler bug (now fixed). See SR-14750 (rdar://45044610).
Copy link
Collaborator

@xwu xwu Jan 13, 2022

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Perhaps, for the user who stumbles upon the documentation of this (public, underscored) method and other similar ones, one might consider stating explicitly that the return value is an arbitrary value. I can imagine that "unused" could be confusing since it's up to the caller whether it's unused or not (and "legacy ABI" may not mean much to many folks in terms of clarifying this)--in this case, we're merely trying to say that the value is of no use.

I wonder if it'd be most fitting to write this information explicitly in a - Returns: section; it isn't essential, I think, to understand what the function is doing.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Huh -- perhaps we should also change this method to be @inlinable internal. It doesn't seem like a good idea to expose internal implementation details as public entry points.

(BTW, I'm not sure why we ended up using an Array here for runs -- it'd be too painful to change it now, but I hope it gets stack allocated...)

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Making them internal now is a good idea. I think that when they were added internal was still ill-advised for the standard library, though now it is fine. We can also further document the expectations with tests.

@glessard
Copy link
Contributor Author

glessard commented Feb 5, 2022

@swift-ci please smoke test

@glessard
Copy link
Contributor Author

glessard commented Feb 5, 2022

@swift-ci apple silicon benchmark

@swift-ci
Copy link
Contributor

swift-ci commented Feb 5, 2022

Performance (arm64): -O

Regression OLD NEW DELTA RATIO
SortIntPyramid 485 1975 +307.2% 0.25x
SortAdjacentIntPyramids 1290 4110 +218.6% 0.31x
Phonebook 721 1666 +131.1% 0.43x (?)
SortStrings 368 621 +68.7% 0.59x (?)
RGBHistogram 1400 1900 +35.7% 0.74x (?)
WordCountHistogramUTF16 2100 2800 +33.3% 0.75x (?)
WordCountHistogramASCII 2100 2700 +28.6% 0.78x
RGBHistogramOfObjects 5500 6500 +18.2% 0.85x (?)
CharIteration_punctuatedJapanese_unicodeScalars 240 280 +16.7% 0.86x (?)
SortStringsUnicode 1465 1705 +16.4% 0.86x (?)
RawBufferCopyBytes 12 13 +8.3% 0.92x (?)

Code size: -O

Improvement OLD NEW DELTA RATIO
SortIntPyramids.o 8758 6606 -24.6% 1.33x
Phonebook.o 8593 7105 -17.3% 1.21x
SortLettersInPlace.o 7505 6363 -15.2% 1.18x
StaticArray.o 9920 8444 -14.9% 1.17x
RGBHistogram.o 19488 17328 -11.1% 1.12x
NibbleSort.o 13768 12294 -10.7% 1.12x
SortLargeExistentials.o 15862 14342 -9.6% 1.11x
SortStrings.o 25693 23505 -8.5% 1.09x
Breadcrumbs.o 38917 36007 -7.5% 1.08x
WordCount.o 33603 32449 -3.4% 1.04x
DriverUtils.o 113440 109616 -3.4% 1.03x

Performance (arm64): -Osize

Regression OLD NEW DELTA RATIO
SortAdjacentIntPyramids 1600 4775 +198.4% 0.34x (?)
SortIntPyramid 915 2265 +147.5% 0.40x
Phonebook 721 1722 +138.8% 0.42x (?)
ArrayAppendReserved 190 310 +63.2% 0.61x (?)
SortStrings 380 619 +62.9% 0.61x (?)
RGBHistogram 1520 2140 +40.8% 0.71x
WordCountHistogramASCII 2200 2900 +31.8% 0.76x
WordCountHistogramUTF16 2300 3000 +30.4% 0.77x (?)
SortArrayInClass 58365 67420 +15.5% 0.87x (?)
SortStringsUnicode 1465 1690 +15.4% 0.87x (?)
RGBHistogramOfObjects 8500 9200 +8.2% 0.92x
StringToDataLargeUnicode 1250 1350 +8.0% 0.93x (?)
Breadcrumbs.UTF16ToIdxRange.longASCII 13 14 +7.7% 0.93x (?)
 
Improvement OLD NEW DELTA RATIO
StaticArray 2 1 -50.0% 2.00x (?)
NSArray.bridged.objectAtIndex 220 198 -10.0% 1.11x (?)

Code size: -Osize

Regression OLD NEW DELTA RATIO
SortIntPyramids.o 9042 9456 +4.6% 0.96x
 
Improvement OLD NEW DELTA RATIO
SortLettersInPlace.o 7465 6487 -13.1% 1.15x
StaticArray.o 9152 8130 -11.2% 1.13x
RGBHistogram.o 18874 16862 -10.7% 1.12x
SortLargeExistentials.o 13666 12246 -10.4% 1.12x
Phonebook.o 8593 7785 -9.4% 1.10x
NibbleSort.o 13952 12936 -7.3% 1.08x
SortStrings.o 25941 24143 -6.9% 1.07x
Breadcrumbs.o 35734 33852 -5.3% 1.06x
WordCount.o 34855 33841 -2.9% 1.03x
DriverUtils.o 116736 114210 -2.2% 1.02x

Performance (arm64): -Onone

Regression OLD NEW DELTA RATIO
ArrayPlusEqualArrayOfInt 170 190 +11.8% 0.89x (?)
NSArray.bridged.objectAtIndex 282 307 +8.9% 0.92x (?)
SortStrings 1095 1187 +8.4% 0.92x (?)

Code size: -swiftlibs

Improvement OLD NEW DELTA RATIO
libswiftSwiftOnoneSupport.dylib 147456 131072 -11.1% 1.12x
How to read the data The tables contain differences in performance which are larger than 8% and differences in code size which are larger than 1%.

If you see any unexpected regressions, you should consider fixing the
regressions before you merge the PR.

Noise: Sometimes the performance results (not code size!) contain false
alarms. Unexpected regressions which are marked with '(?)' are probably noise.
If you see regressions which you cannot explain you can try to run the
benchmarks again. If regressions still show up, please consult with the
performance team (@eeckstein).

Hardware Overview
  Model Name: Mac mini
  Model Identifier: Macmini9,1
  Total Number of Cores: 8 (4 performance and 4 efficiency)
  Memory: 16 GB

Copy link
Collaborator

@xwu xwu left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

These should still be inlinable, I think; I believe @lorentey indicated the same above too.

/// - Precondition: `runs.count > 1` and `i > 0`
/// - Precondition: `buffer` must have at least
/// `min(runs[i].count, runs[i - 1].count)` uninitialized elements.
@discardableResult
@inlinable
public mutating func _mergeRuns(
@usableFromInline
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
@usableFromInline
@inlinable

/// - Precondition: `buffer` must have at least
/// `min(runs[i].count, runs[i - 1].count)` uninitialized elements.
/// - Precondition: The ranges in `runs` must be consecutive, such that for
/// any i, `runs[i].upperBound == runs[i + 1].lowerBound`.
@discardableResult
@inlinable
public mutating func _mergeTopRuns(
@usableFromInline
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
@usableFromInline
@inlinable

/// - Precondition: `buffer` must have at least
/// `min(runs[i].count, runs[i - 1].count)` uninitialized elements.
/// - Precondition: The ranges in `runs` must be consecutive, such that for
/// any i, `runs[i].upperBound == runs[i + 1].lowerBound`.
@discardableResult
@inlinable
public mutating func _finalizeRuns(
@usableFromInline
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
@usableFromInline
@inlinable

@glessard
Copy link
Contributor Author

glessard commented Feb 5, 2022

These should still be inlinable, I think; I believe @lorentey indicated the same above too.

There is a bad error message when putting both attributes:
warning: '@inlinable' declaration is already '@usableFromInline'
This implies that @usableFromInline includes the behaviour @inlinable, but that is incorrect. https://bugs.swift.org/browse/SR-15827

@xwu
Copy link
Collaborator

xwu commented Feb 6, 2022

There is a bad error message when putting both attributes:
warning: '@inlinable' declaration is already '@usableFromInline'

The meaning that the message is trying to convey is that @inlinable implies @usableFromInline, such that the latter is redundant—that is correct. I agree the wording could be interpreted incorrectly though.

@glessard
Copy link
Contributor Author

glessard commented Feb 7, 2022

@swift-ci please test

@glessard
Copy link
Contributor Author

glessard commented Feb 7, 2022

@swift-ci apple silicon benchmark

@swift-ci
Copy link
Contributor

swift-ci commented Feb 7, 2022

Performance (arm64): -O

Code size: -O

Performance (arm64): -Osize

Regression OLD NEW DELTA RATIO
DataCreateMediumArray 820 900 +9.8% 0.91x (?)
 
Improvement OLD NEW DELTA RATIO
ObjectiveCBridgeStubURLAppendPathRef2 1010 920 -8.9% 1.10x (?)
RawBufferCopyBytes 12 11 -8.3% 1.09x (?)

Code size: -Osize

Performance (arm64): -Onone

Regression OLD NEW DELTA RATIO
ArrayAppendToFromGeneric 160 190 +18.7% 0.84x (?)
ArrayAppendToGeneric 170 190 +11.8% 0.89x (?)

Code size: -swiftlibs

How to read the data The tables contain differences in performance which are larger than 8% and differences in code size which are larger than 1%.

If you see any unexpected regressions, you should consider fixing the
regressions before you merge the PR.

Noise: Sometimes the performance results (not code size!) contain false
alarms. Unexpected regressions which are marked with '(?)' are probably noise.
If you see regressions which you cannot explain you can try to run the
benchmarks again. If regressions still show up, please consult with the
performance team (@eeckstein).

Hardware Overview
  Model Name: Mac mini
  Model Identifier: Macmini9,1
  Total Number of Cores: 8 (4 performance and 4 efficiency)
  Memory: 16 GB

buffer -> Bool in
_ = buffer.initialize(from: 0..<10)
return withUnsafeTemporaryAllocation(of: Int.self, capacity: 10) {
_merge(
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

(Huh, TIL we have -Xfrontend -disable-access-control in the %target-run-stdlib-swift macro.)

Copy link
Member

@lorentey lorentey left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM. We could document what the returned values are, but now that these are internal, it seems not very important to do so.

@glessard
Copy link
Contributor Author

glessard commented Feb 8, 2022

I think the tests are documentation in this case. Thanks!

@glessard glessard merged commit 8552044 into swiftlang:main Feb 8, 2022
@glessard glessard deleted the sort-optimization branch February 8, 2022 23:17
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

5 participants