[stdlib] Optimize high-level Set operations #40012

lorentey · 2021-11-02T05:10:04Z

This revives and replaces #21300, dramatically speeding up some high-level Set operations while also restoring the (undocumented) behavior that Set.intersection always returns items from self, not its argument.

As discussed on the forums, Swift 5.5 changed Set.intersection's behavior as an optimization (#36678). This method does not guarantee which input set it uses to construct its results, so arguably, code that relies on the original behavior is in the wrong. However, there is no reason to break such code: we can implement intersections even faster than #36678, while also preserving the original behavior. This PR does this and more. (rdar://84831592)

(I believe it would be worth documenting this behavior for both Set.intersection and Set.union. However, that is an API change that likely requires a Swift evolution proposal.)

This new incarnation of #21300 does not introduce a new type, which makes it possible to cleanly deploy these improvements to any shipping stdlib version. In exchange, the implementation turned slightly more complicated in places: the existing _UnsafeBitset type does not have a cached count, so we need to explicitly maintain it as a standalone variable. Temporary bitmaps are now allocated using the new temporary allocation facility, so they will be in most cases allocated on the stack.

The primary drawback of all these optimizations is a code size increase. I believe the performance increase will be worth this extra cost, but let's keep on eye on benchmark reports.

The original PR promised the following improvements: (these are very rough numbers, and they depend on the contents/size of the set)

Operation	Speedup
`Set.isSubset<S>(of:)`	3x
`Set.isStrictSubset<S>(of:)`	3x
`Set.isSuperset<S>(of:)`	4x
`Set.isStrictSuperset<S>(of:)`	4x
`Set.isDisjoint<S>(with:)`	4x
`Set.subtracting(_:)`	1-4x
`Set.filter(_:)`	1-4x
`Set.intersection<S>(_:)`	4-6x
`Set.intersection(_:)`	1-4x

Use a temporary bitset to speed up the Sequence variant by roughly a factor of 3.

Call into the specialized overload if the argument happens to be a `Set`.

- Use a temporary bitset to speed up the `Sequence` variant by roughly a factor of 4. - Fix a logic error causing the `a == b` case for the set variant to be O(n) instead of O(1).

Have the generic variant call out to the specialized overload if the argument happens to be a `Set`.

Use a temporary bitset to avoid hashing elements more than once, and to prevent rehashings during the creation of the result set. This leads to a speedup of about 0-4x, depending on the number of elements removed.

This works the same way as `Set.subtracting<S>(_:)`, and has similar performance benefits.

Use a temporary bitset to speed up the `Sequence` variant by roughly a factor of ~4-6, and the set/set variant by a factor of ~1-4, depending on the ratio of overlapping elements.

lorentey · 2021-11-02T05:10:18Z

@swift-ci test

lorentey · 2021-11-02T05:10:27Z

@swift-ci benchmark

swift-ci · 2021-11-02T06:41:51Z

Performance (x86_64): -O

Regression	OLD	NEW	DELTA	RATIO
Set.subtracting.Seq.Int.Empty	137	368	+168.6%	0.37x
Set.subtracting.Int.Empty	40	81	+102.5%	0.49x
Set.subtracting.Seq.Box.Empty	197	374	+89.8%	0.53x
Set.subtracting.Box.Empty	30	54	+80.0%	0.56x
Set.subtracting.Seq.Empty.Box	136	225	+65.4%	0.60x
Set.subtracting.Seq.Int25	420	575	+36.9%	0.73x
Set.subtracting.Empty.Box	15	19	+26.7%	0.79x
AnyHashableWithAClass	94000	119000	+26.6%	0.79x (?)
SetSubtractingInt25	98	122	+24.5%	0.80x
Set.subtracting.Seq.Empty.Int	134	164	+22.4%	0.82x
Set.subtracting.Empty.Int	27	32	+18.5%	0.84x (?)
CharacterLiteralsLarge	97	108	+11.3%	0.90x (?)

Improvement	OLD	NEW	DELTA	RATIO
Set.isStrictSuperset.Seq.Box0	10780	38	-99.6%	283.68x
Set.isStrictSuperset.Seq.Int0	4809	27	-99.4%	178.10x
Set.filter.Int100.28k	1940	50	-97.4%	38.80x
Set.filter.Int100.16k	1021	28	-97.3%	36.46x
Set.filter.Int100.20k	1142	35	-96.9%	32.63x
Set.filter.Int100.24k	1289	42	-96.7%	30.69x
Set.isSubset.Seq.Box25	1231	101	-91.8%	12.19x
Set.isStrictSubset.Seq.Box25	1236	103	-91.7%	12.00x
Set.isStrictSuperset.Seq.Box25	1085	102	-90.6%	10.64x
Set.isSubset.Seq.Int25	555	74	-86.7%	7.50x
Set.intersection.Seq.Int100	433	59	-86.4%	7.34x
Set.isStrictSubset.Seq.Int25	553	76	-86.3%	7.28x
Set.isStrictSuperset.Seq.Int25	481	74	-84.6%	6.50x
SetIntersectionInt100	340	62	-81.8%	5.48x
Set.intersection.Seq.Box0	364	78	-78.6%	4.67x
Set.isSubset.Seq.Int50	629	151	-76.0%	4.17x
Set.isStrictSubset.Seq.Int50	635	153	-75.9%	4.15x
Set.intersection.Seq.Box25	476	118	-75.2%	4.03x
Set.isStrictSuperset.Seq.Int100	1088	306	-71.9%	3.56x
Set.isStrictSubset.Seq.Int100	1084	308	-71.6%	3.52x
Set.isStrictSubset.Box0	342	104	-69.6%	3.29x
Set.isStrictSuperset.Seq.Int50	486	150	-69.1%	3.24x
Set.intersection.Seq.Int50	305	96	-68.5%	3.18x
Set.intersection.Seq.Int25	237	77	-67.5%	3.08x
Set.isSubset.Seq.Box0	1081	389	-64.0%	2.78x
Set.intersection.Seq.Int0	157	57	-63.7%	2.75x
Set.isStrictSubset.Seq.Box0	1086	416	-61.7%	2.61x
Set.isSubset.Seq.Int100	783	310	-60.4%	2.53x
Set.subtracting.Seq.Int100	752	302	-59.8%	2.49x
Set.isStrictSubset.Int0	256	104	-59.4%	2.46x
Set.filter.Int50.28k	1085	449	-58.6%	2.42x
Set.filter.Int50.16k	583	242	-58.5%	2.41x
Set.filter.Int50.20k	670	301	-55.1%	2.23x
SetSubtractingInt100	153	72	-52.9%	2.12x
Set.filter.Int50.24k	765	362	-52.7%	2.11x
SetIntersectionInt50	205	99	-51.7%	2.07x
SetIntersectionBox25	240	127	-47.1%	1.89x
Set.subtracting.Seq.Box0	745	413	-44.6%	1.80x
SetIntersectionInt25	138	81	-41.3%	1.70x
Set.isSubset.Seq.Int0	482	283	-41.3%	1.70x
Set.isStrictSubset.Seq.Int0	480	291	-39.4%	1.65x
Set.isStrictSuperset.Seq.Empty.Int	278	174	-37.4%	1.60x
SetIntersectionBox0	132	87	-34.1%	1.52x
Set.subtracting.Seq.Box25	1212	969	-20.0%	1.25x
DictionaryBridgeToObjC_Access	1272	1052	-17.3%	1.21x (?)
SetSubtractingInt50	121	105	-13.2%	1.15x
Set.isStrictSubset.Int.Empty	80	70	-12.5%	1.14x
SetSubtractingBox0	148	131	-11.5%	1.13x (?)
FlattenListFlatMap	6751	6103	-9.6%	1.11x (?)
NSStringConversion.Rebridge.UTF8	652	592	-9.2%	1.10x (?)
SetSubtractingInt0	68	62	-8.8%	1.10x
Set.isSubset.Seq.Int.Empty	193	176	-8.8%	1.10x (?)
Set.isStrictSubset.Seq.Int.Empty	192	178	-7.3%	1.08x (?)
StringBuilder	331	308	-6.9%	1.07x (?)
StringBuilderSmallReservingCapacity	340	317	-6.8%	1.07x (?)

Code size: -O

Regression	OLD	NEW	DELTA	RATIO
SetTests.o	131325	138749	+5.7%	0.95x

Improvement	OLD	NEW	DELTA	RATIO
DictTest3.o	15981	15085	-5.6%	1.06x

Performance (x86_64): -Osize

Regression	OLD	NEW	DELTA	RATIO
Set.subtracting.Seq.Int.Empty	138	367	+165.9%	0.38x
Set.subtracting.Int.Empty	39	81	+107.7%	0.48x
Set.subtracting.Seq.Box.Empty	195	397	+103.6%	0.49x
Set.subtracting.Box.Empty	30	54	+80.0%	0.56x
Set.subtracting.Seq.Empty.Box	137	232	+69.3%	0.59x
Set.subtracting.Empty.Int	27	45	+66.7%	0.60x
Set.subtracting.Seq.Empty.Int	134	222	+65.7%	0.60x
FlattenListFlatMap	4393	6858	+56.1%	0.64x (?)
Set.subtracting.Seq.Int25	435	573	+31.7%	0.76x
Set.subtracting.Empty.Box	15	19	+26.7%	0.79x
AnyHashableWithAClass	94000	119000	+26.6%	0.79x
SetSubtractingInt25	99	122	+23.2%	0.81x
Array2D	7520	8112	+7.9%	0.93x (?)

Improvement	OLD	NEW	DELTA	RATIO
Set.isStrictSuperset.Seq.Box0	12906	45	-99.7%	286.79x
Set.isStrictSuperset.Seq.Int0	4699	29	-99.4%	162.03x
Set.filter.Int100.28k	1915	52	-97.3%	36.83x
Set.filter.Int100.16k	1005	29	-97.1%	34.65x
Set.filter.Int100.20k	1125	36	-96.8%	31.25x
Set.filter.Int100.24k	1265	43	-96.6%	29.42x
Set.isSubset.Seq.Box25	1467	166	-88.7%	8.84x
Set.isStrictSubset.Seq.Box25	1469	168	-88.6%	8.74x
Set.isStrictSuperset.Seq.Box25	1307	162	-87.6%	8.07x
Set.isSubset.Seq.Int25	551	74	-86.6%	7.45x
Set.intersection.Seq.Int100	430	59	-86.3%	7.29x
Set.isStrictSubset.Seq.Int25	552	77	-86.1%	7.17x
Set.isStrictSuperset.Seq.Int25	475	74	-84.4%	6.42x
SetIntersectionInt100	342	63	-81.6%	5.43x
Set.isSubset.Seq.Int50	630	152	-75.9%	4.14x
Set.isStrictSubset.Seq.Int50	629	155	-75.4%	4.06x
Set.isStrictSuperset.Seq.Int100	1090	305	-72.0%	3.57x
Set.isStrictSubset.Box0	352	101	-71.3%	3.49x
Set.isStrictSubset.Seq.Int100	1089	317	-70.9%	3.44x
Set.isStrictSuperset.Seq.Int50	473	149	-68.5%	3.17x
Set.intersection.Seq.Int50	303	96	-68.3%	3.16x
Set.intersection.Seq.Box25	542	175	-67.7%	3.10x
Set.intersection.Seq.Int25	234	77	-67.1%	3.04x
Set.intersection.Seq.Box0	407	136	-66.6%	2.99x
Set.intersection.Seq.Int0	156	57	-63.5%	2.74x
Set.subtracting.Seq.Int100	771	304	-60.6%	2.54x
Set.isStrictSubset.Int0	259	104	-59.8%	2.49x
Set.isSubset.Seq.Int100	786	316	-59.8%	2.49x
Set.filter.Int50.28k	1094	440	-59.8%	2.49x
Set.filter.Int50.16k	588	239	-59.4%	2.46x
Set.filter.Int50.20k	675	296	-56.1%	2.28x
SetSubtractingInt100	157	70	-55.4%	2.24x
Set.filter.Int50.24k	771	356	-53.8%	2.17x
SetIntersectionInt50	207	101	-51.2%	2.05x
Set.isStrictSubset.Seq.Box0	1300	682	-47.5%	1.91x
Set.isSubset.Seq.Box0	1298	692	-46.7%	1.88x
SetIntersectionBox25	277	163	-41.2%	1.70x
Set.isSubset.Seq.Int0	478	284	-40.6%	1.68x
SetIntersectionInt25	138	82	-40.6%	1.68x
Set.isStrictSubset.Seq.Int0	479	303	-36.7%	1.58x
Set.isStrictSuperset.Seq.Empty.Int	279	202	-27.6%	1.38x
Set.subtracting.Seq.Box0	895	683	-23.7%	1.31x
SetSubtractingBox0	178	137	-23.0%	1.30x
SetIntersectionBox0	147	117	-20.4%	1.26x
Set.isStrictSubset.Int.Empty	84	71	-15.5%	1.18x
SetSubtractingInt50	124	105	-15.3%	1.18x
SetSubtractingBox25	291	251	-13.7%	1.16x
Set.subtracting.Seq.Box25	1410	1239	-12.1%	1.14x (?)
Set.subtracting.Seq.Int50	544	486	-10.7%	1.12x (?)
DictionaryBridgeToObjC_Access	1193	1079	-9.6%	1.11x (?)
RemoveWhereFilterString	322	293	-9.0%	1.10x
SetSubtractingInt0	69	63	-8.7%	1.10x (?)
ObjectiveCBridgeStubFromNSString	1766	1623	-8.1%	1.09x (?)
Set.isStrictSubset.Empty.Int	169	157	-7.1%	1.08x (?)

Code size: -Osize

Regression	OLD	NEW	DELTA	RATIO
SetTests.o	110066	120179	+9.2%	0.92x

Performance (x86_64): -Onone

Regression	OLD	NEW	DELTA	RATIO
Set.isDisjoint.Seq.Empty.Int	435	733	+68.5%	0.59x
Set.isDisjoint.Seq.Empty.Box	456	747	+63.8%	0.61x
Set.isSuperset.Seq.Empty.Int	624	894	+43.3%	0.70x (?)
ArrayAppendAscii	15334	21624	+41.0%	0.71x (?)
ArrayAppendUTF16	15402	21624	+40.4%	0.71x
Set.isDisjoint.Seq.Int.Empty	698	973	+39.4%	0.72x (?)
ArrayAppendLatin1	15640	21692	+38.7%	0.72x (?)
Set.isDisjoint.Seq.Box.Empty	722	1001	+38.6%	0.72x
Set.isSubset.Seq.Empty.Int	311	418	+34.4%	0.74x
Set.subtracting.Box.Empty	149	189	+26.8%	0.79x
Set.subtracting.Int.Empty	160	200	+25.0%	0.80x
AnyHashableWithAClass	129500	157000	+21.2%	0.82x (?)
Set.isDisjoint.Seq.Int100	1846	2166	+17.3%	0.85x (?)
Set.isSuperset.Seq.Int0	1744	2041	+17.0%	0.85x (?)
Set.isSuperset.Seq.Int.Empty	1686	1969	+16.8%	0.86x (?)
SetSubtractingBox25	855	993	+16.1%	0.86x (?)
Set.isSuperset.Seq.Box0	2007	2304	+14.8%	0.87x (?)
Set.subtracting.Seq.Box.Empty	894	980	+9.6%	0.91x (?)
Set.subtracting.Seq.Box25	8726	9540	+9.3%	0.91x (?)
Set.subtracting.Seq.Int.Empty	872	953	+9.3%	0.92x (?)
StringInterpolation	11500	12400	+7.8%	0.93x (?)

Improvement	OLD	NEW	DELTA	RATIO
Set.isStrictSuperset.Seq.Box0	84279	255	-99.7%	330.50x
Set.isStrictSuperset.Seq.Int0	64464	227	-99.6%	283.98x
Set.filter.Int100.28k	5348	326	-93.9%	16.40x
Set.filter.Int100.16k	2886	185	-93.6%	15.60x
Set.filter.Int100.20k	3390	230	-93.2%	14.74x
Set.filter.Int100.24k	3969	275	-93.1%	14.43x
SetIntersectionInt100	876	117	-86.6%	7.49x
Set.isStrictSubset.Seq.Int25	6707	1320	-80.3%	5.08x
Set.isSubset.Seq.Int25	6648	1319	-80.2%	5.04x
Set.isStrictSuperset.Seq.Int25	6467	1314	-79.7%	4.92x
Set.isSubset.Seq.Box25	9282	1916	-79.4%	4.84x
Set.isStrictSubset.Seq.Box25	9257	1936	-79.1%	4.78x
Set.isStrictSuperset.Seq.Box25	8547	1927	-77.5%	4.44x
SetSubtractingInt100	491	167	-66.0%	2.94x
Set.filter.Int50.28k	3108	1059	-65.9%	2.93x
SetIntersectionInt50	529	181	-65.8%	2.92x
Set.filter.Int50.16k	1701	609	-64.2%	2.79x
Set.isStrictSubset.Box0	858	318	-62.9%	2.70x
Set.isStrictSubset.Seq.Int50	6875	2599	-62.2%	2.65x
Set.isSubset.Seq.Int50	6888	2619	-62.0%	2.63x
Set.filter.Int50.20k	2012	796	-60.4%	2.53x
Set.isStrictSuperset.Seq.Int50	6521	2608	-60.0%	2.50x
Set.filter.Int50.24k	2366	1019	-56.9%	2.32x
SetIntersectionInt25	345	149	-56.8%	2.32x
Set.isStrictSubset.Int0	603	287	-52.4%	2.10x
Set.intersection.Seq.Int100	2163	1039	-52.0%	2.08x
SetSubtractingInt0	307	156	-49.2%	1.97x
SetSubtractingInt50	433	239	-44.8%	1.81x
Set.intersection.Seq.Box25	2772	1607	-42.0%	1.72x
SetIntersectionBox25	1049	638	-39.2%	1.64x
Set.intersection.Seq.Int50	1832	1129	-38.4%	1.62x
Set.intersection.Seq.Box0	2227	1454	-34.7%	1.53x
Set.isStrictSubset.Seq.Int100	7957	5198	-34.7%	1.53x
Set.isStrictSuperset.Seq.Int100	7958	5233	-34.2%	1.52x
Set.intersection.Seq.Int25	1651	1095	-33.7%	1.51x
Set.isStrictSubset.Int.Empty	421	291	-30.9%	1.45x
SetSubtractingInt25	391	281	-28.1%	1.39x
Set.isSubset.Seq.Int100	7267	5255	-27.7%	1.38x
Set.intersection.Seq.Int0	1438	1053	-26.8%	1.37x
Set.subtracting.Seq.Int100	7037	5195	-26.2%	1.35x
SetSubtractingBox0	682	522	-23.5%	1.31x
Set.subtracting.Seq.Empty.Int	610	477	-21.8%	1.28x
Set.subtracting.Seq.Empty.Box	634	497	-21.6%	1.28x
Set.isStrictSubset.Seq.Int0	6575	5258	-20.0%	1.25x
Set.isSubset.Seq.Int0	6465	5220	-19.3%	1.24x
Set.isStrictSuperset.Seq.Empty.Int	1374	1122	-18.3%	1.22x (?)
SetIntersect	1400	1150	-17.9%	1.22x
SetIntersectionInt0	140	115	-17.9%	1.22x
Set.isStrictSubset.Seq.Box0	8519	7254	-14.8%	1.17x
Set.isSubset.Seq.Box0	8500	7253	-14.7%	1.17x
Set.subtracting.Seq.Int50	6525	5641	-13.5%	1.16x
Set.subtracting.Seq.Int0	6060	5263	-13.2%	1.15x (?)
Set.subtracting.Seq.Box0	8103	7235	-10.7%	1.12x (?)
Set.isStrictSubset.Seq.Int.Empty	1267	1133	-10.6%	1.12x (?)
Set.subtracting.Seq.Int25	6343	5885	-7.2%	1.08x (?)

Code size: -swiftlibs

How to read the data

The tables contain differences in performance which are larger than 8% and differences in code size which are larger than 1%.

If you see any unexpected regressions, you should consider fixing the
regressions before you merge the PR.

Noise: Sometimes the performance results (not code size!) contain false
alarms. Unexpected regressions which are marked with '(?)' are probably noise.
If you see regressions which you cannot explain you can try to run the
benchmarks again. If regressions still show up, please consult with the
performance team (@eeckstein).

Hardware Overview

  Model Name: Mac Pro
  Model Identifier: MacPro6,1
  Processor Name: 12-Core Intel Xeon E5
  Processor Speed: 2.7 GHz
  Number of Processors: 1
  Total Number of Cores: 12
  L2 Cache (per Core): 256 KB
  L3 Cache: 30 MB
  Memory: 64 GB

swift-ci · 2021-11-02T06:58:08Z

Build failed
Swift Test OS X Platform
Git Sha - 2e3e88c

lorentey · 2021-11-02T07:29:12Z

Huh, a mere 5-10% overall code size increase on SetTests.o seems like a welcome surprise!

I have some assertion failures I'll need to investigate. (That isn't cause for alarm, this PR is a draft for a reason. 🙈)

…ations

lorentey · 2021-11-03T03:17:21Z

@swift-ci benchmark

lorentey · 2021-11-03T03:17:28Z

@swift-ci test

swift-ci · 2021-11-03T04:15:46Z

Performance (x86_64): -O

Regression	OLD	NEW	DELTA	RATIO
Set.subtracting.Empty.Box	15	36	+140.0%	0.42x
Set.subtracting.Seq.Int.Empty	137	242	+76.6%	0.57x
Set.subtracting.Seq.Empty.Int	135	233	+72.6%	0.58x
Set.subtracting.Seq.Empty.Box	136	233	+71.3%	0.58x
Set.subtracting.Empty.Int	27	46	+70.4%	0.59x
Set.subtracting.Seq.Int25	421	582	+38.2%	0.72x (?)
Set.subtracting.Seq.Box.Empty	197	272	+38.1%	0.72x
SetSubtractingInt25	97	123	+26.8%	0.79x
UTF8Decode_InitDecoding_ascii	253	288	+13.8%	0.88x (?)
NSStringConversion.Rebridge.UTF8	603	673	+11.6%	0.90x (?)
CharacterLiteralsLarge	97	108	+11.3%	0.90x (?)
LessSubstringSubstring	39	43	+10.3%	0.91x (?)
EqualSubstringSubstring	39	42	+7.7%	0.93x (?)
EqualStringSubstring	39	42	+7.7%	0.93x (?)
EqualSubstringString	39	42	+7.7%	0.93x (?)
ConvertFloatingPoint.MockFloat64Exactly2	26	28	+7.7%	0.93x (?)

Improvement	OLD	NEW	DELTA	RATIO
Set.isStrictSuperset.Seq.Box0	10790	38	-99.6%	283.94x
Set.isStrictSuperset.Seq.Int0	4804	26	-99.5%	184.76x
Set.filter.Int100.28k	1915	50	-97.4%	38.30x
Set.filter.Int100.16k	1007	28	-97.2%	35.96x
Set.filter.Int100.20k	1126	35	-96.9%	32.17x
Set.filter.Int100.24k	1267	42	-96.7%	30.17x
Set.isSubset.Seq.Box25	1230	101	-91.8%	12.18x
Set.isStrictSubset.Seq.Box25	1236	103	-91.7%	12.00x
Set.isStrictSuperset.Seq.Box25	1087	102	-90.6%	10.66x
Set.isSubset.Seq.Int25	555	74	-86.7%	7.50x
Set.intersection.Seq.Int100	434	59	-86.4%	7.36x
Set.isStrictSubset.Seq.Int25	553	76	-86.3%	7.28x
Set.isStrictSuperset.Seq.Int25	481	73	-84.8%	6.59x
SetIntersectionInt100	342	62	-81.9%	5.52x
Set.intersection.Seq.Box0	364	79	-78.3%	4.61x
Set.isSubset.Seq.Int50	630	151	-76.0%	4.17x
Set.isStrictSubset.Seq.Int50	636	154	-75.8%	4.13x
Set.intersection.Seq.Box25	476	118	-75.2%	4.03x
Set.isStrictSuperset.Seq.Int100	1086	306	-71.8%	3.55x
Set.isStrictSubset.Seq.Int100	1083	307	-71.7%	3.53x
Set.isStrictSubset.Box0	340	104	-69.4%	3.27x
Set.isStrictSuperset.Seq.Int50	487	149	-69.4%	3.27x
Set.intersection.Seq.Int50	305	96	-68.5%	3.18x
Set.intersection.Seq.Int25	237	77	-67.5%	3.08x
Set.intersection.Seq.Int0	157	56	-64.3%	2.80x
Set.isSubset.Seq.Box0	1081	391	-63.8%	2.76x
Set.isStrictSubset.Seq.Box0	1085	415	-61.8%	2.61x
Set.isSubset.Seq.Int100	782	310	-60.4%	2.52x
Set.isStrictSubset.Int0	257	104	-59.5%	2.47x
Set.subtracting.Seq.Int100	753	307	-59.2%	2.45x
Set.filter.Int50.28k	1086	450	-58.6%	2.41x
Set.filter.Int50.16k	583	244	-58.1%	2.39x
SetSubtractingInt100	153	67	-56.2%	2.28x
Set.filter.Int50.20k	670	302	-54.9%	2.22x
Set.filter.Int50.24k	765	363	-52.5%	2.11x
SetIntersectionInt50	204	100	-51.0%	2.04x
SetIntersectionBox25	240	125	-47.9%	1.92x
Set.isSubset.Seq.Int0	481	280	-41.8%	1.72x
SetIntersectionInt25	138	81	-41.3%	1.70x
Set.subtracting.Seq.Box0	744	444	-40.3%	1.68x
Set.isStrictSubset.Seq.Int0	480	292	-39.2%	1.64x
Set.isStrictSuperset.Seq.Empty.Int	278	174	-37.4%	1.60x
SetIntersectionBox0	132	84	-36.4%	1.57x
Set.subtracting.Seq.Box25	1214	956	-21.3%	1.27x
SetSubtractingInt50	121	104	-14.0%	1.16x
Set.isStrictSubset.Int.Empty	80	70	-12.5%	1.14x
SetSubtractingBox0	148	130	-12.2%	1.14x (?)
SetSubtractingInt0	68	61	-10.3%	1.11x (?)
Set.isSubset.Seq.Int.Empty	193	174	-9.8%	1.11x (?)
Breadcrumbs.MutatedUTF16ToIdx.Mixed	307	279	-9.1%	1.10x (?)
Set.isStrictSubset.Seq.Int.Empty	192	178	-7.3%	1.08x (?)
StringBuilder	331	308	-6.9%	1.07x (?)

Code size: -O

Regression	OLD	NEW	DELTA	RATIO
SetTests.o	131325	142173	+8.3%	0.92x

Improvement	OLD	NEW	DELTA	RATIO
DictTest3.o	15981	15085	-5.6%	1.06x

Performance (x86_64): -Osize

Regression	OLD	NEW	DELTA	RATIO
Set.subtracting.Empty.Box	15	36	+140.0%	0.42x
Set.subtracting.Seq.Empty.Box	137	237	+73.0%	0.58x
Set.subtracting.Seq.Int.Empty	138	238	+72.5%	0.58x
Set.subtracting.Seq.Empty.Int	134	226	+68.7%	0.59x
Set.subtracting.Empty.Int	27	45	+66.7%	0.60x
Set.subtracting.Seq.Box.Empty	195	274	+40.5%	0.71x
Set.subtracting.Seq.Int25	438	580	+32.4%	0.76x
AnyHashableWithAClass	93500	119000	+27.3%	0.79x (?)
SetSubtractingInt25	99	122	+23.2%	0.81x
LessSubstringSubstring	40	44	+10.0%	0.91x (?)
Set.isSubset.Seq.Int.Empty	195	212	+8.7%	0.92x
Array2D	7520	8112	+7.9%	0.93x (?)

Improvement	OLD	NEW	DELTA	RATIO
Set.isStrictSuperset.Seq.Box0	12899	45	-99.7%	286.64x
Set.isStrictSuperset.Seq.Int0	4697	30	-99.4%	156.56x
Set.filter.Int100.28k	1912	51	-97.3%	37.49x
Set.filter.Int100.16k	1004	29	-97.1%	34.62x
Set.filter.Int100.20k	1124	35	-96.9%	32.11x
Set.filter.Int100.24k	1263	42	-96.7%	30.07x
Set.isSubset.Seq.Box25	1469	167	-88.6%	8.80x
Set.isStrictSubset.Seq.Box25	1469	168	-88.6%	8.74x
Set.isStrictSuperset.Seq.Box25	1307	162	-87.6%	8.07x
Set.isSubset.Seq.Int25	552	74	-86.6%	7.46x
Set.intersection.Seq.Int100	430	59	-86.3%	7.29x
Set.isStrictSubset.Seq.Int25	553	76	-86.3%	7.28x
Set.isStrictSuperset.Seq.Int25	476	74	-84.5%	6.43x
SetIntersectionInt100	342	64	-81.3%	5.34x
Set.isSubset.Seq.Int50	629	152	-75.8%	4.14x
Set.isStrictSubset.Seq.Int50	629	154	-75.5%	4.08x
Set.isStrictSuperset.Seq.Int100	1094	307	-71.9%	3.56x
Set.isStrictSubset.Box0	352	101	-71.3%	3.49x
Set.isStrictSubset.Seq.Int100	1090	313	-71.3%	3.48x
Set.intersection.Seq.Int50	303	96	-68.3%	3.16x
Set.isStrictSuperset.Seq.Int50	472	150	-68.2%	3.15x
Set.intersection.Seq.Box25	542	174	-67.9%	3.11x
Set.intersection.Seq.Int25	234	77	-67.1%	3.04x
Set.intersection.Seq.Box0	407	136	-66.6%	2.99x
Set.intersection.Seq.Int0	156	57	-63.5%	2.74x
Set.isStrictSubset.Int0	260	101	-61.2%	2.57x
Set.isSubset.Seq.Int100	785	311	-60.4%	2.52x
Set.subtracting.Seq.Int100	769	308	-59.9%	2.50x
Set.filter.Int50.28k	1094	447	-59.1%	2.45x
Set.filter.Int50.16k	588	243	-58.7%	2.42x
SetSubtractingInt100	158	68	-57.0%	2.32x
Set.filter.Int50.20k	675	302	-55.3%	2.24x
Set.filter.Int50.24k	771	361	-53.2%	2.14x
SetIntersectionInt50	208	101	-51.4%	2.06x
Set.isStrictSubset.Seq.Box0	1299	679	-47.7%	1.91x
Set.isSubset.Seq.Box0	1298	688	-47.0%	1.89x
SetIntersectionInt25	139	82	-41.0%	1.70x
SetIntersectionBox25	277	164	-40.8%	1.69x
Set.isSubset.Seq.Int0	478	286	-40.2%	1.67x
Set.isStrictSubset.Seq.Int0	477	306	-35.8%	1.56x
FlattenListLoop	2540	1659	-34.7%	1.53x (?)
Set.isStrictSuperset.Seq.Empty.Int	278	207	-25.5%	1.34x
Set.subtracting.Seq.Box0	894	672	-24.8%	1.33x
SetSubtractingBox0	178	136	-23.6%	1.31x
SetIntersectionBox0	146	119	-18.5%	1.23x
SetSubtractingInt50	124	104	-16.1%	1.19x
Set.isStrictSubset.Int.Empty	84	71	-15.5%	1.18x
SetSubtractingBox25	291	251	-13.7%	1.16x
SetSubtractingInt0	69	61	-11.6%	1.13x
Set.subtracting.Seq.Box25	1410	1249	-11.4%	1.13x (?)
Set.subtracting.Seq.Int50	547	492	-10.1%	1.11x
StringBuilder	330	304	-7.9%	1.09x (?)
String.data.LargeUnicode	120	112	-6.7%	1.07x (?)

Code size: -Osize

Regression	OLD	NEW	DELTA	RATIO
SetTests.o	110066	124097	+12.7%	0.89x

Performance (x86_64): -Onone

Regression	OLD	NEW	DELTA	RATIO
Set.isDisjoint.Seq.Empty.Int	436	730	+67.4%	0.60x
Set.isDisjoint.Seq.Empty.Box	455	753	+65.5%	0.60x
Set.isSuperset.Seq.Empty.Int	626	907	+44.9%	0.69x
Set.isDisjoint.Seq.Int.Empty	700	997	+42.4%	0.70x (?)
ArrayAppendAscii	15334	21794	+42.1%	0.70x (?)
ArrayAppendUTF16	15436	21624	+40.1%	0.71x (?)
Set.isDisjoint.Seq.Box.Empty	724	1012	+39.8%	0.72x
ArrayAppendLatin1	15640	21658	+38.5%	0.72x (?)
Set.isSubset.Seq.Empty.Int	311	420	+35.0%	0.74x
SetSubtractingBox25	855	1127	+31.8%	0.76x
Set.subtracting.Seq.Box25	8687	10268	+18.2%	0.85x (?)
Set.isDisjoint.Seq.Int100	1864	2186	+17.3%	0.85x (?)
Set.subtracting.Empty.Box	116	135	+16.4%	0.86x (?)
Set.subtracting.Empty.Int	129	147	+14.0%	0.88x
DataCreateMediumArray	6360	7220	+13.5%	0.88x (?)
RandomDoubleOpaqueDef	51100	56000	+9.6%	0.91x (?)
RandomDoubleDef	50700	55300	+9.1%	0.92x (?)

Improvement	OLD	NEW	DELTA	RATIO
Set.isStrictSuperset.Seq.Box0	84431	254	-99.7%	332.40x
Set.isStrictSuperset.Seq.Int0	65051	228	-99.6%	285.31x
Set.filter.Int100.28k	5356	326	-93.9%	16.43x
Set.filter.Int100.16k	2890	185	-93.6%	15.62x
Set.filter.Int100.20k	3392	231	-93.2%	14.68x
Set.filter.Int100.24k	3970	276	-93.0%	14.38x
SetIntersectionInt100	875	119	-86.4%	7.35x
Set.isSubset.Seq.Int25	6733	1311	-80.5%	5.14x
Set.isStrictSubset.Seq.Int25	6716	1324	-80.3%	5.07x
Set.isStrictSuperset.Seq.Int25	6508	1310	-79.9%	4.97x
Set.isSubset.Seq.Box25	9267	1916	-79.3%	4.84x
Set.isStrictSubset.Seq.Box25	9271	1947	-79.0%	4.76x
Set.isStrictSuperset.Seq.Box25	8615	1918	-77.7%	4.49x
Set.filter.Int50.28k	3109	1053	-66.1%	2.95x
SetIntersectionInt50	528	184	-65.2%	2.87x
Set.filter.Int50.16k	1700	602	-64.6%	2.82x
Set.isStrictSubset.Box0	859	307	-64.3%	2.80x
Set.isSubset.Seq.Int50	6908	2599	-62.4%	2.66x
Set.isStrictSubset.Seq.Int50	6849	2610	-61.9%	2.62x
Set.filter.Int50.20k	2013	790	-60.8%	2.55x
Set.isStrictSuperset.Seq.Int50	6525	2591	-60.3%	2.52x
Set.filter.Int50.24k	2365	1020	-56.9%	2.32x
SetIntersectionInt25	344	150	-56.4%	2.29x
Set.isStrictSubset.Int0	603	287	-52.4%	2.10x
Set.intersection.Seq.Int100	2176	1039	-52.3%	2.09x
SetSubtractingInt0	307	157	-48.9%	1.96x
SetSubtractingInt100	491	279	-43.2%	1.76x
Set.intersection.Seq.Box25	2787	1610	-42.2%	1.73x
SetIntersectionBox25	1050	637	-39.3%	1.65x
Set.intersection.Seq.Int50	1838	1123	-38.9%	1.64x
Set.intersection.Seq.Box0	2237	1443	-35.5%	1.55x
Set.isStrictSubset.Seq.Int100	7976	5187	-35.0%	1.54x
Set.isStrictSuperset.Seq.Int100	7955	5230	-34.3%	1.52x
Set.intersection.Seq.Int25	1656	1093	-34.0%	1.52x
Set.isStrictSubset.Int.Empty	422	291	-31.0%	1.45x
Set.isSubset.Seq.Int100	7297	5226	-28.4%	1.40x (?)
Set.intersection.Seq.Int0	1447	1052	-27.3%	1.38x
SetSubtractingBox0	683	518	-24.2%	1.32x
Set.isSubset.Seq.Int0	6569	5266	-19.8%	1.25x
Set.isStrictSubset.Seq.Int0	6509	5290	-18.7%	1.23x
SetIntersectionInt0	140	115	-17.9%	1.22x
Set.subtracting.Seq.Int100	7060	5843	-17.2%	1.21x (?)
SetIntersect	1400	1160	-17.1%	1.21x
Set.isStrictSuperset.Seq.Empty.Int	1374	1152	-16.2%	1.19x (?)
Set.isSubset.Seq.Box0	8527	7203	-15.5%	1.18x
Set.isStrictSubset.Seq.Box0	8516	7249	-14.9%	1.17x
SetSubtractingInt50	433	374	-13.6%	1.16x
Set.subtracting.Seq.Int0	6085	5267	-13.4%	1.16x (?)
Breadcrumbs.MutatedUTF16ToIdx.Mixed	316	287	-9.2%	1.10x (?)
Set.subtracting.Seq.Box0	7948	7231	-9.0%	1.10x (?)
Set.isStrictSubset.Seq.Int.Empty	1266	1152	-9.0%	1.10x (?)
Breadcrumbs.MutatedIdxToUTF16.Mixed	330	304	-7.9%	1.09x (?)
Data.init.Sequence.64kB.Count0.RE.I	31942	29574	-7.4%	1.08x (?)
Data.append.Sequence.64kB.Count0.RE.I	31806	29524	-7.2%	1.08x (?)
Data.append.Sequence.809B.Count0.RE	39439	36642	-7.1%	1.08x (?)
Data.append.Sequence.809B.Count0.RE.I	39356	36714	-6.7%	1.07x (?)
Data.init.Sequence.64kB.Count0.RE	31730	29603	-6.7%	1.07x (?)
Data.init.Sequence.809B.Count0.RE	39648	37033	-6.6%	1.07x (?)

Code size: -swiftlibs

How to read the data

The tables contain differences in performance which are larger than 8% and differences in code size which are larger than 1%.

If you see any unexpected regressions, you should consider fixing the
regressions before you merge the PR.

Noise: Sometimes the performance results (not code size!) contain false
alarms. Unexpected regressions which are marked with '(?)' are probably noise.
If you see regressions which you cannot explain you can try to run the
benchmarks again. If regressions still show up, please consult with the
performance team (@eeckstein).

Hardware Overview

  Model Name: Mac Pro
  Model Identifier: MacPro6,1
  Processor Name: 12-Core Intel Xeon E5
  Processor Speed: 2.7 GHz
  Number of Processors: 1
  Total Number of Cores: 12
  L2 Cache (per Core): 256 KB
  L3 Cache: 30 MB
  Memory: 64 GB

swift-ci · 2021-11-03T04:46:48Z

Build failed
Swift Test Linux Platform
Git Sha - af06df9

lorentey · 2021-11-03T19:00:42Z

@swift-ci test

lorentey added 10 commits November 1, 2021 21:38

[stdlib] _UnsafeBitset.withTemporaryBitset: New internal function

e2415f7

[stdlib] _SetVariant.convertedToNative: New convenience function

6b1f6c0

[stdlib] Optimize Set.isSubset<S>(of: S)

5e642d2

Use a temporary bitset to speed up the Sequence variant by roughly a factor of 3.

[stdlib] Optimize Set.isStrictSubset<S>(of: S)

9043c71

Use a temporary bitset to speed up the Sequence variant by roughly a factor of 3.

[stdlib] Optimize Set.isSuperset<S>(of: S)

725ee55

Call into the specialized overload if the argument happens to be a `Set`.

[stdlib] Optimize Set.isStrictSuperset(of:)

a2540c2

- Use a temporary bitset to speed up the `Sequence` variant by roughly a factor of 4. - Fix a logic error causing the `a == b` case for the set variant to be O(n) instead of O(1).

[stdlib] Add a fast path to Set.isDisjoint<S>(with: S)

8612f2f

Have the generic variant call out to the specialized overload if the argument happens to be a `Set`.

[stdlib] Optimize Set.subtracting(_:)

80296bb

Use a temporary bitset to avoid hashing elements more than once, and to prevent rehashings during the creation of the result set. This leads to a speedup of about 0-4x, depending on the number of elements removed.

[stdlib] Optimize Set.filter(_:)

db80c78

This works the same way as `Set.subtracting<S>(_:)`, and has similar performance benefits.

[stdlib] Optimize Set.intersection(_:)

2e3e88c

Use a temporary bitset to speed up the `Sequence` variant by roughly a factor of ~4-6, and the set/set variant by a factor of ~1-4, depending on the ratio of overlapping elements.

lorentey mentioned this pull request Nov 2, 2021

[stdlib] Optimize some Set operations #21300

Closed

lorentey added 3 commits November 2, 2021 19:30

[stdlib] Delete bogus assert

f82b299

[stdlib][test] Add a bit more extensive tests for high-level set oper…

8e4b53b

…ations

[stdlib] Optimize Set.subtracting even more

af06df9

lorentey marked this pull request as ready for review November 3, 2021 03:19

[test] Fix error in non-Darwin builds

172b1b8

lorentey requested review from timvermeulen, glessard and natecook1000 November 3, 2021 20:54

lorentey merged commit 6d33683 into swiftlang:main Nov 5, 2021

lorentey deleted the set-on-fire2 branch November 5, 2021 20:48

lorentey mentioned this pull request Dec 9, 2021

Counted Set apple/swift-collections#132

Draft

7 tasks

lorentey mentioned this pull request Jan 13, 2025

✨ Add remove(where:) Method to Set and Dictionary for Seamless Element Removal ✨ #78600

Closed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

[stdlib] Optimize high-level Set operations #40012

[stdlib] Optimize high-level Set operations #40012

Uh oh!

lorentey commented Nov 2, 2021 •

edited

Loading

Uh oh!

lorentey commented Nov 2, 2021

Uh oh!

lorentey commented Nov 2, 2021

Uh oh!

swift-ci commented Nov 2, 2021

Uh oh!

swift-ci commented Nov 2, 2021

Uh oh!

lorentey commented Nov 2, 2021 •

edited

Loading

Uh oh!

lorentey commented Nov 3, 2021

Uh oh!

lorentey commented Nov 3, 2021

Uh oh!

swift-ci commented Nov 3, 2021

Uh oh!

swift-ci commented Nov 3, 2021

Uh oh!

lorentey commented Nov 3, 2021

Uh oh!

Uh oh!

[stdlib] Optimize high-level Set operations #40012

[stdlib] Optimize high-level Set operations #40012

Uh oh!

Conversation

lorentey commented Nov 2, 2021 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

lorentey commented Nov 2, 2021

Uh oh!

lorentey commented Nov 2, 2021

Uh oh!

swift-ci commented Nov 2, 2021

Performance (x86_64): -O

Code size: -O

Performance (x86_64): -Osize

Code size: -Osize

Performance (x86_64): -Onone

Code size: -swiftlibs

Uh oh!

swift-ci commented Nov 2, 2021

Uh oh!

lorentey commented Nov 2, 2021 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

lorentey commented Nov 3, 2021

Uh oh!

lorentey commented Nov 3, 2021

Uh oh!

swift-ci commented Nov 3, 2021

Performance (x86_64): -O

Code size: -O

Performance (x86_64): -Osize

Code size: -Osize

Performance (x86_64): -Onone

Code size: -swiftlibs

Uh oh!

swift-ci commented Nov 3, 2021

Uh oh!

lorentey commented Nov 3, 2021

Uh oh!

Uh oh!

lorentey commented Nov 2, 2021 •

edited

Loading

lorentey commented Nov 2, 2021 •

edited

Loading