Skip to content

[Performance] iterate the smaller set during Set.intersect() #576

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
merged 1 commit into from
Dec 20, 2015

Conversation

dduan
Copy link
Contributor

@dduan dduan commented Dec 16, 2015

For Set's intersect() operation, iterate over the smaller of the two.

This change introduces about 1% overhead in speed for average case a small constant cost, but greatly optimizes worst case.

Benchmark result:

master, -R

small ∩ large * 100 -- 0.268759667873383
large ∩ small * 100 -- 7.67079079151154
small ∩ small * 100 -- 0.279738128185272
large ∩ large * 100 -- 17.1418876051903
total time spent ----- 25.3611761927605

optimized, -R

small ∩ large * 100 -- 0.275687038898468
large ∩ small * 100 -- 0.267830967903137
small ∩ small * 100 -- 0.27216911315918
large ∩ large * 100 -- 17.287498831749
total time spent ----- 18.1031859517097

Note: the benchmark was written for intersectInPlace(), which simply calls intersect(). Tests in *validation-test/stdlib/Set.swift * passes.

@dduan dduan changed the title iterate the smaller set during Set.intersect() [Performance] iterate the smaller set during Set.intersect() Dec 16, 2015
@dabrahams
Copy link
Contributor

Do you have any theory about why there's even a 1% slowdown in the average case?

I should note that the implementation before your patch was suboptimal in other ways. It's really a bad idea to convert the other sequence to a Set. We should use underestimateCount() on other instead of building a Set and using count so we are not traversing it twice or creating needless storage. We should (probably) allocate the newSet with an initial capacity equal to the smaller of the two sizes. Care to try that arrangement? I'd be happy to merge this pull request first if you think that's best.

@dduan
Copy link
Contributor Author

dduan commented Dec 16, 2015

@dabrahams I just ran the benchmark a few more times. It seems that 1% is within normal fluctuation between runs of the same test. Bad reporting here.

@dduan
Copy link
Contributor Author

dduan commented Dec 16, 2015

@dabrahams My understanding is contains() would consume a singe-pass sequence, therefore we must deal with this somehow. Converting other to Set, therefore, is an OK option. Correct me if I'm wrong.

@dabrahams
Copy link
Contributor

@dduan Regarding the 1%, that's reassuring, and what I would have expected. Regarding converting other to Set, it's "OK" semantically, but it is very wasteful. If other is not a set (e.g. an array) we should treat it as the smaller of the two arguments.

@dduan dduan force-pushed the set_intersect_optimization branch from dfb9e14 to cd6d28f Compare December 16, 2015 08:15
@dduan
Copy link
Contributor Author

dduan commented Dec 16, 2015

@dabrahams Done!

@dduan dduan force-pushed the set_intersect_optimization branch 2 times, most recently from c89da4c to 5612d83 Compare December 16, 2015 09:57
}
} else {
// sequence is ether a Set with a smaller count or not a Set at all.
newSet = Set<Element>(minimumCapacity: sequence.underestimateCount())
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think it's better to use min(sequence.underestimateCount(), count) for minimumCapacity. In that way you don't allocate too much memory for the new set, if the sequence is large and the set is small.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Good catch. Fixing now.

@dduan dduan force-pushed the set_intersect_optimization branch 2 times, most recently from 104fdce to 27f01a3 Compare December 17, 2015 04:09
var newSet: Set<Element>

if let other = sequence as? Set<Element> {
newSet = Set<Element>(minimumCapacity: min(other.count, count))
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm not sure if requesting capacity is the right thing here. We don't have a guarantee that the result will have that many elements. The result could even be empty.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

You are right. I think we should just create an empty set without requesting capacity.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@gribozavr the resulting size could be anywhere between 0 and min(other.count, count).

The closer we allocate towards min(other.count, count), the more we risk wasting space.
The closer we allocate towards 0, the more we risk wasting time resizing the storage.

I don't think there's a "right" way to do this. Before I make the change you suggested, just want to point out this trade off.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@gribozavr @florianreinhart I ran benchmark for both arrangements and here's the result. It only represent the time aspect, but it does confirm my theory above.

newSet = Set<Element>()
small ∩ large * 100 -- 0.313429653644562
large ∩ small * 100 -- 9.32736319303513
small ∩ small * 100 -- 0.30213850736618
large ∩ large * 100 -- 19.0307229161263
total time spent ----- 28.9736542701721

newSet = Set<Element>(minimumCapacity: min(other.count, count))
small ∩ large * 100 -- 0.262056231498718
large ∩ small * 100 -- 9.16748160123825
small ∩ small * 100 -- 0.256339490413666
large ∩ large * 100 -- 15.4998977780342
total time spent ----- 25.1857751011848

@dduan dduan force-pushed the set_intersect_optimization branch 4 times, most recently from 2e22552 to 2c022f9 Compare December 18, 2015 19:26
@dduan
Copy link
Contributor Author

dduan commented Dec 19, 2015

@gribozavr If memory is deemed more important than speed here, I'll make the change. Please let me know 😬

@gribozavr
Copy link
Contributor

@dduan Thank you for the benchmarks! Swift is being used on platforms that are memory-constrained, like iOS devices, so yes, my preference is to be conservative and not over-allocate.

For example, we don't reserve space in filter(), even though we have an upper bound on the number of elements. Continuing to do this in intersect() would be consistent with current practices.

@dduan dduan force-pushed the set_intersect_optimization branch 2 times, most recently from e1a8771 to d7fbe51 Compare December 19, 2015 06:51
@dduan
Copy link
Contributor Author

dduan commented Dec 19, 2015

@gribozavr Great, I've made the changes. Thank you for clarifying!

@florianreinhart
Copy link
Contributor

@dduan Now you aren't iterating over the smaller collection anymore!? If sequence is a Set you iterate over self. If sequence is not a Set you iterate over sequence.

Instead, I think we should do the following: If sequence is a Set compare the counts and iterate over the smaller Set. And if sequence isn't a Set, just iterate over sequence. Wasn't that your initial idea?

This introduces a small constant in speed, but it's a big win for
worst case scenario.
@dduan dduan force-pushed the set_intersect_optimization branch from d7fbe51 to d919f4c Compare December 19, 2015 12:27
@dduan
Copy link
Contributor Author

dduan commented Dec 19, 2015

@florianreinhart hmm, restored. Thanks for catching it. I ran the benchmark and tests on Set again just as a sanity check this time.

small ∩ large * 100 -- 0.269078969955444
large ∩ small * 100 -- 0.276514112949371
small ∩ small * 100 -- 0.286798119544983
large ∩ large * 100 -- 17.9460356235504
total time spent ----- 18.7784268260002

@gribozavr

@florianreinhart
Copy link
Contributor

Looks good to me!

@dduan
Copy link
Contributor Author

dduan commented Dec 19, 2015

@gribozavr any other concerns?

@gribozavr
Copy link
Contributor

@dduan Looks good, thanks!

gribozavr added a commit that referenced this pull request Dec 20, 2015
[Performance] iterate the smaller set during Set.intersect()
@gribozavr gribozavr merged commit 6f8bee1 into swiftlang:master Dec 20, 2015
freak4pc pushed a commit to freak4pc/swift that referenced this pull request Sep 28, 2022
XFAIL projects no longer passing after Xcode update
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants