-
-
Notifications
You must be signed in to change notification settings - Fork 5.9k
Use subquery to instead In #10874
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Use subquery to instead In #10874
Conversation
@@ -1153,6 +1155,14 @@ func (opts *IssuesOptions) setupSession(sess *xorm.Session) { | |||
} | |||
} | |||
} | |||
|
|||
if len(opts.IncludedLabelNames) > 0 { | |||
sess.In("issue.id", BuildLabelNamesIssueIDsCondition(opts.IncludedLabelNames)) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@guillep2k can we do better here? I'm suspicious that this In
could be quite expensive.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
WHERE EXISTS (SELECT ...)
is the equivalent idiom, but depending on the engine it could be better optimized.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It all depends on how sophisticated the engine is. Some engines will rearrange the whole query and be very clever on optimizing the access... others will just take your statement more or less literally, and therefore your syntax might influence the results drastically. And versions are important too. I read from time to time "XX SQL now takes this construct and treats it like xxx for better performance".
Formally, WHERE EXISTS (SELECT ...)
is equivalent to IN (SELECT ....)
; but if the engine is not too smart, EXISTS
is a kind of join that is more optimizable than a plain LOOP JOIN (e.g. using a HASH JOIN).
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
But performing the IN()
in a subquery is definitely more performant that SELECT ids
, and then `SELECT ... WHERE ... IN(id list).
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Like @guillep2k said, it depends on the database engine. The sub query maybe cached or not. It may be faster than before or not. But it will give the right result however many there are labels. But for In
with id
, when there are over 500 variables, database will return error.
Codecov Report
@@ Coverage Diff @@
## master #10874 +/- ##
=======================================
Coverage 43.41% 43.42%
=======================================
Files 593 593
Lines 83332 83344 +12
=======================================
+ Hits 36178 36190 +12
+ Misses 42660 42659 -1
- Partials 4494 4495 +1
Continue to review full report at Codecov.
|
Sometimes id will be more than 500 that will result in 500.