-
Notifications
You must be signed in to change notification settings - Fork 625
Add ordering to Index implementation #2995
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
Coverage ReportAffected SDKs
Test Logs
NotesHTML coverage reports can be produced locally with Head commit (9259321c5) is created by Prow via merging commits: d5eec48 5957bd7. |
Binary Size ReportAffected SDKs
Test Logs
NotesHead commit (9259321c5) is created by Prow via merging commits: d5eec48 5957bd7. |
46ebaa6
to
99a91e5
Compare
99a91e5
to
82b91a0
Compare
82b91a0
to
26e680d
Compare
/test smoke-tests |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Please address the comment in SQLiteSchema.java
first, all others are insignificant.
@@ -115,6 +114,27 @@ public boolean hasLimit() { | |||
return endAt; | |||
} | |||
|
|||
/** Returns the list of values that are used in ARRAY_CONTAINS and ARRAY_CONTAINS_ANY filter. */ |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Ultranit: _CONTAINS or ARRAY_CONTAINS_ANY filter
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Fixed
@@ -142,26 +162,17 @@ public Bound getLowerBound(FieldIndex fieldIndex) { | |||
case LESS_THAN_OR_EQUAL: | |||
filterValue = Values.getLowerBound(fieldFilter.getValue().getValueTypeCase()); | |||
break; | |||
case NOT_EQUAL: |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
NOT_EQUAL is also deleted here because we do not support it yet?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We can user the default value for this and other cases (see line 131 in the old code, 151 in the new code).
String lowerBoundOp, | ||
@Nullable Object[] upperBounds, | ||
String upperBoundOp) { | ||
int statementCount = |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think this method (and below one) deserve some implementation comments, to explain what it is doing and how. I found them pretty hard to understand.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Makes sense. Comment added.
+ "uid TEXT, " // user id or null if there are no pending mutations | ||
+ "document_name TEXT, " | ||
+ "PRIMARY KEY (index_id, index_value, uid, document_name))"); | ||
+ "PRIMARY KEY (index_id, array_value, directional_value, uid, document_name))"); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Having both array_value
and directional_value
would lead less efficient queries actually:
sqlite> create table index_entries_test ( index_id INTEGER, directional_value TEXT, array_value TEXT, uid TEXT, document_id TEXT, PRIMARY KEY (index_id, array_value, directional_value, uid, document_id));
QUERY PLAN
`--SEARCH TABLE sqlite_master USING INTEGER PRIMARY KEY (rowid=?)
sqlite> select * from index_entries_test where (index_id=1 and directional_value == 'x') order by directional_value limit 2;
QUERY PLAN
`--SEARCH TABLE index_entries_test USING COVERING INDEX sqlite_autoindex_index_entries_test_1 (index_id=?)
sqlite> select * from index_entries_test where (index_id=1 and array_value == 'x') order by directional_value limit 2;
QUERY PLAN
`--SEARCH TABLE index_entries_test USING COVERING INDEX sqlite_autoindex_index_entries_test_1 (index_id=? AND array_value=?)
Note filtering on directional_value
without filter array_value
leads to full index scan.
If I am not mistaken, both order by
and limit
are served by our own code currently? So we can do both without this PR right?
If we do want to do this, directional_value
should be before array_value
as it is more common. For array queries we would then need to create a separate index on index_id, array_value
and tell SQLite to use that explicitly.
I am leaning towards the status before this PR though, because other platforms will have to do it that way.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Only applying the query limit locally can be really inefficient since we would always read all matching values even if the limit is low. This then imposes ordering, which unfortunately makes it necessary to split up array values from the other field values (as explained in the PR description).
We can actually use the array_value here even if it is not set by simply enforcing that it is null. We already don't use filters that have an array_value for non ArrayContains/ArrayContains any queries.
EXPLAIN QUERY PLAN select * from index_entries_test where (index_id=1 and array_value IS NULL AND directional_value == 'x') order by directional_value limit 2;
QUERY PLAN
`--SEARCH TABLE index_entries_test USING COVERING INDEX sqlite_autoindex_index_entries_test_1 (index_id=? AND array_value=? AND directional_value=?)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Right, always having a filter on array_value
should fix this.
Please add a comment to document this in SQLiteIndexManager when the query is built.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Also, limit
is probably still done in memory, by SQLite, but I think we can live with that.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Will merge this PR as is and add the comment in the follow up PR (the one that fixes IN queries).
@schmidt-sebastian: The following test failed, say
Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes/test-infra repository. I understand the commands that are listed here. |
This PR adds
limit()
support to our query engine. Unfortunately, this means that we have to enforce ordering as we need to filter the "best n" matches directly in the database layer.This again changes our indexing format. This test is a good example as to why:
This filters by the new "array_value" column and then sorts by "directional_value". "array_value" contains one value per array element while "directional_value" contains the full array as is.