-
Notifications
You must be signed in to change notification settings - Fork 943
Use Read-Time Index for Multi-Tab #2128
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Use Read-Time Index for Multi-Tab #2128
Conversation
@@ -1144,7 +1144,7 @@ describeSpec('Listens:', [], () => { | |||
.client(1) | |||
.stealPrimaryLease() | |||
.expectListen(query, 'resume-token-2000') | |||
.watchAcksFull(query, 2000, docC) | |||
.watchAcksFull(query, 3000, docC) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This changed since we can use the read time to get a change diff, rather than a separate counter.
This should fix the CI failure in #2128
This should fix the CI failure in #2128
This should fix the CI failure in #2128
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Mostly LGTM but a few nits and I think we should sanity-check with @wilhuff that we're okay with web storing sentinel delete documents that iOS / Android do not.
@@ -1237,7 +1211,9 @@ export class IndexedDbLruDelegate implements ReferenceDelegate, LruDelegate { | |||
upperBound: ListenSequenceNumber | |||
): PersistencePromise<number> { | |||
const documentCache = this.db.getRemoteDocumentCache(); | |||
const changeBuffer = documentCache.newChangeBuffer(); | |||
const changeBuffer = documentCache.newChangeBuffer({ | |||
createSentinelDocumentsToTrackDeletes: false |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Perhaps change "Deletes" => "Removals" or "RemovedEntries" since this corresponds to removeEntry() calls.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Done (trackRemovals as per later comment)
@@ -486,15 +447,22 @@ export class IndexedDbRemoteDocumentCache implements RemoteDocumentCache { | |||
promises.push(this.documentCache.addEntry(transaction, key, doc)); | |||
} else { | |||
sizeDelta -= previousSize!; | |||
promises.push(this.documentCache.removeEntry(transaction, key)); | |||
if (this.trackDeletes) { | |||
// A sentinel delete is symbolized by a NoDocument with a version of 0. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Maybe expand on this comment?
In order to track deletes, we store "sentinel delete" entries in the Remote Document Cache. These entries are represented by a NoDocument with a version of 0 and are ignored by maybeDecodeDocument() but preserved in getNewDocumentChanges().
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Done. Thanks for the comment :)
newChangeBuffer(): RemoteDocumentChangeBuffer { | ||
return new IndexedDbRemoteDocumentCache.RemoteDocumentChangeBuffer(this); | ||
newChangeBuffer(options: { | ||
createSentinelDocumentsToTrackDeletes: boolean; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
- You say "sentinelDocuments" here but refer to them as "sentinel deletes" elsewhere. Please use consistent terminology.
- We have this long name here but just
trackDeletes
in the RemoteDocumentChangeBuffer constructor. I am not sure which I prefer, but I can't think of a great reason for them to be different.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Since you seem to be ok with the shorter name, I opted for "trackRemovals" everywhere.
DbClientMetadata.store, | ||
DbRemoteDocumentChanges.store | ||
]; | ||
export const V4_STORES = [...V3_STORES, DbClientMetadata.store]; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Hrm... I'm having uneasy feelings about revising history like this. I wonder if we should keep some hint about the prior existence of this store to keep our history of schema changes in-tact in this file. I'm not sure though. It also seems strange to keep dead code. Maybe ask @wilhuff what he thinks.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
As per our offline discussion, let's keep this as is. I added a comment about the prior existence of this store. Theoretically though it should not matter since we drop the store, so we can recreate it with the same name and use for a different purpose later.
@@ -345,7 +345,9 @@ export class LocalStore { | |||
'readwrite-primary', | |||
txn => { | |||
const affected = batchResult.batch.keys(); | |||
const documentBuffer = this.remoteDocuments.newChangeBuffer(); | |||
const documentBuffer = this.remoteDocuments.newChangeBuffer({ | |||
createSentinelDocumentsToTrackDeletes: true |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Part of me wonders if this should be the default behavior and GC should explicitly opt-out (e.g. flip the logic and rename to suppressSentinelDeleteDocuments
or something) since reading this code, I would have trouble guessing what this is for and why it is enabled in this code path. By contrast, the GC code could have a comment explaining why we don't want to track deletes in that case.
Alternatively, since this option probably corresponds 100% to whether this is GC code or not, we could consider specializing the name to match (e.g. forGarbageCollection: true
or something)
WDYT?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'm a little torn on this. I think the current naming makes the behavior more explicit (compared to forGarbageCollection
) and I am not a big fan of flipping the behavior since we are turning on an extra feature after all. I do see your point though and can probably be won over somewhat easily.
For now, I made the argument optional and added a comment in the two places that requires us to turn trackRemovals
on.
@@ -177,7 +178,9 @@ export class MemoryRemoteDocumentCache implements RemoteDocumentCache { | |||
return PersistencePromise.resolve(changedDocs); | |||
} | |||
|
|||
newChangeBuffer(): RemoteDocumentChangeBuffer { | |||
newChangeBuffer(options: { | |||
createSentinelDocumentsToTrackDeletes: boolean; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Perhaps there should be a comment indicating we explicitly ignore this.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Added. This makes me thing we should never have added the change log to the MemoryRemoteDocumentCache.
} | ||
) | ||
.next(() => { | ||
// The old documents are not included since readTime was not populated |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Would be nice if this test added some documents with index entries but that were older than version=1 to verify that are not included either...
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I extracted this into a separate test ("can get recent document changes") that does.
* Update typescript-eslint monorepo to v2 * fix breaking changes
This should fix the CI failure in #2128
4879484
to
fc9ed49
Compare
fc9ed49
to
8142123
Compare
I used "Sebastian's guide to screw with your commit history for this PR". Ugh, I totally screwed this up. Do you mind reviewing the full diff again? I also had to add the readTime as an argument for |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Mostly LGTM but in order to be consistent with my previous feedback, I'd like to see if we can remove the version number from removeEntry()... again. 😬
changeBuffer.removeEntry( | ||
docKey, | ||
SnapshotVersion.forDeletedDoc() | ||
); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Per chat conversation, I will not stand for removeEntry() requiring a version that is MEANINGLESS!!! j/k... But I am realizing that there is exactly one place where we want to track removed entries and that is for failed limbo resolution, so I'm wondering if we should:
- Remove the
trackRemovals
option. - Make it so removeEntry() doesn't require a version.
- Let you optionally pass an entry in. This could be
removeEntry(key, version)
or if we want to make it a little more explicit (and borrow a bit from the existing shape of the API), mayberemoveEntry(key, {trackRemovalVersion: version})
or something.
Alternatively, since this such a specific case, we could consider not storing the sentinel delete and inventing a localStorage notification for it instead.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
As always, this didn't quite work out in practice :/ If I do want to do this, then I have to track the readTime per document. If our semantic are going to be "when a read time is provided, create a fake NoDocument" then we should allow for the fact that some removals have readTimes while others do not.
I still made the second argument optional. If no read time is provided when it is needed, the assert in get readTime()
should provide some insight.
@@ -131,24 +110,18 @@ export class IndexedDbRemoteDocumentCache implements RemoteDocumentCache { | |||
} | |||
|
|||
/** | |||
* Updates the document change log and adds the given delta to the cached current size. | |||
* Updates the current size. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
maybe "current cache size" ?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Done
newChangeBuffer(): RemoteDocumentChangeBuffer { | ||
return new IndexedDbRemoteDocumentCache.RemoteDocumentChangeBuffer(this); | ||
newChangeBuffer(options?: { | ||
trackRemovals: boolean; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Per feedback elsewhere, I recommend dropping this option and "moving" it to the removeEntry() call.
@@ -617,6 +630,12 @@ export class DbRemoteDocument { | |||
|
|||
static collectionReadTimeIndexPath = ['parentPath', 'readTime']; | |||
|
|||
// TODO: We are currently storing full document keys almost three times | |||
// (once as part of the primary key, once - partly - as `parentPath` and once | |||
// inside the encoded documents). During our next migration, we should |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We store it 3 more times if you include the indexes (collectionReadTimeIndex stores it twice, readTimeIndex stores it once). Hopefully people don't use deeply-nested subcollections and small documents!
It occurs to me that we could maintain our own collectionReadTimeIndex that instead of being a true index was just an object store containing {collectionPath, readTime, documentId}
objects. This would save us storing the full documentKey. It's probably not worth doing for web since it'd be less efficient than a real index, but with LevelDB (where we have to implement our own indexes anyway), it probably makes sense.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
So much redundancy! So much uptime!
I am planning on using that format for LevelDB, but I prefer using existing functionality here even if it is slightly less storage efficient.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM after merging #2145
This uses the read time we persist for Index-Free queries in the change log for multi-tab. The big benefit is that we no longer need to GC the change log (since LRU GC does it for us) and we no longer need to recover from GCed change logs. The downside is that we now need to store NoDocuments for deleted rows, but these are internal to the IndexedDbRemoteDocumentStore (I added a new option to the RemoteDocumentChangeBuffer for that).
This is a follow-up to #2125