Skip to content

Use sequence numbers instead of read time in collection group #3111

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Closed
wants to merge 4 commits into from

Conversation

thebrianchen
Copy link

@thebrianchen thebrianchen commented Nov 8, 2021

Instead of using read times to track the ordering of collection groups in the collection_group_update_times table, use a sequence counter instead.

Looking for naming suggestions:

  • collection_group_indexing
  • collection_group_sequence_numbers
  • collection_group_sequences

@thebrianchen thebrianchen self-assigned this Nov 8, 2021
@google-cla google-cla bot added the cla: yes Override cla label Nov 8, 2021
@google-oss-bot
Copy link
Contributor

google-oss-bot commented Nov 8, 2021

Coverage Report

Affected SDKs

  • firebase-firestore

    SDK overall coverage changed from ? (4e336c6) to 45.01% (e4ee5a67) by ?.

    Click to show coverage changes in 299 files.
    Filename Base (4e336c6) Head (e4ee5a67) Diff
    AbstractStream.java ? 37.02% ?
    ActivityScope.java ? 0.00% ?
    AndroidConnectivityMonitor.java ? 39.51% ?
    ApiUtil.java ? 12.50% ?
    ArrayContainsAnyFilter.java ? 100.00% ?
    ArrayContainsFilter.java ? 100.00% ?
    ArrayTransformOperation.java ? 86.49% ?
    ArrayValue.java ? 48.60% ?
    ArrayValueOrBuilder.java ? 0.00% ?
    Assert.java ? 57.14% ?
    AsyncEventListener.java ? 0.00% ?
    AsyncQueue.java ? 77.11% ?
    AutoValue_FieldIndex.java ? 55.56% ?
    AutoValue_FieldIndex_Segment.java ? 33.33% ?
    BackgroundQueue.java ? 100.00% ?
    BasePath.java ? 98.08% ?
    BatchGetDocumentsRequest.java ? 0.00% ?
    BatchGetDocumentsRequestOrBuilder.java ? 0.00% ?
    BatchGetDocumentsResponse.java ? 0.00% ?
    BatchGetDocumentsResponseOrBuilder.java ? 0.00% ?
    BeginTransactionRequest.java ? 0.00% ?
    BeginTransactionRequestOrBuilder.java ? 0.00% ?
    BeginTransactionResponse.java ? 0.00% ?
    BeginTransactionResponseOrBuilder.java ? 0.00% ?
    Blob.java ? 76.92% ?
    Bound.java ? 33.93% ?
    BundleCache.java ? 0.00% ?
    BundleCallback.java ? 0.00% ?
    BundleDocument.java ? 90.00% ?
    BundleElement.java ? 0.00% ?
    BundleElementOrBuilder.java ? 0.00% ?
    BundleLoader.java ? 100.00% ?
    BundleMetadata.java ? 0.00% ?
    BundleMetadataOrBuilder.java ? 0.00% ?
    BundleProto.java ? 0.00% ?
    BundleReader.java ? 95.29% ?
    BundleSerializer.java ? 89.43% ?
    BundledDocumentMetadata.java ? 0.00% ?
    BundledDocumentMetadataOrBuilder.java ? 0.00% ?
    BundledQuery.java ? 34.57% ?
    BundledQueryOrBuilder.java ? 0.00% ?
    ByteBufferInputStream.java ? 83.33% ?
    CollectionReference.java ? 13.64% ?
    CommitRequest.java ? 0.00% ?
    CommitRequestOrBuilder.java ? 0.00% ?
    CommitResponse.java ? 0.00% ?
    CommitResponseOrBuilder.java ? 0.00% ?
    CommonProto.java ? 0.00% ?
    ComponentProvider.java ? 100.00% ?
    ConnectivityMonitor.java ? 0.00% ?
    Consumer.java ? 0.00% ?
    CreateDocumentRequest.java ? 0.00% ?
    CreateDocumentRequestOrBuilder.java ? 0.00% ?
    CredentialsProvider.java ? 100.00% ?
    Cursor.java ? 35.29% ?
    CursorOrBuilder.java ? 0.00% ?
    CustomClassMapper.java ? 84.37% ?
    DatabaseId.java ? 70.37% ?
    DatabaseInfo.java ? 90.91% ?
    Datastore.java ? 32.89% ?
    DatastoreTestTrace.java ? 0.00% ?
    DefaultQueryEngine.java ? 97.83% ?
    DeleteDocumentRequest.java ? 0.00% ?
    DeleteDocumentRequestOrBuilder.java ? 0.00% ?
    DeleteMutation.java ? 90.00% ?
    DirectionalIndexByteEncoder.java ? 100.00% ?
    Document.java ? 100.00% ?
    DocumentChange.java ? 67.61% ?
    DocumentChangeOrBuilder.java ? 0.00% ?
    DocumentCollections.java ? 83.33% ?
    DocumentDelete.java ? 34.62% ?
    DocumentDeleteOrBuilder.java ? 0.00% ?
    DocumentId.java ? 0.00% ?
    DocumentKey.java ? 100.00% ?
    DocumentMask.java ? 40.66% ?
    DocumentMaskOrBuilder.java ? 0.00% ?
    DocumentOrBuilder.java ? 0.00% ?
    DocumentOverlayCache.java ? 0.00% ?
    DocumentProto.java ? 0.00% ?
    DocumentReference.java ? 13.24% ?
    DocumentRemove.java ? 29.23% ?
    DocumentRemoveOrBuilder.java ? 0.00% ?
    DocumentSet.java ? 83.78% ?
    DocumentSnapshot.java ? 37.50% ?
    DocumentTransform.java ? 29.44% ?
    DocumentTransformOrBuilder.java ? 0.00% ?
    DocumentViewChange.java ? 90.91% ?
    DocumentViewChangeSet.java ? 87.88% ?
    EncodedPath.java ? 93.22% ?
    EventListener.java ? 0.00% ?
    EventManager.java ? 96.15% ?
    Exclude.java ? 0.00% ?
    Executors.java ? 100.00% ?
    ExistenceFilter.java ? 80.00% ?
    ExistenceFilterOrBuilder.java ? 0.00% ?
    ExponentialBackoff.java ? 30.43% ?
    FieldFilter.java ? 88.46% ?
    FieldIndex.java ? 100.00% ?
    FieldMask.java ? 55.56% ?
    FieldPath.java ? 89.66% ?
    FieldTransform.java ? 64.71% ?
    FieldValue.java ? 82.14% ?
    FileUtil.java ? 0.00% ?
    Filter.java ? 100.00% ?
    FirebaseAppCheckTokenProvider.java ? 93.75% ?
    FirebaseAuthCredentialsProvider.java ? 100.00% ?
    FirebaseClientGrpcMetadataProvider.java ? 44.00% ?
    FirebaseFirestore.java ? 37.71% ?
    FirebaseFirestoreException.java ? 83.72% ?
    FirebaseFirestoreSettings.java ? 75.00% ?
    FirestoreCallCredentials.java ? 18.60% ?
    FirestoreChannel.java ? 14.91% ?
    FirestoreClient.java ? 31.34% ?
    FirestoreGrpc.java ? 2.62% ?
    FirestoreIndexValueWriter.java ? 77.11% ?
    FirestoreMultiDbComponent.java ? 100.00% ?
    FirestoreProto.java ? 0.00% ?
    FirestoreRegistrar.java ? 100.00% ?
    Function.java ? 0.00% ?
    GeoPoint.java ? 91.67% ?
    GetDocumentRequest.java ? 0.00% ?
    GetDocumentRequestOrBuilder.java ? 0.00% ?
    GrpcCallProvider.java ? 46.59% ?
    GrpcMetadataProvider.java ? 0.00% ?
    IgnoreExtraProperties.java ? 0.00% ?
    InFilter.java ? 100.00% ?
    IncomingStreamObserver.java ? 0.00% ?
    Index.java ? 25.92% ?
    IndexBackfiller.java ? 77.38% ?
    IndexByteEncoder.java ? 88.57% ?
    IndexCursor.java ? 0.00% ?
    IndexEntry.java ? 0.00% ?
    IndexManager.java ? 0.00% ?
    IndexOrBuilder.java ? 0.00% ?
    IndexProto.java ? 0.00% ?
    IndexRange.java ? 0.00% ?
    IndexedQueryEngine.java ? 65.63% ?
    IntMath.java ? 41.67% ?
    KeyFieldFilter.java ? 66.67% ?
    KeyFieldInFilter.java ? 0.00% ?
    KeyFieldNotInFilter.java ? 0.00% ?
    LimboDocumentChange.java ? 70.59% ?
    ListCollectionIdsRequest.java ? 0.00% ?
    ListCollectionIdsRequestOrBuilder.java ? 0.00% ?
    ListCollectionIdsResponse.java ? 0.00% ?
    ListCollectionIdsResponseOrBuilder.java ? 0.00% ?
    ListDocumentsRequest.java ? 0.00% ?
    ListDocumentsRequestOrBuilder.java ? 0.00% ?
    ListDocumentsResponse.java ? 0.00% ?
    ListDocumentsResponseOrBuilder.java ? 0.00% ?
    ListenRequest.java ? 10.77% ?
    ListenRequestOrBuilder.java ? 0.00% ?
    ListenResponse.java ? 32.64% ?
    ListenResponseOrBuilder.java ? 0.00% ?
    ListenSequence.java ? 100.00% ?
    Listener.java ? 0.00% ?
    ListenerRegistration.java ? 0.00% ?
    ListenerRegistrationImpl.java ? 0.00% ?
    LoadBundleTask.java ? 83.16% ?
    LoadBundleTaskProgress.java ? 81.40% ?
    LocalDocumentsView.java ? 97.70% ?
    LocalSerializer.java ? 90.73% ?
    LocalStore.java ? 99.38% ?
    LocalViewChanges.java ? 100.00% ?
    LocalWriteResult.java ? 100.00% ?
    Logger.java ? 75.00% ?
    LruDelegate.java ? 0.00% ?
    LruGarbageCollector.java ? 84.11% ?
    MapValue.java ? 53.00% ?
    MapValueOrBuilder.java ? 0.00% ?
    MemoryBundleCache.java ? 100.00% ?
    MemoryComponentProvider.java ? 100.00% ?
    MemoryDocumentOverlayCache.java ? 100.00% ?
    MemoryEagerReferenceDelegate.java ? 100.00% ?
    MemoryIndexManager.java ? 64.29% ?
    MemoryLruReferenceDelegate.java ? 96.00% ?
    MemoryMutationQueue.java ? 98.57% ?
    MemoryOverlayMigrationManager.java ? 100.00% ?
    MemoryPersistence.java ? 100.00% ?
    MemoryRemoteDocumentCache.java ? 100.00% ?
    MemoryTargetCache.java ? 100.00% ?
    MetadataChanges.java ? 100.00% ?
    MutableDocument.java ? 100.00% ?
    Mutation.java ? 100.00% ?
    MutationBatch.java ? 89.06% ?
    MutationBatchResult.java ? 100.00% ?
    MutationQueue.java ? 0.00% ?
    MutationResult.java ? 100.00% ?
    NamedQuery.java ? 0.00% ?
    NamedQueryOrBuilder.java ? 0.00% ?
    NotInFilter.java ? 85.71% ?
    NumberComparisonHelper.java ? 100.00% ?
    NumericIncrementTransformOperation.java ? 87.50% ?
    ObjectValue.java ? 99.06% ?
    OnProgressListener.java ? 0.00% ?
    OnlineState.java ? 100.00% ?
    OnlineStateTracker.java ? 98.11% ?
    OrderBy.java ? 96.55% ?
    OrderedCodeWriter.java ? 68.52% ?
    OverlayMigrationManager.java ? 0.00% ?
    PatchMutation.java ? 98.39% ?
    Persistence.java ? 100.00% ?
    Precondition.java ? 38.46% ?
    PreconditionOrBuilder.java ? 0.00% ?
    Preconditions.java ? 56.25% ?
    PropertyName.java ? 0.00% ?
    Query.java ? 4.06% ?
    QueryDocumentSnapshot.java ? 64.71% ?
    QueryEngine.java ? 0.00% ?
    QueryListener.java ? 100.00% ?
    QueryProto.java ? 0.00% ?
    QueryPurpose.java ? 100.00% ?
    QueryResult.java ? 100.00% ?
    QuerySnapshot.java ? 76.36% ?
    QueryView.java ? 100.00% ?
    ReferenceDelegate.java ? 0.00% ?
    ReferenceSet.java ? 92.45% ?
    RemoteDocumentCache.java ? 0.00% ?
    RemoteEvent.java ? 92.31% ?
    RemoteSerializer.java ? 84.63% ?
    RemoteStore.java ? 89.56% ?
    ResourcePath.java ? 94.74% ?
    RollbackRequest.java ? 0.00% ?
    RollbackRequestOrBuilder.java ? 0.00% ?
    RunQueryRequest.java ? 0.00% ?
    RunQueryRequestOrBuilder.java ? 0.00% ?
    RunQueryResponse.java ? 0.00% ?
    RunQueryResponseOrBuilder.java ? 0.00% ?
    SQLiteBundleCache.java ? 90.00% ?
    SQLiteComponentProvider.java ? 100.00% ?
    SQLiteDocumentOverlayCache.java ? 91.11% ?
    SQLiteIndexManager.java ? 95.45% ?
    SQLiteLruReferenceDelegate.java ? 98.63% ?
    SQLiteMutationQueue.java ? 82.76% ?
    SQLiteOverlayMigrationManager.java ? 61.90% ?
    SQLitePersistence.java ? 85.11% ?
    SQLiteRemoteDocumentCache.java ? 95.65% ?
    SQLiteSchema.java ? 96.31% ?
    SQLiteTargetCache.java ? 98.47% ?
    Scheduler.java ? 0.00% ?
    ServerTimestamp.java ? 0.00% ?
    ServerTimestampOperation.java ? 100.00% ?
    ServerTimestamps.java ? 79.17% ?
    SetMutation.java ? 97.14% ?
    SetOptions.java ? 0.00% ?
    SnapshotMetadata.java ? 68.75% ?
    SnapshotVersion.java ? 87.50% ?
    Source.java ? 0.00% ?
    Stream.java ? 100.00% ?
    StructuredQuery.java ? 32.44% ?
    StructuredQueryOrBuilder.java ? 0.00% ?
    Supplier.java ? 0.00% ?
    SyncEngine.java ? 93.57% ?
    Target.java ? 96.12% ?
    TargetCache.java ? 0.00% ?
    TargetChange.java ? 80.00% ?
    TargetChangeOrBuilder.java ? 0.00% ?
    TargetData.java ? 77.50% ?
    TargetIdGenerator.java ? 100.00% ?
    TargetIndexMatcher.java ? 100.00% ?
    TargetOrBuilder.java ? 0.00% ?
    TargetState.java ? 97.78% ?
    ThrottledForwardingExecutor.java ? 100.00% ?
    ThrowOnExtraProperties.java ? 0.00% ?
    Timestamp.java ? 80.39% ?
    Token.java ? 0.00% ?
    Transaction.java ? 0.00% ?
    TransactionOptions.java ? 0.00% ?
    TransactionOptionsOrBuilder.java ? 0.00% ?
    TransactionRunner.java ? 0.00% ?
    TransformOperation.java ? 0.00% ?
    UpdateDocumentRequest.java ? 0.00% ?
    UpdateDocumentRequestOrBuilder.java ? 0.00% ?
    User.java ? 78.57% ?
    UserData.java ? 66.00% ?
    UserDataReader.java ? 70.48% ?
    UserDataWriter.java ? 51.02% ?
    Util.java ? 62.28% ?
    Value.java ? 42.48% ?
    ValueOrBuilder.java ? 0.00% ?
    Values.java ? 95.58% ?
    VerifyMutation.java ? 50.00% ?
    View.java ? 96.82% ?
    ViewChange.java ? 100.00% ?
    ViewSnapshot.java ? 88.33% ?
    WatchChange.java ? 70.51% ?
    WatchChangeAggregator.java ? 98.18% ?
    WatchStream.java ? 19.23% ?
    Write.java ? 36.56% ?
    WriteBatch.java ? 0.00% ?
    WriteOrBuilder.java ? 0.00% ?
    WriteProto.java ? 0.00% ?
    WriteRequest.java ? 10.41% ?
    WriteRequestOrBuilder.java ? 0.00% ?
    WriteResponse.java ? 11.70% ?
    WriteResponseOrBuilder.java ? 0.00% ?
    WriteResult.java ? 0.00% ?
    WriteResultOrBuilder.java ? 0.00% ?
    WriteStream.java ? 31.82% ?

Test Logs

Notes

HTML coverage reports can be produced locally with ./gradlew <product>:checkCoverage.
Report files are located at <product-build-dir>/reports/jacoco/.

Head commit (e4ee5a67) is created by Prow via merging commits: 4e336c6 1a5aaa9.

@google-oss-bot
Copy link
Contributor

Binary Size Report

Affected SDKs

  • firebase-firestore

    Type Base (9e93fff) Head (027297d5) Diff
    aar 1.22 MB 1.22 MB +270 B (+0.0%)
    apk (release) 3.32 MB 3.32 MB +432 B (+0.0%)

Test Logs

Notes

Head commit (027297d5) is created by Prow via merging commits: 9e93fff 7e6b736.

Copy link
Contributor

@schmidt-sebastian schmidt-sebastian left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think this PR needs to wait until we have a holistic view on how we keep track of mutation batches and partial updates (e.g. we processed 10 out of 100 documents for a single read time). I am no longer certain that sequence numbers are the right solution here, so it would be good to confirm that they are before we proceed.

String getNextCollectionGroupToUpdate(int currentMaxSequenceNumber);

/** Returns the highest sequence number in the collection groups table. */
int getMaxCollectionGroupSequenceNumber();
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

getIndexBackfillSequenceNumber? I think it might be more interesting to say what the sequence number is for.

I am 50/50 on whether we need this method in our interface though.

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Not sure what to name it. The Index Manager is just getting the highest sequence number, but the IndexBackfiller is assigning it meaning by storing the most recently written collection group to it.

I think it's best to leave it in the manager since the IndexBackfiller currently delegates all SQL calls to the index manager.


// TODO(indexing): Handle pausing and resuming from the correct document if backfilling hits the
// max doc limit while processing docs for a certain read time.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

That's a very big todo :)

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Just backfilling it in from before :)

Will tackle this in a subsequent PR

@@ -55,6 +54,7 @@
/** A persisted implementation of IndexManager. */
final class SQLiteIndexManager implements IndexManager {
private static final String TAG = SQLiteIndexManager.class.getSimpleName();
private static final int NEW_COLLECTION_GROUP_SEQUENCE_NUMBER = 0;
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Should we use -1 here? 0 poses the question whether we have indexed the first entry (with a zero-based index).

// TODO(indexing): Store progress with a counter rather than a timestamp. If using a timestamp,
// use the read time of the last document read in the loop.
setCollectionGroupUpdateTime(nextCollectionGroup[0], Timestamp.now());
setCollectionGroup(nextCollectionGroup[0], getMaxCollectionGroupSequenceNumber() + 1);
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hm, this should only be set after we are done with the update.

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Done. Also refactored the code so that the IndexBackfiller does not have to deal with sequence numbers.

Copy link
Author

@thebrianchen thebrianchen left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Made some revisions:

  • Mark collection group as completed only after indexing all documents in the collection.
  • Removed getMaxCollectionGroupSequenceNumber() from index manager and instead let the backfiller track when all collection groups have been processed.
  • Added test to verify that collection groups that are partially indexed are processed first.


// TODO(indexing): Handle pausing and resuming from the correct document if backfilling hits the
// max doc limit while processing docs for a certain read time.
Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Just backfilling it in from before :)

Will tackle this in a subsequent PR

String getNextCollectionGroupToUpdate(int currentMaxSequenceNumber);

/** Returns the highest sequence number in the collection groups table. */
int getMaxCollectionGroupSequenceNumber();
Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Not sure what to name it. The Index Manager is just getting the highest sequence number, but the IndexBackfiller is assigning it meaning by storing the most recently written collection group to it.

I think it's best to leave it in the manager since the IndexBackfiller currently delegates all SQL calls to the index manager.

// TODO(indexing): Store progress with a counter rather than a timestamp. If using a timestamp,
// use the read time of the last document read in the loop.
setCollectionGroupUpdateTime(nextCollectionGroup[0], Timestamp.now());
setCollectionGroup(nextCollectionGroup[0], getMaxCollectionGroupSequenceNumber() + 1);
Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Done. Also refactored the code so that the IndexBackfiller does not have to deal with sequence numbers.

// Mark the collection group as fully indexed if all documents in the collection have been
// indexed during this backfill iteration.
if (documentsRemaining >= documents.size()) {
indexManager.markCollectionGroupIndexed(collectionGroup);
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is different from how I implemented it. I always increment the sequence number regardless of whether I am done or not - it's a tiny bit simpler but may not be as optimal as what you are doing. We should discuss.

@google-oss-bot
Copy link
Contributor

@thebrianchen: The following tests failed, say /retest to rerun them all:

Test name Commit Details Rerun command
binary-size 1a5aaa9 link /run binary-size
smoke-tests 1a5aaa9 link /test smoke-tests
device-check-changed 1a5aaa9 link /test device-check-changed

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes/test-infra repository. I understand the commands that are listed here.

@thebrianchen
Copy link
Author

Closing the PR since this was incorporated into #3153

@thebrianchen thebrianchen deleted the bc/use-counter branch November 25, 2021 00:28
@firebase firebase locked and limited conversation to collaborators Dec 25, 2021
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants