Skip to content

Feature FirestoreItemKeyedDatasource #1599

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Closed
wants to merge 9 commits into from
Closed

Feature FirestoreItemKeyedDatasource #1599

wants to merge 9 commits into from

Conversation

abond
Copy link

@abond abond commented Mar 13, 2019

Hello!
I want to share my implementation of DataSourse for Firestore that support data refresh on real time events.

Related issue #1524

@abond abond requested a review from samtstern as a code owner March 13, 2019 06:03
@samtstern
Copy link
Contributor

@abond thank you for sending this PR! Before I even begin to take a look I want to warn you that we can't ship Kotlin in this library yet as it will force our many Java developers to make the switch. However I am happy to look over the implementation and decide if we want to pursue converting it to Java and including it in the library.

Also thanks for bringing ItemKeyedDataSource to my attention! It looks like something new in the paging library that would definitely have been useful to us back when we were first implementing pagination support.

@abond
Copy link
Author

abond commented Mar 13, 2019

But Kotlin fully interoperablewith java. Nor users of library won't notice anything, nor other developers will be forced to use kotlin.

@samtstern
Copy link
Contributor

samtstern commented Mar 13, 2019 via email

@abond
Copy link
Author

abond commented Mar 13, 2019

Users of library written in kotlin can use it in java project like any other jvm lib. They don't need to enable kotlin in their project. Kotlin was developed with this goal in mind. It also developed to support step by step migration in project that start using it. That means that u can write new code in kotlin without converting all Java files to kotlin. Bite code is fully interoperable. There is some features that you can't use from Java, but I was trying my best to implement this feature Java friendly.

Copy link
Contributor

@samtstern samtstern left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

First of all I really appreciate your sending this PR! I know this is a very complex topic.

Before I go and review the details I have some high-level questions about your implementation and I also left one or two comments:

  1. How does this deal with items that move? For example if my query is collection(users).orderBy(score) and users scores are changing in real-time, what happens? What if the user at the "page boundary" which is used to load the next page changes score dramatically?

  2. How many pages are kept in memory at once? When are they loaded/unloaded?

*
* @property fieldPath The field to sort by.
*/
private abstract class Filter(
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

One of the informal rules we have on the team is "don't re-implement the query interface". The reason is that this library will always be chasing the official SDK as queries are added / changed. Plus as you can see by your implementation, it's a lot of code.

From what I can tell this system is used to be able to create a reversible query. Have you considered just asking the developer to pass in both a forward and reverse Query object (with no limit(), startAt() or endAt() clauses) when they create the recycler adapter?

Copy link
Author

@abond abond Mar 14, 2019

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yep, we can ask for backward query, but as I understand only a few people understand that to query documents before provided you should create reversed query, not just call endAt + limit.
To tell the truth I myself find out this only while working on this feature. =)

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yeah, this whole file should probably be removed. 🙂

private inline fun <TResult> Task<TResult>.await() {
try {
Tasks.await(this)
} catch (e: Throwable) { }
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

What sorts of errors are you ignoring here and why?

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It waits for completion of the Task that tries to get key befor one that is provided on DS recreation after invalidation(that happens after any document change, creadion or delition, if this change is in the range of existing listener thar previously load page). If it fails DS just start load new portion of documents beginning from provided key.

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Sorry! It's not true right now. Described in the previous comment is how it was working at first. Now it wait for whole initial data load finished. And error handling occurred in ResultsListener.onResults method. This code only needed because if we don't wait for this load synchronously Page library erase Recyclerview by submitting empty results.

@abond
Copy link
Author

abond commented Mar 14, 2019

First of all I really appreciate your sending this PR! I know this is a very complex topic.

Before I go and review the details I have some high-level questions about your implementation and I also left one or two comments:

  1. How does this deal with items that move? For example if my query is collection(users).orderBy(score) and users scores are changing in real-time, what happens? What if the user at the "page boundary" which is used to load the next page changes score dramatically?
  2. How many pages are kept in memory at once? When are they loaded/unloaded?
  1. All this "load when" logic is done by Paging library. I just implement DataSource by recomendations that is given in Paging Library documentation.
    Here https://developer.android.com/reference/android/arch/paging/DataSource#updating-paged-data in last paragraph written how to implement DS for services like firestore:

If you have more granular update signals, such as a network API signaling an update to a single item in the list, it's recommended to load data from network into memory. Then present that data to the PagedList via a DataSource that wraps an in-memory snapshot. Each time the in-memory copy changes, invalidate the previous DataSource, and a new one wrapping the new state of the snapshot can be created.

So my DS add data listener for each requested page which treat first data load event as fresh page and second as data change and invalidate DS. DS invalidation leads to recreation of PagedList+DataSource pair. During recreation PagedList provide last accessed document and requestedLoadSize(by default pageSize*3, but you can change it by PagedList.Config) as a parameters to DS. DS tries to load new portion of data around provided document and pass it back to PL which submit it to PagedAdapter. PagedAdapter compares old list with new and generete and apply update sequense which gently updates RecyclerView by calling notifyItemXXX or notifyRangeXXX functions.

  1. You can't control number of pages kept in memory before androidx library if I'm not wrong. In androidx PagedList.Config theree is maxSize paramerer which control how many items PL holds in memory.
    And as I said even before androidx pages are dropped on each data source change.

@samtstern
Copy link
Contributor

@abond thanks for your explanations! I have to apologize I am super busy right now and will need a few more days until I can get back to this PR.

While I'm being slow, maybe @SUPERCILEX or @puf has some thoughts on this.

Copy link
Collaborator

@SUPERCILEX SUPERCILEX left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Very cool concept, but I have some pretty significant reservations.

@@ -31,4 +32,13 @@ dependencies {
androidTestImplementation(Config.Libs.Test.rules)
androidTestImplementation(Config.Libs.Test.mockito)
androidTestImplementation(Config.Libs.Arch.paging)
implementation(kotlin("stdlib-jdk7", kotlin_version))
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Whoa there! @samtstern Is FUI accepting Kotlin now? From the FAQ:
image

Typically, libs will be written entirely in Kotlin (not postfixed like this) and only if it offers some sort of syntactic benefit that isn't available in Java (like coroutines or multi-platform stuff, see https://github.com/square/okio). Here, I have trouble seeing what's worth the 1MB and cognitive overhead of ensuring our APIs stay idiomatic in both Java and Kotlin.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yeah I am pushing the Kotlin question to the end. If the PR is worth having we can translate it back to Java at the last step.

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

SGTM 👍


private fun removeAllPageLoaders() {
registrations.forEach { it.remove() }
registrations.clear()
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I believe one of the reasons this hasn't been done yet is because of how difficult it is to properly subscribe and unsub from these listeners. Can you explain a bit more how and when invalidate is called?

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

No, I just forget to unsubscribe.

I wrote it previously. By design of Paging library to refresh data in UI you should invalidate DS.
Here quote from Paging library documentation:

A PagedList / DataSource pair are a snapshot of the data set. A new pair of PagedList / DataSource must be created if an update occurs, such as a reorder, insert, delete, or content update occurs. A DataSource must detect that it cannot continue loading its snapshot (for instance, when Database query notices a table being invalidated), and call invalidate(). Then a new PagedList / DataSource pair would be created to load data from the new state of the Database query.

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ok, I think I understand what's going on now. 👍

I'm a little concerned about cost. Correct me if I'm wrong, but here's what I think is going on: when a page is being loaded, it adds a snapshot listener which submits the data the first time it gets triggered. Then, on the second trigger, it invalidates the entire data set and removes all the listeners. This causes the adapter to requery everything from scratch and start the process anew. Is that correct?

If so, this feature would be extremely expensive to users. They wouldn't be paying for the diff—every single update would cost them the entire query. Coming from an official Firebase library, that seems a little abusive. @samtstern WDYT?

Copy link
Author

@abond abond Mar 19, 2019

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes, you right that it requery data on every change, but not all the data only recent pageSize*3 items by default or can be specified by PagedList.Config.initialLoadSizeHint. What is more when you finish #1388 there will be PagedList.Config.maxSize which limits number of item currently kept in memory and which will be involved in old with new data comparison.

What about requerying - this 3 resent pages most likely will be loaded from cache if I understand correctly how firebase works.

upd: Here is quote from firestore pricing page:

Also, if the listener is disconnected for more than 30 minutes (for example, if the user goes offline), you will be charged for reads as if you had issued a brand-new query.

According to it only if some new documents will be queried after data change detection there will be extra charges. As I understand even if we requery whole data from the beginning there will be no extra charge during immideate requery.

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

After some testing I come to conclusion that I was wrong - every requery leeds to extra charges.
I was wrongly assumed that since offline access is enabled by default recently accessed documents will be fetched from local cache and do not count as priced reads. But unfortunately thats not true.

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes, you right, but problem is that exactly same means with the same limit and start/end conditions.
But in this implementation we can't avoid creating new queries starting from different positions so I think it is expensive to use.
I have an idea how to solve this issue. If I succeed I send new PR.

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Assume that the query result is identical

execute exact same query multiple times doesn't charge for the entire doc set multiple times

Is this still true? @samtstern

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes that's true, identical queries issued within 30m of each other don't double-charge unless there are new results.

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

What if we have 1 new document in the query result, for example, the first query return 3 documents, the second query return the same 3 documents + 1 new document. In this case, how many documents read will be charged?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

In that case 4 total reads. 3 for the initial result. 1 for the update later.

*
* @property fieldPath The field to sort by.
*/
private abstract class Filter(
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yeah, this whole file should probably be removed. 🙂

@samtstern samtstern changed the base branch from version-4.3.2-dev to version-4.4.0-dev March 26, 2019 16:26
@samtstern
Copy link
Contributor

Since this PR is very old and we just decided to move to Paging 3 for Firebase UI v8.0.0 I am going to close this. @abond sorry we never resolved all the outstanding questions, we still really appreciate this contribution!

@samtstern samtstern closed this Jun 28, 2021
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants