Skip to content

Retrying transactions with backoff #698

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
merged 11 commits into from
Aug 13, 2019
Merged

Retrying transactions with backoff #698

merged 11 commits into from
Aug 13, 2019

Conversation

thebrianchen
Copy link

Good call on recommending that I try and implement Android first before merging web. This has a been a painful, but educational experience.

A couple things about Android:

  • The recursive method seemed like the only viable option here, since we cannot await backoffs in the same thread as the one calling it (something that I realized after a bit too much struggling). That being said, I still think the web approach is cleaner and more readable compared to Android. How do you feel about keeping the implementations different? I'm also not sure how I would port the whole TaskCompletionSource strategy used by Android for recursion over to web.
  • The skipDelaysForTimerIds approach of web doesn't port over to Android b/c it relies on runDelayedTasksUntil, which blocks until the delay is completed. This means that we must call delays from outside the AsyncQueue, or else it deadlocks. Overriding the backoff is the only other approach that works aside from calling a runDelayedTasksUntil on a 10ms interval on a separate thread. Thoughts on this approach?

@thebrianchen thebrianchen changed the title Bc/tx backoff Retrying transactions with backoff Aug 12, 2019
Copy link
Contributor

@mikelehen mikelehen left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Mostly LGTM but I'm not psyched about removeTransactionBackoffs(). I think we can make skipDelaysForTimerId() work.

Also, I think this code is portable to JS, and it would be worthwhile to change JS to run transaction code on the AsyncQueue anyway (other than the user's updateFunction) in order to make the platforms more consistent. As-is, it's not clear what is an intentional deviation and what is accidental.

If the recursive transaction update code is getting too gnarly, I think we could probably simplify things by introducing a TransactionRunner class that encapsulates the logic of running / retrying with backoff. Right now, SyncEngine.transaction() is getting a bit unwieldy with 5 arguments, when outside callers (i.e. FirestoreClient.transaction()) really only need to provide asyncQueue and updateFunction. The rest are just there to enable the method to call itself recursively, I think.

@mikelehen mikelehen assigned thebrianchen and unassigned mikelehen Aug 12, 2019
Copy link
Author

@thebrianchen thebrianchen left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for the review! Quick question: from a code health and maintainability perspective, why is it bad to introduce new test hooks, like I did with removeTransactionBackoffs? I can see how skipDelaysForTimerId is a more elegant solution, but when is it appropriate/preferred to add new test hooks?

Copy link
Contributor

@mikelehen mikelehen left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for reworking! I have some minor feedback, but overall I'm liking this approach better.

}

/** Returns the result of the transaction after it has been run. */
public Task<TResult> getTask() {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is there a reason for this to be a separate function? I think I'd just make runTransaction() return the Task<TResult>.

This will also probably port better to iOS (runTransaction() will just accept a completion callback instead of returning a Task).

@mikelehen mikelehen assigned thebrianchen and unassigned mikelehen Aug 12, 2019
@mikelehen
Copy link
Contributor

Oh! By the way, I realized I forgot to answer your question:

from a code health and maintainability perspective, why is it bad to introduce new test hooks, like I did with removeTransactionBackoffs? I can see how skipDelaysForTimerId is a more elegant solution, but when is it appropriate/preferred to add new test hooks?

Uh, this is pretty subjective. I guess my general philosophy is that test hooks are bad for a variety of reasons, including:

  1. You end up shipping test-only code as part of your product which means it's essentially "dead code."
  2. It means your tests are modifying the behavior of the client which necessarily means it's not a fully realistic test so there's risk of masking bugs or confusing future devs working on the code / tests.

And so in general, it's good to have as few test hooks as possible, keep the scope as limited as possible, and make them reusable so you don't have to invent more test hooks later.

In our case, we've already chosen the AsyncQueue as a useful place to allow tests to hook in to deal with time-related issues in tests (i.e. fast-forwarlding time via runDelayedOperationsEarly()). So adding new test hooks to the AsyncQueue is preferable to a new transaction-specific test hook. It's more consistent and is more likely to be reusable in the future.

Copy link
Contributor

@mikelehen mikelehen left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

A few remaining nits, but this basically LGTM.

@thebrianchen
Copy link
Author

/test new-smoke-tests

Copy link
Contributor

@mikelehen mikelehen left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM except for typo.

@@ -27,8 +26,7 @@
import com.google.firebase.firestore.util.ExponentialBackoff;

/**
* TransactionRunner encapsulates the logic needed to run and retry transactions so that the caller
* does not have to manage the backoff and retry count through recursive calls.
* TransactionRunner encapsulates the logic needed to run and retry transactions without backoff.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

without => with ? 😄

@thebrianchen thebrianchen removed the cla: yes Override cla label Aug 13, 2019
@googlebot googlebot added the cla: yes Override cla label Aug 13, 2019
@thebrianchen thebrianchen merged commit 137fb97 into master Aug 13, 2019
@thebrianchen thebrianchen deleted the bc/tx-backoff branch August 13, 2019 23:08
@firebase firebase locked and limited conversation to collaborators Oct 8, 2019
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Projects
None yet
Development

Successfully merging this pull request may close these issues.

5 participants