Allow firestore to recover quicker when a network change occurs #217

rsgowman · 2019-01-28T17:05:23Z

Eg going from airplane mode to wifi enabled. Previously, firestore would
use an exponential backoff to determine when to attempt a reconnect.
That backoff is now reset and the connections are retried immediately
upon a network change.

firebase-firestore/src/main/java/com/google/firebase/firestore/remote/RemoteStore.java

Eg going from airplane mode to wifi enabled. Previously, firestore would use an exponential backoff to determine when to attempt a reconnect. That backoff is now reset and the connections are retried immediately upon a network change.

...firestore/src/main/java/com/google/firebase/firestore/remote/NetworkReachabilityMonitor.java

(Filenames themselves only to make subsequent diff's easier to review in case github's code review tool doesn't handle this well.)

...re/src/main/java/com/google/firebase/firestore/remote/AndroidNetworkReachabilityMonitor.java

...ase-firestore/src/androidTest/java/com/google/firebase/firestore/remote/RemoteStoreTest.java

var-const · 2019-01-28T18:37:30Z

...re/src/main/java/com/google/firebase/firestore/remote/AndroidNetworkReachabilityMonitor.java

+    }
+
+    @Override
+    public void onLost(Network network) {


Question: are onAvailable and onLost guaranteed to never fire twice in a row? (similar question about the pre-N implementation)

I wouldn't expect them to be called twice in a row under normal circumstances, at least, not for the same network. But when going into (or coming out of) airplane mode (while in a wifi area) it seems possible for onLost to be called twice; once for wifi and again for cell.

The behaviour we'd take in response is to reset our streams twice (which I think is correct, since cell comes up first, and android will kill cell connections shortly after wifi is re-established.) Worst case: we do a little extra work when the networks are changing.

Does this seem reasonable, or am I missing a case?

...ase-firestore/src/androidTest/java/com/google/firebase/firestore/remote/RemoteStoreTest.java

var-const · 2019-01-28T18:56:05Z

firebase-firestore/src/main/java/com/google/firebase/firestore/remote/RemoteStore.java

@@ -201,6 +203,17 @@ public void onClose(Status status) {
                handleWriteStreamClose(status);
              }
            });
+
+    networkReachabilityMonitor.onNetworkReachabilityChange(


I think this is fine. It's pretty different from what iOS does, but my understanding is that the Android implementation has significantly less control over gRPC, so trying to bring this in line with iOS is probably impractical. One difference is that iOS does an error stop while this implementation performs a graceful stop. Graceful stop resets backoff, which is better for reestablishing connection, though it would result in more unsuccessful attempts to connect in case of losing connection.

...firestore/src/main/java/com/google/firebase/firestore/remote/AndroidConnectivityMonitor.java

firebase-firestore/src/main/java/com/google/firebase/firestore/remote/ConnectivityMonitor.java

wilhuff · 2019-01-28T22:55:33Z

Also, don't forget the changelog entry.

- apply renames to spec test - fix broken test. Previously, the connectivity monitor callback would reset the network (enableNetwork(); disableNetwork();) which caused the network state to change. Now that we go via the internals, the state transitions from UNKNOWN to UNKNOWN, which doesn't trigger the *OnlineStateTracker* callback. (The right thing was still happening... we just couldn't detect it due to that callback not firing.) Overriding the state "fixes" this.

...ase-firestore/src/androidTest/java/com/google/firebase/firestore/remote/RemoteStoreTest.java

...re/src/main/java/com/google/firebase/firestore/remote/AndroidNetworkReachabilityMonitor.java

firebase-firestore/src/main/java/com/google/firebase/firestore/remote/ConnectivityMonitor.java

rsgowman · 2019-01-28T19:36:06Z

...re/src/main/java/com/google/firebase/firestore/remote/AndroidNetworkReachabilityMonitor.java

+    }
+
+    @Override
+    public void onLost(Network network) {


I wouldn't expect them to be called twice in a row under normal circumstances, at least, not for the same network. But when going into (or coming out of) airplane mode (while in a wifi area) it seems possible for onLost to be called twice; once for wifi and again for cell.

The behaviour we'd take in response is to reset our streams twice (which I think is correct, since cell comes up first, and android will kill cell connections shortly after wifi is re-established.) Worst case: we do a little extra work when the networks are changing.

Does this seem reasonable, or am I missing a case?

rsgowman · 2019-01-29T02:55:46Z

...ase-firestore/src/androidTest/java/com/google/firebase/firestore/remote/RemoteStoreTest.java

+    waitFor(networkChangeSemaphore);
+    drain(testQueue);
+
+    waitFor(testQueue.enqueue(() -> remoteStore.forceEnableNetwork()));


I wasn't super wild about this test before, and now I'm liking it even less. It doesn't quite test what we want (though actually testing that the network recovers quickly when coming out of airplane mode is a bit challenging since you can't programatically alter the airplane mode state in api17+.) I may have to rethink this a bit... But in the meantime, if anyone has a shorter-term idea for making this test better, let me know.

rsgowman · 2019-01-29T03:04:41Z

re changelog: done (thanks)

firebase-firestore/CHANGELOG.md

wilhuff · 2019-01-29T16:18:19Z

...firestore/src/main/java/com/google/firebase/firestore/remote/AndroidConnectivityMonitor.java

+  private final List<Consumer<NetworkStatus>> callbacks = new ArrayList<>();
+
+  public AndroidConnectivityMonitor(Context context) {
+    // This notnull restriction could be eliminated... the pre-N method doesn't


A possible future refactoring (not for this PR) that would address this would just be to split this into two classes with a shared callback manager helper.

FWIW: AndroidChannelBuilder resolves this by (a) checking for null before setting the connectivityManager, and then (b) checking if connectivityManager is null prior to using the N+ variant. (I've preserved that check, though it's not technically necessary here. See line 64.)

wilhuff

LGTM

wilhuff · 2019-01-29T18:45:59Z

firebase-firestore/src/main/java/com/google/firebase/firestore/remote/RemoteStore.java

+  @VisibleForTesting
+  void forceEnableNetwork() {
+    enableNetwork();
+    onlineStateTracker.updateState(OnlineState.ONLINE);


This could be better handled by injecting the onlineStateTracker and allowing the test to manipulate it directly. Let's refactor later though.

wilhuff · 2019-01-29T18:47:23Z

firebase-firestore/src/main/java/com/google/firebase/firestore/remote/RemoteStore.java

    }
  }

+  private void restartNetwork() {


Consider moving this closer to the other network-controlling methods like disableNetworkInternal so that it's more obvious that a choice is available.

wilhuff · 2019-01-29T18:50:15Z

Oh and even though github isn't flagging this as conflicting, it's probably worth merging master here before squashing because there is a conflict it's not flagging on the CHANGELOG.

(The SQLite blob thing is in Unreleased section in master and should be moved into the 18.0.1 section you've created.)

We don't care about this in production, but our integration tests fail due to 'Too many NetworkRequests filed' because of the way they create and tear down firestore as a whole.

…faster

wilhuff

LGTM

We do care about this in more than the integration tests (e.g. google3-internal conformance tests), and eventually we need to support Firebase-wide app deletion so this is a worthwhile addition regardless.

wilhuff · 2019-01-29T19:21:47Z

...firestore/src/main/java/com/google/firebase/firestore/remote/AndroidConnectivityMonitor.java

@@ -58,21 +59,37 @@ public void addCallback(Consumer<NetworkStatus> callback) {
    callbacks.add(callback);
  }

+  @Override
+  public void shutdown() {
+    if (unregisterRunnable != null) {


This is fine for now, but we should split these two versions so that we're not manually implementing virtual functions just to keep these two different implementations in the same class.

googlebot added the cla: yes Override cla label Jan 28, 2019

google-oss-bot added the size/L label Jan 28, 2019

rsgowman commented Jan 28, 2019

View reviewed changes

firebase-firestore/src/main/java/com/google/firebase/firestore/remote/RemoteStore.java Outdated Show resolved Hide resolved

wilhuff mentioned this pull request Jan 28, 2019

[Firestore] Connection not recovering quickly after being offline #204

Closed

rsgowman force-pushed the rsgowman/reconnect_faster branch from ba787b8 to df51beb Compare January 28, 2019 17:16

rsgowman force-pushed the rsgowman/reconnect_faster branch from df51beb to def6f93 Compare January 28, 2019 17:17

rsgowman assigned rsgowman, var-const and wilhuff Jan 28, 2019

rsgowman requested review from var-const and wilhuff January 28, 2019 17:58

Rework resetting of the network to respect user requests

6e373af

rsgowman removed their assignment Jan 28, 2019

wilhuff reviewed Jan 28, 2019

View reviewed changes

Rename NetworkReachabilityMonitor to ConnectivityMonitor

5d60975

(Filenames themselves only to make subsequent diff's easier to review in case github's code review tool doesn't handle this well.)

var-const reviewed Jan 28, 2019

View reviewed changes

rsgowman added 2 commits January 28, 2019 13:59

Review feedback: mostly naming

b6cee9e

format

ddc47dd

wilhuff reviewed Jan 28, 2019

View reviewed changes

...firestore/src/main/java/com/google/firebase/firestore/remote/AndroidConnectivityMonitor.java Outdated Show resolved Hide resolved

firebase-firestore/src/main/java/com/google/firebase/firestore/remote/ConnectivityMonitor.java Show resolved Hide resolved

rsgowman added 2 commits January 28, 2019 14:22

Review feeedback: mostly naming (pt2: tests)

e917162

j.u.f.Consumer -> c.g.f.f.u.Consumer

f0e7b07

var-const assigned rsgowman and unassigned var-const Jan 28, 2019

rsgowman commented Jan 29, 2019

View reviewed changes

changelog

8705ea2

rsgowman assigned var-const Jan 29, 2019

rsgowman added 3 commits January 28, 2019 22:08

changelog + issue number

64314e8

Fix resulting from manual test: Restart network on worker queue.

0df8901

Remove debug lines and format

a41cbdb

wilhuff reviewed Jan 29, 2019

View reviewed changes

fix changelog

ac528dc

wilhuff approved these changes Jan 29, 2019

View reviewed changes

var-const approved these changes Jan 29, 2019

View reviewed changes

var-const removed their assignment Jan 29, 2019

rsgowman added 3 commits January 29, 2019 14:13

Hookup unregistering the network listener.

8aa6daf

We don't care about this in production, but our integration tests fail due to 'Too many NetworkRequests filed' because of the way they create and tear down firestore as a whole.

Merge remote-tracking branch 'origin/master' into rsgowman/reconnect_…

8772e4f

…faster

Move unreleased changelog entry into 18.0.1 section

e608da3

wilhuff approved these changes Jan 29, 2019

View reviewed changes

var-const approved these changes Jan 29, 2019

View reviewed changes

rsgowman added 2 commits January 29, 2019 14:31

format

a1591fd

Move restartNetwork

1936237

rsgowman merged commit 1a39d8c into master Jan 29, 2019

rsgowman deleted the rsgowman/reconnect_faster branch January 29, 2019 20:07

firebase locked and limited conversation to collaborators Oct 12, 2019

Allow firestore to recover quicker when a network change occurs #217

Allow firestore to recover quicker when a network change occurs #217

Uh oh!

Conversation

rsgowman commented Jan 28, 2019

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Uh oh!

wilhuff commented Jan 28, 2019

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

rsgowman commented Jan 29, 2019

Uh oh!

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

wilhuff left a comment

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

wilhuff commented Jan 29, 2019

Uh oh!

wilhuff left a comment

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Uh oh!