Fix gossip using `gossip_timestamp_filter` instead of queries #1382

TheBlueMatt · 2022-03-24T03:01:10Z

See large comment added for more details

TheBlueMatt · 2022-03-24T03:03:37Z

CC @t-bast please tell me there's something substantial I've missed here and there is a way to keep the graph in sync that is smarter than this.

codecov-commenter · 2022-03-24T03:25:24Z

Codecov Report

Merging #1382 (83ebc1d) into main (f6fa8e9) will increase coverage by 0.03%.
The diff coverage is 90.90%.

❗ Current head 83ebc1d differs from pull request most recent head 1386a0c. Consider uploading reports for the commit 1386a0c to get more accurate results

@@            Coverage Diff             @@
##             main    #1382      +/-   ##
==========================================
+ Coverage   90.81%   90.84%   +0.03%     
==========================================
  Files          73       73              
  Lines       41209    41294      +85     
  Branches    41209    41294      +85     
==========================================
+ Hits        37424    37514      +90     
+ Misses       3785     3780       -5

Impacted Files	Coverage Δ
lightning/src/routing/network_graph.rs	`89.03% <90.90%> (-0.51%)`	⬇️
lightning/src/ln/chanmon_update_fail_tests.rs	`98.07% <0.00%> (+0.35%)`	⬆️

Continue to review full report at Codecov.

Legend - Click here to learn more
Δ = absolute <relative> (impact), ø = not affected, ? = missing data
Powered by Codecov. Last update f6fa8e9...1386a0c. Read the comment docs.

t-bast

Granted, this is far from ideal, but I'm not sure we can do much better today.

t-bast · 2022-03-24T14:53:23Z

lightning/src/routing/network_graph.rs

+		//
+		// Obnoxiously, `gossip_timestamp_filter` isn't *just* a filter, but its also a request for
+		// your peer to send you the full routing graph. Thus, in order to tell a peer to send you
+		// any updates as it sees them, you have to also ask for the full routing graph to be


I don't understand that part, what do you mean here by "full routing graph"?

If you sent a gossip_timestamp_filter set to now() - 1 hour for example, your peer should initially send you what changed in the last hour, and will then send new updates as it sees them.

IIUC it sends you what messages have a timestamp field in the last hour, not what messages it has received in the last hour. Given gossip can take some time to propagate, presumably this could lead to missing updates.

That's correct, but when you say its also a request for your peer to send you the full routing graph that sentence is quite misleading, it's only a request for your peer to send you all the updates starting from that timestamp (it's unrelated to the fact that you may miss updates by providing a timestamp that's too close from now)

Right, thanks, I added a (subject to the filter).

t-bast · 2022-03-24T15:02:09Z

lightning/src/routing/network_graph.rs

+		// In an attempt to cut a middle ground between always fetching the full graph from all of
+		// our peers and never receiving gossip from peers at all, we send all of our peers a
+		// `gossip_timestamp_filter`, with the filter time set either two weeks ago or an hour ago.


This is indeed quite a hard problem and I believe this is a good middle ground. You don't know what you previously missed, so you should somewhat regularly do a full sync (gossip_timestamp_filter of two weeks). It also depends a lot on whether your node is mostly online or mostly offline. If you're mostly online, you will receive updates through staggered broadcast, which can make up for some of the previous updates that you missed, even if you don't do a two weeks sync. If you're mostly offline though, you should do more frequent full syncs (also because the last time you attempted a full sync, you may have disconnected before it was complete).

You could also imagine more complex strategies, where you would have some heuristics based on the graph size and/or path-finding success to trigger a full sync (if path-finding fails to find routes too often, you're probably missing things so you may want to initiate a full sync), but I believe that doing a full sync with at least one of your peers when you reconnect is a wasteful but much simpler way to cover this.

IIUC, the please-include-timestamps extension to gossip queries would let us actually figure out exactly what we're missing, except (a) lnd doesn't support it, I think? and (b) after we did that we'd really like to send a gossip timestamp filter of 0/2 weeks ago without getting a full graph dump, but it seems you can't do that in the current protocol?

(a) lnd doesn't support it, I think?

Damn, I didn't remember that (I haven't checked though). But that's why we did support the timestamps extension in eclair, because it made it almost practical to sync on eclair-mobile when connected to nodes that supported this extension (like ours).

(b) after we did that we'd really like to send a gossip timestamp filter of 0/2 weeks ago without getting a full graph dump

There may be something I'm misunderstanding here when you say "getting a full graph dump".
At least in eclair, we don't do anything proactively when you send us a gossip_timestamp_filter, we only attach this value to the connection and will not propagate updates that happened before this timestamp through staggered broadcast. But we don't actively send you anything in response to gossip_timestamp_filter...?

But we don't actively send you anything in response to gossip_timestamp_filter...?

Ah, you interpreted the spec the way I did...which is different from c-lightning and lnd. The spec says this:

The receiver:
SHOULD send all gossip messages whose timestamp is greater or equal to first_timestamp, and less than first_timestamp plus timestamp_range.
MAY wait for the next outgoing gossip flush to send these.

I (and apparently you) thought that just meant "going forward", c-lightning and lnd think that means "send it now". Thus, if you set the first_timestampfield to two weeks ago, both will send you a full gossip dump!

Interesting! I now understand your comments better, I never realized c-lightning and lnd understood it this way (which shows I didn't do enough cross-compat testing of gossip...). I did understand it the same way as you did: gossip_timestamp_filter is just something that applies to (future) staggered broadcast.

It would make sense to clarify the spec and see which behavior is more desirable.

t-bast · 2022-03-24T15:03:34Z

lightning/src/routing/network_graph.rs

+		// peer receives after you connect, you must send a `gossip_timestamp_filter` message. This
+		// message, as the name implies, tells the peer to not forward any gossip messages with a
+		// timestamp older than a given value (not the time the peer received the filter, but the
+		// timestamp in the update message, which is often hours behind when the peer received the


Do you have data on this? The last time Rusty did some analysis, I believe it took a lot less than an hour for most updates to propagate on mainnet.

I don't have any new data, no, my recollection from Rusty's last analysis was there was quite a long tail, even though most updates propagated relatively quickly. That said, given nodes may use block time instead of wall clock (LDK does this), timestamps may be quite a while back when a message is generated. Worse, clock-off-by-an-hour is a pretty common thing for users who mis-set clocks, though a bit less so with time sync these days.

See large comment added for more details

TheBlueMatt · 2022-03-25T17:12:25Z

Squashed without change.

t-bast reviewed Mar 24, 2022

View reviewed changes

jkczyz added this to the 0.0.106 milestone Mar 24, 2022

jkczyz previously approved these changes Mar 25, 2022

View reviewed changes

Fix gossip using gossip_timestamp_filter instead of queries

1386a0c

See large comment added for more details

TheBlueMatt dismissed jkczyz’s stale review via 1386a0c March 25, 2022 17:12

TheBlueMatt force-pushed the 2022-03-gossip-queries-sucks branch from 83ebc1d to 1386a0c Compare March 25, 2022 17:12

jkczyz approved these changes Mar 25, 2022

View reviewed changes

valentinewallace approved these changes Mar 25, 2022

View reviewed changes

TheBlueMatt merged commit 77cab14 into lightningdevkit:main Mar 25, 2022

TheBlueMatt mentioned this pull request Apr 18, 2022

Inconsistent behavior around graph dump on gossip_timestamp_filter. lightning/bolts#980

Open

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Fix gossip using `gossip_timestamp_filter` instead of queries #1382

Fix gossip using `gossip_timestamp_filter` instead of queries #1382

Uh oh!

TheBlueMatt commented Mar 24, 2022

Uh oh!

TheBlueMatt commented Mar 24, 2022

Uh oh!

codecov-commenter commented Mar 24, 2022 •

edited

Loading

Uh oh!

t-bast left a comment

Uh oh!

t-bast Mar 24, 2022

Uh oh!

TheBlueMatt Mar 24, 2022

Uh oh!

t-bast Mar 24, 2022

Uh oh!

TheBlueMatt Mar 24, 2022

Uh oh!

t-bast Mar 24, 2022

Uh oh!

TheBlueMatt Mar 24, 2022

Uh oh!

t-bast Mar 24, 2022

Uh oh!

TheBlueMatt Mar 24, 2022

Uh oh!

t-bast Mar 25, 2022

Uh oh!

t-bast Mar 24, 2022

Uh oh!

TheBlueMatt Mar 24, 2022

Uh oh!

TheBlueMatt commented Mar 25, 2022

Uh oh!

Uh oh!

Fix gossip using gossip_timestamp_filter instead of queries #1382

Fix gossip using gossip_timestamp_filter instead of queries #1382

Uh oh!

Conversation

TheBlueMatt commented Mar 24, 2022

Uh oh!

TheBlueMatt commented Mar 24, 2022

Uh oh!

codecov-commenter commented Mar 24, 2022 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Codecov Report

Uh oh!

t-bast left a comment

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

TheBlueMatt commented Mar 25, 2022

Uh oh!

Uh oh!

Fix gossip using `gossip_timestamp_filter` instead of queries #1382

Fix gossip using `gossip_timestamp_filter` instead of queries #1382

codecov-commenter commented Mar 24, 2022 •

edited

Loading