-
Notifications
You must be signed in to change notification settings - Fork 412
Use unique per-node "node_counter"s rather than a node hashmap in routing #3104
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Use unique per-node "node_counter"s rather than a node hashmap in routing #3104
Conversation
0d4a06f
to
515d757
Compare
Codecov ReportAttention: Patch coverage is
❗ Your organization needs to install the Codecov GitHub app to enable full functionality. Additional details and impacted files@@ Coverage Diff @@
## main #3104 +/- ##
==========================================
+ Coverage 89.82% 91.56% +1.74%
==========================================
Files 119 121 +2
Lines 97991 113379 +15388
Branches 97991 113379 +15388
==========================================
+ Hits 88016 103816 +15800
+ Misses 7400 7032 -368
+ Partials 2575 2531 -44 ☔ View full report in Codecov by Sentry. |
IMO, feel free to squash, if that's fine with @valentinewallace? |
Yup, feel free to squash and rebase! |
Rebased and squashed fixups. |
515d757
to
402fcf8
Compare
lightning/src/routing/gossip.rs
Outdated
node_entry.into_mut().channels.push(short_channel_id); | ||
let node = node_entry.into_mut(); | ||
node.channels.push(short_channel_id); | ||
**node_counter = node.node_counter; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Not following this because the node should already have a counter set, I think? 🤔
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
node_counter
is a reference to the channel's node_counter field, not the node's one. I updated the variable name to be clearer.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Basically LGTM
if target_node_counter.is_none() { | ||
break 'path_walk; | ||
} | ||
if target_node_counter == Some(payee_node_counter) { break 'path_walk; } |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The previous code was a bit more readable, is this change necessary?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Well, the next line unwrap
s the target_node_counter
, so it seemed nice to have the sanity check here that its set?
Btw, some warnings were introduced in CI |
402fcf8
to
8a2ad24
Compare
These counters are simply a unique number describing each node. They have no specific meaning, but start at 0 and count up, with counters being reused after a node has been deleted.
In the next commit, we'll use the new `node_counter`s to remove a `HashMap` from the router, using a `Vec` to store all our per-node information. In order to make finding entries in that `Vec` cheap, here we store the source and destintaion `node_counter`s in `ChannelInfo`, givind us the counters for both ends of a channel without doing a second `HashMap` lookup.
8a2ad24
to
7bbac2e
Compare
Squashed fixups. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Did a first high-level pass for the new approach, but have yet to get fully into the details. Having the builder already makes a lot of difference w.r.t. readability/maintainability.
As this is still a lot of added complexity, do you have any benchmark numbers indicating how significant the performance improvement actually is?
|
||
impl PartialEq for ChannelInfo { | ||
fn eq(&self, o: &ChannelInfo) -> bool { | ||
self.features == o.features && |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Any idea if/how we could ensure we don't forget to add any new fields to these PartialEq
implementations for ChannelInfo
/NodeInfo
in the future? Maybe it would be worth to create CountedNodeInfo
/CountedChannelInfo
wrapper structs and maintain the simple derives
on the inner structs?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yea, there's no way with standard rust outside of what you describe. I'm a bit heasitant to jump to a substruct given we currently expose the node/channel maps publicly, and having some kind of wrapper struct would then get exposed in our public API, which would suck :/
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Right, rather than exposing the IndexedMap
directly (which, due to its custom nature doesn't play nice with a lot of the Rust ecosystem anyways), we might want to provide iterators over the underlying data directly, and possibly a few other accessors that would allow us to keep only exposing NodeInfo
/ChannelInfo
, while the internal representation includes the wrapper?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
That would be a lot of work...I'm not quite sure why the IndexedMap
doesn't play nice with existing Rust stuff - it has methods almost the same as a standard Rust map (and we could certainly add more), exposes iterators and Entry
s and everything. Doing away with the map being exposed would mean we'd need to replicate most of that in the NetworkGraph
API with iterators and accessors, adding a handful of methods. I'm not really sure its worth it here :(
lightning/src/routing/router.rs
Outdated
}) | ||
} | ||
|
||
fn node_counter_from_pubkey(&mut self, pubkey: PublicKey) -> u32 { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
While we're adding helpers, could we also add a newtype for the counter itself? Could improve readability over having yet another non-descript u32
.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Aside from CLTVs, I think this is our only u32
? I certainly can add a newtype, but given I believe its only ever accessed/stored in fields titled *_counter
I'm not really clear on it improving readability/reducing the chance of bugs.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
All the block height values are u32
, all the fee values are u32
, CLTVs, Timestamps are u32
s, etc...
IMO we should start to introduce newtypes for a good part of the ubiquitous u64
/u32
values internally, but especially when it comes to the API. Besides such instances as here, Amount
/ FeeRate
might be very valuable to avoid any conversion errors, as discussed elsewhere before.
Another example would be to have a ShortChannelId
data type that would hide the (somewhat internal/LDK-specific u64
representation) and could display the different kinds of SCIDs differently and allow encoding as BOLT7 BBBxTTTxOOO
(without external helpers).
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
IMO we should start to introduce newtypes for a good part of the ubiquitous u64/u32 values
There's a lot of value in newtypes when there's room for confusion, but I don't think that's always the case just because two things have the same type. Otherwise its just boilerplate .0
s everywhere for the sake of it and the need to go double-check the type isn't doing something out from under us when we use it.
Besides such instances as here, Amount / FeeRate might be very valuable to avoid any conversion errors, as discussed elsewhere before.
Indeed, mostly because feerates are multiplied to create amounts, so we use two elements of the different types in the same line, but maybe more importantly it would let us consolidate our feerate -> fee amount conversion logic, which we have pasted in a lot of places right now.
Another example would be to have a ShortChannelId data type that would hide the (somewhat internal/LDK-specific u64 representation) and could display the different kinds of SCIDs differently and allow encoding as BOLT7 BBBxTTTxOOO (without external helpers).
Right, this would be nice for the display implementation.
Very unscientific benchmarks on my laptop (comparing current git with this PR rebased on git) show mean improvement between 14% and ~0. Note of course that if we further optimize the scorer (which we can do quite a bit) these numbers would look better.
|
7bbac2e
to
3db5f1e
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'll need to do another closer pass on the routing logic updates.
In general, I'm still not the biggest fan of embedding these internal counters into publicly exposed datatypes that users might use differently than expected by us. If we could find a way to further encapsulate the node counter logic and leave the public types untouched, it would be preferable IMO.
I also ran some benchmarks, which however paint a pretty clear picture (so far, the nonlinear
case is still running). On an older Linux Server (Intel Xeon E3-1226 v3 @ 3.30Ghz), with 10000 samples:
Benchmarking generate_routes_with_zero_penalty_scorer: Collecting 10000 samples in estimated 1032.3 s (10k iterations)
Benchmarking generate_routes_with_zero_penalty_scorer: Analyzing
generate_routes_with_zero_penalty_scorer
time: [100.72 ms 101.46 ms 102.20 ms]
change: [+14.158% +15.781% +17.449%] (p = 0.00 < 0.05)
Performance has regressed.
Found 200 outliers among 10000 measurements (2.00%)
200 (2.00%) high mild
Benchmarking generate_mpp_routes_with_zero_penalty_scorer: Collecting 10000 samples in estimated 1032.9 s (10k iterations)
Benchmarking generate_mpp_routes_with_zero_penalty_scorer: Analyzing
generate_mpp_routes_with_zero_penalty_scorer
time: [100.76 ms 101.61 ms 102.48 ms]
change: [+11.333% +12.728% +14.099%] (p = 0.00 < 0.05)
Performance has regressed.
Found 200 outliers among 10000 measurements (2.00%)
200 (2.00%) high mild
Benchmarking generate_routes_with_probabilistic_scorer: Collecting 10000 samples in estimated 1427.9 s (10k iterations)
Benchmarking generate_routes_with_probabilistic_scorer: Analyzing
generate_routes_with_probabilistic_scorer
time: [141.39 ms 142.10 ms 142.80 ms]
change: [-55.456% -55.182% -54.911%] (p = 0.00 < 0.05)
Performance has improved.
Benchmarking generate_mpp_routes_with_probabilistic_scorer: Collecting 10000 samples in estimated 1365.2 s (10k iterations)
Benchmarking generate_mpp_routes_with_probabilistic_scorer: Analyzing
generate_mpp_routes_with_probabilistic_scorer
time: [131.85 ms 132.59 ms 133.33 ms]
change: [-57.247% -56.991% -56.732%] (p = 0.00 < 0.05)
Performance has improved.
Benchmarking generate_large_mpp_routes_with_probabilistic_scorer: Collecting 10000 samples in estimated 3147.2 s (10k iterations)
Benchmarking generate_large_mpp_routes_with_probabilistic_scorer: Analyzing
generate_large_mpp_routes_with_probabilistic_scorer
time: [352.87 ms 356.96 ms 361.08 ms]
change: [-73.539% -73.138% -72.724%] (p = 0.00 < 0.05)
Performance has improved.
Found 200 outliers among 10000 measurements (2.00%)
200 (2.00%) high mild
I think I'll rerun these over the weekend as they almost seem to good to be true.
/// | ||
/// These IDs allow the router to avoid a `HashMap` lookup by simply using this value as an | ||
/// index in a `Vec`, skipping a big step in some of the hottest code when routing. | ||
pub(crate) node_counter: u32, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
To be honest, I'm still not too keen on embedding this index in the NodeInfo
type itself, as it semantically clearly should live in the enclosing data structure.
As this is part of the public API and Clone
, users might hang on to NodeInfo
s whose node_counter
is stale as we might have dropped the node already from the network graph and reused by now. If we'd ever accept NodeInfo
inputs to some API, we might read these stale counters. In fact, users might already run into this if they'd use our serialization logic in a manner we don't predict.
IIUC, a wrapper struct would allow us to clearly distinguish the LDK-internal accounting data (node_counter
) from the user-facing data type that users may want to reuse independently of our current NetworkGraph
state.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Hmm, we don't, however, expose the counters nor does LDK access them via NodeInfo
s which the user might hold. Thus, it seems like a stretch to be too concerned about it now, though I can see why its a bit awkward. Obviously if the router were to ever take NodeInfo
s from users we'd be screwed, but I don't see any reason why we'd ever do that. Worse, wrapping NodeInfo
s would mean we can't directly expose the graph's node map anymore, which would be a pretty annoying API change.
lightning/src/routing/router.rs
Outdated
}) | ||
} | ||
|
||
fn node_counter_from_pubkey(&mut self, pubkey: PublicKey) -> u32 { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
All the block height values are u32
, all the fee values are u32
, CLTVs, Timestamps are u32
s, etc...
IMO we should start to introduce newtypes for a good part of the ubiquitous u64
/u32
values internally, but especially when it comes to the API. Besides such instances as here, Amount
/ FeeRate
might be very valuable to avoid any conversion errors, as discussed elsewhere before.
Another example would be to have a ShortChannelId
data type that would hide the (somewhat internal/LDK-specific u64
representation) and could display the different kinds of SCIDs differently and allow encoding as BOLT7 BBBxTTTxOOO
(without external helpers).
|
||
impl PartialEq for ChannelInfo { | ||
fn eq(&self, o: &ChannelInfo) -> bool { | ||
self.features == o.features && |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Right, rather than exposing the IndexedMap
directly (which, due to its custom nature doesn't play nice with a lot of the Rust ecosystem anyways), we might want to provide iterators over the underlying data directly, and possibly a few other accessors that would allow us to keep only exposing NodeInfo
/ChannelInfo
, while the internal representation includes the wrapper?
I believe none of the new counters are public, otherwise I'd definitely agree with you :).
50% from this alone does seem a bit high, but I wouldn't have been surprised to see that on the zero-penalty-scorer cases. The router does a ton of lookups in |
4b34513
to
3654954
Compare
The second round of benches seems consistent:
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
As discussed offline, in general I'm still not the biggest fan of introducing this additional complexity and adding private fields to public datatypes (which has several drawbacks/consequences, such as prohibiting that the types are constructed outside of the crate's context, that we'd ever add an API taking them as an argument, etc.).
That said, the performance benefits shown by the benchmarks seem significant, and the changes here look good to me I think. I'll let @valentinewallace have another look, but happy to ACK from my side.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM, feel free to squash.
lightning/src/routing/router.rs
Outdated
} | ||
}) | ||
.collect::<Vec<_>>(); | ||
match &payment_params.payee { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Might be nice to separate these checks into a method, given that get_route
is ~1200 lines.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
NIce
In the next commit we'll stop using `NodeId`s to look up nodes when routing, instead using the new per-node counters. Here we take the first step, adding a local struct which tracks temporary counters for route hints/source/destination nodes. Because we must ensure we have a 1-to-1 mapping from node ids to `node_counter`s, even across first-hop and last-hop hints, we have to be careful to check the network graph first, then a new `private_node_id_to_node_counter` map to ensure we only ever end up with one counter per node id.
With the new `NodeCounters` have have a all the `NodeId`s we'll need during routing, so there's no need to keep the `private_hop_key_cache` which existed to provide references to `NodeId`s which are needed during routing.
The router's `introduction_node_id_cache` is used to cache the `NodeId`s of blinded path introduction points so that we don't have to look them up every time we go around the main router loop. When using it, if the introduction point isn't a public node we then look up the introduction in our first-hops map. In either case, we have to end up with a reference to a `NodeId` that outlives our `dist` map. Here we consolidate both the initial cache building and the first-hops map lookup to one place, storing only a reference to a `NodeId` either in the `NetworkGraph` or in the new `NodeCounters` to get the required lifetime without needing to reference into the first-hops map. We then take this opportunity to avoid `clone`ing the first-hops map entries as we now no longer reference into it.
Now that we have unique, dense, 32-bit identifiers for all the nodes in our network graph, we can store the per-node information when routing in a simple `Vec` rather than a `HashMap`. This avoids the overhead of hashing and table scanning entirely, for a nice "simple" optimization win.
Now that `PathBuildingHop` is stored in a `Vec` (as `Option`s), rather than `HashMap` entries, they can grow to fill a full two cache lines without a memory access performance cost. In the next commit we'll take advantage of this somewhat, but here we update the assertions and drop the `repr(C)`, allowing rust to lay the memory out as it wishes.
This marginally reduces the size of `get_route` by moving a the blinded path introduction point resolution and blinded path checks into a helper method.
3654954
to
4f5e17b
Compare
Squashed without change and then added one more commit:
|
78c0eaa
into
lightningdevkit:main
During routing, we spend most of our time doing hashmap lookups. It turns out, we can drop two of them, the first requires a good bit of work - assigning each node in memory a random u32 "node counter", we can then drop the main per-node routefinding state map and replace it with a vec. Once we do that, we can also drop the first-hop hashmap lookup that we do on a per node basis as we walk the network graph, replacing it with a check in the same vec.
This is just the minimal node_id -> counter change from #2803