Skip to content

Add probing_diversity_penalty_msat to the scorer parameters #3422

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged

Conversation

TheBlueMatt
Copy link
Collaborator

@TheBlueMatt TheBlueMatt commented Nov 23, 2024

When doing background probing, its important to get a diverse view of the network graph to ensure you have as many options when pathfinding as possible.

Sadly, the naive probing we currently recommend users do does not accomplish that - using the same success-optimized pathfinder when sending as when probing results in always testing the same (good) path over and over and over again.

Instead, here, we add a probing_diversity_penalty_msat parameter to the scorer, allowing us to penalize channels for which we already have recent data. It applies a penalty which scales with the square of the inverse time since the channel was last updated, up to one day.

@TheBlueMatt TheBlueMatt marked this pull request as draft November 23, 2024 17:11
@TheBlueMatt
Copy link
Collaborator Author

Needs a test, should otherwise work, plan to start using this asap tho.

@tnull
Copy link
Contributor

tnull commented Nov 23, 2024

Instead, here, we add a probing_diversity_penalty_msat parameter to the scorer, allowing us to penalize channels for which we already have recent data. It applies a penalty which scales with the square of the inverse time since the channel was last updated, up to one day.

Hmm. I think we had previously discussed this and I expressed the need for path diversity in background probing to explore the wider network. However, I'm not sure why this should be a parameter on our regular scorer we use to send payments.

Shouldn't this rather be handled by target/path selection of the background probing software?

@TheBlueMatt TheBlueMatt force-pushed the 2024-11-probing-diversity branch from bc8a318 to 3fe0c91 Compare November 23, 2024 21:02
@TheBlueMatt
Copy link
Collaborator Author

Hmm. I think we had previously discussed this and I expressed the need for path diversity in background probing to explore the wider network. However, I'm not sure why this should be a parameter on our regular scorer we use to send payments.

I'm not sure exactly how we'd build it as a separate router. I think "just do the normal thing but try to explore backup options" seems like a pretty good strategy. Doing that as a separate router/scorer would be kinda hard - it really needs access to the internal scorer fields and scorer support.

Shouldn't this rather be handled by target/path selection of the background probing software?

It still would be. No sensible use of the scorer would ever set this parameter, you'd only ever set it when background probing.

@TheBlueMatt TheBlueMatt marked this pull request as ready for review November 23, 2024 21:08
Copy link

codecov bot commented Nov 23, 2024

Codecov Report

All modified and coverable lines are covered by tests ✅

Project coverage is 90.00%. Comparing base (1b281f1) to head (a245ca7).
Report is 341 commits behind head on main.

Additional details and impacted files
@@            Coverage Diff             @@
##             main    #3422      +/-   ##
==========================================
+ Coverage   88.66%   90.00%   +1.34%     
==========================================
  Files         149      151       +2     
  Lines      116893   130296   +13403     
  Branches   116893   130296   +13403     
==========================================
+ Hits       103640   117273   +13633     
+ Misses      10753    10463     -290     
- Partials     2500     2560      +60     

☔ View full report in Codecov by Sentry.
📢 Have feedback on the report? Share it here.

🚀 New features to boost your workflow:
  • ❄️ Test Analytics: Detect flaky tests, report on failures, and find test suite problems.

@TheBlueMatt TheBlueMatt force-pushed the 2024-11-probing-diversity branch from 3fe0c91 to 957f93f Compare November 23, 2024 21:11
@TheBlueMatt
Copy link
Collaborator Author

Any further thoughts here, @tnull?

@TheBlueMatt TheBlueMatt added the weekly goal Someone wants to land this this week label Jan 30, 2025
/// will be applied.
///
/// It obviously does not make sense to assign a non-0 value here unless you are using the
/// pathfinding result for background probing.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This way of probing where you set this exploration/exploitation trade-off is just one way to do it, isn't it?

From https://lightningdevkit.org/probing:

Recall that earlier, we discussed the concept of using multiple scorers. One scorer that we actually save results to and another un-updated scorer that can be used to identify routes with a "beginner's mind" - so to speak. If we'd like to leverage this approach, we can pass an un-updated scorer here and then update a different scorer with the results of the probe.

This is a different approach where no scores are used for probe route selection, which is different again from what is in the PR description (using a single scorer).

What's your current thinking on what's best?

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hmm, certainly depends on your goal, but I think one model of probing I've kinda come around to is the idea of finding "backup routes" to destination you will likely pay in the future. ie its great when we repeatedly pay a destination and we have a route we always use that works basically all the time, but what happens when that route suddenly fails? Now we have no data on any other path because we're always paying over that same route.

This PR is certainly not a thorough way to get good "backup routes", of course, but with regular probing my gut says it should do at least an okay job.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I can follow your thoughts there, but I am quite uncomfortable with the level of gut feeling involved in these decisions. This is probably something that keeps coming back. In #2709 we were also guessing a bit at the best approach, and there are probably more areas where it's not very clear what the best approach or parameters are.

It does make me think more about the idea of setting up a test framework that involves a virtual network of nodes.

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yea, there's definitely a bit of a gut approach here, and a sim-framework might help, but the amount of gut-feelings that would go into building a sim-framework, I assume, would end up with more gut-result than even this :/. IMO we'd be better off A/B testing two nodes with identical channels to the same peers probing on the same interval and making test payments on the same interval to evaluate such things, but that requires a lot of time to get results.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Maybe that is true. With a sim network though, I think the discussion will move to a higher level. We'd discuss whether the network is realistic enough, and parameters / probe methods / pathfinding algorithms will follow from that.

To me, A/B testing sounds even more prone to getting non-representative results and is probably too slow to scan parameter spaces.

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I fear that in a sim network the impact of guesses/gut-informed decisions is multiplicative (each guess compounds the effects of previous ones), but the only way to find out would be to build it and see what we get, plus to compare with reality (via some dataset...not sure what datset...). Maybe comparing with my probing dataset would work, when we shipped the new scorer in 0.1 a data scientist who works on c= helped run some analysis and at least found that our results were robust across a number of dimensions (channel size, etc), so there's at least some reason to think that those results are moderately representative of the broader network.

@TheBlueMatt
Copy link
Collaborator Author

Oh grrrr, I forgot about the first commit here, we should backport that since it fixes build for some 32-bit platforms with newer rustc.

@joostjager
Copy link
Contributor

joostjager commented Feb 3, 2025

Needs a test, should otherwise work, plan to start using this asap tho.

How are you planning to use this? Using it sounds like testing as well, and perhaps that is a more functional test than the unit test in this PR.

@joostjager joostjager closed this Feb 3, 2025
@joostjager joostjager reopened this Feb 3, 2025
@TheBlueMatt
Copy link
Collaborator Author

I've been running this patch on my personal node for a few weeks now. I haven't tried to do any analysis of whether it resulted in higher success rates, though.

@joostjager
Copy link
Contributor

I've been running this patch on my personal node for a few weeks now. I haven't tried to do any analysis of whether it resulted in higher success rates, though.

Do you think this is sufficient for merging what's in this PR? And, do you have thoughts on how that analysis could be carried out?

@TheBlueMatt
Copy link
Collaborator Author

Do you think this is sufficient for merging what's in this PR?

I mean up to you and @tnull, but I think the maintenance cost of this code is relatively small (and we can drop it in the future if we have a better method of probing) and it seems unlikely to me that using the new penalty when doing regular (eg. minute-ly) probes would get you materially worse data - fundamentally you trade off gathering data for you "preferred" paths for paths you have less recent data for. This would be a bad tradeoff if you aren't gathering enough datapoints for paths you otherwise care about, but if you're doing focused probing relatively often you gather quite a few datapoints (eg probing the top 100 destinations that you pay to once a minute gets you 14 attempts to each desination per day), so ISTM this is a pretty obviously a useful feature at least for that kind of probing. Whether its the best way to do probing, probably not, but it seems pretty Obvious (TM) to me that its better than the current logic.

And, do you have thoughts on how that analysis could be carried out?

I'd have to A/B test, probably.

/// pathfinding result for background probing.
///
/// Specifically, the following penalty is applied
/// `probing_diversity_penalty_msat * max(0, (86400 - current time + last update))^2 / 86400^2` is
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Just talked offline with Joost a bit about this and during that I wondered what a good magic value would be for the penalty, i.e, from what threshold does it actually have an impact on diversity of the picked paths?

Could we maybe have the applied penalty be proportional to the amount of datapoints tracked (mayb in the last day)? This way, even if we'd start with the penalty having no impact on diversity and we'd probe the same path over and over again, it would accrue over time and eventually the applied penalty would be so big that it would have an impact, having us moving on to the next likely path at least?

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Just talked offline with Joost a bit about this and during that I wondered what a good magic value would be for the penalty, i.e, from what threshold does it actually have an impact on diversity of the picked paths?

I set mine to 100_000. 100 sats is about max of where fees lie anyway, so a penalty between 25 sats (for 12 hours) and 100 sats (absolute max) should have a material impact. In general, I'm not sure I understand your question - its the same type of penalty as any of our other penalties, and we have to pick penalties which appropriately prefer other paths which have lower fees, ie they need to be somehow proportional to fees on the network.

Could we maybe have the applied penalty be proportional to the amount of datapoints tracked (mayb in the last day)? This way, even if we'd start with the penalty having no impact on diversity and we'd probe the same path over and over again, it would accrue over time and eventually the applied penalty would be so big that it would have an impact, having us moving on to the next likely path at least?

We currently don't track any information that would allow us to do that, and tracking it would probably be nontrivial in size (eg a list of the last N times, as u64s, we got a result for each channel). I'm not entirely convinced that makes a difference, though - the theory here is that we probably don't need more results for a channel than once or twice a day if we're probing constantly over the course of months. Achieving that by calculating the number of times we probed a channel in the last day vs just looking at the last time we probed a channel should come out in the wash.

@joostjager
Copy link
Contributor

How about if you'd - on your node - probe the same destination let's say 20 times and do that with and without this patch? Perhaps this is exactly your A/B test suggestion, but not sure if you also wanted to simplify to probing just a single node without randomization.

Looking at the paths taken with and without should give some confidence that paths are indeed diversified. We can merge this code, and then go from there.

@TheBlueMatt
Copy link
Collaborator Author

I started comparing paths between a scorer with and without this, and once it got going it picked a different path 4 out of the last 16 probes. I looked at a few of the hops on two of them and we generally preferred hops with less total data (I don't currently print the last-updated time).

@joostjager
Copy link
Contributor

joostjager commented Feb 6, 2025

What are your thoughts on this result? If there isn't much time in between, I'd think that 12 duplicate probes is maybe not good enough? Or an indication that probing_diversity_penalty_msat needs to be higher.

What do you mean by 'once it got going' by the way?

@TheBlueMatt
Copy link
Collaborator Author

What are your thoughts on this result? If there isn't much time in between, I'd think that 12 duplicate probes is maybe not good enough? Or an indication that probing_diversity_penalty_msat needs to be higher.

My node is picking random nodes in the graph to probe and I have 55 active channels, so I imagine for many probes it will pick a path that I haven't explored in a while (in part because many nodes are hard to pay/don't have any inbound liquidity so it should constantly be trying new paths).

To be clear, the thing I was looking at was just whether or not the path it picks is different with or without the new penalty, so none of these paths are repeated, rather we're just picking the same path with or without the penalty applied.

On a slightly longer time horizon, this morning I saw 14/52 probes take a different path with double the penalty I'd originally applied (maxing out at 200 sats), which at least indicates that the penalty logic is working somewhat correctly.

@joostjager
Copy link
Contributor

joostjager commented Feb 6, 2025

Thanks for that clarification. And good that you got at least some indication that it is working. We already established that it isn't exact science at the moment, and also that improving that situation probably requires a significant investment (test framework, robust a/b testing, etc).

In that light, would you say let's merge this PR to at least open up for some experimentation? I do sense that it is worth being healthily cautious about future intuitive additions that further complicate the scorer.

@TheBlueMatt
Copy link
Collaborator Author

Yea, IMO we should land it and move on. We should give more thought to probing broadly and maybe take some testing approach things if we add more knobs for probing. For non-probing results we at least have a test framework now which is helpful.

}

/// The amount of padding required to make [`ChannelLiquidity`] (plus a u64) a full 4 cache lines.
const LIQ_PADDING_LEN: usize = (256 - ::core::mem::size_of::<(u64, DummyLiquidity)>()) / 8;
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Do you have any idea on the effect size of this optimization?

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't recall exactly anymore, but the commit that introduced it (4fd07ec) said "a few percent".

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Not sure if it is worth it then, writing, reviewing, understanding and maintaining the code and making sure that it remains a win on all platforms.

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Sorry for the delay here, was re-running benchmarks to see what's up. I only have some pretty old hardware handy, but on two old Intel machines: one seems to see no change (-3% to +3% with an outlier at +8%) the other sees around 10% win by doing this (and some related prefetches I keep meaning to upstream, but which don't make much sense without the alignment) when doing linear scoring and relatively flat when doing nonlinear scoring. Though linear scoring isn't used very much, the performance gain there (which was actually robust, unlike the noise on the other machine, probably because its NUMA) is nontrivial and likely indicative of gains on other platforms (I don't have an arm, especially a relatively cheap arm, machine handy to benchmark on atm).

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Linear scoring - what does this mean exactly? That liquidity information is retrieved for channels in the order in which they are stored in the hashmap? Interested to hear where 'linear scoring' is happening in LDK.

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The core PDF in LDK is switchable - you can use the crazy M + (x - 0.5)^N thing or you can just use the naive Rene scoring. The naive Rene scoring sticks with integer math so exercises a different part of the CPU and can be a bit faster due to lack of float <> int conversions (ie the conversions and float math appears to be the bottleneck on the machine in question, rather than the memory loads, but once we remove that bottleneck, or if we used a faster machine, the memory loads become the bottleneck). When we added the nonlinear PDF we weren't particularly confident in it as we hadn't done much formal side-by-side testing of it, and thus made it switch-able. Now that we have we can maybe remove the option and only have nonlinear scoring.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

You are saying that when linear scoring is removed, non-linear floating point conversion takes over, and this cache optimization isn't significant anymore. Is it worth keeping then?

read_tlv_fields!(r, {
(0, channel_liquidities, required),
});
let mut last_duration_since_epoch = Duration::from_secs(0);
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The idea is that when there is no data, the first pathfinding operation works with time zero for all the penalty calculations?

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think that makes sense, yea.

@TheBlueMatt
Copy link
Collaborator Author

Rebased.

@TheBlueMatt TheBlueMatt force-pushed the 2024-11-probing-diversity branch from e1f41be to 40f6175 Compare February 12, 2025 18:45
@tnull
Copy link
Contributor

tnull commented Feb 13, 2025

@TheBlueMatt given there is some back-and-forth here, mind splitting out the first commit into a dedicated PR to make sure it can land (and get backported) independently ASAP?

@TheBlueMatt
Copy link
Collaborator Author

@TheBlueMatt given there is some back-and-forth here, mind splitting out the first commit into a dedicated PR to make sure it can land (and get backported) independently ASAP?

There's back and forth on the first commit too :)

@TheBlueMatt
Copy link
Collaborator Author

Went ahead and addressed the build issues in 0.1 with #3610 so this doesn't need backported. Was gonna drop the first commit but realized the changes to the struct would also break the build, so easier to at least hash out whether we want the first commit or if we want to drop the assertions here and then move forward for 0.2.

@joostjager
Copy link
Contributor

I'd think remove all the cache line optimization here if it breaks with the struct changes in this PR, and move reintroduction/discussion to a different PR?

We generally expect `ChannelLiquidity` to be exactly three cache
lines to ensure the first bytes we need are all one one cache line.
This improves performance very marginally on some machines, but
the assertions that this is true do not work on some Android 32-bit
machines due to differing `Duration` sizes.

Here we simply remove the assertions to fix build on platforms
where the struct size isn't exactly on cache lines. This may
marginally harm performance but it shouldn't be that critical.

Fixes lightningdevkit#3415
@TheBlueMatt TheBlueMatt force-pushed the 2024-11-probing-diversity branch from 40f6175 to 0dc5fae Compare February 20, 2025 14:47
@TheBlueMatt
Copy link
Collaborator Author

Okay, yea, sorry, was checking if more strict alignment than what this PR did would actually show up in benchmarks and it really didn't, so went ahead and swapped the first commit for the one in #3610. We can revisit alignment later if we have benchmarks showing it helps much (but sadly I think the floating point performance is the limiting factor).

joostjager
joostjager previously approved these changes Feb 21, 2025
Copy link
Contributor

@joostjager joostjager left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Spoke about this PR offline and went through and settled the remaining discussions. Just two remarks about naming, non-blocking.

In the next commit we'll enable users to assign a penalty to
channels which we don't have any recent information for to better
enable probing diversity. In order to do so, we need to actually
track the last time we received new information for a channel,
which we do here.
When doing background probing, its important to get a diverse view
of the network graph to ensure you have as many options when
pathfinding as possible.

Sadly, the naive probing we currently recommend users do does not
accomplish that - using the same success-optimized pathfinder when
sending as when probing results in always testing the same (good)
path over and over and over again.

Instead, here, we add a `probing_diversity_penalty_msat` parameter
to the scorer, allowing us to penalize channels for which we
already have recent data. It applies a penalty which scales with
the square of the inverse time since the channel was last updated,
up to one day.
@TheBlueMatt
Copy link
Collaborator Author

Squashed with only the two requested renames added. Figure we should just land this with one ack since this is all off-by-default stuff and not wildly complicated implementation-wise (low risk of imploding something), and at least tnull has looked at it.

$ git diff-tree -U1 0dc5faeb3 a245ca733
diff --git a/lightning/src/routing/scoring.rs b/lightning/src/routing/scoring.rs
index 9627521ff..d88bce4f5 100644
--- a/lightning/src/routing/scoring.rs
+++ b/lightning/src/routing/scoring.rs
@@ -480,3 +480,3 @@ where L::Target: Logger {
 	/// decayed every liquidity bound up to that time.
-	last_duration_since_epoch: Duration,
+	last_update_time: Duration,
 }
@@ -934,3 +934,3 @@ struct ChannelLiquidity {
 	/// ignoring decays).
-	last_datapoint: Duration,
+	last_datapoint_time: Duration,
 }
@@ -945,3 +945,3 @@ struct DirectedChannelLiquidity<L: Deref<Target = u64>, HT: Deref<Target = Histo
 	offset_history_last_updated: T,
-	last_datapoint: T,
+	last_datapoint_time: T,
 }
@@ -957,3 +957,3 @@ impl<G: Deref<Target = NetworkGraph<L>>, L: Deref> ProbabilisticScorer<G, L> whe
 			channel_liquidities: ChannelLiquidities::new(),
-			last_duration_since_epoch: Duration::from_secs(0),
+			last_update_time: Duration::from_secs(0),
 		}
@@ -1189,3 +1189,3 @@ impl ChannelLiquidity {
 			offset_history_last_updated: last_updated,
-			last_datapoint: last_updated,
+			last_datapoint_time: last_updated,
 		}
@@ -1222,3 +1222,3 @@ impl ChannelLiquidity {
 			offset_history_last_updated: &self.offset_history_last_updated,
-			last_datapoint: &self.last_datapoint,
+			last_datapoint_time: &self.last_datapoint_time,
 		}
@@ -1246,3 +1246,3 @@ impl ChannelLiquidity {
 			offset_history_last_updated: &mut self.offset_history_last_updated,
-			last_datapoint: &mut self.last_datapoint,
+			last_datapoint_time: &mut self.last_datapoint_time,
 		}
@@ -1414,3 +1414,3 @@ DirectedChannelLiquidity< L, HT, T> {
 	fn penalty_msat(
-		&self, amount_msat: u64, inflight_htlc_msat: u64, last_duration_since_epoch: Duration,
+		&self, amount_msat: u64, inflight_htlc_msat: u64, last_update_time: Duration,
 		score_params: &ProbabilisticScoringFeeParameters,
@@ -1497,9 +1497,8 @@ DirectedChannelLiquidity< L, HT, T> {
 		if score_params.probing_diversity_penalty_msat != 0 {
-			// We use `last_duration_since_epoch` as a stand-in for the current time as we don't
-			// want to fetch the current time in every score call (slowing things down
-			// substantially on some platforms where a syscall is required), don't want to add an
-			// unnecessary `std` requirement. Assuming we're probing somewhat regularly, it should
-			// reliably be close to the current time, (and using the last the last time we probed
-			// is also fine here).
-			let time_since_update = last_duration_since_epoch.saturating_sub(*self.last_datapoint);
+			// We use `last_update_time` as a stand-in for the current time as we don't want to
+			// fetch the current time in every score call (slowing things down substantially on
+			// some platforms where a syscall is required), don't want to add an unnecessary `std`
+			// requirement. Assuming we're probing somewhat regularly, it should reliably be close
+			// to the current time, (and using the last the last time we probed is also fine here).
+			let time_since_update = last_update_time.saturating_sub(*self.last_datapoint_time);
 			let mul = Duration::from_secs(60 * 60 * 24).saturating_sub(time_since_update).as_secs();
@@ -1606,3 +1605,3 @@ DirectedChannelLiquidity<L, HT, T> {
 		*self.last_updated = duration_since_epoch;
-		*self.last_datapoint = duration_since_epoch;
+		*self.last_datapoint_time = duration_since_epoch;
 	}
@@ -1616,3 +1615,3 @@ DirectedChannelLiquidity<L, HT, T> {
 		*self.last_updated = duration_since_epoch;
-		*self.last_datapoint = duration_since_epoch;
+		*self.last_datapoint_time = duration_since_epoch;
 	}
@@ -1660,3 +1659,3 @@ impl<G: Deref<Target = NetworkGraph<L>>, L: Deref> ScoreLookUp for Probabilistic
 		let capacity_msat = usage.effective_capacity.as_msat();
-		let time = self.last_duration_since_epoch;
+		let time = self.last_update_time;
 		self.channel_liquidities
@@ -1711,3 +1710,3 @@ impl<G: Deref<Target = NetworkGraph<L>>, L: Deref> ScoreUpdate for Probabilistic
 		}
-		self.last_duration_since_epoch = duration_since_epoch;
+		self.last_update_time = duration_since_epoch;
 	}
@@ -1739,3 +1738,3 @@ impl<G: Deref<Target = NetworkGraph<L>>, L: Deref> ScoreUpdate for Probabilistic
 		}
-		self.last_duration_since_epoch = duration_since_epoch;
+		self.last_update_time = duration_since_epoch;
 	}
@@ -1752,3 +1751,3 @@ impl<G: Deref<Target = NetworkGraph<L>>, L: Deref> ScoreUpdate for Probabilistic
 		self.channel_liquidities.time_passed(duration_since_epoch, self.decay_params);
-		self.last_duration_since_epoch = duration_since_epoch;
+		self.last_update_time = duration_since_epoch;
 	}
@@ -2409,5 +2408,5 @@ ReadableArgs<(ProbabilisticScoringDecayParameters, G, L)> for ProbabilisticScore
 		let channel_liquidities = ChannelLiquidities::read(r)?;
-		let mut last_duration_since_epoch = Duration::from_secs(0);
+		let mut last_update_time = Duration::from_secs(0);
 		for (_, liq) in channel_liquidities.0.iter() {
-			last_duration_since_epoch = cmp::max(last_duration_since_epoch, liq.last_updated);
+			last_update_time = cmp::max(last_update_time, liq.last_updated);
 		}
@@ -2418,3 +2417,3 @@ ReadableArgs<(ProbabilisticScoringDecayParameters, G, L)> for ProbabilisticScore
 			channel_liquidities,
-			last_duration_since_epoch,
+			last_update_time,
 		})
@@ -2435,3 +2434,3 @@ impl Writeable for ChannelLiquidity {
 			(9, self.offset_history_last_updated, required),
-			(11, self.last_datapoint, required),
+			(11, self.last_datapoint_time, required),
 		});
@@ -2452,3 +2451,3 @@ impl Readable for ChannelLiquidity {
 		let mut offset_history_last_updated = None;
-		let mut last_datapoint = None;
+		let mut last_datapoint_time = None;
 		read_tlv_fields!(r, {
@@ -2462,3 +2461,3 @@ impl Readable for ChannelLiquidity {
 			(9, offset_history_last_updated, option),
-			(11, last_datapoint, option),
+			(11, last_datapoint_time, option),
 		});
@@ -2487,3 +2486,3 @@ impl Readable for ChannelLiquidity {
 			offset_history_last_updated: offset_history_last_updated.unwrap_or(last_updated),
-			last_datapoint: last_datapoint.unwrap_or(last_updated),
+			last_datapoint_time: last_datapoint_time.unwrap_or(last_updated),
 		})
@@ -2661,3 +2660,3 @@ mod tests {
 		let offset_history_last_updated = Duration::ZERO;
-		let last_datapoint = Duration::ZERO;
+		let last_datapoint_time = Duration::ZERO;
 		let network_graph = network_graph(&logger);
@@ -2668,3 +2667,3 @@ mod tests {
 					min_liquidity_offset_msat: 700, max_liquidity_offset_msat: 100,
-					last_updated, offset_history_last_updated, last_datapoint,
+					last_updated, offset_history_last_updated, last_datapoint_time,
 					liquidity_history: HistoricalLiquidityTracker::new(),
@@ -2674,3 +2673,3 @@ mod tests {
 					min_liquidity_offset_msat: 700, max_liquidity_offset_msat: 100,
-					last_updated, offset_history_last_updated, last_datapoint,
+					last_updated, offset_history_last_updated, last_datapoint_time,
 					liquidity_history: HistoricalLiquidityTracker::new(),
@@ -2741,3 +2740,3 @@ mod tests {
 		let offset_history_last_updated = Duration::ZERO;
-		let last_datapoint = Duration::ZERO;
+		let last_datapoint_time = Duration::ZERO;
 		let network_graph = network_graph(&logger);
@@ -2748,3 +2747,3 @@ mod tests {
 					min_liquidity_offset_msat: 200, max_liquidity_offset_msat: 400,
-					last_updated, offset_history_last_updated, last_datapoint,
+					last_updated, offset_history_last_updated, last_datapoint_time,
 					liquidity_history: HistoricalLiquidityTracker::new(),
@@ -2802,3 +2801,3 @@ mod tests {
 		let offset_history_last_updated = Duration::ZERO;
-		let last_datapoint = Duration::ZERO;
+		let last_datapoint_time = Duration::ZERO;
 		let network_graph = network_graph(&logger);
@@ -2809,3 +2808,3 @@ mod tests {
 					min_liquidity_offset_msat: 200, max_liquidity_offset_msat: 400,
-					last_updated, offset_history_last_updated, last_datapoint,
+					last_updated, offset_history_last_updated, last_datapoint_time,
 					liquidity_history: HistoricalLiquidityTracker::new(),
@@ -2915,3 +2914,3 @@ mod tests {
 		let offset_history_last_updated = Duration::ZERO;
-		let last_datapoint = Duration::ZERO;
+		let last_datapoint_time = Duration::ZERO;
 		let network_graph = network_graph(&logger);
@@ -2929,3 +2928,3 @@ mod tests {
 					min_liquidity_offset_msat: 40, max_liquidity_offset_msat: 40,
-					last_updated, offset_history_last_updated, last_datapoint,
+					last_updated, offset_history_last_updated, last_datapoint_time,
 					liquidity_history: HistoricalLiquidityTracker::new(),

@TheBlueMatt TheBlueMatt requested review from joostjager and tnull April 1, 2025 18:26
if score_params.probing_diversity_penalty_msat != 0 {
// We use `last_update_time` as a stand-in for the current time as we don't want to
// fetch the current time in every score call (slowing things down substantially on
// some platforms where a syscall is required), don't want to add an unnecessary `std`
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Re-reading this now, I am not so sure anymore that it is important to avoid std for probing specifically. It seems likely that probing is done only on servers that have std?

For attr failures we do a similar thing where we only report hold time data if the platform has std. This would avoid the extra disk writes, and make for a simpler changeset in general. Not just lines of code, but also reasoning about the time approximation. This comment basically.

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm incredibly skeptical of making exceptions for more and more use-cases. While its true that most active probers will be built with std, I'd be pretty skeptical of calling it a guarantee. There's not much reason to think that some edge-nodes will never decide to do some probing to improve their graph knowledge, and even if we assume they never would we could also end up at some point in the future having forwarding nodes that happen to be running in a weird environment (eg some high-security high-funds node running in a secure enclave or so that might not like regular std SystemTime).

We can cut some corners here and there but diverging too much in terms of functionality between std and not really dilutes the value of LDK, IMO.

Copy link
Contributor

@joostjager joostjager Apr 3, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The alternative approach is to implement the minimum that is needed today for probing and don't support non-std probing diversity yet. Then if any of the potential requirements that you list above materializes, add it then. In the case of this PR, it wouldn't increase the total amount of work much. In this PR you'd just fetch time, and then in the potential future non-std extension, delete that one line and add the time derivation.

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'd much rather do it right the first time than have a user complain that something isn't working in their build and have to rush out a release to get it working. If its a ton of complexity or not practical its one thing, but its fairly straightforward here.

Copy link
Contributor

@tnull tnull left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Two non-blocking naming nits, otherwise LGTM


/// The last time when the liquidity bounds were updated with new payment information (i.e.
/// ignoring decays).
last_datapoint_time: Duration,
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can we name this last_datapoint_duration_since_epoch to clarify the format since Duration is a ubiquitous type that doesn't tell us how time is measured?

Copy link
Contributor

@tnull tnull Apr 7, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LOL, I just realized this is exactly what @joostjager had you change the variable names from in his naming nits above.

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yea, I figured _time was pretty clear (what else could it possibly be?)

@@ -476,6 +476,9 @@ where L::Target: Logger {
network_graph: G,
logger: L,
channel_liquidities: ChannelLiquidities,
/// The last time we were given via a [`ScoreUpdate`] method. This does not imply that we've
/// decayed every liquidity bound up to that time.
last_update_time: Duration,
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Same here, why not last_update_duration_since_epoch?

@TheBlueMatt
Copy link
Collaborator Author

Landing but will add test coverage in a trivial followup.

@TheBlueMatt TheBlueMatt merged commit 187a8b7 into lightningdevkit:main Apr 7, 2025
25 of 26 checks passed
@TheBlueMatt
Copy link
Collaborator Author

Should backport the first commit to 0.1 if we do another point release.

@TheBlueMatt
Copy link
Collaborator Author

Backported in #3610

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
weekly goal Someone wants to land this this week
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants