Skip to content

Don't include below-dust inbound HTLCs in commit tx fee calculation #762

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged

Conversation

valentinewallace
Copy link
Contributor

Closes #757

Does this fix look right? Seems like an oops from the original channel reserve PR. Let me know and I'll write tests.

@codecov
Copy link

codecov bot commented Nov 29, 2020

Codecov Report

Merging #762 (fc784f4) into main (de58bcf) will increase coverage by 0.03%.
The diff coverage is 95.37%.

Impacted file tree graph

@@            Coverage Diff             @@
##             main     #762      +/-   ##
==========================================
+ Coverage   90.76%   90.80%   +0.03%     
==========================================
  Files          44       44              
  Lines       24282    24466     +184     
==========================================
+ Hits        22040    22216     +176     
- Misses       2242     2250       +8     
Impacted Files Coverage Δ
lightning/src/ln/channel.rs 87.70% <94.68%> (+0.44%) ⬆️
lightning/src/ln/functional_tests.rs 96.99% <100.00%> (+0.06%) ⬆️
lightning/src/ln/channelmanager.rs 84.90% <0.00%> (-0.10%) ⬇️

Continue to review full report at Codecov.

Legend - Click here to learn more
Δ = absolute <relative> (impact), ø = not affected, ? = missing data
Powered by Codecov. Last update de58bcf...1079753. Read the comment docs.

@TheBlueMatt
Copy link
Collaborator

This looks good to me. I'll let the fuzzer spin on this + 756 for a bit and see what it thinks.

@TheBlueMatt
Copy link
Collaborator

Fuzzers didn't find anything after 15 CPU-days, so probably at least fixes that failing case. Obviously can't rely on fuzzers so should add an explicit test (yay lightning/src/ln/chanmon_update_fail_tests.rs), but otherwise looks good.

@TheBlueMatt TheBlueMatt added this to the 0.0.13 milestone Dec 3, 2020
@valentinewallace valentinewallace force-pushed the chan-reserve-fuzz-failure branch 4 times, most recently from fa57462 to 84149fd Compare December 4, 2020 21:09
@valentinewallace
Copy link
Contributor Author

Still needs a bit more testing but I pushed up one test and an additional fix (the fix being: when calculating the commit tx fee, don't include the HTLC currently being added if it's below dust).

@valentinewallace valentinewallace force-pushed the chan-reserve-fuzz-failure branch from 84149fd to 1faf0da Compare December 5, 2020 20:53
@valentinewallace valentinewallace marked this pull request as ready for review December 7, 2020 20:08
let mut their_acked_htlcs = self.pending_inbound_htlcs.len();
let mut addl_htlcs = 1;
if fee_spike_buffer_htlc.is_some() { addl_htlcs += 1; }
if new_htlc_amount / 1000 <= self.counterparty_dust_limit_satoshis {
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Should the counterparty/holder dust limit references in this PR be swapped? If its easy, would be nice to get a test which would catch such an inversion.

@valentinewallace valentinewallace force-pushed the chan-reserve-fuzz-failure branch from 1faf0da to 15a92be Compare December 16, 2020 16:53
@valentinewallace
Copy link
Contributor Author

Found some more missing pieces so I'm restructuring to reuse some of the logic in build_commitment_transaction. Will update when this is ready for review again.

@TheBlueMatt
Copy link
Collaborator

If you're refactoring bits of build_commitment_transaction you may want to build on/anticipate #742.

@valentinewallace valentinewallace force-pushed the chan-reserve-fuzz-failure branch from 15a92be to f165fca Compare December 16, 2020 23:51
@TheBlueMatt
Copy link
Collaborator

Is there a way we can do a more thorough test of this? eg in cfg(test) store the latest value here and when we call build_commitment_transaction check that the latest value matches? It may require a bunch of "do we need to check here" cases, but if its not too bad it may be worth it.

@valentinewallace
Copy link
Contributor Author

Is there a way we can do a more thorough test of this? eg in cfg(test) store the latest value here and when we call build_commitment_transaction check that the latest value matches? It may require a bunch of "do we need to check here" cases, but if its not too bad it may be worth it.

That makes sense, currently trying to implement this!

@valentinewallace valentinewallace force-pushed the chan-reserve-fuzz-failure branch from f165fca to 068543e Compare December 22, 2020 21:30
@valentinewallace
Copy link
Contributor Author

Is there a way we can do a more thorough test of this? eg in cfg(test) store the latest value here and when we call build_commitment_transaction check that the latest value matches? It may require a bunch of "do we need to check here" cases, but if its not too bad it may be worth it.

Ok, I believe I've approximately added this testing.

@devrandom devrandom mentioned this pull request Dec 27, 2020
@TheBlueMatt
Copy link
Collaborator

We should probably enable the new fee caching stuff in fuzztarget, too.

@TheBlueMatt
Copy link
Collaborator

Note that the new tests, with just the cfg changed to run in fuzztarget, fail on at least 2c090d2c2c2c340c13fffe039f42f3e7.

@valentinewallace valentinewallace force-pushed the chan-reserve-fuzz-failure branch 2 times, most recently from ba5ea8c to 5754f42 Compare January 27, 2021 01:58
@TheBlueMatt
Copy link
Collaborator

Ah, I take that back, the full_stack_target failure definitely looks like a bug either in the new tests or in the new logic.

@valentinewallace valentinewallace force-pushed the chan-reserve-fuzz-failure branch 2 times, most recently from 3b02a56 to 076137d Compare January 27, 2021 23:39
@valentinewallace
Copy link
Contributor Author

Note that the new tests, with just the cfg changed to run in fuzztarget, fail on at least 2c090d2c2c2c340c13fffe039f42f3e7.

OK I believe I've fixed this and full_stack_target

@TheBlueMatt
Copy link
Collaborator

It looks like 2a441414192a441414190a0a2a441414190d61c9de5c021d is failing chanmon_consistency.

@valentinewallace
Copy link
Contributor Author

It looks like 2a441414192a441414190a0a2a441414190d61c9de5c021d is failing chanmon_consistency.

I think my current approach in 076137d causes too many bespoke exceptions to be necessary. Gonna try to rework it.

@TheBlueMatt
Copy link
Collaborator

Right, I was kinda afraid of that. I'm mostly OK with a ton of exceptions in #[cfg(test)] blocks, but if you can find a simpler way that would be cool too.

@valentinewallace valentinewallace force-pushed the chan-reserve-fuzz-failure branch from 076137d to e6a3fe3 Compare February 5, 2021 18:09
@valentinewallace
Copy link
Contributor Author

Right, I was kinda afraid of that. I'm mostly OK with a ton of exceptions in #[cfg(test)] blocks, but if you can find a simpler way that would be cool too.

Lmk what you think of the new approach.

@TheBlueMatt
Copy link
Collaborator

Lmk what you think of the new approach.

Right, much better, thanks! I am a little worried about the "info.num_htlcs == counterparty_commitment_tx.1" checks neutering some of the value - isn't part of the reason for the checks to make sure we have the same number of HTLCs as we thought we would? Is there a way to take a bit of both approaches and maybe just drop that check and then reset the states on RAA or commitment_signed messages?

@valentinewallace valentinewallace force-pushed the chan-reserve-fuzz-failure branch from e6a3fe3 to 3b6000d Compare February 8, 2021 21:57
@valentinewallace
Copy link
Contributor Author

valentinewallace commented Feb 8, 2021

Lmk what you think of the new approach.

Right, much better, thanks! I am a little worried about the "info.num_htlcs == counterparty_commitment_tx.1" checks neutering some of the value - isn't part of the reason for the checks to make sure we have the same number of HTLCs as we thought we would? Is there a way to take a bit of both approaches and maybe just drop that check and then reset the states on RAA or commitment_signed messages?

Adding those specific checks^ still had test failures. I agree the info.num_htlcs == counterparty_commitment_tx.1 did neuter some of the value though, so I've updated it to instead make sure the total number of pending HTLCs is the same. I think this should capture the full value of the check, but lmk if I'm off on that.

Copy link
Collaborator

@TheBlueMatt TheBlueMatt left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Right. Looks good to me! I added one note about some manual mutation testing that didn't get caught, if its easy would be nice, but if its a pain I'm ok without.

@valentinewallace valentinewallace force-pushed the chan-reserve-fuzz-failure branch from 137346c to ebf5a06 Compare February 9, 2021 20:16
&& info.next_holder_htlc_id == self.next_holder_htlc_id
&& info.next_counterparty_htlc_id == self.next_counterparty_htlc_id
&& info.feerate == self.feerate_per_kw {
assert_eq!(total_fee, info.fee / 1000);
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is there any way to write a test that checks that these agree instead of adding to the production code (albeit, only conditionally for tests). Generally, writing a test would be preferable if it is possible.

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think the point of this block is more to check it in fuzz/functional tests than one-off. Would be great to get the test code our of channel.rs, I agree, but not sure how to do that in the fuzzer/functional tests.

@valentinewallace valentinewallace force-pushed the chan-reserve-fuzz-failure branch from ebf5a06 to d6860ec Compare February 17, 2021 21:28
@valentinewallace valentinewallace force-pushed the chan-reserve-fuzz-failure branch 2 times, most recently from 2e4297b to 2cc8aa7 Compare February 18, 2021 18:09
Also remove part of the holding cell channel reserve test that's newly failing but
a bit of a redundant test anyway.
@valentinewallace valentinewallace force-pushed the chan-reserve-fuzz-failure branch from 2cc8aa7 to 1079753 Compare February 18, 2021 18:09
@TheBlueMatt TheBlueMatt merged commit 13e990e into lightningdevkit:main Feb 19, 2021
Copy link

@ariard ariard left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Post-merge Code Review ACK 1079753

Overall, a call for better comments given how much off-chain computation is tricky. Accounting a HTLC for fee computation is function of its state, and even if we follow the most sound reasoning IMO, I wouldn't be surprised that other implementation have weird recipes ("What if they have another definition of AwaitingRemoteRevokeToRemove ?"), thus leading to unexpected closure in case of edge-cases. Good documentation would help track down that kind of cases if they show up.

I think I've also found few test coverage holes.

assert!(self.is_outbound());

let mut their_acked_htlcs = self.pending_inbound_htlcs.len();
let real_dust_limit_success_sat = (self.feerate_per_kw as u64 * HTLC_SUCCESS_TX_WEIGHT / 1000) + self.holder_dust_limit_satoshis;
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think it would be great to document what does real dust_limit means, it's quite buried in BOLT3. I even think the spec should scope the HTLC output fee contribution to the commitment one, but it doesn't seem to be recommended anywhere...

OutboundHTLCState::RemoteRemoved {..} => included_htlcs += 1,
// We don't include AwaitingRemoteRevokeToRemove HTLCs because our next commitment
// transaction won't be generated until they send us their next RAA, which will mean
// dropping any HTLCs in this state.
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think this is correct, sequence where you might have HTLC in AwaitingRemoteRevokeToRemote are either us sending update_fail/success_htlc, commitment_signed and yet trying to send one more HTLC or receive an update from remote.

Note, "dropping any HTLCs in this state" doesn't mean this HTLC won't materialize onchain. Receiving counterparty RAA doesn't guarantee then to receive a commitment_signedwith such HTLC effectively removed for holder commitment tx.

// Get the commitment tx fee for the remote's next commitment transaction based on the number of
// pending HTLCs that are on track to be in their next commitment tx, plus an additional HTLC if
// `fee_spike_buffer_htlc` is Some, plus a new HTLC given by `new_htlc_amount`. Dust HTLCs are
// excluded.
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

May we prefix mention of dust with "economically" or something like this to mark the difference with Core' s notion of IsDust to evaluate a transaction standardness ?

// +1 for this HTLC.
self.next_remote_commit_tx_fee_msat(1)
let htlc_candidate = HTLCCandidate::new(msg.amount_msat, HTLCInitiator::RemoteOffered);
self.next_remote_commit_tx_fee_msat(htlc_candidate, None) // Don't include the extra fee spike buffer HTLC in calculations
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

No test coverage, mutating here

diff --git a/lightning/src/ln/channel.rs b/lightning/src/ln/channel.rs
index d3a372e4..b7f5d6a2 100644
--- a/lightning/src/ln/channel.rs
+++ b/lightning/src/ln/channel.rs
@@ -1969,7 +1969,7 @@ impl<ChanSigner: ChannelKeys> Channel<ChanSigner> {
                // feerate_per_kw, while maintaining their channel reserve (as required by the spec).
                let remote_commit_tx_fee_msat = if self.is_outbound() { 0 } else {
                        let htlc_candidate = HTLCCandidate::new(msg.amount_msat, HTLCInitiator::RemoteOffered);
-                       self.next_remote_commit_tx_fee_msat(htlc_candidate, None) // Don't include the extra fee spike buffer HTLC in calculations
+                       self.next_remote_commit_tx_fee_msat(htlc_candidate, Some(())) // Don't include the extra fee spike buffer HTLC in calculations
                };
                if pending_remote_value_msat - msg.amount_msat < remote_commit_tx_fee_msat {
                        return Err(ChannelError::Close("Remote HTLC add would not leave enough to pay for fees".to_owned()));

OutboundHTLCState::RemoteRemoved {..} => their_acked_htlcs += 1,
OutboundHTLCState::Committed => included_htlcs += 1,
OutboundHTLCState::RemoteRemoved {..} => included_htlcs += 1,
OutboundHTLCState::LocalAnnounced { .. } => included_htlcs += 1,
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

No test coverage, mutating here

diff --git a/lightning/src/ln/channel.rs b/lightning/src/ln/channel.rs
index d3a372e4..6751b973 100644
--- a/lightning/src/ln/channel.rs
+++ b/lightning/src/ln/channel.rs
@@ -1871,7 +1871,7 @@ impl<ChanSigner: ChannelKeys> Channel<ChanSigner> {
                        match htlc.state {
                                OutboundHTLCState::Committed => included_htlcs += 1,
                                OutboundHTLCState::RemoteRemoved {..} => included_htlcs += 1,
-                               OutboundHTLCState::LocalAnnounced { .. } => included_htlcs += 1,
+                               OutboundHTLCState::LocalAnnounced { .. } => {}, //included_htlcs += 1,
                                _ => {},
                        }
                }

for ref htlc in self.pending_outbound_htlcs.iter() {
if htlc.amount_msat / 1000 <= self.counterparty_dust_limit_satoshis {
if htlc.amount_msat / 1000 <= real_dust_limit_success_sat {
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think it would be great to comment this subtlety around which dust limit apply in function of which viewpoint the fee is computed for.

"When computing the fee as paid by counterparty, if they have to claim onchain HTLC by broadcasting their commitment, they will have to spend inbound through a HTLC-timeout, outbound through a HTLC-Success.
When computing the fee as paid by holder, if we have to claim onchain HTLC by broadcasting our commitment, we will have to spend inbound through a HTLC-success, outbound through a HTLC-timeout"

Well, that's an attempt, but we should make it clear that even if our channel variables represent the holder direction-side (pending_inbound_htlcs/pending_outbound_htlcs) the actual fee cost is function from which commitment they will be claim.

@valentinewallace
Copy link
Contributor Author

Thanks for the post-merge review @ariard, will attempt to address soon^tm

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

chanmon_consistency found force-close
4 participants