Skip to content

Handle monitor update failures in msg-recv functions #263

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged

Conversation

TheBlueMatt
Copy link
Collaborator

Not 100% confident that the tests cover everything, but they're not bad and should have the common cases covered.

Copy link

@ariard ariard left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Generally seems good to me, but just have a quick skim on tests, don't know stuff enough to know if it covers everything

// broadcast HTLC-Timeout and pay the associated fees to get their funds back than
// us bother trying to claim it just to forward on to another peer. If we're
// splitting hairs we'd prefer to claim things that went *to us*, but we haven't
// given up the preimage yet, so might as well just wait until the payment is
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

But we may lost a HTLC routed through us there ? Because nothing guarantee us that peer will choose us again to route its payment

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm not sure what your point is here? We will, indeed, forget about a payment which was in-flight through us, but only for an HTLC that we hadn't yet forwarded onwards, so its not really our problem to track it. Ie in the case of A -> B -> C, when we're B, and we receive the final RAA from A which is supposed to allow us to forward the payment to C, but we fail a monitor update and force-close on chain, we don't really care about tracking the HTLC. We never forwarded it onwards to C, so we have no way to learn the preimage, so there is no way we can claim the payment for ourselves, so we just let A figure it out.

Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ah no wasn't talking about internal HTLC tracking, I ticked on last sentence of comment "If we're splitting...", which say that payment may be retried (by origin node) but seems to me if we are unreliable doesn't seem that we're gonna be again in payment route and we won't see this payment a second time. But well doesn't seems we can do better if our Monitor is dead (sending back update_fail is useless, if we can't revoke).

Reading again, new point, why speaking about "we'd prefer to claim things that went to us" ? Seems to me not doable even with more complexity as we never forwarded HTLC.

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I kinda implicitly assume there that the payment went through multiple hops and we only failed one of our channels. So if the sender wants to pay us they'll retry via a different channel. I was referring to payments that we received is our own payments. We can't really claim it cause we never generated a PaymentReceived event which means we never got the preimage from the user, but in theory we could do that, use the preimage to claim the now-on-chain HTLC and take the payment, but thats complicated and probably not for much gain.

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I clarified the comment a bit, hopefully.

@@ -1675,6 +1712,11 @@ impl ChannelManager {
if let Err(e) = self.monitor.add_update_monitor(chan_monitor.get_funding_txo().unwrap(), chan_monitor) {
match e {
ChannelMonitorUpdateErr::PermanentFailure => {
// TODO: There may be some pending HTLCs that we intended to fail
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hmm normally thanks to #198, forward HTLCs (and so backwards if they fail) will get updated to ChannelMonitor into Channel::commitment_signed, but does with current add_update_monitor API could we know which monitors failed to update among remote/in-memory ones.. ?

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm not quite sure what your point is here, but I think #198 practically solves this issue. The ChannelMonitorUpdateErr API says that if your local update fails to commit (ie the in-memory update failed) you must return PermanentFailure irrespective of remote monitor(s). Thus, we should only ever get here if the in-memory monitor update succeeded so the latest-tx-broadcast should result in a fail_backwards once the HTLC-Timeout commits. This needs a test, though, obviously, once they're both merged.

Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ah yes got it reading ChannelMonitorUpdateErr again, that we reach PermanentFailure if persistent local storage of in-memory ChannelMonitor fail, maybe PermanentFailure could be a little bit clearer (right now seems that TemporaryFailure comment is covering both). Obviously need to be tested.

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I noted it in Permanent failure's comment. Indeed need more tests on these things.

@TheBlueMatt TheBlueMatt force-pushed the 2018-12-monitor-fail-2 branch 3 times, most recently from 1d4dfbc to 4b76f4f Compare December 9, 2018 18:36
This resolves an API bug where send_payment may return a
MonitorUpdateFailed Err both when the payment will not be sent and
when the HTLC will be retried automatically when monitor updating
is restored. This makes it impossible for a client to know when
they should retry a payment and when they should not.
This adds a few TODOs around further message rebroadcasting which
needs to be implemented as well as some loss of tracking of HTLCs
on permanent channel failure which needs to get transferred over to
the appropriate in-memory ChannelMonitor.
@TheBlueMatt TheBlueMatt force-pushed the 2018-12-monitor-fail-2 branch from 4b76f4f to 42da58c Compare December 11, 2018 18:18
@TheBlueMatt TheBlueMatt merged commit 0bf783e into lightningdevkit:master Dec 11, 2018
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants