Skip to content

Rescan dependent transactions in ChainMonitor #840

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
merged 5 commits into from
Mar 28, 2021

Conversation

jkczyz
Copy link
Contributor

@jkczyz jkczyz commented Mar 12, 2021

Electrum clients primarily operate by subscribing to notifications of transactions by script pubkeys. Therefore, they will send filtered transaction data without including dependent transactions. Outputs for such dependent transactions must be explicitly registered with these clients.

This PR updates chain::Filter::register_output to return dependent transaction data as Option<TransactionData>, allowing ChainMonitor to register more outputs that are found within them.

@jkczyz
Copy link
Contributor Author

jkczyz commented Mar 12, 2021

It occurred to me that register_output could only return one transaction, since returning more would be a double spend. Does that sound right?

@codecov
Copy link

codecov bot commented Mar 12, 2021

Codecov Report

Merging #840 (d8d9eaf) into main (8a8c75a) will decrease coverage by 0.00%.
The diff coverage is 89.65%.

Impacted file tree graph

@@            Coverage Diff             @@
##             main     #840      +/-   ##
==========================================
- Coverage   90.59%   90.59%   -0.01%     
==========================================
  Files          51       51              
  Lines       27109    27193      +84     
==========================================
+ Hits        24560    24636      +76     
- Misses       2549     2557       +8     
Impacted Files Coverage Δ
lightning/src/chain/mod.rs 100.00% <ø> (ø)
lightning/src/ln/functional_test_utils.rs 95.21% <ø> (ø)
lightning/src/util/test_utils.rs 82.95% <80.95%> (-0.70%) ⬇️
lightning/src/chain/chainmonitor.rs 93.96% <97.50%> (+1.86%) ⬆️
lightning/src/chain/channelmonitor.rs 95.42% <100.00%> (+0.01%) ⬆️
lightning/src/ln/functional_tests.rs 96.88% <0.00%> (+0.03%) ⬆️

Continue to review full report at Codecov.

Legend - Click here to learn more
Δ = absolute <relative> (impact), ø = not affected, ? = missing data
Powered by Codecov. Last update 8a8c75a...d8d9eaf. Read the comment docs.

@jkczyz jkczyz force-pushed the 2021-03-rescan-logic branch from 5f9bb0d to eb21a87 Compare March 12, 2021 18:50
@jkczyz
Copy link
Contributor Author

jkczyz commented Mar 12, 2021

It occurred to me that register_output could only return one transaction, since returning more would be a double spend. Does that sound right?

Updated with this change.

@TheBlueMatt If this looks right to you, I can start writing some tests. Also, should register_output take a BlockHash as discussed?

@jkczyz jkczyz requested review from TheBlueMatt and ariard March 12, 2021 18:50
Copy link
Collaborator

@TheBlueMatt TheBlueMatt left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Concept ACK, will take some deep review time on it when you're ready, though.

fn register_output(&self, outpoint: &OutPoint, script_pubkey: &Script);
///
/// Optionally, returns any transaction dependent on the output. This is useful for Electrum
/// clients to facilitate registering in-block descendants.
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Did you decide against including a block hash to limit returned results to just the current block? Presumably there will be more docs here with an example or such.

@TheBlueMatt
Copy link
Collaborator

TheBlueMatt commented Mar 12, 2021

Also, should register_output take a BlockHash as discussed?

Oh, just saw this. I think so. I think we may want some kind of ElectrumClientWrapper struct that can map script_pubkey requests to a queue of transactions to connect and converts them to some kind of block events. In such a case, a block hash would be useful.

@jkczyz
Copy link
Contributor Author

jkczyz commented Mar 13, 2021

Also, should register_output take a BlockHash as discussed?

Oh, just saw this. I think so. I think we may want some kind of ElectrumClientWrapper struct that can map script_pubkey requests to a queue of transactions to connect and converts them to some kind of block events. In such a case, a block hash would be useful.

Hmmm... so one thing that we didn't consider is that register_output is called by watch_channel (via load_outputs_to_watch) for the funding output. However, we don't have a block hash there because the funding transaction may not have been confirmed on chain yet. It's also called there for any other outputs to watch that had been serialized with the ChannelMonitor.

I suppose I can make it an Option<BlockHash> for these cases or maybe a separate function would be better? Let me know if you have a strong preference one way or the other.

@TheBlueMatt
Copy link
Collaborator

I suppose I can make it an Option for these cases or maybe a separate function would be better? Let me know if you have a strong preference one way or the other.

Yea, I don't have a strong preference, though I agree that kinda sucks. This may be a good candidate for a default-implementation if you go the multi-function approach.

@jkczyz jkczyz force-pushed the 2021-03-rescan-logic branch from eb21a87 to d292d00 Compare March 16, 2021 01:10
Copy link

@ariard ariard left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It occurred to me that register_output could only return one transaction, since returning more would be a double spend. Does that sound right?

Yes this sounds right to me, unless you have some buggy chain::Filter implem not disconnecting block and corresponding outputs sanely during a reorg and the output being spent differently.

outpoint: OutPoint { txid, index: *idx as u16 },
script_pubkey: output.script_pubkey.clone(),
};
if let Some(tx) = chain_source.register_output(output) {
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is where I've a doubt about the operations of an Electrum client. At this point, if you don't have locally the in-block descendants what should you do ?

  • a) the local Electrum client should emit new queries to its Electrum servers and return None ?
  • b) the local Electrum client when learning about the spent of a register output should immediately subscribe spent outputs, and thus learn descendants asynchronously from register_output caller ?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The Electrum client needs to query the server for dependent transactions synchronously and block until it gets a response back. Otherwise, the user would need to call block_connected again for the same header, which we'd like to avoid.

Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Gotcha, option c) emit new queries and block until it gets a response back. I think this query-and-block requirement should be documented around chain::Filter::register_output ?

Also, I think documentation should note that query to a trusted-but-not-too-much server should be on a time-out, otherwise that's a vector to stale operations of an Electrum client.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hmmm... currently the return value is Option<Transaction>, so there's no mechanism to timeout. What do you propose should happen in that scenario?

Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I would suggest to return a None, but I think the timeout should be implemented by the Electrum local client, not by us. We should just make this point clear in chainFilter implementation documentation imo.

I think <30s is enough to consider the output spent isn't known by the Electrum server and that we should return.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Agreed that is shouldn't be us. But returning None doesn't allow us to differentiate between "Electrum doesn't know" and "Electrum is unresponsive". What should the user do in case of the latter?

/// Optionally, when `output.block_hash` is set, should return any transaction spending the
/// output that is found in the corresponding block at the given index.
///
/// This return value is useful for Electrum clients in order to supply in-block descendant
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't understand how the Electrum client is leveraging block hash presence. Do you assume it stores the in-block descendant, even before register_output caller ask for them ?

Otherwise, I don't think you need block hash to register already-confirmed scriptpubkeys to an Electrum server API, see https://electrumx.readthedocs.io/en/latest/protocol-methods.html

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The purpose of the block_hash is to limit transactions to a given block. Essentially:

  • user calls block_connected for block X with a subset of txdata supplied by Electrum
  • register_output indicates there is a new output in txdata to watch
  • Electrum client needs to determine if that output was spent in block X

It should return None if not spent in block X even if spent in some later block Electrum knows about.

Copy link
Contributor Author

@jkczyz jkczyz left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Let me know if my responses seem feasible from your understanding of Electrum.

outpoint: OutPoint { txid, index: *idx as u16 },
script_pubkey: output.script_pubkey.clone(),
};
if let Some(tx) = chain_source.register_output(output) {
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The Electrum client needs to query the server for dependent transactions synchronously and block until it gets a response back. Otherwise, the user would need to call block_connected again for the same header, which we'd like to avoid.

/// Optionally, when `output.block_hash` is set, should return any transaction spending the
/// output that is found in the corresponding block at the given index.
///
/// This return value is useful for Electrum clients in order to supply in-block descendant
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The purpose of the block_hash is to limit transactions to a given block. Essentially:

  • user calls block_connected for block X with a subset of txdata supplied by Electrum
  • register_output indicates there is a new output in txdata to watch
  • Electrum client needs to determine if that output was spent in block X

It should return None if not spent in block X even if spent in some later block Electrum knows about.

Electrum clients primarily operate by subscribing to notifications of
transactions by script pubkeys. Therefore, they will send filtered
transaction data without including dependent transactions. Outputs for
such transactions must be explicitly registered with these clients.

Therefore, upon block_connected, provide a mechanism for an Electrum-
backed chain::Filter to return new transaction data to scan.
@jkczyz jkczyz force-pushed the 2021-03-rescan-logic branch from d292d00 to 83b341d Compare March 21, 2021 05:05
@jkczyz jkczyz marked this pull request as ready for review March 21, 2021 05:37
@jkczyz
Copy link
Contributor Author

jkczyz commented Mar 21, 2021

Ok, added a test for processing dependent transactions returned by chain::Filter::register_output. I couldn't find a way to test the deduplication logic. Let me know if there's a simple way to trigger this scenario. Namely, two registered outputs are spent by the same transaction.

@jkczyz jkczyz force-pushed the 2021-03-rescan-logic branch from 83b341d to ce1bc8e Compare March 21, 2021 05:49
Copy link
Contributor

@valentinewallace valentinewallace left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Not sure if I'm the best person to review, but I think this code looks reasonable :) modulo ideally testing deduplication as @jkczyz already pointed out. Also, I think ideally it'd be tested with Electrum before merge (has Overtorment tested it out?).


// Notify nodes[1] that nodes[0]'s revoked commitment transaction was mined. The chain
// source should return the dependent HTLC transaction when the HTLC output is registered.
mine_transaction(&nodes[1], &commitment_tx);
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I double checked that "expectations" are registered and then cleared before and after this tx is mined

Copy link

@ariard ariard left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Overall SGTM, minor points

outpoint: OutPoint { txid, index: *idx as u16 },
script_pubkey: output.script_pubkey.clone(),
};
if let Some(tx) = chain_source.register_output(output) {
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Gotcha, option c) emit new queries and block until it gets a response back. I think this query-and-block requirement should be documented around chain::Filter::register_output ?

Also, I think documentation should note that query to a trusted-but-not-too-much server should be on a time-out, otherwise that's a vector to stale operations of an Electrum client.

// Recursively call for any dependent transactions that were identified by the chain source.
if !dependent_txdata.is_empty() {
dependent_txdata.sort_unstable_by_key(|(index, _tx)| *index);
dependent_txdata.dedup_by_key(|(index, _tx)| *index);
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

IIRC, ChannelMonitor::block_connected will never return a registration request for outputs spent by transactions it has just been called to parse ?

From where duplication could have been introduced ? Though I agree to keep it as belt-and-suspender.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

IIRC, ChannelMonitor::block_connected will never return a registration request for outputs spent by transactions it has just been called to parse ?

I'm not sure I follow. Could you explain why it will not do so? If fact, doesn't my test demonstrate that it does? (i.e., a commitment transaction is processed and its outputs are registered)

From where duplication could have been introduced ? Though I agree to keep it as belt-and-suspender.

The duplication may occur across channel monitors. For instance, two channel monitors register two different outputs which happen to be spent by the same in-block descendent transaction. So each call to register_output should return this transaction, which would need to be de-duped.

Would this ever happen in practice? Maybe not? But it is not an impossible scenario, right?

Copy link

@ariard ariard Mar 25, 2021

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If fact, doesn't my test demonstrate that it does? (i.e., a commitment transaction is processed and its outputs are registered)

I meant a channel monitor won't ask to register outputs spent by the processed commitment transactions, only outputs for the next stage of spending ? Those outputs spent are already registered, otherwise we won't have received a transaction spending them.

Would this ever happen in practice? Maybe not? But it is not an impossible scenario, right?

I don't believe this can happen right now because we don't coalesce justice transactions across channels. Although thought about doing it for fee optimizations a while back but not worthy the complexity.

Beyond, I'm also pretty sure that dual-funding/splicing will enable that kind of scenario where channel monitors register different outputs which happen to be spent by the same in-block descendent transaction. A splicing transaction could lawfully spend two funding outpoints.

AFAICT, I think those scenarios aren't possible for now, but really likely in the future so better to keep the dedup ?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I meant a channel monitor won't ask to register outputs spent by the processed commitment transactions, only outputs for the next stage of spending ? Those outputs spent are already registered, otherwise we won't have received a transaction spending them.

Ah, right. Sorry, I misread your message.

I don't believe this can happen right now because we don't coalesce justice transactions across channels. Although thought about doing it for fee optimizations a while back but not worthy the complexity.

Beyond, I'm also pretty sure that dual-funding/splicing will enable that kind of scenario where channel monitors register different outputs which happen to be spent by the same in-block descendent transaction. A splicing transaction could lawfully spend two funding outpoints.

AFAICT, I think those scenarios aren't possible for now, but really likely in the future so better to keep the dedup ?

Gotcha, will keep as is. Though note that these transactions are published by the counterparty, which may be another implementation.

For a de-dup test, I suppose if I set up two channels then I may be able to form a transaction that spends from both commitment transaction's outputs.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

For a de-dup test, I suppose if I set up two channels then I may be able to form a transaction that spends from both commitment transaction's outputs.

Actually, I think one channel would be sufficient here. A transaction spending the commitment transaction's to_local output and the HTLC transaction's output should suffice. The Electrum client would return this twice (once for each call to register_output), which would trigger the de-dup logic.

Is there an easy way to form such a transaction using our API?

@jkczyz jkczyz force-pushed the 2021-03-rescan-logic branch from ce1bc8e to 46f5bb6 Compare March 25, 2021 04:30
@ariard
Copy link

ariard commented Mar 25, 2021

Code Review ACK 46f5bb6 modulo squash.

We might have to fix chain::Filter documentation, can be done later when we've landed all block-connection API-related PRs.

fn register_output(&self, output: WatchedOutput) -> Option<(usize, Transaction)>;
}

/// A transaction output watched by a [`ChannelMonitor`] for spends on-chain.
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Given the persistent confusion all of us seem to have, would be good if this mentioned "spends" more prominently, or maybe explicitly - "Indicates any transaction spending the outpoint described here..."

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yeah, it is a bit vague. Rewrote to be more explicit on how this is used.

@TheBlueMatt
Copy link
Collaborator

Two super minor comments, feel free to ignore them, but either way needs a squash.

@jkczyz jkczyz force-pushed the 2021-03-rescan-logic branch from 46f5bb6 to 0901def Compare March 27, 2021 22:19
jkczyz added 4 commits March 27, 2021 18:20
When registering a watched transaction output, any in-block descendant
transactions spending the output must be supplied. Give the block hash
when registering such outputs such that this is possible. Otherwise,
spends from other blocks may be returned inadvertently.
Electrum clients will only provide transaction data for outputs that
have been explicitly registered. Hence, upon registering new outputs,
recursively register any outputs to watch contained within dependent
transactions from the same block.
Add a method to TestChainSource to test chain::Filter expectations. This
is limited to register_output, allowing tests to assert that the method
was called with a specific output and dictate what the return value is.

Multiple expectations are checked in the order in which they were added.
Failure occurs if a call doesn't match the next expectation or if there
are unsatisfied expectations. If not expectations are added, then no
calls are checked.
chain::Filter::register_output may return an in-block dependent
transaction that spends the output. Test the scenario where the txdata
given to ChainMonitor::block_connected includes a commitment transaction
whose HTLC output is spent in the same block but not included in txdata.
Instead, it is returned by chain::Filter::register_output when given the
commitment transaction's HTLC output. This is a common scenario for
Electrum clients, which provided filtered txdata.
@jkczyz jkczyz force-pushed the 2021-03-rescan-logic branch from 0901def to d8d9eaf Compare March 27, 2021 22:22
@TheBlueMatt
Copy link
Collaborator

Only diff from Antoine's ack are simple doc fixups.

@TheBlueMatt TheBlueMatt merged commit 6fcac8b into lightningdevkit:main Mar 28, 2021
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants