Rescan dependent transactions in ChainMonitor #840

jkczyz · 2021-03-12T03:22:41Z

Electrum clients primarily operate by subscribing to notifications of transactions by script pubkeys. Therefore, they will send filtered transaction data without including dependent transactions. Outputs for such dependent transactions must be explicitly registered with these clients.

This PR updates chain::Filter::register_output to return dependent transaction data as Option<TransactionData>, allowing ChainMonitor to register more outputs that are found within them.

jkczyz · 2021-03-12T03:24:23Z

It occurred to me that register_output could only return one transaction, since returning more would be a double spend. Does that sound right?

codecov · 2021-03-12T03:37:39Z

Codecov Report

Merging #840 (d8d9eaf) into main (8a8c75a) will decrease coverage by 0.00%.
The diff coverage is 89.65%.

@@            Coverage Diff             @@
##             main     #840      +/-   ##
==========================================
- Coverage   90.59%   90.59%   -0.01%     
==========================================
  Files          51       51              
  Lines       27109    27193      +84     
==========================================
+ Hits        24560    24636      +76     
- Misses       2549     2557       +8

Impacted Files	Coverage Δ
lightning/src/chain/mod.rs	`100.00% <ø> (ø)`
lightning/src/ln/functional_test_utils.rs	`95.21% <ø> (ø)`
lightning/src/util/test_utils.rs	`82.95% <80.95%> (-0.70%)`	⬇️
lightning/src/chain/chainmonitor.rs	`93.96% <97.50%> (+1.86%)`	⬆️
lightning/src/chain/channelmonitor.rs	`95.42% <100.00%> (+0.01%)`	⬆️
lightning/src/ln/functional_tests.rs	`96.88% <0.00%> (+0.03%)`	⬆️

Continue to review full report at Codecov.

Legend - Click here to learn more
Δ = absolute <relative> (impact), ø = not affected, ? = missing data
Powered by Codecov. Last update 8a8c75a...d8d9eaf. Read the comment docs.

jkczyz · 2021-03-12T18:50:46Z

It occurred to me that register_output could only return one transaction, since returning more would be a double spend. Does that sound right?

Updated with this change.

@TheBlueMatt If this looks right to you, I can start writing some tests. Also, should register_output take a BlockHash as discussed?

TheBlueMatt

Concept ACK, will take some deep review time on it when you're ready, though.

TheBlueMatt · 2021-03-12T22:22:29Z

lightning/src/chain/mod.rs

-	fn register_output(&self, outpoint: &OutPoint, script_pubkey: &Script);
+	///
+	/// Optionally, returns any transaction dependent on the output. This is useful for Electrum
+	/// clients to facilitate registering in-block descendants.


Did you decide against including a block hash to limit returned results to just the current block? Presumably there will be more docs here with an example or such.

TheBlueMatt · 2021-03-12T22:42:27Z

Also, should register_output take a BlockHash as discussed?

Oh, just saw this. I think so. I think we may want some kind of ElectrumClientWrapper struct that can map script_pubkey requests to a queue of transactions to connect and converts them to some kind of block events. In such a case, a block hash would be useful.

jkczyz · 2021-03-13T01:35:34Z

Also, should register_output take a BlockHash as discussed?

Oh, just saw this. I think so. I think we may want some kind of ElectrumClientWrapper struct that can map script_pubkey requests to a queue of transactions to connect and converts them to some kind of block events. In such a case, a block hash would be useful.

Hmmm... so one thing that we didn't consider is that register_output is called by watch_channel (via load_outputs_to_watch) for the funding output. However, we don't have a block hash there because the funding transaction may not have been confirmed on chain yet. It's also called there for any other outputs to watch that had been serialized with the ChannelMonitor.

I suppose I can make it an Option<BlockHash> for these cases or maybe a separate function would be better? Let me know if you have a strong preference one way or the other.

TheBlueMatt · 2021-03-15T16:35:10Z

I suppose I can make it an Option for these cases or maybe a separate function would be better? Let me know if you have a strong preference one way or the other.

Yea, I don't have a strong preference, though I agree that kinda sucks. This may be a good candidate for a default-implementation if you go the multi-function approach.

ariard

It occurred to me that register_output could only return one transaction, since returning more would be a double spend. Does that sound right?

Yes this sounds right to me, unless you have some buggy chain::Filter implem not disconnecting block and corresponding outputs sanely during a reorg and the output being spent differently.

ariard · 2021-03-16T14:14:22Z

lightning/src/chain/chainmonitor.rs

+							outpoint: OutPoint { txid, index: *idx as u16 },
+							script_pubkey: output.script_pubkey.clone(),
+						};
+						if let Some(tx) = chain_source.register_output(output) {


This is where I've a doubt about the operations of an Electrum client. At this point, if you don't have locally the in-block descendants what should you do ?

a) the local Electrum client should emit new queries to its Electrum servers and return None ?

b) the local Electrum client when learning about the spent of a register output should immediately subscribe spent outputs, and thus learn descendants asynchronously from register_output caller ?

The Electrum client needs to query the server for dependent transactions synchronously and block until it gets a response back. Otherwise, the user would need to call block_connected again for the same header, which we'd like to avoid.

Gotcha, option c) emit new queries and block until it gets a response back. I think this query-and-block requirement should be documented around chain::Filter::register_output ?

Also, I think documentation should note that query to a trusted-but-not-too-much server should be on a time-out, otherwise that's a vector to stale operations of an Electrum client.

Hmmm... currently the return value is Option<Transaction>, so there's no mechanism to timeout. What do you propose should happen in that scenario?

I would suggest to return a None, but I think the timeout should be implemented by the Electrum local client, not by us. We should just make this point clear in chainFilter implementation documentation imo.

I think <30s is enough to consider the output spent isn't known by the Electrum server and that we should return.

Agreed that is shouldn't be us. But returning None doesn't allow us to differentiate between "Electrum doesn't know" and "Electrum is unresponsive". What should the user do in case of the latter?

lightning/src/chain/chainmonitor.rs

ariard · 2021-03-16T14:20:36Z

lightning/src/chain/mod.rs

+	/// Optionally, when `output.block_hash` is set, should return any transaction spending the
+	/// output that is found in the corresponding block at the given index.
+	///
+	/// This return value is useful for Electrum clients in order to supply in-block descendant


I don't understand how the Electrum client is leveraging block hash presence. Do you assume it stores the in-block descendant, even before register_output caller ask for them ?

Otherwise, I don't think you need block hash to register already-confirmed scriptpubkeys to an Electrum server API, see https://electrumx.readthedocs.io/en/latest/protocol-methods.html

The purpose of the block_hash is to limit transactions to a given block. Essentially:

user calls block_connected for block X with a subset of txdata supplied by Electrum

register_output indicates there is a new output in txdata to watch

Electrum client needs to determine if that output was spent in block X

It should return None if not spent in block X even if spent in some later block Electrum knows about.

jkczyz

Let me know if my responses seem feasible from your understanding of Electrum.

jkczyz · 2021-03-16T15:43:40Z

lightning/src/chain/chainmonitor.rs

+							outpoint: OutPoint { txid, index: *idx as u16 },
+							script_pubkey: output.script_pubkey.clone(),
+						};
+						if let Some(tx) = chain_source.register_output(output) {


The Electrum client needs to query the server for dependent transactions synchronously and block until it gets a response back. Otherwise, the user would need to call block_connected again for the same header, which we'd like to avoid.

lightning/src/chain/chainmonitor.rs

jkczyz · 2021-03-16T15:58:12Z

lightning/src/chain/mod.rs

+	/// Optionally, when `output.block_hash` is set, should return any transaction spending the
+	/// output that is found in the corresponding block at the given index.
+	///
+	/// This return value is useful for Electrum clients in order to supply in-block descendant


The purpose of the block_hash is to limit transactions to a given block. Essentially:

user calls block_connected for block X with a subset of txdata supplied by Electrum

register_output indicates there is a new output in txdata to watch

Electrum client needs to determine if that output was spent in block X

It should return None if not spent in block X even if spent in some later block Electrum knows about.

Electrum clients primarily operate by subscribing to notifications of transactions by script pubkeys. Therefore, they will send filtered transaction data without including dependent transactions. Outputs for such transactions must be explicitly registered with these clients. Therefore, upon block_connected, provide a mechanism for an Electrum- backed chain::Filter to return new transaction data to scan.

jkczyz · 2021-03-21T05:43:16Z

Ok, added a test for processing dependent transactions returned by chain::Filter::register_output. I couldn't find a way to test the deduplication logic. Let me know if there's a simple way to trigger this scenario. Namely, two registered outputs are spent by the same transaction.

lightning/src/chain/mod.rs

valentinewallace

Not sure if I'm the best person to review, but I think this code looks reasonable :) modulo ideally testing deduplication as @jkczyz already pointed out. Also, I think ideally it'd be tested with Electrum before merge (has Overtorment tested it out?).

lightning/src/chain/chainmonitor.rs

valentinewallace · 2021-03-23T18:05:01Z

lightning/src/chain/chainmonitor.rs

+
+		// Notify nodes[1] that nodes[0]'s revoked commitment transaction was mined. The chain
+		// source should return the dependent HTLC transaction when the HTLC output is registered.
+		mine_transaction(&nodes[1], &commitment_tx);


I double checked that "expectations" are registered and then cleared before and after this tx is mined

ariard

Overall SGTM, minor points

lightning/src/chain/chainmonitor.rs

ariard · 2021-03-24T14:33:55Z

lightning/src/chain/chainmonitor.rs

+							outpoint: OutPoint { txid, index: *idx as u16 },
+							script_pubkey: output.script_pubkey.clone(),
+						};
+						if let Some(tx) = chain_source.register_output(output) {


Gotcha, option c) emit new queries and block until it gets a response back. I think this query-and-block requirement should be documented around chain::Filter::register_output ?

Also, I think documentation should note that query to a trusted-but-not-too-much server should be on a time-out, otherwise that's a vector to stale operations of an Electrum client.

lightning/src/chain/chainmonitor.rs

lightning/src/chain/mod.rs

ariard · 2021-03-24T14:49:18Z

lightning/src/chain/chainmonitor.rs

+		// Recursively call for any dependent transactions that were identified by the chain source.
+		if !dependent_txdata.is_empty() {
+			dependent_txdata.sort_unstable_by_key(|(index, _tx)| *index);
+			dependent_txdata.dedup_by_key(|(index, _tx)| *index);


IIRC, ChannelMonitor::block_connected will never return a registration request for outputs spent by transactions it has just been called to parse ?

From where duplication could have been introduced ? Though I agree to keep it as belt-and-suspender.

IIRC, ChannelMonitor::block_connected will never return a registration request for outputs spent by transactions it has just been called to parse ?

I'm not sure I follow. Could you explain why it will not do so? If fact, doesn't my test demonstrate that it does? (i.e., a commitment transaction is processed and its outputs are registered)

From where duplication could have been introduced ? Though I agree to keep it as belt-and-suspender.

The duplication may occur across channel monitors. For instance, two channel monitors register two different outputs which happen to be spent by the same in-block descendent transaction. So each call to register_output should return this transaction, which would need to be de-duped.

Would this ever happen in practice? Maybe not? But it is not an impossible scenario, right?

If fact, doesn't my test demonstrate that it does? (i.e., a commitment transaction is processed and its outputs are registered)

I meant a channel monitor won't ask to register outputs spent by the processed commitment transactions, only outputs for the next stage of spending ? Those outputs spent are already registered, otherwise we won't have received a transaction spending them.

Would this ever happen in practice? Maybe not? But it is not an impossible scenario, right?

I don't believe this can happen right now because we don't coalesce justice transactions across channels. Although thought about doing it for fee optimizations a while back but not worthy the complexity.

Beyond, I'm also pretty sure that dual-funding/splicing will enable that kind of scenario where channel monitors register different outputs which happen to be spent by the same in-block descendent transaction. A splicing transaction could lawfully spend two funding outpoints.

AFAICT, I think those scenarios aren't possible for now, but really likely in the future so better to keep the dedup ?

I meant a channel monitor won't ask to register outputs spent by the processed commitment transactions, only outputs for the next stage of spending ? Those outputs spent are already registered, otherwise we won't have received a transaction spending them.

Ah, right. Sorry, I misread your message.

I don't believe this can happen right now because we don't coalesce justice transactions across channels. Although thought about doing it for fee optimizations a while back but not worthy the complexity.

Beyond, I'm also pretty sure that dual-funding/splicing will enable that kind of scenario where channel monitors register different outputs which happen to be spent by the same in-block descendent transaction. A splicing transaction could lawfully spend two funding outpoints.

AFAICT, I think those scenarios aren't possible for now, but really likely in the future so better to keep the dedup ?

Gotcha, will keep as is. Though note that these transactions are published by the counterparty, which may be another implementation.

For a de-dup test, I suppose if I set up two channels then I may be able to form a transaction that spends from both commitment transaction's outputs.

For a de-dup test, I suppose if I set up two channels then I may be able to form a transaction that spends from both commitment transaction's outputs.

Actually, I think one channel would be sufficient here. A transaction spending the commitment transaction's to_local output and the HTLC transaction's output should suffice. The Electrum client would return this twice (once for each call to register_output), which would trigger the de-dup logic.

Is there an easy way to form such a transaction using our API?

ariard · 2021-03-25T16:37:02Z

Code Review ACK 46f5bb6 modulo squash.

We might have to fix chain::Filter documentation, can be done later when we've landed all block-connection API-related PRs.

lightning/src/chain/mod.rs

TheBlueMatt · 2021-03-26T22:40:01Z

lightning/src/chain/mod.rs

+	fn register_output(&self, output: WatchedOutput) -> Option<(usize, Transaction)>;
+}
+
+/// A transaction output watched by a [`ChannelMonitor`] for spends on-chain.


Given the persistent confusion all of us seem to have, would be good if this mentioned "spends" more prominently, or maybe explicitly - "Indicates any transaction spending the outpoint described here..."

Yeah, it is a bit vague. Rewrote to be more explicit on how this is used.

TheBlueMatt · 2021-03-26T22:42:09Z

Two super minor comments, feel free to ignore them, but either way needs a squash.

When registering a watched transaction output, any in-block descendant transactions spending the output must be supplied. Give the block hash when registering such outputs such that this is possible. Otherwise, spends from other blocks may be returned inadvertently.

Electrum clients will only provide transaction data for outputs that have been explicitly registered. Hence, upon registering new outputs, recursively register any outputs to watch contained within dependent transactions from the same block.

Add a method to TestChainSource to test chain::Filter expectations. This is limited to register_output, allowing tests to assert that the method was called with a specific output and dictate what the return value is. Multiple expectations are checked in the order in which they were added. Failure occurs if a call doesn't match the next expectation or if there are unsatisfied expectations. If not expectations are added, then no calls are checked.

chain::Filter::register_output may return an in-block dependent transaction that spends the output. Test the scenario where the txdata given to ChainMonitor::block_connected includes a commitment transaction whose HTLC output is spent in the same block but not included in txdata. Instead, it is returned by chain::Filter::register_output when given the commitment transaction's HTLC output. This is a common scenario for Electrum clients, which provided filtered txdata.

TheBlueMatt · 2021-03-28T19:46:49Z

Only diff from Antoine's ack are simple doc fixups.

jkczyz force-pushed the 2021-03-rescan-logic branch from 5f9bb0d to eb21a87 Compare March 12, 2021 18:50

jkczyz requested review from TheBlueMatt and ariard March 12, 2021 18:50

TheBlueMatt reviewed Mar 12, 2021

View reviewed changes

jkczyz force-pushed the 2021-03-rescan-logic branch from eb21a87 to d292d00 Compare March 16, 2021 01:10

ariard reviewed Mar 16, 2021

View reviewed changes

jkczyz commented Mar 16, 2021

View reviewed changes

jkczyz force-pushed the 2021-03-rescan-logic branch from d292d00 to 83b341d Compare March 21, 2021 05:05

jkczyz marked this pull request as ready for review March 21, 2021 05:37

jkczyz force-pushed the 2021-03-rescan-logic branch from 83b341d to ce1bc8e Compare March 21, 2021 05:49

jkczyz commented Mar 21, 2021

View reviewed changes

lightning/src/chain/mod.rs Outdated Show resolved Hide resolved

valentinewallace reviewed Mar 23, 2021

View reviewed changes

ariard reviewed Mar 24, 2021

View reviewed changes

jkczyz force-pushed the 2021-03-rescan-logic branch from ce1bc8e to 46f5bb6 Compare March 25, 2021 04:30

TheBlueMatt approved these changes Mar 26, 2021

View reviewed changes

jkczyz force-pushed the 2021-03-rescan-logic branch from 46f5bb6 to 0901def Compare March 27, 2021 22:19

jkczyz added 4 commits March 27, 2021 18:20

jkczyz force-pushed the 2021-03-rescan-logic branch from 0901def to d8d9eaf Compare March 27, 2021 22:22

TheBlueMatt approved these changes Mar 28, 2021

View reviewed changes

TheBlueMatt merged commit 6fcac8b into lightningdevkit:main Mar 28, 2021

Rescan dependent transactions in ChainMonitor #840

Rescan dependent transactions in ChainMonitor #840

Conversation

jkczyz commented Mar 12, 2021

Uh oh!

jkczyz commented Mar 12, 2021

Uh oh!

codecov bot commented Mar 12, 2021 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Codecov Report

Uh oh!

jkczyz commented Mar 12, 2021

Uh oh!

TheBlueMatt left a comment

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

TheBlueMatt commented Mar 12, 2021 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

jkczyz commented Mar 13, 2021

Uh oh!

TheBlueMatt commented Mar 15, 2021

Uh oh!

ariard left a comment

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

jkczyz left a comment

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Choose a reason for hiding this comment

Uh oh!

jkczyz commented Mar 21, 2021

Uh oh!

Uh oh!

valentinewallace left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Choose a reason for hiding this comment

Uh oh!

ariard left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

ariard Mar 25, 2021 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

codecov bot commented Mar 12, 2021 •

edited

Loading

TheBlueMatt commented Mar 12, 2021 •

edited

Loading

ariard Mar 25, 2021 •

edited

Loading

ariard commented Mar 25, 2021 •

edited

Loading