-
Notifications
You must be signed in to change notification settings - Fork 411
Rescan dependent transactions in ChainMonitor #840
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Rescan dependent transactions in ChainMonitor #840
Conversation
It occurred to me that |
Codecov Report
@@ Coverage Diff @@
## main #840 +/- ##
==========================================
- Coverage 90.59% 90.59% -0.01%
==========================================
Files 51 51
Lines 27109 27193 +84
==========================================
+ Hits 24560 24636 +76
- Misses 2549 2557 +8
Continue to review full report at Codecov.
|
5f9bb0d
to
eb21a87
Compare
Updated with this change. @TheBlueMatt If this looks right to you, I can start writing some tests. Also, should |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Concept ACK, will take some deep review time on it when you're ready, though.
lightning/src/chain/mod.rs
Outdated
fn register_output(&self, outpoint: &OutPoint, script_pubkey: &Script); | ||
/// | ||
/// Optionally, returns any transaction dependent on the output. This is useful for Electrum | ||
/// clients to facilitate registering in-block descendants. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Did you decide against including a block hash to limit returned results to just the current block? Presumably there will be more docs here with an example or such.
Oh, just saw this. I think so. I think we may want some kind of ElectrumClientWrapper struct that can map script_pubkey requests to a queue of transactions to connect and converts them to some kind of block events. In such a case, a block hash would be useful. |
Hmmm... so one thing that we didn't consider is that I suppose I can make it an |
Yea, I don't have a strong preference, though I agree that kinda sucks. This may be a good candidate for a default-implementation if you go the multi-function approach. |
eb21a87
to
d292d00
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It occurred to me that register_output could only return one transaction, since returning more would be a double spend. Does that sound right?
Yes this sounds right to me, unless you have some buggy chain::Filter
implem not disconnecting block and corresponding outputs sanely during a reorg and the output being spent differently.
outpoint: OutPoint { txid, index: *idx as u16 }, | ||
script_pubkey: output.script_pubkey.clone(), | ||
}; | ||
if let Some(tx) = chain_source.register_output(output) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This is where I've a doubt about the operations of an Electrum client. At this point, if you don't have locally the in-block descendants what should you do ?
- a) the local Electrum client should emit new queries to its Electrum servers and return None ?
- b) the local Electrum client when learning about the spent of a register output should immediately subscribe spent outputs, and thus learn descendants asynchronously from
register_output
caller ?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The Electrum client needs to query the server for dependent transactions synchronously and block until it gets a response back. Otherwise, the user would need to call block_connected
again for the same header, which we'd like to avoid.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Gotcha, option c) emit new queries and block until it gets a response back. I think this query-and-block requirement should be documented around chain::Filter::register_output
?
Also, I think documentation should note that query to a trusted-but-not-too-much server should be on a time-out, otherwise that's a vector to stale operations of an Electrum client.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Hmmm... currently the return value is Option<Transaction>
, so there's no mechanism to timeout. What do you propose should happen in that scenario?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I would suggest to return a None, but I think the timeout should be implemented by the Electrum local client, not by us. We should just make this point clear in chainFilter
implementation documentation imo.
I think <30s is enough to consider the output spent isn't known by the Electrum server and that we should return.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Agreed that is shouldn't be us. But returning None
doesn't allow us to differentiate between "Electrum doesn't know" and "Electrum is unresponsive". What should the user do in case of the latter?
/// Optionally, when `output.block_hash` is set, should return any transaction spending the | ||
/// output that is found in the corresponding block at the given index. | ||
/// | ||
/// This return value is useful for Electrum clients in order to supply in-block descendant |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I don't understand how the Electrum client is leveraging block hash presence. Do you assume it stores the in-block descendant, even before register_output
caller ask for them ?
Otherwise, I don't think you need block hash to register already-confirmed scriptpubkeys to an Electrum server API, see https://electrumx.readthedocs.io/en/latest/protocol-methods.html
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The purpose of the block_hash
is to limit transactions to a given block. Essentially:
- user calls
block_connected
for block X with a subset oftxdata
supplied by Electrum register_output
indicates there is a new output intxdata
to watch- Electrum client needs to determine if that output was spent in block X
It should return None
if not spent in block X even if spent in some later block Electrum knows about.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Let me know if my responses seem feasible from your understanding of Electrum.
outpoint: OutPoint { txid, index: *idx as u16 }, | ||
script_pubkey: output.script_pubkey.clone(), | ||
}; | ||
if let Some(tx) = chain_source.register_output(output) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The Electrum client needs to query the server for dependent transactions synchronously and block until it gets a response back. Otherwise, the user would need to call block_connected
again for the same header, which we'd like to avoid.
/// Optionally, when `output.block_hash` is set, should return any transaction spending the | ||
/// output that is found in the corresponding block at the given index. | ||
/// | ||
/// This return value is useful for Electrum clients in order to supply in-block descendant |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The purpose of the block_hash
is to limit transactions to a given block. Essentially:
- user calls
block_connected
for block X with a subset oftxdata
supplied by Electrum register_output
indicates there is a new output intxdata
to watch- Electrum client needs to determine if that output was spent in block X
It should return None
if not spent in block X even if spent in some later block Electrum knows about.
Electrum clients primarily operate by subscribing to notifications of transactions by script pubkeys. Therefore, they will send filtered transaction data without including dependent transactions. Outputs for such transactions must be explicitly registered with these clients. Therefore, upon block_connected, provide a mechanism for an Electrum- backed chain::Filter to return new transaction data to scan.
d292d00
to
83b341d
Compare
Ok, added a test for processing dependent transactions returned by |
83b341d
to
ce1bc8e
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Not sure if I'm the best person to review, but I think this code looks reasonable :) modulo ideally testing deduplication as @jkczyz already pointed out. Also, I think ideally it'd be tested with Electrum before merge (has Overtorment tested it out?).
|
||
// Notify nodes[1] that nodes[0]'s revoked commitment transaction was mined. The chain | ||
// source should return the dependent HTLC transaction when the HTLC output is registered. | ||
mine_transaction(&nodes[1], &commitment_tx); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I double checked that "expectations" are registered and then cleared before and after this tx is mined
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Overall SGTM, minor points
outpoint: OutPoint { txid, index: *idx as u16 }, | ||
script_pubkey: output.script_pubkey.clone(), | ||
}; | ||
if let Some(tx) = chain_source.register_output(output) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Gotcha, option c) emit new queries and block until it gets a response back. I think this query-and-block requirement should be documented around chain::Filter::register_output
?
Also, I think documentation should note that query to a trusted-but-not-too-much server should be on a time-out, otherwise that's a vector to stale operations of an Electrum client.
// Recursively call for any dependent transactions that were identified by the chain source. | ||
if !dependent_txdata.is_empty() { | ||
dependent_txdata.sort_unstable_by_key(|(index, _tx)| *index); | ||
dependent_txdata.dedup_by_key(|(index, _tx)| *index); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
IIRC, ChannelMonitor::block_connected
will never return a registration request for outputs spent by transactions it has just been called to parse ?
From where duplication could have been introduced ? Though I agree to keep it as belt-and-suspender.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
IIRC,
ChannelMonitor::block_connected
will never return a registration request for outputs spent by transactions it has just been called to parse ?
I'm not sure I follow. Could you explain why it will not do so? If fact, doesn't my test demonstrate that it does? (i.e., a commitment transaction is processed and its outputs are registered)
From where duplication could have been introduced ? Though I agree to keep it as belt-and-suspender.
The duplication may occur across channel monitors. For instance, two channel monitors register two different outputs which happen to be spent by the same in-block descendent transaction. So each call to register_output
should return this transaction, which would need to be de-duped.
Would this ever happen in practice? Maybe not? But it is not an impossible scenario, right?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
If fact, doesn't my test demonstrate that it does? (i.e., a commitment transaction is processed and its outputs are registered)
I meant a channel monitor won't ask to register outputs spent by the processed commitment transactions, only outputs for the next stage of spending ? Those outputs spent are already registered, otherwise we won't have received a transaction spending them.
Would this ever happen in practice? Maybe not? But it is not an impossible scenario, right?
I don't believe this can happen right now because we don't coalesce justice transactions across channels. Although thought about doing it for fee optimizations a while back but not worthy the complexity.
Beyond, I'm also pretty sure that dual-funding/splicing will enable that kind of scenario where channel monitors register different outputs which happen to be spent by the same in-block descendent transaction. A splicing transaction could lawfully spend two funding outpoints.
AFAICT, I think those scenarios aren't possible for now, but really likely in the future so better to keep the dedup ?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I meant a channel monitor won't ask to register outputs spent by the processed commitment transactions, only outputs for the next stage of spending ? Those outputs spent are already registered, otherwise we won't have received a transaction spending them.
Ah, right. Sorry, I misread your message.
I don't believe this can happen right now because we don't coalesce justice transactions across channels. Although thought about doing it for fee optimizations a while back but not worthy the complexity.
Beyond, I'm also pretty sure that dual-funding/splicing will enable that kind of scenario where channel monitors register different outputs which happen to be spent by the same in-block descendent transaction. A splicing transaction could lawfully spend two funding outpoints.
AFAICT, I think those scenarios aren't possible for now, but really likely in the future so better to keep the dedup ?
Gotcha, will keep as is. Though note that these transactions are published by the counterparty, which may be another implementation.
For a de-dup test, I suppose if I set up two channels then I may be able to form a transaction that spends from both commitment transaction's outputs.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
For a de-dup test, I suppose if I set up two channels then I may be able to form a transaction that spends from both commitment transaction's outputs.
Actually, I think one channel would be sufficient here. A transaction spending the commitment transaction's to_local
output and the HTLC transaction's output should suffice. The Electrum client would return this twice (once for each call to register_output
), which would trigger the de-dup logic.
Is there an easy way to form such a transaction using our API?
ce1bc8e
to
46f5bb6
Compare
Code Review ACK 46f5bb6 modulo squash. We might have to fix |
fn register_output(&self, output: WatchedOutput) -> Option<(usize, Transaction)>; | ||
} | ||
|
||
/// A transaction output watched by a [`ChannelMonitor`] for spends on-chain. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Given the persistent confusion all of us seem to have, would be good if this mentioned "spends" more prominently, or maybe explicitly - "Indicates any transaction spending the outpoint described here..."
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yeah, it is a bit vague. Rewrote to be more explicit on how this is used.
Two super minor comments, feel free to ignore them, but either way needs a squash. |
46f5bb6
to
0901def
Compare
When registering a watched transaction output, any in-block descendant transactions spending the output must be supplied. Give the block hash when registering such outputs such that this is possible. Otherwise, spends from other blocks may be returned inadvertently.
Electrum clients will only provide transaction data for outputs that have been explicitly registered. Hence, upon registering new outputs, recursively register any outputs to watch contained within dependent transactions from the same block.
Add a method to TestChainSource to test chain::Filter expectations. This is limited to register_output, allowing tests to assert that the method was called with a specific output and dictate what the return value is. Multiple expectations are checked in the order in which they were added. Failure occurs if a call doesn't match the next expectation or if there are unsatisfied expectations. If not expectations are added, then no calls are checked.
chain::Filter::register_output may return an in-block dependent transaction that spends the output. Test the scenario where the txdata given to ChainMonitor::block_connected includes a commitment transaction whose HTLC output is spent in the same block but not included in txdata. Instead, it is returned by chain::Filter::register_output when given the commitment transaction's HTLC output. This is a common scenario for Electrum clients, which provided filtered txdata.
0901def
to
d8d9eaf
Compare
Only diff from Antoine's ack are simple doc fixups. |
Electrum clients primarily operate by subscribing to notifications of transactions by script pubkeys. Therefore, they will send filtered transaction data without including dependent transactions. Outputs for such dependent transactions must be explicitly registered with these clients.
This PR updates
chain::Filter::register_output
to return dependent transaction data asOption<TransactionData>
, allowingChainMonitor
to register more outputs that are found within them.