Skip to content

(4/4) Merge upstream/master #694

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Closed
wants to merge 104 commits into from

Conversation

julianknutsen
Copy link

@julianknutsen julianknutsen commented Sep 15, 2020

PRs:
(1/4) Enum Dispatch NOISE State Machine & Unit Tests
(2/4) PeerManager Unit Tests and Refactoring
(3/4) Conduit fixups from 494
(4/4) Merge upstream/master

Hopefully, future PRs won't be this stale, and rebase can always prevail...

Mostly auto-merge except for the following action items:

4395b92 - Added licensing to all files in ln/peers/
4bb5955 - LN_MAX_MSG_LEN. Tests and behavior have been moved to encryption.rs
43eed8d - Remove thread_rng() from PeerManager tests
00d063d - Fuzz test log message

…r, and only expose old peer_channel_encryptor.rs to fuzz tests
fix last lint doc issue?

make the &Some matches explicit for Rust 1.22

appease Rust 1.22 some more with ampersandery

appease Rust 1.22 by using byte_utils for endianness functionality

appease Rust 1.22 by using byte_utils and ref in match arms

experimenting some more with ref matching for Rust 1.22

might Rust 1.22 finally work? Rearranged a lot of borrowing locations and macro behavior and definition locations

fix bug that was kindly caught by fuzz tests

fix fuzz test

improve error messaging

the different rust versions are a balancing act, and I can't juggle
…stants, and type definitions

respond to Jeff's comments and add node pubkeys to map

simplify review by means of 3 scopes

adjust scope indentation

resolve merge conflicts

respond to more of Jeff's comments

constantify key rotation index for conduit
… handshake method `process_act` across both call sites from peer_handler.rs.
# Conflicts:
#	lightning/src/ln/mod.rs
…on to the process_act method, which now also returns an enum of the next act instead of a Vec<u8>.
# Conflicts:
#	lightning/src/ln/peer_handler.rs
…orrow splits). Extract handshake response method in peer handler into separate method. Create iterator pattern for decryptor.
# Conflicts:
#	fuzz/src/full_stack.rs
#	lightning/src/ln/peers/handler.rs
Use the newly abstracted Transport layer to write unit tests for most of
the PeerManager functionality.

This uses dependency inversion with the PeerManager to pass in a
Transport test double that is configured to exercise the interesting
code paths. To enable this, a new struct PeerManagerImpl has been
created that has all of the original functionality of PeerManager, but
includes a type parameter for the Transport implementation that should
be used.

To keep this test-feature hidden from public docs, the
PeerManager struct is now just a shim that delegates all the calls to
PeerManagerImpl. This provides a nice balance of a clean public
interface with the features needed to create isolated tests.

The tests make use of the Spy and Stub test patterns to control input state and
validate the output state.
Pass in all the in/out parameters individually instead of taking
everything out of peer.

The important thing to notice here is that the pending_outbound_buffer
parameter MUST be an OutboundQueue and CANNOT be an 'impl PayloadQueuer'.
This demonstrates that the read_event() function has the ability to
flush() since it can use the SocketDescriptorFlusher interface.

This should be illegal due to the invariant that read_event() never
calls back into the SocketDescriptor.

This is one of the motivations of using trait bounds in argument
position as a defensive programing technique to prohibit functions from
taking actions that shouldn't be allowed even if they have access to the
object.

Future patches will fix this bug and ensure the type system enforces
this bug from happening again.

This is a short-lived destructure that will be recombined once the
dependencies are cleaned up.
Per the external documentation, there is an invariant that read_event()
will not call back into the SocketDescriptor, but the actual code
flushed the outbound queue.

Fix the bug and use the type system to guarantee this behavior in
the future. By changing the function parameters to use `impl
PayloadQueuer` the function will only have access to functions on that
interface throughout the execution, but still be able to pass the
OutboundQueue object since it satisfies the trait bounds.

Functions such as enqueue_message() also take a `impl PayloadQueuer` so
they can be called from that context, but functions that need the
SocketDescriptorFlusher interface will fail at compile-time and point to
the issue before any tests need to be run.

This also fixes up tests and the tokio implementation that did not
implement the proper behavior and relied on read_event() calling
send_data().
Introduce the MessageQueuer interface used by Transport to queue
messages for send and use it in all the existing places where messages
are queued.
Many of the is_connected() checks were duplicative due to the macro
continuing out of the match statement if the peer referenced by the
node_id had not seen an Init message yet.

Remove the macro in favor of a function on PeerHolder that will return
an Option<> and use it instead.

Split out a new helper function Peer::is_initialized() that will return
true if the peer has seen an Init message.
Encapsulate the Peer information that is only valid after an Init
message has been seen. This makes the accesses to sync_status, ping
information, and their_features cleaner w.r.t. the Peer initialization
state.
The previous implementation would send a peer_disconnect callback for
any error after the NOISE handshake completed, but only send a
peer_connected callback after the Init message was received. This could
lead to a dangling peer_disconnected callback that didn't have a paired
peer_connected.

This patch cleans up the state differentiation to match the following
cases:

unconnected (!transport.is_connected() && peer.post_init_info.is_none())
* Know about the peer, but are still in the process of the NOISE
handshake

connected (transport.is_connected() && peer.post_init_info.is_none())
* NOISE handshake completed, but haven't received an Init message

initialized (transport.is_connected() && peer.post_init_info.is_some())
* The NOISE handshake has completed and the Init message has been
received and processed

With the 3 conceptual states, the read_event() path now only inserts
a Peer into the node_id_to_descriptor map after a Peer enters the
initialized state.

This fixes the asymmetry between peer_disconnected & peer_connected as
well as simplifies the disconnect code.
Now that the Init handling has been moved and the dependencies are more
clear, deduplicate the parameters of the read_event() and helper
functions to make use of Peer.

The overall pattern remains the same, read_event() does the locking and
passes in the separate items (peer, peers_needing_send,
node_id_to_descriptor) to do_read_event().

The handle_message() path is also cleaned up now that post_init_state is
guaranteed to be valid at that point.
Now that the locking and destructuring is done in read_event(), the
workarounds for the pre-NLL borrow checker can go away.
This patch expands the PeerHolder API allowing for encapsulation of
the peer state from the code that needs to iterate over initialized
peers. This cleans up peer iteration/removal allowing for a much
simpler design.

1) Introduce new APIs for PeerHolder:
* initialized_peers_mut()
* initialized_peer_node_ids()
* remove_peer_by_descriptor()

2) Clean up event handler iteration using new APIs
3) Unify the timer_tick and DisconnectPeer event
   disconnect path using new APIs
4) Convert get_peer_node_ids() to use new API
The broadcast event handling code did not catch Ok(false) returned from
the route handler when deciding whether or not to broadcast messages.
Instead, it only checked if the return value was an Error.

Fix it up and enable the regression tests.
Use a separate lock to generate the SecretKey instead of overloading the
PeerHolder mutex.

Refactoring PeerManager removed this hidden constraint and this should
make it more robust in the future.
s/fill_message_queue_with_sync/fill_outbound_queue_with_sync/
s/queue_init_message/enqueue_init_message/
Motivated by lightningdevkit#456, remove the
peers_needing_send set in favor of just scanning the peers during
process_events() and attempting to send data for those peers that have items in
their outbound queue or pending sync items to be sent out.
This gets rid of the complexity required to handle Iterators that
return errors and makes way for fewer copies in the decryption path.
Previously, all input data was written to the read buffer before
any decryption was attempted. This patch will use the input data solely in
the event that there is no previous data in the internal read buffer.

This also converts all tests to use the public interface instead of
the previously used decrypt_single_message

This was design feedback from the original 494 review.
This makes the process_act() interface a bit cleaner on a successfully
completed handshake.
The public interface was in a half-state with some callers referencing
the Decryptor directly for message iteration and others using the
public interface for the encryption path. This patch moves everything to
using the Encryptor and Decryptor directly.

This is motivated by feedback from 494, that recommended the objects
are split up but still have common functions for the key rotation.
Now that the CompletedHandshakeInfo exists to pass the relevant pieces
out of the handshake code and all users go to the Encryptor and
Decryptor directly, this is no longer needed.
Clean up the loose comments, test names, and variables names that still
referred to conduit now that it has been destructured.
Mostly auto-merge except for the following action items:

4395b92 - Added licensing to all files in ln/peers/
4bb5955 - LN_MAX_MSG_LEN. Tests and behavior have been moved to encryption.rs
43eed8d - Remove thread_rng() from PeerManager tests
00d063d - Update fuzz testing log messages
@codecov
Copy link

codecov bot commented Sep 15, 2020

Codecov Report

Merging #694 into main will increase coverage by 1.58%.
The diff coverage is 91.12%.

Impacted file tree graph

@@            Coverage Diff             @@
##             main     #694      +/-   ##
==========================================
+ Coverage   91.92%   93.51%   +1.58%     
==========================================
  Files          35       43       +8     
  Lines       20082    22133    +2051     
==========================================
+ Hits        18461    20697    +2236     
+ Misses       1621     1436     -185     
Impacted Files Coverage Δ
lightning/src/ln/peers/handler.rs 97.25% <ø> (ø)
lightning/src/ln/wire.rs 89.71% <ø> (+21.71%) ⬆️
lightning/src/ln/features.rs 88.99% <50.00%> (-0.38%) ⬇️
lightning/src/util/test_utils.rs 75.13% <61.72%> (-9.87%) ⬇️
lightning/src/ln/peers/test_util.rs 89.10% <89.10%> (ø)
lightning/src/ln/peers/transport.rs 92.30% <92.30%> (ø)
lightning/src/ln/peers/handshake/acts.rs 94.64% <94.64%> (ø)
lightning/src/ln/peers/encryption.rs 94.84% <94.84%> (ø)
lightning/src/ln/peers/chacha.rs 95.65% <95.65%> (ø)
lightning/src/ln/peers/handshake/mod.rs 95.91% <95.91%> (ø)
... and 18 more

Continue to review full report at Codecov.

Legend - Click here to learn more
Δ = absolute <relative> (impact), ø = not affected, ? = missing data
Powered by Codecov. Last update 343aacc...d2859fa. Read the comment docs.

@julianknutsen
Copy link
Author

Removed in favor of squashing 494 up front.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants