Skip to content

Add libp2p-connection-tracker for simplified peer connection state management #6046

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Draft
wants to merge 5 commits into
base: master
Choose a base branch
from

Conversation

diegomrsantos
Copy link
Contributor

@diegomrsantos diegomrsantos commented Jun 4, 2025

Description

This PR introduces a new libp2p-connection-tracker crate that provides a lightweight NetworkBehaviour for tracking connected peers. The primary motivation is to eliminate the need for manual connection state management in application-level behaviours.

Notes & open questions

Problem

Currently, behaviours that need to track connected peers must manually maintain connection state using patterns like:

pub struct PeerManager {
    connected: HashSet<PeerId>, // Manual tracking
    // ...
}

impl NetworkBehaviour for PeerManager {
    fn on_swarm_event(&mut self, event: FromSwarm) {
        match event {
            FromSwarm::ConnectionEstablished(ConnectionEstablished { peer_id, .. }) => {
                self.connected.insert(peer_id); // Manual state management
            }
            FromSwarm::ConnectionClosed(ConnectionClosed { peer_id, remaining_established, .. }) => {
                if remaining_established == 0 {
                    self.connected.remove(&peer_id); // Manual cleanup
                }
            }
            _ => {}
        }
        // ... delegate to other behaviours
    }
}

This approach is error-prone and requires each behaviour to correctly handle connection lifecycle events, multiple connections per peer, and state synchronization.

Solution

The new connection-tracker behaviour provides a composable solution following libp2p's architectural patterns:

#[derive(NetworkBehaviour)]
#[behaviour(prelude = "libp2p_swarm::derive_prelude")]
struct PeerManager {
    connection_tracker: libp2p_connection_tracker::Behaviour,
    peer_store: peer_store::Behaviour<MemoryStore<Enr>>,
    // ... other behaviours
}

impl PeerManager {
    fn connected_count(&self) -> usize {
        self.connection_tracker.connected_count()
    }
    
    fn is_connected(&self, peer: &PeerId) -> bool {
        self.connection_tracker.is_connected(peer)
    }

    fn connected_peers(&self) -> impl Iterator<Item = &PeerId> {
        self.connection_tracker.connected_peers()
    }

}

Design Choices

  1. Minimal Scope: Only tracks connected peers to start small and avoid complexity
  2. Composable: Uses the NetworkBehaviour derive macro for seamless integration
  3. Event-Driven: Emits PeerConnected/PeerDisconnected events for reactive programming
  4. Zero Configuration: Works out of the box with sensible defaults
  5. Memory Efficient: Uses HashMap<PeerId, HashSet> for optimal storage
  6. Multiple Connections: Correctly handles multiple connections per peer

Implementation

store.rs: Simple storage layer with HashMap<PeerId, HashSet<ConnectionId>>
behaviour.rs: NetworkBehaviour implementation using dummy::ConnectionHandler
lib.rs: Public API and event definitions

The implementation follows the hierarchical state machine pattern and coding guidelines used throughout rust-libp2p.

Future Extensions

This foundation enables future enhancements like:

  • Recently disconnected peers cache (similar to Lighthouse's PeerDB)
  • Banned peers management
  • Connection quality metrics
  • Peer scoring support

Change checklist

  • I have performed a self-review of my own code
  • I have made corresponding changes to the documentation
  • I have added tests that prove my fix is effective or that my feature works
  • A changelog entry has been made in the appropriate crates

@elenaf9
Copy link
Member

elenaf9 commented Jun 4, 2025

Thanks for the PR!
Just wondering: why is Swarm::connected_peers not sufficient?

@diegomrsantos
Copy link
Contributor Author

Thanks for the PR! Just wondering: why is Swarm::connected_peers not sufficient?

Because, unfortunately, it's not available inside a Behavior. If we wish to split a feature into multiple behaviors that compose, it's not possible to use the swarm in any of them. Let me know if my understanding is correct.

@elenaf9
Copy link
Member

elenaf9 commented Jun 4, 2025

Thanks for the PR! Just wondering: why is Swarm::connected_peers not sufficient?

Because, unfortunately, it's not available inside a Behavior. If we wish to split a feature into multiple behaviors that compose, it's not possible to use the swarm in any of them. Let me know if my understanding is correct.

Yes you're right. But from my own experience, most behaviors want to track some custom state for each connected node anyway.
That said, I am not really against it if you think it's useful for lighthouse.
I am just wondering if we should then just make it part of the existing peer-store? It's 300 lines of code right now that could be added as <20 lines in the peer-store because we need to handle the necessary event their anyway, and I would think that many users want to use them together anyway.

@diegomrsantos
Copy link
Contributor Author

Thanks for the PR! Just wondering: why is Swarm::connected_peers not sufficient?

Because, unfortunately, it's not available inside a Behavior. If we wish to split a feature into multiple behaviors that compose, it's not possible to use the swarm in any of them. Let me know if my understanding is correct.

Yes you're right. But from my own experience, most behaviors want to track some custom state for each connected node anyway. That said, I am not really against it if you think it's useful for lighthouse. I am just wondering if we should then just make it part of the existing peer-store? It's 300 lines of code right now that could be added as <20 lines in the peer-store because we need to handle the necessary event their anyway, and I would think that many users want to use them together anyway.

Yes, I considered this. But, currently, the PeerStore handles only addresses and custom data. I see it as a storage where it's possible to store data about peers. On the other hand, the responsibility of the Behavior in this PR would be to track what peers are connected, disconnected, and banned. It could grow considerably over time as it becomes more feature-rich. The idea was to start small to validate the idea and get feedback. That being said, this behaviour could also be composed with the PeerStore. We could even introduce a PeerManager that is composed of those two components. What do you think?

@diegomrsantos
Copy link
Contributor Author

I can try to remove the behavior part and make it just a component that is part of the PeerStore.

@elenaf9
Copy link
Member

elenaf9 commented Jun 5, 2025

But, currently, the PeerStore handles only addresses and custom data. I see it as a storage where it's possible to store data about peers.

I would consider PeerStore more broadly as a network behavior that does general, basic tracking about connected peers and per-peer data, which most behaviors and users would want to have.
Initially we even had connected_peers in it, but got rid of it in favor of just using Swarm::connected_peers, because we mostly looked at it from the perspective of a user that is using the behavior through the swarm.

On the other hand I do like having small, composable behaviors. I am just afraid that the number of behaviors could explode rather quickly if we add a separate behavior for every minor thing (it also results in larger BehaviorEvent enums, error types, etc). And if, as you mentioned, eventually more features should be added anyway, then I'd rather also merge it with the PeerStore.
cc @jxs wdyt?

the responsibility of the Behavior in this PR would be to track what peers are connected, disconnected, and banned.

Side note: we already have libp2p-allow-block-list to track banning of peers.

@diegomrsantos
Copy link
Contributor Author

diegomrsantos commented Jun 5, 2025

I would consider PeerStore more broadly as a network behavior that does general, basic tracking about connected peers and per-peer data, which most behaviors and users would want to have.

IMO PeerManager would be a more descriptive name for that. The PeerStore could keep its current responsibilities and would be a component inside it. Wdyt?

@diegomrsantos
Copy link
Contributor Author

diegomrsantos commented Jun 5, 2025

Side note: we already have libp2p-allow-block-list to track banning of peers.

Thanks for pointing this out. Could it be used for what is described below?

Banned Peers - There is also a cache of about 500ish most recent banned peers. When a peer gets banned, we want to keep track of them (and their IP) so that we can prevent future connections. This will prevent a banned peer from reconnected over a time period (scoring determines how long, will talk about that later).

This PR is in the context of sigp/anchor#135

In the current PR I just wanted to draft an initial idea to move this to libp2p

@jxs
Copy link
Member

jxs commented Jun 5, 2025

Hi Elena,
I had an off band conversation with Diego before reading your comments where I overall stated the same,
we already have Swarm::get_connected_peers and most Behaviors have their own peer list to track custom data.

I also agree with your vision for the PeerStore and think that if what anchor needs cannot be done at the application level (instead of Behaviour level) I'd rather have PeerStore maintain its own state of connected peers as gossipsub and kademlia already do, but first I'd like to assert that we are not creating redundancy for the same needs, and if we introduce connected_peers function on PeerStore we still need Swarm::get_connected_peers.

IMO PeerManager would be a more descriptive name for that. The PeerStore could keep its current responsibilities and would be a component inside it. Wdyt?

From my experience a PeerManager is different for every use case, anchor will be different than lighthouse and they will be both compose by a PeerStore an allow-block-list a connection-limits etc.
This to say that I don't think we are able to generalize a PeerManager that is general enough to suffice all the needs

Thanks for pointing this out. Could it be used for what is described below?

Banned Peers - There is also a cache of about 500ish most recent banned peers. When a peer gets banned, we want to keep track of them (and their IP) so that we can prevent future connections. This will prevent a banned peer from reconnected over a time period (scoring determines how long, will talk about that later).

yes, only the timings of banning are missing, which we can implement

@dariusc93
Copy link
Member

Thanks for the PR!

Im not 100% sure about this since as @elenaf9 mentioned that most behaviours would want to track the connection state of each peer for explicit reasons. However, if we were to have this, wouldn't it be better for it to be apart of libp2p_swarm::behaviour as some utility that would track connection events from NetworkBehaviour::on_swarm_event similar to how we have some for tracking listening and external addresses and use that within a specific behaviour instead of having it be its own behaviour?

@diegomrsantos
Copy link
Contributor Author

diegomrsantos commented Jun 5, 2025

Thanks a lot, everyone, for the feedback! I removed all the NetworkBehavior from the connection_store and made it a component inside the PeerStore. I'm not sure what to do with the events yet, so they are commented out for now. Please let me know what you think.

Copy link
Member

@elenaf9 elenaf9 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks!

IMO we should just do the tracking of connections directly in the Behavior instead of only in the MemoryStore. That way also other Store implementations can use it.

Comment on lines +9 to +19
PeerConnected {
peer_id: PeerId,
connection_id: ConnectionId,
endpoint: ConnectedPoint,
},

/// A peer disconnected (last connection closed).
PeerDisconnected {
peer_id: PeerId,
connection_id: ConnectionId,
},
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't think we need these events. We already have the swarm-events for established and closed connections, that include the info how many other connections to that peer exist.

Comment on lines +241 to +243
let is_first_connection = self
.connection_store
.connection_established(peer_id, connection_id);
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We can just check here if ConnectionEstablished { other_established, .. } is 0.

}) => {
trace!(%peer_id, ?connection_id, remaining_established, "Connection closed");

let is_last_connection = self.connection_store.connection_closed(
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

same here, just check if remaining_established == 0.

Comment on lines +22 to +27
/// Simple storage for connected peers.
#[derive(Debug, Default)]
pub struct ConnectionStore {
/// Currently connected peers with their connection IDs
connected: HashMap<PeerId, HashSet<ConnectionId>>,
}
Copy link
Member

@elenaf9 elenaf9 Jun 5, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't think this needs to be a separate module, or even own structure, given that all it does is just adding and removing peers from a hashmap.
Why not just embed the logic directly in Behavior?

@elenaf9
Copy link
Member

elenaf9 commented Jun 5, 2025

but first I'd like to assert that we are not creating redundancy for the same needs, and if we introduce connected_peers function on PeerStore we still need Swarm::get_connected_peers.

@jxs yes I agree, I am also not super happy with the fact that we now duplicate swarm logic.
Still, I would argue that we should also keep Swarm::get_connected_peers because not everyone uses the PeerStore. And maybe the duplication is tolerable here given that it's very little code. Still, alternative ideas/ suggestions on how to best avoid any duplication would be great!

@jxs
Copy link
Member

jxs commented Jun 5, 2025

@jxs yes I agree, I am also not super happy with the fact that we now duplicate swarm logic.
Still, I would argue that we should also keep Swarm::get_connected_peers because not everyone uses the PeerStore. And maybe the duplication is tolerable here given that it's very little code. Still, alternative ideas/ suggestions on how to best avoid any duplication would be great!

Yeah agree with you Elena, with this being part of the PeerStore it's different than a whole Behaviour just for the connected peers and doesn't introduce the confusion the latter would.

@diegomrsantos
Copy link
Contributor Author

I'm still on the fence if this should be part of the PeerStore. In nim-libp2, it's handled by different concepts https://github.com/vacp2p/nim-libp2p/blob/master/libp2p/peerstore.nim and https://github.com/vacp2p/nim-libp2p/blob/master/libp2p/connmanager.nim

@diegomrsantos
Copy link
Contributor Author

From my experience a PeerManager is different for every use case, anchor will be different than lighthouse and they will be both compose by a PeerStore an allow-block-list a connection-limits etc.
This to say that I don't think we are able to generalize a PeerManager that is general enough to suffice all the needs

It's not necessarily about creating a PeerManager that is general enough to satisfy all the needs. The one in libp2p could have basic features that are expected to be part of all or most clients. It could be flexible enough to be composable with more specific ones like AnchorPeerManager or a LighthousePeerManager.

I personally wouldn't expect connection management in a module called PeerStore.

@elenaf9
Copy link
Member

elenaf9 commented Jun 5, 2025

I personally wouldn't expect connection management in a module called PeerStore.

It's not managing connections, it's just tracking connected peers 🙂 .

I'm still on the fence if this should be part of the PeerStore. In nim-libp2, it's handled by different concepts https://github.com/vacp2p/nim-libp2p/blob/master/libp2p/peerstore.nim and https://github.com/vacp2p/nim-libp2p/blob/master/libp2p/connmanager.nim

Please correct me if I am wrong, I am not really familiar with nim-libp2p, but it seems like the connmanager in nim does a lot more things than just tracking the connected peers (e.g. getting a new stream on a connection, reacting to new connections, etc.). Those are all solved in rust-libp2p in the swarm, so I am not sure if we can really compare this?

@drHuangMHT
Copy link
Contributor

I think #6012 can be an interesting idea, although the maintainers deemed it unnecessary.

@diegomrsantos
Copy link
Contributor Author

diegomrsantos commented Jun 9, 2025

@diegomrsantos
Copy link
Contributor Author

Then we do something like below instead of keeping an additional data structure:

    /// Gives the ids and info of all known connected peers.
    pub fn connected_peers(&self) -> impl Iterator<Item = (&PeerId, &PeerInfo<E>)> {
        self.peers.iter().filter(|(_, info)| info.is_connected())
    }

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

5 participants