Add libp2p-connection-tracker for simplified peer connection state management #6046

diegomrsantos · 2025-06-04T14:53:22Z

Description

This PR introduces a new libp2p-connection-tracker crate that provides a lightweight NetworkBehaviour for tracking connected peers. The primary motivation is to eliminate the need for manual connection state management in application-level behaviours.

Notes & open questions

Problem

Currently, behaviours that need to track connected peers must manually maintain connection state using patterns like:

pub struct PeerManager {
    connected: HashSet<PeerId>, // Manual tracking
    // ...
}

impl NetworkBehaviour for PeerManager {
    fn on_swarm_event(&mut self, event: FromSwarm) {
        match event {
            FromSwarm::ConnectionEstablished(ConnectionEstablished { peer_id, .. }) => {
                self.connected.insert(peer_id); // Manual state management
            }
            FromSwarm::ConnectionClosed(ConnectionClosed { peer_id, remaining_established, .. }) => {
                if remaining_established == 0 {
                    self.connected.remove(&peer_id); // Manual cleanup
                }
            }
            _ => {}
        }
        // ... delegate to other behaviours
    }
}

This approach is error-prone and requires each behaviour to correctly handle connection lifecycle events, multiple connections per peer, and state synchronization.

Solution

The new connection-tracker behaviour provides a composable solution following libp2p's architectural patterns:

#[derive(NetworkBehaviour)]
#[behaviour(prelude = "libp2p_swarm::derive_prelude")]
struct PeerManager {
    connection_tracker: libp2p_connection_tracker::Behaviour,
    peer_store: peer_store::Behaviour<MemoryStore<Enr>>,
    // ... other behaviours
}

impl PeerManager {
    fn connected_count(&self) -> usize {
        self.connection_tracker.connected_count()
    }
    
    fn is_connected(&self, peer: &PeerId) -> bool {
        self.connection_tracker.is_connected(peer)
    }

    fn connected_peers(&self) -> impl Iterator<Item = &PeerId> {
        self.connection_tracker.connected_peers()
    }

}

Design Choices

Minimal Scope: Only tracks connected peers to start small and avoid complexity
Composable: Uses the NetworkBehaviour derive macro for seamless integration
Event-Driven: Emits PeerConnected/PeerDisconnected events for reactive programming
Zero Configuration: Works out of the box with sensible defaults
Memory Efficient: Uses HashMap<PeerId, HashSet> for optimal storage
Multiple Connections: Correctly handles multiple connections per peer

Implementation

store.rs: Simple storage layer with HashMap<PeerId, HashSet<ConnectionId>>
behaviour.rs: NetworkBehaviour implementation using dummy::ConnectionHandler
lib.rs: Public API and event definitions

The implementation follows the hierarchical state machine pattern and coding guidelines used throughout rust-libp2p.

Future Extensions

This foundation enables future enhancements like:

Recently disconnected peers cache (similar to Lighthouse's PeerDB)
Banned peers management
Connection quality metrics
Peer scoring support

Change checklist

I have performed a self-review of my own code
I have made corresponding changes to the documentation
I have added tests that prove my fix is effective or that my feature works
A changelog entry has been made in the appropriate crates

elenaf9 · 2025-06-04T17:26:48Z

Thanks for the PR!
Just wondering: why is Swarm::connected_peers not sufficient?

diegomrsantos · 2025-06-04T17:29:49Z

Thanks for the PR! Just wondering: why is Swarm::connected_peers not sufficient?

Because, unfortunately, it's not available inside a Behavior. If we wish to split a feature into multiple behaviors that compose, it's not possible to use the swarm in any of them. Let me know if my understanding is correct.

elenaf9 · 2025-06-04T18:09:01Z

Thanks for the PR! Just wondering: why is Swarm::connected_peers not sufficient?

Because, unfortunately, it's not available inside a Behavior. If we wish to split a feature into multiple behaviors that compose, it's not possible to use the swarm in any of them. Let me know if my understanding is correct.

Yes you're right. But from my own experience, most behaviors want to track some custom state for each connected node anyway.
That said, I am not really against it if you think it's useful for lighthouse.
I am just wondering if we should then just make it part of the existing peer-store? It's 300 lines of code right now that could be added as <20 lines in the peer-store because we need to handle the necessary event their anyway, and I would think that many users want to use them together anyway.

diegomrsantos · 2025-06-04T18:42:05Z

Thanks for the PR! Just wondering: why is Swarm::connected_peers not sufficient?

Because, unfortunately, it's not available inside a Behavior. If we wish to split a feature into multiple behaviors that compose, it's not possible to use the swarm in any of them. Let me know if my understanding is correct.

Yes you're right. But from my own experience, most behaviors want to track some custom state for each connected node anyway. That said, I am not really against it if you think it's useful for lighthouse. I am just wondering if we should then just make it part of the existing peer-store? It's 300 lines of code right now that could be added as <20 lines in the peer-store because we need to handle the necessary event their anyway, and I would think that many users want to use them together anyway.

Yes, I considered this. But, currently, the PeerStore handles only addresses and custom data. I see it as a storage where it's possible to store data about peers. On the other hand, the responsibility of the Behavior in this PR would be to track what peers are connected, disconnected, and banned. It could grow considerably over time as it becomes more feature-rich. The idea was to start small to validate the idea and get feedback. That being said, this behaviour could also be composed with the PeerStore. We could even introduce a PeerManager that is composed of those two components. What do you think?

diegomrsantos · 2025-06-04T19:09:28Z

I can try to remove the behavior part and make it just a component that is part of the PeerStore.

elenaf9 · 2025-06-05T07:01:22Z

But, currently, the PeerStore handles only addresses and custom data. I see it as a storage where it's possible to store data about peers.

I would consider PeerStore more broadly as a network behavior that does general, basic tracking about connected peers and per-peer data, which most behaviors and users would want to have.
Initially we even had connected_peers in it, but got rid of it in favor of just using Swarm::connected_peers, because we mostly looked at it from the perspective of a user that is using the behavior through the swarm.

On the other hand I do like having small, composable behaviors. I am just afraid that the number of behaviors could explode rather quickly if we add a separate behavior for every minor thing (it also results in larger BehaviorEvent enums, error types, etc). And if, as you mentioned, eventually more features should be added anyway, then I'd rather also merge it with the PeerStore.
cc @jxs wdyt?

the responsibility of the Behavior in this PR would be to track what peers are connected, disconnected, and banned.

Side note: we already have libp2p-allow-block-list to track banning of peers.

diegomrsantos · 2025-06-05T09:33:49Z

I would consider PeerStore more broadly as a network behavior that does general, basic tracking about connected peers and per-peer data, which most behaviors and users would want to have.

IMO PeerManager would be a more descriptive name for that. The PeerStore could keep its current responsibilities and would be a component inside it. Wdyt?

diegomrsantos · 2025-06-05T10:51:33Z

Side note: we already have libp2p-allow-block-list to track banning of peers.

Thanks for pointing this out. Could it be used for what is described below?

Banned Peers - There is also a cache of about 500ish most recent banned peers. When a peer gets banned, we want to keep track of them (and their IP) so that we can prevent future connections. This will prevent a banned peer from reconnected over a time period (scoring determines how long, will talk about that later).

This PR is in the context of sigp/anchor#135

In the current PR I just wanted to draft an initial idea to move this to libp2p

jxs · 2025-06-05T12:15:38Z

Hi Elena,
I had an off band conversation with Diego before reading your comments where I overall stated the same,
we already have Swarm::get_connected_peers and most Behaviors have their own peer list to track custom data.

I also agree with your vision for the PeerStore and think that if what anchor needs cannot be done at the application level (instead of Behaviour level) I'd rather have PeerStore maintain its own state of connected peers as gossipsub and kademlia already do, but first I'd like to assert that we are not creating redundancy for the same needs, and if we introduce connected_peers function on PeerStore we still need Swarm::get_connected_peers.

IMO PeerManager would be a more descriptive name for that. The PeerStore could keep its current responsibilities and would be a component inside it. Wdyt?

From my experience a PeerManager is different for every use case, anchor will be different than lighthouse and they will be both compose by a PeerStore an allow-block-list a connection-limits etc.
This to say that I don't think we are able to generalize a PeerManager that is general enough to suffice all the needs

Thanks for pointing this out. Could it be used for what is described below?

Banned Peers - There is also a cache of about 500ish most recent banned peers. When a peer gets banned, we want to keep track of them (and their IP) so that we can prevent future connections. This will prevent a banned peer from reconnected over a time period (scoring determines how long, will talk about that later).

yes, only the timings of banning are missing, which we can implement

dariusc93 · 2025-06-05T12:16:10Z

Thanks for the PR!

Im not 100% sure about this since as @elenaf9 mentioned that most behaviours would want to track the connection state of each peer for explicit reasons. However, if we were to have this, wouldn't it be better for it to be apart of libp2p_swarm::behaviour as some utility that would track connection events from NetworkBehaviour::on_swarm_event similar to how we have some for tracking listening and external addresses and use that within a specific behaviour instead of having it be its own behaviour?

diegomrsantos · 2025-06-05T13:19:35Z

Thanks a lot, everyone, for the feedback! I removed all the NetworkBehavior from the connection_store and made it a component inside the PeerStore. I'm not sure what to do with the events yet, so they are commented out for now. Please let me know what you think.

elenaf9

Thanks!

IMO we should just do the tracking of connections directly in the Behavior instead of only in the MemoryStore. That way also other Store implementations can use it.

elenaf9 · 2025-06-05T15:07:57Z

misc/peer-store/src/connection_store.rs

+    PeerConnected {
+        peer_id: PeerId,
+        connection_id: ConnectionId,
+        endpoint: ConnectedPoint,
+    },
+
+    /// A peer disconnected (last connection closed).
+    PeerDisconnected {
+        peer_id: PeerId,
+        connection_id: ConnectionId,
+    },


I don't think we need these events. We already have the swarm-events for established and closed connections, that include the info how many other connections to that peer exist.

elenaf9 · 2025-06-05T15:09:11Z

misc/peer-store/src/memory_store.rs

+                let is_first_connection = self
+                    .connection_store
+                    .connection_established(peer_id, connection_id);


We can just check here if ConnectionEstablished { other_established, .. } is 0.

elenaf9 · 2025-06-05T15:09:36Z

misc/peer-store/src/memory_store.rs

+            }) => {
+                trace!(%peer_id, ?connection_id, remaining_established, "Connection closed");
+
+                let is_last_connection = self.connection_store.connection_closed(


same here, just check if remaining_established == 0.

elenaf9 · 2025-06-05T15:14:23Z

misc/peer-store/src/connection_store.rs

+/// Simple storage for connected peers.
+#[derive(Debug, Default)]
+pub struct ConnectionStore {
+    /// Currently connected peers with their connection IDs
+    connected: HashMap<PeerId, HashSet<ConnectionId>>,
+}


I don't think this needs to be a separate module, or even own structure, given that all it does is just adding and removing peers from a hashmap.
Why not just embed the logic directly in Behavior?

elenaf9 · 2025-06-05T15:18:40Z

but first I'd like to assert that we are not creating redundancy for the same needs, and if we introduce connected_peers function on PeerStore we still need Swarm::get_connected_peers.

@jxs yes I agree, I am also not super happy with the fact that we now duplicate swarm logic.
Still, I would argue that we should also keep Swarm::get_connected_peers because not everyone uses the PeerStore. And maybe the duplication is tolerable here given that it's very little code. Still, alternative ideas/ suggestions on how to best avoid any duplication would be great!

jxs · 2025-06-05T15:22:12Z

@jxs yes I agree, I am also not super happy with the fact that we now duplicate swarm logic.
Still, I would argue that we should also keep Swarm::get_connected_peers because not everyone uses the PeerStore. And maybe the duplication is tolerable here given that it's very little code. Still, alternative ideas/ suggestions on how to best avoid any duplication would be great!

Yeah agree with you Elena, with this being part of the PeerStore it's different than a whole Behaviour just for the connected peers and doesn't introduce the confusion the latter would.

diegomrsantos · 2025-06-05T15:45:02Z

I'm still on the fence if this should be part of the PeerStore. In nim-libp2, it's handled by different concepts https://github.com/vacp2p/nim-libp2p/blob/master/libp2p/peerstore.nim and https://github.com/vacp2p/nim-libp2p/blob/master/libp2p/connmanager.nim

diegomrsantos · 2025-06-05T15:50:18Z

From my experience a PeerManager is different for every use case, anchor will be different than lighthouse and they will be both compose by a PeerStore an allow-block-list a connection-limits etc.
This to say that I don't think we are able to generalize a PeerManager that is general enough to suffice all the needs

It's not necessarily about creating a PeerManager that is general enough to satisfy all the needs. The one in libp2p could have basic features that are expected to be part of all or most clients. It could be flexible enough to be composable with more specific ones like AnchorPeerManager or a LighthousePeerManager.

I personally wouldn't expect connection management in a module called PeerStore.

elenaf9 · 2025-06-05T16:32:50Z

I personally wouldn't expect connection management in a module called PeerStore.

It's not managing connections, it's just tracking connected peers 🙂 .

I'm still on the fence if this should be part of the PeerStore. In nim-libp2, it's handled by different concepts https://github.com/vacp2p/nim-libp2p/blob/master/libp2p/peerstore.nim and https://github.com/vacp2p/nim-libp2p/blob/master/libp2p/connmanager.nim

Please correct me if I am wrong, I am not really familiar with nim-libp2p, but it seems like the connmanager in nim does a lot more things than just tracking the connected peers (e.g. getting a new stream on a connection, reacting to new connections, etc.). Those are all solved in rust-libp2p in the swarm, so I am not sure if we can really compare this?

drHuangMHT · 2025-06-06T02:27:45Z

I think #6012 can be an interesting idea, although the maintainers deemed it unnecessary.

diegomrsantos · 2025-06-09T12:55:55Z

How about we add a PeerConnectionStatus to PeerRecord like in Lighthouse https://github.com/sigp/lighthouse/blob/bde0f1ef0b29608b6cf4e447987843a2b249e433/beacon_node/lighthouse_network/src/peer_manager/peerdb/peer_info.rs#L29

diegomrsantos · 2025-06-09T12:58:32Z

Then we do something like below instead of keeping an additional data structure:

    /// Gives the ids and info of all known connected peers.
    pub fn connected_peers(&self) -> impl Iterator<Item = (&PeerId, &PeerInfo<E>)> {
        self.peers.iter().filter(|(_, info)| info.is_connected())
    }

introduce connection tracker

74197a5

diegomrsantos force-pushed the connection-tracker branch from 2725dca to 74197a5 Compare June 4, 2025 15:01

Merge branch 'master' into connection-tracker

0853f3a

make connection_store a PeerStore component

e3e74c0

diegomrsantos force-pushed the connection-tracker branch from d1bd05a to 31873db Compare June 5, 2025 13:24

fmt and clippy

e6048c8

diegomrsantos force-pushed the connection-tracker branch from 31873db to e6048c8 Compare June 5, 2025 13:28

elenaf9 reviewed Jun 5, 2025

View reviewed changes

Merge branch 'master' into connection-tracker

20b58ae

Add libp2p-connection-tracker for simplified peer connection state management #6046

Are you sure you want to change the base?

Add libp2p-connection-tracker for simplified peer connection state management #6046

Uh oh!

Conversation

diegomrsantos commented Jun 4, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Description

Notes & open questions

Problem

Solution

Design Choices

Implementation

Future Extensions

Change checklist

Uh oh!

elenaf9 commented Jun 4, 2025

Uh oh!

diegomrsantos commented Jun 4, 2025

Uh oh!

elenaf9 commented Jun 4, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

diegomrsantos commented Jun 4, 2025

Uh oh!

diegomrsantos commented Jun 4, 2025

Uh oh!

elenaf9 commented Jun 5, 2025

Uh oh!

diegomrsantos commented Jun 5, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

diegomrsantos commented Jun 5, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

jxs commented Jun 5, 2025

Uh oh!

dariusc93 commented Jun 5, 2025

Uh oh!

diegomrsantos commented Jun 5, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

elenaf9 left a comment • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Choose a reason for hiding this comment

Uh oh!

elenaf9 Jun 5, 2025

Choose a reason for hiding this comment

Uh oh!

elenaf9 Jun 5, 2025

Choose a reason for hiding this comment

Uh oh!

elenaf9 Jun 5, 2025

Choose a reason for hiding this comment

Uh oh!

elenaf9 Jun 5, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Choose a reason for hiding this comment

Uh oh!

elenaf9 commented Jun 5, 2025

Uh oh!

jxs commented Jun 5, 2025

Uh oh!

diegomrsantos commented Jun 5, 2025

Uh oh!

diegomrsantos commented Jun 5, 2025

Uh oh!

elenaf9 commented Jun 5, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

drHuangMHT commented Jun 6, 2025

Uh oh!

diegomrsantos commented Jun 9, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

diegomrsantos commented Jun 9, 2025

Uh oh!

Uh oh!

diegomrsantos commented Jun 4, 2025 •

edited

Loading

elenaf9 commented Jun 4, 2025 •

edited

Loading

diegomrsantos commented Jun 5, 2025 •

edited

Loading

diegomrsantos commented Jun 5, 2025 •

edited

Loading

diegomrsantos commented Jun 5, 2025 •

edited

Loading

elenaf9 left a comment •

edited

Loading

elenaf9 Jun 5, 2025 •

edited

Loading

elenaf9 commented Jun 5, 2025 •

edited

Loading

diegomrsantos commented Jun 9, 2025 •

edited

Loading