Skip to content

Fix gossipsub race condition for heartbeat#188

Merged
vyzo merged 1 commit intolibp2p:masterfrom
quorumcontrol:heartbeat-race
May 28, 2019
Merged

Fix gossipsub race condition for heartbeat#188
vyzo merged 1 commit intolibp2p:masterfrom
quorumcontrol:heartbeat-race

Conversation

@brandonwestcott
Copy link
Copy Markdown

In using gossipsub on a system with a number of ephemeral peers, we noticed messages would occasionally fail to route to their intended targets even though a subscribe had been received.

In a basic example, given there are 2 peers connected, PeerA & PeerB. For TopicA, PeerA is not in the mesh, but has seen and gossiped it before to PeerC, which is now disconnected and no longer subscribed.

When a Subscribe request for TopicA is issued from PeerB, then PeerA adds that peer onto the PubSub.topics map:

go-libp2p-pubsub/pubsub.go

Lines 566 to 573 in 49274b0

if subopt.GetSubscribe() {
tmap, ok := p.topics[t]
if !ok {
tmap = make(map[peer.ID]struct{})
p.topics[t] = tmap
}
tmap[rpc.from] = struct{}{}

However, its not that map that is used for publishing messages back out from PeerA when its not in the mesh:

gmap, ok = gs.fanout[topic]
if !ok {
// we don't have any, pick some
peers := gs.getPeers(topic, GossipSubD, func(peer.ID) bool { return true })
if len(peers) > 0 {
gmap = peerListToMap(peers)
gs.fanout[topic] = gmap
}
}

Line #233 there uses GossipSubRouter.fanout as its map, which is updated during the heartbeat process:

for topic, peers := range gs.fanout {
// check whether our peers are still in the topic
for p := range peers {
_, ok := gs.p.topics[topic][p]
if !ok {
delete(peers, p)
}
}

In between the 1 second of a heartbeat, these two maps can be out of date with each other, which is to be expected. However, on

if !ok {
the !ok only works as a fallback for the initial iteration, if the map is empty because all other Peers have been unsubscribed (PeerC) then it doesn't fallback to getPeers.

This PR is to change that conditional to check for empty map cases so that the fallback still happens. The other option is to make the key for the topic be set back to nil instead of an empty map inside the heartbeat process.

Copy link
Copy Markdown
Collaborator

@vyzo vyzo left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

thank you!

@vyzo vyzo merged commit 4221a39 into libp2p:master May 28, 2019
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants