Simulate DONT_HAVE for older peers by dirkmc · Pull Request #248 · ipfs/go-bitswap

dirkmc · 2020-01-31T20:43:33Z

When sending a want-block to a peer running an older version of Bitswap, if there is a timeout then simulate receiving a DONT_HAVE from that peer for the CID.

This PR addresses one of the two cases described in #244. It does not address timeouts caused by a peer being overloaded or network issues.

Sample benchmark output:

$ go test . -run a -bench BenchmarkFetchFromOldBitswap
goos: darwin
goarch: amd64
pkg: github.com/ipfs/go-bitswap
BenchmarkFetchFromOldBitswap/3Nodes-Overlap3-OneAtATime-8         	       1	25244830917 ns/op
BenchmarkFetchFromOldBitswap/3Nodes-AllToAll-OneAtATime-8         	1000000000	         0.230 ns/op
BenchmarkFetchFromOldBitswap/3Nodes-Overlap3-AllConcurrent-8      	1000000000	         0.0285 ns/op
BenchmarkFetchFromOldBitswap/3Nodes-Overlap3-OneAtATime (1 runs / 25.24s):  25.243s: sent 25, recv 11, dups 1 / 11
BenchmarkFetchFromOldBitswap/3Nodes-AllToAll-OneAtATime (14 runs / 3.24s):  0.231s: sent 20, recv 11, dups 1 / 11
BenchmarkFetchFromOldBitswap/3Nodes-Overlap3-AllConcurrent (7 runs / 0.17s): 0.024s: sent 2, recv 6, dups 4 / 14
PASS
ok  	github.com/ipfs/go-bitswap	29.755s

internal/messagequeue/messagequeue.go

Stebalien · 2020-02-03T05:04:47Z

(assuming you don't want a full review as this is marked as a draft)

internal/messagequeue/messagequeue.go

internal/messagequeue/mqpinger.go

Stebalien

I'd like to simplify the control flow before merging this. We're launching goroutines from goroutines from goroutines which is making the control flow hard to reason about.

bitswap.go

internal/messagequeue/donthavetimeoutmgr.go

dirkmc · 2020-02-11T22:50:08Z

I wanted to test how this branch would perform when fetching from seeds that have a subset of the data (instead of both seeds having all of the data) and compare it to master and to old Bitswap (that doesn't support HAVE / DONT_HAVE messages).

These results are for one leech fetching files of various sizes from two seeds, where each seed has 2/3 of the leaf blocks in the file:

master-from-2master: leech (master) fetching from two seeds (master)
Note that master now supports HAVE / DONT_HAVE messages
old-from-2old: leech (old) fetching from two seeds (old)
Note that old Bitswap does not support HAVE / DONT_HAVE messages
sim-dont-have-from-2old: leech (sim-dont-have) fetching from two seeds (old)
The leech (sim-dont-have) is this branch, ie Bitswap simulating DONT_HAVE with timeouts

Interestingly the results for old Bitswap are very inconsistent, if you look closely you can see orange data points (representing old-from-2old) right near the top of the graph (behind the key) for file size 32MB. I ran this simulation twice and got the same anomaly with old Bitswap for 32MB files.

Conversely this branch (which simulates DONT_HAVE with timeouts) produces more consistent results with somewhat worse performance for larger file sizes.

Stebalien

Ok, I'm mostly down to coding nits at this point. I feel like there should be some way to simplify this, but I can't think of any.

My only real remaining concern is the want, cancel, want race. It'll be fine, but if we keep doing that, we'll stall a bit because we'll keep delaying the want.

internal/messagequeue/donthavetimeoutmgr.go

dirkmc · 2020-02-12T17:09:58Z

I've resolved the want-cancel-want in the way that you suggested, and resolved the nits.

Thanks for all the detailed coding suggestions and explanations ❤️
Personally I appreciate all suggestions including nits, I always feel like I learn something.

Stebalien

One small bug.

internal/messagequeue/donthavetimeoutmgr.go

Stebalien · 2020-02-12T21:28:22Z

🎉

dirkmc added 2 commits January 31, 2020 15:40

fix: simulate DONT_HAVE for older peers

ccaea17

fix: lint

c35d90d

dirkmc commented Jan 31, 2020

View reviewed changes

internal/messagequeue/messagequeue.go Outdated Show resolved Hide resolved

fix: message queue DONT_HAVE timer bug

2a57572

Stebalien reviewed Feb 3, 2020

View reviewed changes

internal/messagequeue/messagequeue.go Show resolved Hide resolved

internal/messagequeue/messagequeue.go Outdated Show resolved Hide resolved

internal/messagequeue/messagequeue.go Outdated Show resolved Hide resolved

feat: add message queue pinger

014348b

Stebalien mentioned this pull request Feb 5, 2020

Timeout Management #244

Closed

dirkmc added 2 commits February 5, 2020 12:53

fix: a couple of bugs in the message queue pinger

5388e73

fix: lint

27505d2

dirkmc marked this pull request as ready for review February 5, 2020 19:22

Stebalien reviewed Feb 7, 2020

View reviewed changes

internal/messagequeue/messagequeue.go Show resolved Hide resolved

internal/messagequeue/mqpinger.go Outdated Show resolved Hide resolved

refactor: simplify DONT_HAVE timeout management

574b577

dirkmc requested a review from Stebalien February 10, 2020 15:41

Stebalien suggested changes Feb 11, 2020

View reviewed changes

refactor: optimize DONT_HAVE timeout manager

c56d552

Stebalien suggested changes Feb 12, 2020

View reviewed changes

fix: handle want-cancel-want case in DONT_HAVE timeout manager

4a1a073

test: add case to dont have timeout manager for add-cancel-add

8d1d5ff

Stebalien suggested changes Feb 12, 2020

View reviewed changes

internal/messagequeue/donthavetimeoutmgr.go Outdated Show resolved Hide resolved

fix: bug with iterating over DONT_HAVE timeout manager queue

1a16a42

dirkmc force-pushed the fix/dont-have-timeout branch from e61c8c4 to 1a16a42 Compare February 12, 2020 21:12

Stebalien approved these changes Feb 12, 2020

View reviewed changes

dirkmc merged commit 20be084 into master Feb 12, 2020

dirkmc deleted the fix/dont-have-timeout branch February 12, 2020 21:26

Stebalien mentioned this pull request Feb 17, 2020

Disassociate RT membership from connectivity libp2p/go-libp2p-kbucket#50

Merged

Conversation

dirkmc commented Jan 31, 2020 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Stebalien commented Feb 3, 2020

Uh oh!

Uh oh!

Uh oh!

Stebalien left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

dirkmc commented Feb 11, 2020 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

Stebalien left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

dirkmc commented Feb 12, 2020

Uh oh!

Stebalien left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Stebalien commented Feb 12, 2020

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

dirkmc commented Jan 31, 2020 •

edited

Loading

dirkmc commented Feb 11, 2020 •

edited

Loading