Conversation
|
(assuming you don't want a full review as this is marked as a draft) |
Stebalien
left a comment
There was a problem hiding this comment.
I'd like to simplify the control flow before merging this. We're launching goroutines from goroutines from goroutines which is making the control flow hard to reason about.
|
I wanted to test how this branch would perform when fetching from seeds that have a subset of the data (instead of both seeds having all of the data) and compare it to master and to old Bitswap (that doesn't support HAVE / DONT_HAVE messages). These results are for one leech fetching files of various sizes from two seeds, where each seed has 2/3 of the leaf blocks in the file:
Interestingly the results for old Bitswap are very inconsistent, if you look closely you can see orange data points (representing old-from-2old) right near the top of the graph (behind the key) for file size 32MB. I ran this simulation twice and got the same anomaly with old Bitswap for 32MB files. Conversely this branch (which simulates DONT_HAVE with timeouts) produces more consistent results with somewhat worse performance for larger file sizes. |
Stebalien
left a comment
There was a problem hiding this comment.
Ok, I'm mostly down to coding nits at this point. I feel like there should be some way to simplify this, but I can't think of any.
My only real remaining concern is the want, cancel, want race. It'll be fine, but if we keep doing that, we'll stall a bit because we'll keep delaying the want.
|
I've resolved the want-cancel-want in the way that you suggested, and resolved the nits. Thanks for all the detailed coding suggestions and explanations ❤️ |
e61c8c4 to
1a16a42
Compare
|
🎉 |

When sending a want-block to a peer running an older version of Bitswap, if there is a timeout then simulate receiving a DONT_HAVE from that peer for the CID.
This PR addresses one of the two cases described in #244. It does not address timeouts caused by a peer being overloaded or network issues.
Sample benchmark output: