Download snapshot through Parity's warp protocol#4622
Conversation
|
Now there's only parity's doc https://github.com/paritytech/parity/wiki/Warp-Sync |
52c6019 to
2444532
Compare
eth/main.cpp
Outdated
| << " --to <n> Export only to block n (inclusive); n may be a decimal, a '0x' prefixed hash, or 'latest'.\n" | ||
| << " --only <n> Equivalent to --export-from n --export-to n.\n" | ||
| << " --dont-check Prevent checking some block aspects. Faster importing, but to apply only when the data is known to be valid.\n\n" | ||
| << " --download-snapshot <path> Download Parity Warp snapshot data to the specified path." << endl |
There was a problem hiding this comment.
I spent so much time removing endls.
There was a problem hiding this comment.
Sorry I'll fix this 😆
Anyway I hope this to go away after #4597
libethereum/WarpHostCapability.h
Outdated
| namespace eth | ||
| { | ||
|
|
||
| class WarpHostCapability: public p2p::HostCapability<WarpPeerCapability>, Worker |
There was a problem hiding this comment.
Since this inherits Worker, there should be a destructor that calls terminate() (bad design it is...).
1d35ec8 to
cfdc66f
Compare
f8bc9a9 to
3216276
Compare
|
Using Fiber sounds like interesting experiment. They seems to work nice together with boost.asio. |
Codecov Report
@@ Coverage Diff @@
## develop #4622 +/- ##
===========================================
- Coverage 60.63% 59.86% -0.78%
===========================================
Files 348 348
Lines 27425 27615 +190
Branches 2857 2887 +30
===========================================
- Hits 16630 16531 -99
- Misses 9800 10113 +313
+ Partials 995 971 -24 |
|
Another relevant talk: https://www.youtube.com/watch?v=mCD6VLVS_y4 |
02ff5ef to
769e75f
Compare
5284451 to
8778947
Compare
|
Looks like boost.fiber from hunter now fails to build on GCC 4.8 (including Travis) but builds on GCC 5 |
|
Might be related: boostorg/fiber#121 but should be fixed in boost 1.66 used here |
|
Looks like boost.fiber starting with boost 1.66 requires |
|
I pinned boost version to 1.65.1 for now, it seems to work fine on GCC 4.8 |
|
I don't see problem with going with 1.65.1 for now and then skip 1.66. |
Switch to using fiber::buffered_channel for free peers, because it's lock-free when pushing below capacity.
…hot (snapshotStorage is nullptr then)
…ether it is suitable for snapshot download - Request DAO fork block from the peer to understand whether it's on the right side of the fork - Explicitly handle peer disconnect
a3326e7 to
8d3f1ad
Compare
|
I might add some tests/minor tweaks, but overall it's ready to be reviewed I think. |
|
A couple of disadvantages of using fibers that I see, to be fair:
|
eth/main.cpp
Outdated
| ("only", po::value<string>()->value_name("<n>"), "Equivalent to --export-from n --export-to n.") | ||
| ("format", po::value<string>()->value_name("<binary/hex/human>"), "Set export format.") | ||
| ("dont-check", "Prevent checking some block aspects. Faster importing, but to apply only when the data is known to be valid.") | ||
| ("download-snapshot", po::value<string>()->value_name("<path>"), "Download Parity Warp Sync snapshot data to the specified path.") |
There was a problem hiding this comment.
I believe you provide the storage like this: po::value(&snapshotPath). No need for if (vm.count("download-snapshot")) then.
libethashseal/EthashClient.h
Outdated
| /// Trivial forwarding constructor. | ||
| EthashClient(ChainParams const& _params, int _networkID, p2p::Host* _host, | ||
| std::shared_ptr<GasPricer> _gpForAdoption, | ||
| boost::filesystem::path const& _dbPath = boost::filesystem::path(), |
|
Issues fixed, I'll do the changes around reformatting |
|
Let's ship it. |
|
|
This is the solution using boost.fiber
Motivation for using boost.fiber
The approach that
BlockChainSyncclass takes to deal with asynchronous p2p nature of the sync looks like an anti-pattern that led to it becoming utterly unmaintainable. That approach could be described as having several network message handlers in one class, saving info about current stage of the sync in a bunch of member variables in the class, branching a lot in each handler depending on the current state of member variables. This way the essence of the sync algorithm itself gets spread out over a lot of different parts of the class, it's hard to keep track of what's currently going on and what should happen next.I am looking for a better approach that can elegantly express what we want to achieve. Fibers allow us here to have a "main algorithm" in one fiber that gets interrupted each time we need to get some data asynchronously, then gets resumed from the place we left off when we have data already. So the main algorithm looks like it's synchronous.
The data itself is sent to fiber through
buffered_channel- producers/consumers-kind of data structure, where the fiber gets "blocked" if it tries to get the data not pushed yet to the channel.To some extent I think this is similar to goroutines and channels of golang.
Also switching context between the fibers is very cheap unlike with threads, and it all happens in the single thread, therefore no need to deal with thread-safety/possible races/mutexes etc.
This is still kind of experiment to see where it goes, I myself am not fully convinced that fibers are the best solution for this.
Current status:
--peerset required:Still to be done:
--peersetwe connect to Ethereum Classic nodes very often and happily download the snapshot for classic chain. To deal with this probably we need to do DAO challenge first similar to full sync.fix macOS build failure because of unusedlooks like I've already found a workaround doing import https://github.com/ethereum/cpp-ethereum/blob/04e8ca5afe785cafe18905d6f0f032cb1d869c7a/libethereum/SnapshotImporter.cpp#L45SnapshotLog::debug- this doesn't make sense to me yet