Propagate blocks after PoW validation by halfalicious · Pull Request #5713 · ethereum/aleth

halfalicious · 2019-08-03T05:11:43Z

Propagate new blocks after they've been imported into the block queue and validated. This results in the blocks being sent out faster compared to waiting until they've been imported into the block chain which reduces the uncle rate.

codecov-io · 2019-08-03T21:37:38Z

Codecov Report

Merging #5713 into master will decrease coverage by <.01%.
The diff coverage is 27.82%.

@@            Coverage Diff             @@
##           master    #5713      +/-   ##
==========================================
- Coverage   63.23%   63.23%   -0.01%     
==========================================
  Files         353      353              
  Lines       30245    30276      +31     
  Branches     3387     3390       +3     
==========================================
+ Hits        19126    19144      +18     
- Misses       9887     9898      +11     
- Partials     1232     1234       +2

halfalicious · 2019-08-03T22:35:35Z

Currently propagate all blocks which are imported into the block queue and which pass validation, should only propagate new blocks

halfalicious · 2019-08-03T23:18:01Z

Currently propagate all blocks which are imported into the block queue and which pass validation, should only propagate new blocks

At the very least we should check if we are currently syncing in EthereumCapability::propagateBlocks and return early if we are here (the presumption being that if we are syncing we are still trying to get to chain tip and haven't started being sent new blocks or mining yet):

aleth/libethereum/EthereumCapability.cpp

Lines 955 to 958 in e390d3c

    
           void EthereumCapability::propagateBlocks(std::shared_ptr<VerifiedBlocks const> const& _blocks) 
        
           { 
        
               if (_blocks->empty()) 
        
                   return;

Note that we have similar logic in EthereumCapability::doBackgroundWork:

aleth/libethereum/EthereumCapability.cpp

Lines 857 to 864 in e390d3c

    
           void EthereumCapability::doBackgroundWork() 
        
           { 
        
               ensureInitialised(); 
        
               auto h = m_chain.currentHash(); 
        
               // If we've finished our initial sync (including getting all the blocks into the chain so as to 
        
               // reduce invalid transactions), start trading transactions & block hashes 
        
               if (!isSyncing() && m_chain.isKnown(m_latestBlockHashSent)) 
        
               {

halfalicious · 2019-08-05T00:04:43Z

Had to do some manual reformatting of the code since I don't have git clang-format working again via the command line after reinstalling my OS so apologies if there are some unnecessary formatting changes in this PR.

halfalicious · 2019-08-05T00:10:19Z

libethereum/Client.cpp

    Timer t;
-    tie(ir, m_syncBlockQueue, count) = bc().sync(m_bq, m_stateDB, m_syncAmount);
+
+    shared_ptr<VerifiedBlocks> verifiedBlocks = make_shared<VerifiedBlocks>();


shared_ptr is required here since we propagate the blocks on another thread (network thread) so we need to ensure that they are kept around until they can be sent out. We can't just create a copy of the blocks since the copy ctor is deleted.

halfalicious · 2019-08-13T04:17:30Z

Received some internal RLPX handshake errors during local testing:

aleth, a C++ Ethereum client
INFO  08-12 20:41:00 main net    Id: ##90a0431e…
INFO  08-12 20:41:00 main net    ENR: [ seq=1 id=v4 key=0290a043… tcp=30303 udp=30303 ]
aleth 1.7.0-alpha.1-20+commit.b4e4fb1a.dirty
INFO  08-12 20:41:41 p2p  info   UPnP device not found.
INFO  08-12 20:41:41 p2p  net    Active peer count: 0
INFO  08-12 20:41:41 p2p  net    Looking for peers...
Node ID: enode://90a0431e893c8fa9a0697b51e5876668f099e0fd7fd268bd0021e8d9125c1805bfac7d0e44613c7fce8275406b8126a4f299990ec8df9b1859e6b7f4e123352a@127.0.0.1:30303
INFO  08-12 20:41:41 main rpc    JSON-RPC socket path: \\.\pipe\\geth.ipc
JSONRPC Admin Session Key: hUlrVuzBFoo=
INFO  08-12 20:41:41 p2p  net    ENR updated: [ seq=2 id=v4 key=0290a043… ip=71.227.153.208 tcp=30303 udp=30303 ]
INFO  08-12 20:41:42 p2p  sync   Starting full sync
ERROR 08-12 20:41:54 p2p  rlpx   egress Internal error in handshake: RLPXFrameCoder disappeared (##5a85cb59…@35.181.105.86:30303)
ERROR 08-12 20:41:54 p2p  rlpx   egress Internal error in handshake: RLPXFrameCoder disappeared (##e31b2dbe…@40.115.25.33:30303)
ERROR 08-12 20:41:54 p2p  rlpx   egress Internal error in handshake: RLPXFrameCoder disappeared (##5fb5c839…@188.214.30.221:30415)

These appear to be transient, will dig deeper to see if they are related to my changes (I don't think they are since I didn't touch anything related to handshakes but I haven't seen these errors before)

[Edit] I haven't been able to replicate this - @gumb0 : Have you seen this before?

halfalicious · 2019-08-13T04:40:27Z

Rebased

halfalicious · 2019-08-14T03:28:32Z

Received some internal RLPX handshake errors during local testing:

aleth, a C++ Ethereum client
INFO  08-12 20:41:00 main net    Id: ##90a0431e…
INFO  08-12 20:41:00 main net    ENR: [ seq=1 id=v4 key=0290a043… tcp=30303 udp=30303 ]
aleth 1.7.0-alpha.1-20+commit.b4e4fb1a.dirty
INFO  08-12 20:41:41 p2p  info   UPnP device not found.
INFO  08-12 20:41:41 p2p  net    Active peer count: 0
INFO  08-12 20:41:41 p2p  net    Looking for peers...
Node ID: enode://90a0431e893c8fa9a0697b51e5876668f099e0fd7fd268bd0021e8d9125c1805bfac7d0e44613c7fce8275406b8126a4f299990ec8df9b1859e6b7f4e123352a@127.0.0.1:30303
INFO  08-12 20:41:41 main rpc    JSON-RPC socket path: \\.\pipe\\geth.ipc
JSONRPC Admin Session Key: hUlrVuzBFoo=
INFO  08-12 20:41:41 p2p  net    ENR updated: [ seq=2 id=v4 key=0290a043… ip=71.227.153.208 tcp=30303 udp=30303 ]
INFO  08-12 20:41:42 p2p  sync   Starting full sync
ERROR 08-12 20:41:54 p2p  rlpx   egress Internal error in handshake: RLPXFrameCoder disappeared (##5a85cb59…@35.181.105.86:30303)
ERROR 08-12 20:41:54 p2p  rlpx   egress Internal error in handshake: RLPXFrameCoder disappeared (##e31b2dbe…@40.115.25.33:30303)
ERROR 08-12 20:41:54 p2p  rlpx   egress Internal error in handshake: RLPXFrameCoder disappeared (##5fb5c839…@188.214.30.221:30415)

These appear to be transient, will dig deeper to see if they are related to my changes (I don't think they are since I didn't touch anything related to handshakes but I haven't seen these errors before)

[Edit] I haven't been able to replicate this - @gumb0 : Have you seen this before?

I investigated and I'm fairly confident that my changes aren't causing this issue. It looks like this error can happen if writing the hello packet during the RLPX handshake fails - we use the RLPXFrameCoder to read/write frames after exchanging the auths/acks with the key material, and we start using it to queue a write hello packet, followed by queueing a read of an incoming hello packet. If the write fails then we will call transition with a boost error code which will call into error, which in turn will call into cancel which will destroy the RLPXFrameCoder instance. When the read handler is finally executed - assuming that it has no boost error code - then the if (!m_io) check will be true which will result in the error message being logged:

aleth/libp2p/RLPxHandshake.cpp

Lines 354 to 370 in 78355a2

    
           ba::async_read(m_socket->ref(), 
        
               boost::asio::buffer(m_handshakeInBuffer, handshakeSizeBytes), 
        
               [this, self](boost::system::error_code ec, std::size_t) { 
        
                   if (ec) 
        
                       transition(ec); 
        
                   else 
        
                   { 
        
                       if (!m_io) 
        
                       { 
        
                           LOG(m_errorLogger) 
        
                               << "Internal error in handshake: RLPXFrameCoder disappeared (" 
        
                               << m_remote << ")"; 
        
                           m_failureReason = HandshakeFailureReason::InternalError; 
        
                           m_nextState = Error; 
        
                           transition(); 
        
                           return; 
        
                       }

halfalicious · 2019-08-14T03:43:07Z

Potential optimization - EthereumCapability::propagateNewBlocks assumes that if a peer doesn't know about the most recent block in the set of new blocks, it doesn't know about any of them which results in the whole set of blocks being sent to the peer (this is existing logic):

aleth/libethereum/EthereumCapability.cpp

Lines 977 to 999 in 0775de0

    
           auto const peersWithoutBlock = selectPeers( 
        
               [&](EthereumPeer const& _peer) { return !_peer.isBlockKnown(latestHash); }); 
        
           auto const peersToSendNumber = 
        
               std::max<std::size_t>(c_minBlockBroadcastPeers, std::sqrt(m_peers.size())); 
        
           std::vector<NodeID> const peersToSend = randomPeers(peersWithoutBlock, peersToSendNumber); 
        
           for (NodeID const& peerID : peersToSend) 
        
               for (auto const& b : *_newBlocks) 
        
               { 
        
                   RLPStream ts; 
        
                   m_host->prep(peerID, name(), ts, NewBlockPacket, 2) 
        
                       .appendRaw(b.blockData, 1) 
        
                       .append(b.verified.info.difficulty()); 
        
                   auto itPeer = m_peers.find(peerID); 
        
                   if (itPeer != m_peers.end()) 
        
                   { 
        
                       m_host->sealAndSend(peerID, ts); 
        
                       // We don't want to send new block hashes to these same peers 
        
                       itPeer->second.markBlockAsKnown(b.verified.info.hash()); 
        
                   } 
        
               }

I think this assumption is incorrect - we should check if a peer knows about a block on a block-by-block basis and only send new blocks to peers who don't know about them.

@gumb0 / @chfast : Thoughts?

libdevcore/Common.h

libethereum/BlockChain.h

libethereum/EthereumCapability.cpp

libethereum/BlockChain.h

libethereum/EthereumCapability.cpp

halfalicious · 2019-08-17T04:30:53Z

Rebased (to address CHANGELOG conflict) and addressed PR feedback

halfalicious · 2019-08-17T21:15:44Z

cc @gumb0

CHANGELOG.md

libethereum/EthereumCapability.cpp

gumb0 · 2019-08-19T17:18:24Z

libethereum/Client.cpp

    Timer t;
-    tie(ir, m_syncBlockQueue, count) = bc().sync(m_bq, m_stateDB, m_syncAmount);
+
+    std::shared_ptr<VerifiedBlocks> verifiedBlocks = std::make_shared<VerifiedBlocks>();


Could you add an explanation comment why we need a shared_ptr (we need to keep these blocks around until they're later processed in the network thread)

Ah good suggestion, I had already left a comment in this PR but it would be nice to also have a comment in the code.

Propagate new blocks after we've validated their PoW. PoW (and other checks) are done as a part of importation into the block queue, so we remove blocks from the queue once they've been validated, propagate them to the appropriate peers, then import them into the block chain.

Don't send new blocks if we're not syncing or we're too far behind. This new code follows the logic in EthereumCapability::maintainBlockHashes. Also make some global constants constexpr and rename some constants for consistency reasons.

We perform this check on the network thread because it turns out that std::atomic only supports types with noexcept ctors (i.e. no h256) and using a mutex to synchronize access to the latest block sent info seems like overkill.

Fix bug in randomPeers function

Fix comment formatting, return bad block hashes in BlockChain::sync() via existing tuple rather than via out parameter, remove unused EthereumCapability::randomPartitionPeers, optimize EthereumCapability::randomPeers

halfalicious · 2019-08-23T04:00:49Z

Rebased

Fix some bugs: * Only shuffle the peers list in randomPeers if a subset of them are being returned * Update the hash of the latest block sent in EthereumCapability if we're too far behind - otherwise we'll always be behind and therefore send any new blocks or new block hashes. Also, change the CHANGELOG entry from "added" to "changed" (since this isn't a new feature but modification of existing behavior)

gumb0 · 2019-08-23T13:34:00Z

I think this assumption is incorrect - we should check if a peer knows about a block on a block-by-block basis and only send new blocks to peers who don't know about them.

Yeah I think it would be reasonable, but for now I'd avoid further complicating this area of code. It's difficult to test, I'm not even sure whether the current logic works as expected.

halfalicious added the in progress label Aug 3, 2019

halfalicious force-pushed the propagate-blocks-pow branch 2 times, most recently from f74ec50 to 7093a31 Compare August 3, 2019 21:18

halfalicious force-pushed the propagate-blocks-pow branch from 1e30b99 to db277ea Compare August 3, 2019 22:26

halfalicious force-pushed the propagate-blocks-pow branch 2 times, most recently from b28a8b6 to e390d3c Compare August 3, 2019 22:56

halfalicious force-pushed the propagate-blocks-pow branch 2 times, most recently from 5d2ac04 to 9ab9964 Compare August 4, 2019 23:58

halfalicious commented Aug 5, 2019

View reviewed changes

halfalicious force-pushed the propagate-blocks-pow branch 4 times, most recently from f7f9b9e to c8d5b0c Compare August 5, 2019 00:49

halfalicious force-pushed the propagate-blocks-pow branch from b4e4fb1 to 4adaeaf Compare August 13, 2019 04:40

halfalicious force-pushed the propagate-blocks-pow branch from 4adaeaf to 71b1423 Compare August 13, 2019 04:55

halfalicious changed the title ~~[WIP] Propagate blocks after PoW validation~~ Propagate blocks after PoW validation Aug 14, 2019

halfalicious removed the in progress label Aug 14, 2019

halfalicious requested review from chfast and gumb0 and removed request for gumb0 August 14, 2019 03:34

gumb0 reviewed Aug 16, 2019

View reviewed changes

gumb0 suggested changes Aug 16, 2019

View reviewed changes

libethereum/EthereumCapability.cpp Outdated Show resolved Hide resolved

halfalicious force-pushed the propagate-blocks-pow branch from 0775de0 to e5466a4 Compare August 17, 2019 04:29

halfalicious force-pushed the propagate-blocks-pow branch 2 times, most recently from 2eb39cb to c66fa7f Compare August 17, 2019 21:15

gumb0 suggested changes Aug 19, 2019

View reviewed changes

halfalicious added 8 commits August 22, 2019 20:13

Update license message in BlockChain.*

d2156b1

Only send new blocks in certain cases

d681839

Don't send new blocks if we're not syncing or we're too far behind. This new code follows the logic in EthereumCapability::maintainBlockHashes. Also make some global constants constexpr and rename some constants for consistency reasons.

Update changelog

6c28b29

Perform send new blocks check on p2p thread

da56c62

We perform this check on the network thread because it turns out that std::atomic only supports types with noexcept ctors (i.e. no h256) and using a mutex to synchronize access to the latest block sent info seems like overkill.

Miscellaneous style-related changes and fix bug

8561b60

Fix bug in randomPeers function

Check total difficulty before propagating new blocks

5469800

Address PR feedback

c3f1b85

Fix comment formatting, return bad block hashes in BlockChain::sync() via existing tuple rather than via out parameter, remove unused EthereumCapability::randomPartitionPeers, optimize EthereumCapability::randomPeers

halfalicious force-pushed the propagate-blocks-pow branch from c66fa7f to c3f1b85 Compare August 23, 2019 03:14

halfalicious force-pushed the propagate-blocks-pow branch from d2b72ae to 8739dd0 Compare August 23, 2019 04:22

gumb0 approved these changes Aug 23, 2019

View reviewed changes

halfalicious merged commit 5f3d000 into master Aug 23, 2019

Conversation

halfalicious commented Aug 3, 2019 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

codecov-io commented Aug 3, 2019 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Codecov Report

Uh oh!

halfalicious commented Aug 3, 2019

Uh oh!

halfalicious commented Aug 3, 2019 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

halfalicious commented Aug 5, 2019

Uh oh!

halfalicious Aug 5, 2019 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Choose a reason for hiding this comment

Uh oh!

halfalicious commented Aug 13, 2019 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

halfalicious commented Aug 13, 2019

Uh oh!

halfalicious commented Aug 14, 2019

Uh oh!

halfalicious commented Aug 14, 2019 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

halfalicious commented Aug 17, 2019

Uh oh!

halfalicious commented Aug 17, 2019

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

gumb0 Aug 19, 2019

Choose a reason for hiding this comment

Uh oh!

halfalicious Aug 23, 2019

Choose a reason for hiding this comment

Uh oh!

halfalicious commented Aug 23, 2019

Uh oh!

gumb0 commented Aug 23, 2019

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants

halfalicious commented Aug 3, 2019 •

edited

Loading

codecov-io commented Aug 3, 2019 •

edited

Loading

halfalicious commented Aug 3, 2019 •

edited

Loading

halfalicious Aug 5, 2019 •

edited

Loading

halfalicious commented Aug 13, 2019 •

edited

Loading

halfalicious commented Aug 14, 2019 •

edited

Loading