-
Notifications
You must be signed in to change notification settings - Fork 5
Description
Specification
In order to segregate nodes of different networks using the ClaimNetworkAccess tokens defined in #779, there needs to be some logic for nodes to prevent accepting RPC calls from nodes that are out-of-network. There will need to be some RPC calls that are whitelisted to enable some level of out of network negotiation such as authenticating into the network or requesting access.
Currently a NodeConnection has two stages, the creating stage when you are awaiting the createNodeConnection factory. And the connected stage when the NodeConnection fully connects and the creation resolve. The problem with this is we need to be able to authenticate without allowing most RPC calls and nominal traffic.
To this end we need a connected but unauthenticated state where the NodeConnection is fully operational but not entertaining RPC calls by rejecting them. The NodeConnection.createNodeConnection should negotiate it's access to the network before resolving with the completed connection. This means the connecting state and authenticating state is hidden inside the creation of the NodeConnection. This will mean minimal changes to how the NodeConnectionManager handles these connections. However, the NodeConnection will need separate connected, authenticated and created events to reflect these stages.
- Connection event should be recorded by the audit domain as a connection
- Authenticated event should be recorded by the audit domain as well. Probably a new authenticated event. There will be a distinct forward and reverse authentication event.
- Not actually sure about creation, fills the same role as authenticated so subject to discussion.
While in the authenticating state, all non whitelisted RPC calls need to be rejected outright. That said, the NodeConnection shouldn't be available to make these calls at this stage but we still need to be secure about it. To avoid having to add logic to all of the RPC handlers we should apply this connection rejection logic to the middleware. The middleware will need to refer back to the NodeConnection somehow to assess the authenticated state.
To authenticate the NodeConnection needs to make an authentication RPC call and provide a valid network token. It's up to the handler of this to decide to reject the authentication with an error and kill the NodeConnection if it fails. Note that this needs to be symmetric in the forward and reverse direction. BOTH sides of the connection need to fully authenticate. This opens us to annoying race conditions so we need to be extra careful here. The handshake has been described within #779.
It is extremely important that the following conditions are met.
- A
NodeConnectionis only fully created if it fully authenticates. - The
NodeConnectionis only added to theNodeConnectionManager's connection map if it fully authenticates. - The connection details are only added to the
NodeGraphif it fully authenticates. - No non whitelisted RPC calls are allowed or made during the
Authenticatingstate.
Additional context
Related: #779 - Defines the network access tokens and how they are verified.
Related: #770 - Parent issue
Tasks
- Add an authentication state to the
NodeConnection. - Add middleware logic to RPC to only allow non whitelisted RPC calls to be handled if we are in the authenticated state. Otherwise kill the stream involved.
- Add an RPC handler authentication the connections access to the network. This will resolve with no message, or throw an error for why it failed to authenticate. This should be whitelisted.
- Review and expand on the creation events to include separate connection, authentication and creation events.
NodeConnectionmust switch to theauthenticatedstate and fully create only after both the authentication handler and call succeeds and resolves.- We need to make sure timeouts work here. There are two levels of timeouts. It's up to the handler side to kill the connection if it fails, but the calling side can timeout and kill the connection as a fail-safe. We also need an overall timeout for the whole process so if it fails to authenticate fully before the timeout then we just give up and kill the connection.