Skip to content
This repository was archived by the owner on Nov 15, 2023. It is now read-only.

Implementing SSRS – Substrate Simple Remote Signer protocol#7365

Closed
gnunicorn wants to merge 20 commits intomasterfrom
ben-remote-signer
Closed

Implementing SSRS – Substrate Simple Remote Signer protocol#7365
gnunicorn wants to merge 20 commits intomasterfrom
ben-remote-signer

Conversation

@gnunicorn
Copy link
Copy Markdown
Contributor

@gnunicorn gnunicorn commented Oct 20, 2020

Implementation of the jsonrpc remote keystore over the http/ws protocol as the substrate keystore, including a separate example server implementation.

To Do:

  • docs
  • Metadata, License
  • hide behind non-default feature-flag
  • implement remaining (currently unused) functions on server

Note: test had ecdsa support, but was removed because of #7490

@github-actions github-actions Bot added the A3-in_progress Pull request is in progress. No review needed at this stage. label Oct 20, 2020
@gnunicorn gnunicorn changed the title Remote keystore interface, Client & Server Implementing SSRS – Substrate Simple Remote Signer protocol Oct 21, 2020
@NikVolf
Copy link
Copy Markdown
Contributor

NikVolf commented Oct 21, 2020

over the http/ws

Shouldn't it be https/wss? How do you secure the transport anyway?

@NikVolf NikVolf added the D9-needsaudit 👮 PR contains changes to fund-managing logic that should be properly reviewed and externally audited label Oct 21, 2020
@gnunicorn
Copy link
Copy Markdown
Contributor Author

Shouldn't it be https/wss? How do you secure the transport anyway?

This is using regular jsonrpc_client_transports, which only support http and ws afaics.

@NikVolf
Copy link
Copy Markdown
Contributor

NikVolf commented Oct 21, 2020

Shouldn't it be https/wss? How do you secure the transport anyway?

This is using regular jsonrpc_client_transports, which only support http and ws afaics.

it is not a secure transport, especially for this task

@gnunicorn
Copy link
Copy Markdown
Contributor Author

gnunicorn commented Oct 21, 2020

it is not a secure transport, especially for this task

That is correct, and no one said it was. The expected environment for this until now, was to have a secured network, like run in a VPN and firewalled sub-network and/or expect the connection to be tunneled through external means (like SSH).

I see though that the regular user might not realize they have to do that, so I could see that we might want hide that this behind a feature flag you'd have to explicitly activate until we have a secure-default-way of doing it instead. WDYT?

@NikVolf
Copy link
Copy Markdown
Contributor

NikVolf commented Oct 21, 2020

it is not a secure transport, especially for this task

That is correct, and no one said it was. The expected environment for this until now, was to have a secured network, like run in a VPN and firewalled sub-network and/or expect the connection to be tunneled through external means (like SSH).

I see though that the regular user might not realize they have to do that, so I could see that we might want hide that this behind a feature flag you'd have to explicitly activate until we have a secure-default-way of doing it instead. WDYT?

Signer will be custom software implemented anyway. So I don't think there is much sense in using non-secure transport at all, when there are secure alternatives (for example, noise). If it is properly used, there will be no need in tunnelling and stuff. Noise has implementation in many languages if that is what required - even in pure C.

JSON-RPC overhead need is also not clear to me, but I am not sure if this is critical. How much signing happens on validator node per second/minute?

@gnunicorn
Copy link
Copy Markdown
Contributor Author

@NikVolf the goal was to get something out of the door quickly with the least developer overhead from our side to allow others to start experimenting with the workflow and use case – JSONRPC was all there and quick to get up and running, that's the only reason it was used. The way it stands, there are known security problems (like, the server still needs to have some knowledge of chain-info otherwise they can't securely sign VRFs, but that also means they have some connection to a chain or chain state, which requires internet to stay up to date) and performance issues (https and JSON being two things that could be replaced with more efficient things instead, but also that signing is mostly sync for in substrate still and the example and known others approaches also follow linear service patterns – like TEE can only do one signing at a time).

@burdges
Copy link
Copy Markdown

burdges commented Oct 21, 2020

I've a stupid question: I'd think half the tools for doing JSON RPC should do TLS, no?

@gnunicorn gnunicorn marked this pull request as ready for review November 3, 2020 09:25
@github-actions github-actions Bot added A0-please_review Pull request needs code review. and removed A3-in_progress Pull request is in progress. No review needed at this stage. labels Nov 3, 2020
@gnunicorn gnunicorn added B0-silent Changes should not be mentioned in any release notes C1-low PR touches the given topic and has a low impact on builders. labels Nov 3, 2020
@kirushik kirushik removed the D9-needsaudit 👮 PR contains changes to fund-managing logic that should be properly reviewed and externally audited label Nov 11, 2020
@kirushik
Copy link
Copy Markdown

After conversation with @gnunicorn I've removed the needsaudit flag; our current SSRS implementation is for experimentation only, hidden behind a feature flag and should under no circumstances be used in anything production-related.

@gnunicorn is extending the readme to explicitly state the above.

(This shouldn't stop us from raising and documenting security-related issues with it, though — it's just by far not in shape to sustain any professional attention yet, and probably never will in its current form.)

@bkchr
Copy link
Copy Markdown
Member

bkchr commented Nov 11, 2020

After conversation with @gnunicorn I've removed the needsaudit flag; our current SSRS implementation is for experimentation only, hidden behind a feature flag and should under no circumstances be used in anything production-related.

@gnunicorn is extending the readme to explicitly state the above.

(This shouldn't stop us from raising and documenting security-related issues with it, though — it's just by far not in shape to sustain any professional attention yet, and probably never will in its current form.)

Maybe we should than put this into its own repo?

And what is the path we continue here? I thought this issue is soo pressing that we can deprecate sentries?

@kirushik
Copy link
Copy Markdown

And what is the path we continue here?

I must admit, I'm a bit confused here too.
I'm talking to @gnunicorn in private to better understand what should and what shouldn't go into our codebase/pass the review here.

@gnunicorn gnunicorn marked this pull request as draft November 13, 2020 09:52
@gnunicorn
Copy link
Copy Markdown
Contributor Author

gnunicorn commented Nov 13, 2020

Converted this back to draft to more clearly signal:

  • we don't the intend to merge or ship this as-is
  • this is an example implementation showing how in the new API other remote-protocols can easily be integrated
  • this is a highly insecure-by-default-example and before any of it could be used, must be given appropriate documentation

Unfortunately, no other, more secure protocol has been proposed until now, which leaves us with a couple of path to go ahead, but not decided on any yet:

  • leave this as is and wait for other protocols to come around
  • replace the given transport with an at least TLS-wrapped one and probably some further hardening, like certificate pinning or off-protocol key-exchange-requirement – none of which is currently supported by the libraries used
  • provide a generic protocol-plugging-mechanism and move the specific implementation into a example/experimental-named repository to make clear this is more about showing the API then the specific protocol

@burdges
Copy link
Copy Markdown

burdges commented Nov 13, 2020

I'd assumed remote signers were vaguely HSM-like with special hardware connections, like a second ethernet card or USB connection. Yes, I think an internet remote signer wants a secure handshake like TLS but with fixed transport keys and IP addresses for both sides, so no CAs or DNS, but even then..

Interestingly, if you look at Noise IK and KK then you'll observe the difference is sending the initiators long term secret inside the first flow, encrypted only by an ephemeral key exchange.

IK:
  <- s
  ...
  -> e, es, s, ss
  <- e, ee, se,

KK:
  -> s
  <- s
  ...
  -> e, es, ss
  <- e, ee, se,

As the initiator knows the respondents long-term secret, one could've a hybrid handshake between IK and KK in which foreign initiators reveal their identity, which permits whitelisting on the first flow, but special distinguished initiators hide their identity against an adversary who compromises the respondents' IP address but not the respondent machine. In other words, your remote signer can open connections that look exactly like normal node connections, but avoids immediately exposing your remote singer's IP address against BGP attacks, etc. This is probably useless since a signer's messages look different anyways, but technically the traffic observing attacker looks different form the BGP attacker.

@redzsina
Copy link
Copy Markdown

For now, we consider our review done for this PR - please refer to issue 21 in the internal tracker.

)
>, ServiceError> {
if config.keystore_remotes.len() > 1 {
return Err(ServiceError::Other(
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

In this case, why is it remotes and not remote?

// FIXME: here would the concrete keystore been build,
// must return a concrete type (NOT `LocalKeystore`) that
// implements `CryptoStore` and `SyncCryptoStore`
Err("Remote Keystore not supported.")
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Question: How would this be extended? do i have to fork substrate in this case? or is this crate node-template used to generate a new node implementation for the user?

})
}

/// Create a local keystore in memory.
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This comment doesn't reflect what's happening inside the method


/// A remote based keystore that is either memory-based or filesystem-based.
pub struct RemoteKeystore {
client: RwLock<Option<Client>>,
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why would we want to read/write lock on a remote client? IMO, locking should only happen on the server side.

}
}

impl<Store: CryptoStore + 'static> Stream for KeystoreReceiver<Store> {
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Wouldn't it be less verbose to have a method such as:

async fn run(mut self) {
    loop {
        request = self.receiver.next().await;
        self.process_request(request)
    }
}


fn sr25519_public_keys(&self, id: KeyTypeId) -> BoxFuture<Vec<sr25519::Public>> {
let receiver = self.send_request(RequestMethod::Sr25519PublicKeys(id));
Box::new(receiver.map(|e| match e {
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Nitpick: e here implies error but i think response here or result might better imply what the value is.

impl crate::RemoteSignerApi for GenericRemoteSignerServer {

fn sr25519_public_keys(&self, id: KeyTypeId) -> BoxFuture<Vec<sr25519::Public>> {
let receiver = self.send_request(RequestMethod::Sr25519PublicKeys(id));
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Follow up to my above suggestion to use async fn, here send_request could be replaced with a direct call to the client's sr25519_public_keys which we can return as a boxed future without .await.

Comment thread client/service/src/builder.rs Outdated

enum KeystoreContainerInner {
Local(Arc<LocalKeystore>)
trait AsCryptoRef {
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Nitpick: can this be renamed to AsCryptoStoreRef?

Comment thread client/service/src/builder.rs Outdated

struct DoubleWrap<T>(Arc<T>);

impl<T> AsCryptoRef for DoubleWrap<T> where T: CryptoStore + SyncCryptoStore + 'static {
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Question: What happens if we get rid of DoubleWrap and implement AsCryptoRef on Arc<T> directly? the below methods could return self.clone() in that case and the code would be a bit simpler.

I have modified this snippet you showed me earlier to do this:
https://play.rust-lang.org/?version=stable&mode=debug&edition=2018&gist=5d6a9deb6c38c00cab9d0198b95d690c

Comment thread client/service/src/builder.rs Outdated
/// Construct and hold different layers of Keystore wrappers
pub struct KeystoreContainer(KeystoreContainerInner);
pub struct KeystoreContainer {
remotes: Vec<Box<dyn AsCryptoRef>>,
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If we plan to support multiple remotes, it would be nice to document the use case.

@gnunicorn
Copy link
Copy Markdown
Contributor Author

superseeded by #7628 .

@gnunicorn gnunicorn closed this Nov 30, 2020
@bkchr bkchr deleted the ben-remote-signer branch November 30, 2020 12:37
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.

Labels

A0-please_review Pull request needs code review. B0-silent Changes should not be mentioned in any release notes C1-low PR touches the given topic and has a low impact on builders.

Projects

None yet

Development

Successfully merging this pull request may close these issues.

7 participants