Implement websocket support in Dropshot as an Extractor and #[channel] macro#403
Conversation
| description: None, | ||
| tags: vec![], | ||
| paginated: func_parameters.paginated, | ||
| extension_mode: func_parameters.extension_mode, |
There was a problem hiding this comment.
So this means a given endpoint can only be in one mode at a time, right? That makes sense for pagination and websockets, since they're mutually exclusive. I have no idea what other kinds of extensions we might dream up, but is it possible we could want to use more than one at a time?
There was a problem hiding this comment.
yeah - i suppose if such a thing comes to pass, we could change to be something more sophisticated than a bare enum, but i didn't want to try to anticipate unknown unknowns yet, just represent that pagination and websockets are mutually exclusive extensions
| }] | ||
| async fn example_api_websocket_counter( | ||
| _rqctx: Arc<RequestContext<()>>, | ||
| ws_upg: WebsocketUpgrade, |
20def1d to
7d64f54
Compare
6678f55 to
d438904
Compare
davepacheco
left a comment
There was a problem hiding this comment.
Cool! I had a few questions.
Would you mind updating the changelog to describe the new feature?
| extension_mode = match (extension_mode, metadata.extension_mode) { | ||
| (ExtensionMode::None, x) | (x, ExtensionMode::None) => x, | ||
| (x, y) if x != y => { | ||
| panic!("incompatible extension modes in tuple: {:?} != {:?}", x, y); |
There was a problem hiding this comment.
What does this condition mean?
There was a problem hiding this comment.
it means the two tuple elements were indicators of different non-None extension modes (i.e. Paginated and Websocket at this point), in either order, which we currently take to be mutually exclusive things
| * The consumer of this *must* call [WebsocketUpgrade::handle] (or if they do | ||
| * not wish to upgrade a given connection, [WebsocketUpgrade::do_not_upgrade]) | ||
| * on the owned instance in the endpoint's fn, or else WebsocketUpgrade's | ||
| * implementation of [Drop::drop] will panic. |
There was a problem hiding this comment.
Why's that? (why not implicitly call do_not_upgrade() on drop, say?)
It looks like it's a non-fatal error to handle the upgrade more than once (we produce an error) but it's fatal to not handle it at all. I don't have a strong idea here but I just wonder if there's a way to make it harder to accidentally panic in production. (One thought I had was that WebsocketUpgrade could be parametrized by an impl of some trait that would allow us to immediately and automatically invoke it exactly once for you...but then I'm not sure what the body of the function would do. Also, I'm not sure if it's desirable to be able to do stuff or look at the request before the upgrade -- presumably it is (e.g., authentication). Then I thought this could be a different kind of endpoint...but then I worry I'm overthinking it.)
There was a problem hiding this comment.
for the interface to be more discoverable (because .handle() has to be called for any of this to work; this way nobody's wondering why their connection isn't getting upgraded, even if they weren't originally clear on the docs) and less prone to incorrect use (e.g. accidental inclusion of a WebsocketUpgrade in a param list due to copy/paste errors, causing us to eventually generate an confusing/incorrect client from the resulting schema)
if we're very concerned about panics making it into production, we can make it be an error log in Drop instead, but that's less likely to be caught at development time -- maybe panic if debug assertions are on, else log?
| #[derive(Debug)] | ||
| pub struct WebsocketUpgrade(Option<WebsocketUpgradeInner>); | ||
|
|
||
| /** |
There was a problem hiding this comment.
Did you mean to put some docs here and on WebsocketUpgradeResponse?
There was a problem hiding this comment.
🤦♀️ sure did! thanks
ahl
left a comment
There was a problem hiding this comment.
Are other extractors permitted alongside WebsocketUpgrade? Or must it be the lone extractor?
What does this look like in the OpenAPI output? The OpenAPI spec folks have pretty much declared Websockets as "not HTTP APIs" and therefore out of scope.
I vaguely had intended to use AsyncAPI to describe websocket interfaces, both the channels and the messages. AsyncAPI and OpenAPI are at least sort of compatible in that I think we could produce a single JSON file with both descriptions in there.
Does it make sense to use the same endpoint macro with the constraints that the method must be a GET, the WebsocketUpgrade must be present, and the return type is Result<Response<Body>, HttpError>? Or might we want some other interface such as #[channel { path = ... } ] or #[endpoint { method = WEBSOCKET, ...?
| } | ||
|
|
||
| /** | ||
| * Dropshot/Progenitor features used by endpoints which are not a part of the base OpenAPI spec. |
There was a problem hiding this comment.
per our discussion on matrix, i'll file some issues after this merges about turning this AsyncAPI support (and perhaps coming up with a general.. philosophy? on how to approach other extensions, like pagination or anything we come up with in the future that's outside of the scope of AsyncAPI in particular)
| */ | ||
| #[derive(Copy, Clone, Debug, Default, Eq, PartialEq)] | ||
| pub enum ExtensionMode { | ||
| #[default] |
There was a problem hiding this comment.
Do we have a MSRV for dropshot? This is quite recent, no?
There was a problem hiding this comment.
i actually changed to this from a proper bare impl Default after an earlier PR comment pointed out that the rust-toolchain.toml for this project declares rust 1.62 - simple enough to revert to what i originally had if it's a concern at all
There was a problem hiding this comment.
I should have clarified. I don't know if we have an MSRV! Yes, I know we specify the toolchain; that's at least in part to ensure repeatable builds as we have several tests for the endpoint macros that are a bit.. finicky with regard to version dependence.
I love #[default]; I don't know if using it would be annoying for our internal consumers (or maybe for external consumers; 👋 I don't know if you exist, but if you do, we love you!)
There was a problem hiding this comment.
I did mention the derive Default but we don't have an MSRV I could find so depends on how conservative we wanna be here I guess.
In terms of our own internal consumers, of the ones I know of:
- propolis lists an MSRV of 1.62
- omicron is pinned on an old nightly (due to unstable features): nightly-2022-04-27
- crucible is also pinned on an even older nightly: nightly-2021-11-24
- buildomat doesn't have a rust-toolchain or note about version
- cio is pinned on an old nightly: nightly-2022-03-13
There was a problem hiding this comment.
Do the old nightly's support #[default]? If so I say we go ahead and keep it. If not... I guess maybe roll them forward first?
There was a problem hiding this comment.
i think i'm not quite as inclined to have us drag everything forward for something as trivial as some very small syntax sugar (vs. say, implementing an actual feature in such a way that otherwise wouldn't have been possible), but i wouldn't say no if other folks felt strongly otherwise
There was a problem hiding this comment.
Yea, I don't think that that should hold this up.
| async fn my_ws_endpoint( | ||
| rqctx: std::sync::Arc<dropshot::RequestContext<()>>, | ||
| websock: dropshot::WebsocketUpgrade, | ||
| ) -> Result<http::Response<hyper::Body>, dropshot::HttpError> { |
There was a problem hiding this comment.
Is it always the case that the Result::Ok response type must be Response<Body>? If so, do we check that?
There was a problem hiding this comment.
we don't explicitly check for it in proc-macro-land, but in the latest patch i did introduce some friendlier type aliases for what's going on - and the user need not concern themselves with the Result<Response<Body>, HttpError> one any more; that's all abstracted away by #[channel] and all their socket-handling function has to return is a very generic Result<(), Box<dyn Error+Send+Sync+'static> (chosen so we can log their errors, if any, and let them use the ergonomic ?)
| * [`tokio::io::AsyncWrite`] type (e.g. `tokio_tungstenite`). | ||
| * | ||
| * ``` | ||
| #[dropshot::endpoint { method = GET, path = "/my/ws/endpoint" }] |
There was a problem hiding this comment.
it seems like this always needs to be a GET request. Do we check that somewhere?
There was a problem hiding this comment.
i think this is resolved by the new #[channel...] approach, as we don't let method be specified at all in that macro (it's always GET for protocol=WEBSOCKETS)
| pub use server::ServerContext; | ||
| pub use server::{HttpServer, HttpServerStarter}; | ||
| pub use websocket::HyperUpgraded; | ||
| pub use websocket::SchemaWebsocketParams; |
They shouldn't be mutually exclusive (or else we wouldn't be able to do much useful with it, like say addressing a specific org/proj/inst in nexus by
It looks like a GET endpoint with a {
"/organizations/{organization_name}/projects/{project_name}/instances/{instance_name}/serial-console": {
"get": {
"tags": ["instances"],
"summary": "Connect to an instance's serial console stream",
"operationId": "instance_serial_console",
"parameters": [
{ "in": "path", "name": "instance_name", "required": true,
"schema": { "$ref": "#/components/schemas/Name" }, "style": "simple" },
{ "in": "path", "name": "organization_name", "required": true,
"schema": { "$ref": "#/components/schemas/Name" }, "style": "simple" },
{ "in": "path", "name": "project_name", "required": true,
"schema": { "$ref": "#/components/schemas/Name" }, "style": "simple" }
],
"responses": { "default": { "description": "", "content": { "*/*": {"schema": {}}}}},
"x-dropshot-websocket": true
}
}
}
Would it be reasonable (in the interest of providing a websockets interface in the nearer term for serial & VNC API work) to punt on the full extent of that for the moment with the "x-dropshot-websocket" sentinel in an OpenAPI path's "get" method, and then follow up by implementing the proper AsyncAPI version? For the use cases I'm thinking of, AsyncAPI's specifications are a bit heavier weight than we need, but that might be a blessing in this regard - I think we often want to forward other protocols through raw binary messages, so most of the work that I think this would end up unblocking would have a fairly trivial expression as far as just annotating everywhere it's used with an effective "yep, it's binary" in such a way that our macro could accept before having a "real" AsyncAPI implementation backing it.
I did have the thought of perhaps making I also had the use case of "what if we want to leave a path open for hybrid GET endpoints requests that may or may not be upgraded depending on how it's accessed," but I am increasingly unsure this use case would be worth the additional complexity.
|
Is it compatible with
Cool!
Absolutely, and I apologize if I seemed to be implying otherwise. I don't want to stand in the way of urgent progress; I also want to have line of sight to how we want this to work in the future.
Neat. I haven't used AsyncAPI. I inferred that it lets you specify the protocol such that e.g. we could generate a client that had the messages as Rust types... but that's just imagineering.
Should we validate at least that folks are using a
That's a neat idea for a follow-on. Thanks for the thorough discussion! |
|
so the little "panic in Drop if it's not handled" scheme is kinda invalidated when any of the other extractors fail to parse from the request - it never reaches the user's code for them to call |
845269d to
5ef073a
Compare
This allows endpoints meant for handling continuous websocket streams to be easily constructed by providing an async handler that accepts the upgraded websocket connection as an argument. This change also refactors the previous `paginated` bool to instead be a dedicated type describing any 'extension' modes we may add atop the basic OpenAPI.
| */ | ||
| pub struct ExtractorMetadata { | ||
| pub paginated: bool, | ||
| pub extension_mode: ExtensionMode, |
There was a problem hiding this comment.
@lifning it looks like this was a breaking change:
error[E0560]: struct `dropshot::ExtractorMetadata` has no field named `paginated`
--> nexus/src/authn/external/cookies.rs:51:29
|
51 | ExtractorMetadata { paginated: false, parameters: vec![] }
| ^^^^^^^^^ `dropshot::ExtractorMetadata` does not have this field
|
= note: available fields are: `extension_mode`, `parameters`
Could you please update the changelog?
rel: oxidecomputer/dropshot#403 Generated methods return a `WebsocketReactants` type that at present can only be unwrapped into its inner `http::Request` and `tokio::net::TcpStream` for the purpose of implementing against the raw websocket connection, but may later be extended as a generic to allow higher-level channel message definitions (rel: oxidecomputer/dropshot#429) The returned `Request` is an HTTP request containing the client's part of the handshake for establishing a Websocket, and the `TcpStream` is a raw TCP (non-Web-) socket. The consumer of the raw `into_request_and_tcp_stream` interface is expected to send the HTTP request over the TCP socket, i.e. by providing them to a websocket implementation such as `tokio_tungstenite::client_async(Request, TcpStream)`.
rel: oxidecomputer/dropshot#403 Generated methods return a `WebsocketReactants` type that at present can only be unwrapped into its inner `http::Request` and `tokio::net::TcpStream` for the purpose of implementing against the raw websocket connection, but may later be extended as a generic to allow higher-level channel message definitions (rel: oxidecomputer/dropshot#429) The returned `Request` is an HTTP request containing the client's part of the handshake for establishing a Websocket, and the `TcpStream` is a raw TCP (non-Web-) socket. The consumer of the raw `into_request_and_tcp_stream` interface is expected to send the HTTP request over the TCP socket, i.e. by providing them to a websocket implementation such as `tokio_tungstenite::client_async(Request, TcpStream)`.
This moves the websocket-related boilerplate into the code generated by Dropshot's `#[channel]` annotation macro. (rel: oxidecomputer/dropshot#403 )
This moves the websocket-related boilerplate into the code generated by Dropshot's `#[channel]` annotation macro. (rel: oxidecomputer/dropshot#403 )
This moves the websocket-related boilerplate into the code generated by Dropshot's `#[channel]` annotation macro. (rel: oxidecomputer/dropshot#403 )
This moves the websocket-related boilerplate into the code generated by Dropshot's `#[channel]` annotation macro. (rel: oxidecomputer/dropshot#403 )
This moves the websocket-related boilerplate into the code generated by Dropshot's `#[channel]` annotation macro. (rel: oxidecomputer/dropshot#403 )
This moves the websocket-related boilerplate into the code generated by Dropshot's `#[channel]` annotation macro. (rel: oxidecomputer/dropshot#403 )
This moves the websocket-related boilerplate into the code generated by Dropshot's `#[channel]` annotation macro. (rel: oxidecomputer/dropshot#403 )
This allows endpoints meant for handling continuous websocket streams to be easily constructed by providing an async handler to an extractor parameter in the endpoint function definition.
This change also refactors the previous
paginatedbool to instead be a dedicated type describing whether an endpoint uses pagination or websockets Dropshot-specific extensions, aspirationally leaving room for describing any more extension modes we may add atop the basic OpenAPI.