MSC3886: Simple client rendezvous capability#3886
Conversation
| @@ -0,0 +1,236 @@ | |||
| # MSC3886: Simple client rendezvous capability | |||
There was a problem hiding this comment.
On one hand, this is a really simple and elegant standalone function. On the other hand, I'm a bit worried that it duplicates the semantics of to-device API (i.e. basic store & forward between devices), albeit with short-polling rather than long-polling.
I wonder how bad it would be if we opened up to-device messages to guests, and used the existing APIs for rendezvous? So a new device would go and /login as a guest to get a temporary access token, and then publish its device ID & HS url in its QR code to let another device rendezvous with it.
My only reason for proposing this is to avoid having two store-and-forward APIs which look suspiciously similar, but have different semantics (short/long poll), and so require more code for client implementors.
There was a problem hiding this comment.
Understood. I'll work up an alternative based on to-device messages and see how that feels.
There was a problem hiding this comment.
I have started some discussion on the to-device based alternative as part of #3903
There was a problem hiding this comment.
I wonder how bad it would be if we opened up to-device messages to guests, and used the existing APIs for rendezvous? So a new device would go and /login as a guest to get a temporary access token, and then publish its device ID & HS url in its QR code to let another device rendezvous with it.
ugh, the complexity of this feels horrible to me.
My only reason for proposing this is to avoid having two store-and-forward APIs which look suspiciously similar, but have different semantics (short/long poll), and so require more code for client implementors.
Sure, having two store-and-forward APIs is rather less than ideal, but this one is so simple and easy to use that I don't really buy that it's a meaningful amount of extra code for clients comparing to have to grab a temporary access token and then start /syncing.
For me, the simplicity of this proposal outweighs the fact it looks a bit like to-device messaging. (Or even matrix rooms, if you squint hard enough and invent "ephemeral rooms".)
The only thing I'd say here is that it would be good if the "Alternatives" section in this MSC said something about this idea (even if it's just a link to MSC3903's alternatives section).
There was a problem hiding this comment.
I think I was broadly coming to a similar conclusion to Rich. Adding guest access to to-device feels about as complex as this separate impl.
There was a problem hiding this comment.
Given the above, it appears we've settled on using a new channel rather than exposing to-device to guests. @matrix-org/spec-core-team if you disagree then please raise comments :)
Upstream changes: Synapse 1.70.1 (2022-10-28) =========================== (bugfixes) Synapse 1.70.0 (2022-10-26) =========================== Features -------- - Support for [MSC3856](matrix-org/matrix-spec-proposals#3856): threads list API. ([\#13394](matrix-org/synapse#13394), [\#14171](matrix-org/synapse#14171), [\#14175](matrix-org/synapse#14175)) - Support for thread-specific notifications & receipts ([MSC3771](matrix-org/matrix-spec-proposals#3771) and [MSC3773](matrix-org/matrix-spec-proposals#3773)). ([\#13776](matrix-org/synapse#13776), [\#13824](matrix-org/synapse#13824), [\#13877](matrix-org/synapse#13877), [\#13878](matrix-org/synapse#13878), [\#14050](matrix-org/synapse#14050), [\#14140](matrix-org/synapse#14140), [\#14159](matrix-org/synapse#14159), [\#14163](matrix-org/synapse#14163), [\#14174](matrix-org/synapse#14174), [\#14222](matrix-org/synapse#14222)) - Stop fetching missing `prev_events` after we already know their signature is invalid. ([\#13816](matrix-org/synapse#13816)) - Send application service access tokens as a header (and query parameter). Implements [MSC2832](matrix-org/matrix-spec-proposals#2832). ([\#13996](matrix-org/synapse#13996)) - Ignore server ACL changes when generating pushes. Implements [MSC3786](matrix-org/matrix-spec-proposals#3786). ([\#13997](matrix-org/synapse#13997)) - Experimental support for redirecting to an implementation of a [MSC3886](matrix-org/matrix-spec-proposals#3886) HTTP rendezvous service. ([\#14018](matrix-org/synapse#14018)) - The `/relations` endpoint can now be used on workers. ([\#14028](matrix-org/synapse#14028)) - Advertise support for Matrix 1.3 and 1.4 on `/_matrix/client/versions`. ([\#14032](matrix-org/synapse#14032), [\#14184](matrix-org/synapse#14184)) - Improve validation of request bodies for the [Device Management](https://spec.matrix.org/v1.4/client-server-api/#device-management) and [MSC2697 Device Dehyrdation](matrix-org/matrix-spec-proposals#2697) client-server API endpoints. ([\#14054](matrix-org/synapse#14054)) - Experimental support for [MSC3874](matrix-org/matrix-spec-proposals#3874): Filtering threads from the `/messages` endpoint. ([\#14148](matrix-org/synapse#14148)) - Improve the validation of the following PUT endpoints: [`/directory/room/{roomAlias}`](https://spec.matrix.org/v1.4/client-server-api/#put_matrixclientv3directoryroomroomalias), [`/directory/list/room/{roomId}`](https://spec.matrix.org/v1.4/client-server-api/#put_matrixclientv3directorylistroomroomid) and [`/directory/list/appservice/{networkId}/{roomId}`](https://spec.matrix.org/v1.4/application-service-api/#put_matrixclientv3directorylistappservicenetworkidroomid). ([\#14179](matrix-org/synapse#14179)) Deprecations and Removals ------------------------- - Remove the experimental implementation of [MSC3772](matrix-org/matrix-spec-proposals#3772). ([\#14094](matrix-org/synapse#14094)) - Remove the unstable identifier for [MSC3715](matrix-org/matrix-spec-proposals#3715). ([\#14106](matrix-org/synapse#14106), [\#14146](matrix-org/synapse#14146))
|
This was proposed as ready for FCP by the author, however it appears to have some conflicts with to-device messaging. For now, I'm going to put it into "needs technical design review". This doesn't mean it's not ready for FCP, just that the review's tone should take a slightly different shape. @matrix-org/spec-core-team please review this for technical design. |
Co-authored-by: Richard van der Hoff <1389908+richvdh@users.noreply.github.com>
| R->>-B: 202 Accepted | ||
|
|
||
| Note over A,B: Rendezvous now established | ||
| ``` |
There was a problem hiding this comment.
My understanding of this is that A and B take turns writing to the same rendezvous URI until they're done. So when it's B's turn to write, A keeps polling (using the ETag) until the server says the data has changed, and vice versa.
What happens if B tries to write, but gets some sort of network error, or an error from a proxy? If the server got B's data, but B received a network error, then it seems to me what could happen is:
- A receives B's data, thinks it's now their turn to send data, so sends their data and gets a new ETag
- B retries the request, overwriting A's data (and never receiving it)
- A polls for new data, using the new ETag
- since B overwrote A's data, the data doesn't match the ETag, so A gets the data B sent, again
So B will miss a message from A, and A will get a duplicate message.
There was a problem hiding this comment.
Perhaps we could mitigate against this by using a RFC7232 If-Match on the PUT requests?
There was a problem hiding this comment.
ISTM that every PUT should be required to cite a previous ETag so that the rendezvous server can enforce a linear ordering. (The initial ETag is included in the POST and GET response, so both A and B should be fully aware of it.)
| HTTP request headers: | ||
|
|
||
| - `If-None-Match` - optional, as per [RFC7232](https://httpwg.org/specs/rfc7232.html#header.if-none-match) server will | ||
| only return data if given ETag does not match |
There was a problem hiding this comment.
Might be nice for servers to have the option to delay responding until it gets content that doesn't match the ETag, so we can do long-polling.
| Because this is an entirely new set of functionality it should not cause issue with any existing Matrix functions or capabilities. | ||
|
|
||
| The proposed protocol requires the devices to have IP connectivity to the server which might not be the case in P2P scenarios. | ||
|
|
There was a problem hiding this comment.
One potential issue here is that if A sends a message to B, then waits for a message from B using the ETag, but the message that B sends to A happens to be exactly the same as the message that A sent, then A will get the 304 Not Modified response, and never realize that B sent a message. So anything built on top of this needs to ensure that a message is never identical to the preceding message.
There was a problem hiding this comment.
Is this still a problem? From the current text, it sounds like 304 Not Modified will only be returned when a matching ETag is supplied in a If-None-Match. Given the resolution of this thread, clients will have to supply a previous ETag when doing a PUT, which means we no longer have to rely on the sameness of the content to decide whether the content has been modified. That is, a PUT request that specifies the previous ETag A should be regarded as altering the payload at A, even if the payload is unchanged, and therefore, it should be assigned a new ETag.
turt2live
left a comment
There was a problem hiding this comment.
overall this MSC seems like a good approach for the problem. I've left some comments about things that should be clarified before the MSC can go up for FCP.
| - the user ID | ||
| - facilitation of issuing a new access token | ||
| - device ID for end-to-end encryption | ||
| - device keys for end-to-end encryption |
There was a problem hiding this comment.
We shouldn't be sending (private) device keys over the wire like this. They should be generated by the new device, which may be the device ID given, but not transmitted over the wire.
| @@ -0,0 +1,236 @@ | |||
| # MSC3886: Simple client rendezvous capability | |||
There was a problem hiding this comment.
Given the above, it appears we've settled on using a new channel rather than exposing to-device to guests. @matrix-org/spec-core-team if you disagree then please raise comments :)
|
|
||
| - any data up to maximum size allowed by the server | ||
|
|
||
| HTTP response codes: |
There was a problem hiding this comment.
These also need to have Matrix error codes to go with them please
There was a problem hiding this comment.
See aee7d81. I have proposed M_DIRTY_WRITE for the HTTP 412 Precondition Failed case where Alice attempts to write but hasn't seen Bob's latest write.
There was a problem hiding this comment.
Although I self-bikeshedded a different name in 3fecfcd.
| - `Location` - required, the allocated rendezvous URI which can be on a different server | ||
| - `X-Max-Bytes` - required, the maximum allowed bytes for the payload |
There was a problem hiding this comment.
why are these headers and not response body parameters?
There was a problem hiding this comment.
On reflection I would agree that if it is going to be part of the C-S API then it would make sense to consider consistency with the rest of the C-S API where headers are not used.
There was a problem hiding this comment.
Are we concerned with just these two headers? Or do we want all of the response and request headers to be expressed via HTTP bodies?
There was a problem hiding this comment.
The response body for GET ought to be just the payload itself. Since the payload data is an arbitrary byte sequence, it would be painful to embed this in JSON. Therefore I would encourage the current GET response headers (Content-Type, ETag, Expires, Last-Modified) to continue to be expressed via headers. For consistency it makes sense to do so in all other resposnes.
For POST this leaves the two highlighted headers: Location and X-Max-Bytes. We could present them as a JSON-encoded body, but it would seem odd to spread the POST response metadata in two places without any meaningful distinction to justify it. My vote would be to leave things as they are. But I neither have a vote, nor any strong opinions.
There was a problem hiding this comment.
The response body for GET ought to be just the payload itself. Since the payload data is an arbitrary byte sequence, it would be painful to embed this in JSON.
We could base64 encode the payload, but I tend to agree. Not all things need to be shoehorned through JSON.
|
|
||
| - any data up to maximum size allowed by the server | ||
|
|
||
| HTTP response codes: |
There was a problem hiding this comment.
Matrix error codes here too please, and probably a 400 definition to cover missing headers and such
There was a problem hiding this comment.
(comment applies throughout remainder of proposal)
|
|
||
| HTTP response headers for `201 Created`: | ||
|
|
||
| - `Location` - required, the allocated rendezvous URI which can be on a different server |
There was a problem hiding this comment.
Presumably this needs URI needs to be not guessable, to prevent attackers from guessing this and impersonating the intended recipient?
| HTTP response headers for `202 Accepted` and `412 Precondition Failed`: | ||
|
|
||
| - `ETag` - required, ETag for the current payload at the rendezvous point as per [RFC7232](https://httpwg.org/specs/rfc7232.html#header.etag) | ||
| - `Expires` - required, the expiry time of the rendezvous as per [RFC7233](https://httpwg.org/specs/rfc7234.html#header.expires) |
There was a problem hiding this comment.
Is the intention that the expiry time is incremented every time the rendezvous payload is updated?
"Dirty write" refers to a specific SQL phenomenon regarding transaction isolation, see e.g. https://blog.acolyer.org/2016/02/24/a-critique-of-ansi-sql-isolation-levels/ Instead, prefer the word "concurrent".
Co-authored-by: Denis Kasak <dkasak@termina.org.uk>
|
It is proposed that MSC4108 supersedes this MSC. |
|
Closing this PR as #4108 is now ready for review. |
No significant changes since 1.108.0rc1. - Add a feature that allows clients to query the configured federation whitelist. Disabled by default. ([\#16848](element-hq/synapse#16848), [\#17199](element-hq/synapse#17199)) - Add the ability to allow numeric user IDs with a specific prefix when in the CAS flow. Contributed by Aurélien Grimpard. ([\#17098](element-hq/synapse#17098)) - Fix bug where push rules would be empty in `/sync` for some accounts. Introduced in v1.93.0. ([\#17142](element-hq/synapse#17142)) - Add support for optional whitespace around the Federation API's `Authorization` header's parameter commas. ([\#17145](element-hq/synapse#17145)) - Fix bug where disabling room publication prevented public rooms being created on workers. ([\#17177](element-hq/synapse#17177), [\#17184](element-hq/synapse#17184)) - Document [`/v1/make_knock`](https://spec.matrix.org/v1.10/server-server-api/#get_matrixfederationv1make_knockroomiduserid) and [`/v1/send_knock/`](https://spec.matrix.org/v1.10/server-server-api/#put_matrixfederationv1send_knockroomideventid) federation endpoints as worker-compatible. ([\#17058](element-hq/synapse#17058)) - Update User Admin API with note about prefixing OIDC external_id providers. ([\#17139](element-hq/synapse#17139)) - Clarify the state of the created room when using the `autocreate_auto_join_room_preset` config option. ([\#17150](element-hq/synapse#17150)) - Update the Admin FAQ with the current libjemalloc version for latest Debian stable. Additionally update the name of the "push_rules" stream in the Workers documentation. ([\#17171](element-hq/synapse#17171)) - Add note to reflect that [MSC3886](matrix-org/matrix-spec-proposals#3886) is closed but will remain supported for some time. ([\#17151](element-hq/synapse#17151)) - Update dependency PyO3 to 0.21. ([\#17162](element-hq/synapse#17162)) - Fixes linter errors found in PR #17147. ([\#17166](element-hq/synapse#17166)) - Bump black from 24.2.0 to 24.4.2. ([\#17170](element-hq/synapse#17170)) - Cache literal sync filter validation for performance. ([\#17186](element-hq/synapse#17186)) - Improve performance by fixing a reactor pause. ([\#17192](element-hq/synapse#17192)) - Route `/make_knock` and `/send_knock` federation APIs to the federation reader worker in Complement test runs. ([\#17195](element-hq/synapse#17195)) - Prepare sync handler to be able to return different sync responses (`SyncVersion`). ([\#17200](element-hq/synapse#17200)) - Organize the sync cache key parameter outside of the sync config (separate concerns). ([\#17201](element-hq/synapse#17201)) - Refactor `SyncResultBuilder` assembly to its own function. ([\#17202](element-hq/synapse#17202)) - Rename to be obvious: `joined_rooms` -> `joined_room_ids`. ([\#17203](element-hq/synapse#17203), [\#17208](element-hq/synapse#17208)) - Add a short pause when rate-limiting a request. ([\#17210](element-hq/synapse#17210)) * Bump cryptography from 42.0.5 to 42.0.7. ([\#17180](element-hq/synapse#17180)) * Bump gitpython from 3.1.41 to 3.1.43. ([\#17181](element-hq/synapse#17181)) * Bump immutabledict from 4.1.0 to 4.2.0. ([\#17179](element-hq/synapse#17179)) * Bump sentry-sdk from 1.40.3 to 2.1.1. ([\#17178](element-hq/synapse#17178)) * Bump serde from 1.0.200 to 1.0.201. ([\#17183](element-hq/synapse#17183)) * Bump serde_json from 1.0.116 to 1.0.117. ([\#17182](element-hq/synapse#17182))
Rendered
307redirect: Implementation of HTTP 307 response for MSC3886 POST endpoint synapse#14018