Skip to content
Open
Changes from 11 commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
392 changes: 392 additions & 0 deletions proposals/2477-user-defined-ephemeral-events.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,392 @@
# MSC2476: User-defined ephemeral events in rooms

Matrix currently handles the transfer of data in the form of Persistent- as well as Ephemeral Data
Units, both of which follow the same general design in the way they're encoded and transferred over
both the Client-Server and Server-Server API.

Currently, users are only able to provide their own event types and data in the case of
persistent data, in the form of state events as well as messages / timeline events.
The sending of ephemeral data by clients - on the other hand - is currently limited to only typing
notifications, event receipts, and presence updates. Which greatly limits the potential usefulness
of ephemeral events as a general mechanism for transferring short-lived data.

Therefore, this proposal suggest extending both the Client-Server and Server-Server APIs to allow
users to transfer arbitrary ephemeral data types and content into rooms in which they have the right
to do so.


## Proposal

The proposed change is to add support for users to provide their own data types and content, in a
similar manner to the already existing support for users to send their own types of persistent data.

Note though that this proposal does not include any support for sending user-defined ephemeral
events which are not explicitly bound to rooms, like the global `m.presence` event.


Examples of how this feature could be used are; as regular status updates to a user-requested
long-lived task, which a bot might has started for a received event. Or pehaps as a GPS live-location
feature, where participating client would regularly post their current location relative to a
persistent geo-URI event. Perhaps for organizing meetups, or for viewing active tracking of the
locations of vehicles in an autonomous fleet - maybe along with peristent messages posted at a lesser
rate for a timeline generation.

The example that will be used througout this proposal is an ephemeral data object that's tracking
the current status of a user-requested 3D print, with some basic printer- and print-status
information being sent every few seconds to a control room, including a reference to the event that
the status is referring to - which the client could use to render a progress bar or some other graphic
with.

### Addition of an ephemeral event sending endpoint to the Client-Server API

The addition to the CS API is the endpoint
`PUT /_matrix/client/v3/rooms/{roomId}/ephemeral/{eventType}/{txnId}`, which would act in an almost
identical manner to the event sending endpoint that is already present.
An example of how an update might be posted using the new endpoint;

```
PUT /_matrix/client/v3/rooms/%21636q39766251%3Aexample.com/ephemeral/com.example.3dprint/19914 HTTP/1.1
Content-Type: application/json

{
"print_event_id": "$E2RPcyuMUiXyDkQ02ASEbFxcJ4wFNrt5JVgov0wrqWo",
"printer_id": 10,
"status": {
"hotend_c": 181.4,
"bed_c": 62.5,
"position": [54, 275, 87.2]
},
"time": {
"elapsed": 4324,
"estimated": 7439
}
}
```

Example of a response;

Status code 200:

As EDUs do not have event IDs, this is an empty JSON object

```json
{}
```

Status code 400:

The user tried to send a `m.*` EDU, instead of using the type-specific endpoint
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm not sure if we should make this off-limits to anything m.*. In fact, I wonder if we should go the opposite way and transition all room-based ephemeral messages to this endpoint - so read receipts and typing notifications.

At the moment, typing notifications are using:

PUT /_matrix/client/v3/rooms/{roomId}/typing/{userId}

{
  "timeout": 30000,
  "typing": true
}

and read markers are sent with:

POST /_matrix/client/v3/rooms/{roomId}/read_markers

{
  "m.fully_read": "$somewhere:example.org",
  "m.read": "$elsewhere:example.org"
}

The {userId} bit of the typing endpoint is especially useless, and could be cleaned up in the same move.

Presence and to-device messages would be excluded from this, as they are not bound to a room.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Read markers seem to be a bit of a special case, as they have an effect on the server to store something quasi-persistent, shouldn't read_markers stay out of this then? or what is the idea behind that?

Copy link
Contributor Author

@ananace ananace Nov 30, 2021

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I wrote this MSC on the assumption that the custom EDUs would be a separate beast to the built-in EDUs, but I guess it could just as well be done to instead have the server handle well-defined EDUs in a certain manner. So that you - as a client dev/user - could just post m.typing or m.receipt/m.fully_read/m.read EDUs instead of calling the specific endpoints for them.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Read markers seem to be a bit of a special case, as they have an effect on the server to store something quasi-persistent, shouldn't read_markers stay out of this then? or what is the idea behind that?

Good point. m.fully_read does not fit this model, as its purpose is to store your read-up-to position, which lives in the current user's room account data. The linked endpoint just happens to allow optionally sending an m.read receipt in the same call.

Instead, we should be thinking about whether to replicate the behaviour of POST /_matrix/client/v3/rooms/{roomId}/receipt/{receiptType}/{eventId} with this then.

With that, I'm not so sure. They do fall under the banner of EDU, but a receipt is always tied to an event ID (e.g. I read up to this event ID). We'd need to include that event ID in the body of the request for m.read or for any other type of receipt. That can work, though I suppose it comes down to how much we'd like to differentiate receipts as their own thing, versus just another type of EDU.


```json
{
"errcode": "M_UNKNOWN",
"error": "Cannot send built-in ephemeral types with this endpoint"
}
```

Status code 403:

Power levels forbid the user from sending the attempted EDU

```json
{
"errcode": "M_FORBIDDEN",
"error": "You do not have permission to send the EDU"
}
```

User is not in the room

```json
{
"errcode": "M_FORBIDDEN",
"error": "You are not a member of the room"
}
```

Status code 429:

The request was rate-limited

```json
{
"errcode": "M_LIMIT_EXCEEDED",
"error": "Too many requests",
"retry_after_ms": 2000
}
```

### Extension of power levels to handle user-defined ephemeral events

As it would be possible for the user-defined events to be used to flood a room with invisible
traffic by malicious users - increasing the bandwidth usage for all connected servers, this proposal
also suggests extending the power levels to handle ephemeral types as well.

In any room version implementing this MSC, the auth rules concerning ephemeral events in the
`m.room.power_levels` event are;

- `ephemeral` (`{string: integer}`) - A mapping of EDU types to the power-level required to send them
- `ephemeral_default` (`integer`) - The default power-level required to send any EDU not listed in
the above mapping

These new keys are to function in an identical manner to the already existing `events` and
`events_default` keys, with the assumed default value for `ephemeral_default` - if there is no
`ephemeral_default` in the `m.room.power_levels` event - being 50, while the default values for
`ephemeral` - if there is no `ephemeral` in the `m.room.power_levels` event - would consider all types
to be `ephemeral_default`, or 0 if there is no `m.room.power_levels` event - which would then not allow
any ephemeral events to be sent.
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I found this paragraph a bit difficult to parse. Partly because it's all one sentence. How about:

These new keys are to function in an identical manner to the already existing events and
events_default keys. The default value for ephemeral_default is 50, while the default value for the
ephemeral dictionary would be {} or an empty mapping. This would imply that all ephemeral event types
would require a power level of at least ephemeral_default to send.

I think the last line about a missing m.room.power_levels state event is not necessary. According to the spec, rooms must contain an m.room.power_levels when created: https://spec.matrix.org/v1.2/client-server-api/#post_matrixclientv3createroom

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Decided to rewrite it in a slightly different manner, should hopefully be more easily parsed now.


It is therefore recommended for servers to include at least the following `ephemeral` configuration
for all newly created rooms of any room version implementing this MSC, to allow for the sending of
the default ephemeral events in Matrix;

```json
{
"m.receipt": 0,
"m.typing": 0
}
```

### Extension of the room-specific ephemeral data received in /sync responses

Because the user-defined ephemeral events can't be aggregated and massaged by Synapse in a simple
manner, this MSC instead requires adding a few more fields to the room-specific ephemeral events as
they are encoded in a sync response. The additions in question are;

- `sender` (`string`) - The fully qualified ID of the user that sent the EDU
- `origin_server_ts` (`integer`) - The timestamp in milliseconds on the originating homeserver when
- this event was sent

To reduce the scope of changes required by this proposal, the suggestion is to allow the original
`m.*` events to skip these keys where no value could be easily assigned to them. E.g. typing notices,
read receipts.

```json
{
"next_batch": "...",
// ...
"rooms": {
"join": {
"!636q39766251:example.com": {
// ...
"timeline": {
"events": [
{
"content": {
"gcode": "mxc://example.com/GEnfasiifADESSAF",
"printer": 10
},
"type": "com.example.3dprint_request",
"event_id": "$4CvDieFIFAzSYaykmBObZ2iUhSa5XNEUnC-GQfLl2yc",
"room_id": "!636q39766251:example.com",
"sender": "@alice:matrix.org",
"origin_server_ts": 1432735824653,
"unsigned": {
"age": 5558
}
},
{
"content": {
"body": "Print of fan_shroud_v5.gcode started on printer 10, ETA is 2h. Stream is available at https://example.com/printers/10.m3u8",
"com.example.3dprint": {
"gcode": "mxc://example.com/GEnfasiifADESSAF",
"printer": 10,
"video": "https://example.com/printers/10.m3u",
"eta": 7253
},
"m.relates_to": {
"m.in_reply_to": {
"event_id": "$4CvDieFIFAzSYaykmBObZ2iUhSa5XNEUnC-GQfLl2yc"
}
},
"msgtype": "m.text"
},
"origin_server_ts": 1432735825887,
"sender": "@printbot:example.com",
"type": "m.room.message",
"unsigned": {
"age": 4324
},
"event_id": "$E2RPcyuMUiXyDkQ02ASEbFxcJ4wFNrt5JVgov0wrqWo",
"room_id": ""
}
]
},
"ephemeral": {
"events": [
{
"content": {
"user_ids": [
"@alice:matrix.org",
"@bob:example.com"
]
},
"type": "m.typing",
"room_id": "!636q39766251:example.com"
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

room_id isn't part of the spec for ephemeral event bodies down /sync.

Likewise below.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The spec entry you linked includes an m.typing event in the example response for code 200, which does include room_id in the data;

# ...
    "join": {
      "!726s6s6q:example.com": {
        "account_data": {
          "events": [
            {
              "content": {
                "tags": {
                  "u.work": {
                    "order": 0.9
                  }
                }
              },
              "type": "m.tag"
            },
            {
              "content": {
                "custom_config_key": "custom_config_value"
              },
              "type": "org.example.custom.room.config"
            }
          ]
        },
        "ephemeral": {
          "events": [
            {
              "content": {
                "user_ids": [
                  "@alice:matrix.org",
                  "@bob:example.com"
                ]
              },
              "room_id": "!jEsUZKDJdhlrceRyVU:example.org",
              "type": "m.typing"
            }
          ]
        },
# ...

I think that's where that came from, since I tried to base my examples directly on the spec examples.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I see, thanks for pointing that out! I've made a PR to fix that, as I believe that's a spec bug: #3679.

},
{
"content": {
"print_event_id": "$E2RPcyuMUiXyDkQ02ASEbFxcJ4wFNrt5JVgov0wrqWo",
"printer_id": 10,
"status": {
"hotend_c": 181.4,
"bed_c": 62.5,
"position": [54, 275, 87.2]
},
"time": {
"elapsed": 4324,
"estimated": 7440
}
},
"type": "com.example.3dprint",
"room_id": "!636q39766251:example.com",
"sender": "@printbot:example.com",
"origin_server_ts": 1432735830211
}
]
}
}
}
}
}
```

### Delivery guarantees

The current ephemeral event system in Matrix is built on a sort-of guaranteed delivery - albeit with
mutation/consolidation - of the events. This might not be desirable with custom ephemeral events, as
they could contain volumes of data that's not as easy to keep around for a guaranteed delivery.

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I wonder if it might make more sense to remove the mutation/consolidation expectation?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes, replacing the built-in m.read/m.presence/etc with proper types that aren't munged by the server is indeed something that could be done, but I'm writing this MSC entirely towards the addition of user-defined EDUs and nothing else - leaving the already existing EDU types as they are.


Therefore I suggest that servers are only required to provide best-effort delivery, with the exact
method in how they propagate EDUs - and store them - left up to implementation.
(Perhaps keeping a max-size ring buffer per room - that will thus remove old data when necessary.
Or maybe only storing per active sync token, per user, per device.)

As the messages in question are ephemeral, I think the only guarantee that should be required is that
all users that are online when the message is sent will receive it. Anything above that should be
Copy link
Member

@kegsay kegsay Sep 16, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

all users that are online when the message is sent will receive it

This is actually a very high bar! When I was designing MSC4354 I thought to provide similar guarantees, but to meet them requires a lot more work than this proposal is doing. There's two main problems with this proposal:

  • the EDUs as described have no reference point in the DAG, meaning servers won't agree on whether the EDU is authorised, as it depends on what their view of the current room state is. You need something like prev_events to anchor the auth rules.
  • Nothing stops a sender from equivocating and sending different EDUs to different servers, intentionally split-braining the room. This is a harder problem to solve, and ultimately requires non-sending servers to talk to each other to see if they have the same data and if not, to deterministically pick one. But once you do that... you need to persist the data so it can be queried later... and you need to have a succinct way of expressing the data like.. a hash, and oh look- we've reinvented PDUs.

commended, but not required.

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

How does one define an "online" user?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I tried to avoid putting any specific requirements/definitions on what the best-effort guarantee should mean, hence why I'm using some more vague terms like "online".
I originally wrote the best-effort definition to be for any user that has an active sync request, but it ended up feeling rather arbitrary (not to mention leaving open the question of what would happen to EDUs that arrive while the user's client is busy processing the sync response), and also didn't really match with some of the user stories I had in mind while writing the MSC.

If I were to write my own Matrix server, with custom EDUs, I'd probably consider a size and time gated buffer for them, where all user sessions with an active sync request are guaranteed to receive the EDU, and any user session that starts a new sync request within the time gate - or before the buffer rolls over - may also receive it.
But that's probably quite different from how other servers would do it.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

When I was thinking about how to implement this a year or so ago, I had come up with the idea to let the sender of the EDU choose their delivery guarantee from a pre-defined set. The homeserver could either store it indefinitely until the receiver comes online (think to-device messages), or allow the receiver to miss it if they're not currently syncing (i.e. ephemeral live location events). In the latter case, the homeserver may only hold that data for 1m or so before expiring it. The same would apply to federation if the remote homeserver were down.

Copy link
Member

@kegsay kegsay Sep 30, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think the only guarantee that should be required is that
all users that are online when the message is sent will receive it.

This is a weak guarantee which will make a bad UX as if you join a room sending print updates, you need to wait N seconds for it to appear, assuming perfect network connectivity. Having a stronger guarantee of "last write is shown" feels like a better UX for the majority of use cases which want custom EDUs (e.g location sharing). This is bounded data if you replace the EDU based on (sender, type) and have a restricted set of types in your power levels event.


### Extension of the Server-Server spec

As the server-server protocol is currently only designed for transferring the well-defined EDUs that
exist as part of the Matrix communication protocol, this proposal requires some additional fields to
be added the EDU schema in order to let them transmit the user-specified data untouched - while still
adding source information that is important for the receiving clients.

The fields to add to the EDU schema are;

- `room_id` (`string`) - The fully qualified ID of the room the event was sent in
- `sender` (`string`) - The fully qualified ID of the user that sent the event
- `origin` (`string`) - The `server_name` of the homeserver that created this event
- `origin_server_ts` (`integer`) - Timestamp in milliseconds on origin homeserver when this event
was created

A user-defined ephemeral event might then look like this when federated;

```json
{
"content": {
"print_event_id": "$E2RPcyuMUiXyDkQ02ASEbFxcJ4wFNrt5JVgov0wrqWo",
"printer_id": "10",
"status": {
"hotend_c": 181.4,
"bed_c": 62.5,
"position": [54, 275, 87.2]
},
"time": {
"elapsed": 4324,
"estimated": 7440
}
},
"room_id": "!636q39766251:example.com",
"sender": "@printbot:example.com",
"origin": "example.com",
"origin_server_ts": 1432735830211,
"edu_type": "com.example.3dprint"
}
```

To reduce the scope of the changes required by this proposal, the suggestion is to allow the original
`m.*` events to skip these keys where no value can be easily assigned to them. E.g. aggregated typing
notices, receipt lists.


## Potential issues

Because this change completely redefines how ephemeral events are used, it is likely to be expected
that some servers and clients could struggle to handle the new types of data that this proposal
would create. But as the protocol is defined with an extensible transport - JSON, it should not be
difficult - if even necessary - for clients or servers to be modified to support the changes.

Additionally, as ephemeral data is never encoded into room state there's not as many tools for
admins to handle abuse that occur through the use of the feature.
The proposals suggested changes to power levels - and limitation of what event types can be sent -
should mitigate the potential of abuse from the feature though, as long as admins don't allow any
user-defined ephemeral types to be sent by regular users.
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This feels suboptimal because it means the room admin needs to know the entire superset of features of all potential clients which may join the room. Consider:

  • I create a room on a terminal client that has no knowledge of any of these custom EDUs.
  • 6 people join all on clients capable of showing printer status updates.
  • See how sad they become as they can't use the feature.

To resolve this requires a fair bit of social work as either:

  • someone asks the admin to allow the custom type (assuming the client they are using let's them do this)
  • someone asks the admin to make one of the 6 users a mod/admin/to have enough power for their client to automatically (or manually) enable the sending of that custom type.


This change could impact the - currently in review - [MSC2409], necessitating some kind of filter for
what event types would be transferred to appservices. [MSC2487] suggests a possible solution for this
potential problem.


## Alternatives

The ephemeral state that this change would allow to transfer could as easily be sent using
self-destructing messages with the help of [MSC1763] or [MSC2228], which would result in a similar
experience to the end user.
Unfortunately using persistent events in such a manner would add a lot of unnecessary data to the
room DAG, while also increasing both the computational work and the database pressure through the
repeated and rapid insertions and redactions that it would result in.

### Client-Server protocol changes

The additions to ephemeral objects could be expanded to also apply to the normal `m.*` types as well,
which would reduce the complexity of the spec as there would be no distinction between the built-in
Matrix types as well as the user-defined types.
This could cause some clients to break though, if they expect the well-defined objects to keep to
their specced forms. Additionally, it might be hard for the server to assign a correct sender and
timestamp to events if they are aggregated from multiple sources - e.g. typing notices and read
receipt lists.
Comment on lines +385 to +388
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This could cause some clients to break though, if they expect the well-defined objects to keep to
their specced forms.

If I understand correctly, this is referring to adding the sender and origin_server_ts fields to a JSON object. In the past I believe we've considered adding keys to JSON to not break backwards compatibility, as clients should just be pulling out the keys they need anyhow - and these events are not signed or otherwise wholly checked for integrity.

Additionally, it might be hard for the server to assign a correct sender and
timestamp to events if they are aggregated from multiple sources - e.g. typing notices and read
receipt lists.

This would have to be a problem that we solve anyways, regardless of backwards-compatibility periods.

I'm also not clear why it would be difficult to populate these fields for typing and read receipt lists.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

As one example on the second point; m.typing events are aggregated down from the source data into a single event with a list of user_ids that are currently typing. If they are to have correct timestamps and sender data attached to match the EDU type as defined in this MSC, then they'd have to be split apart into separate objects each with their own metadata. I'm not really sure if that's desirable.


### Server-Server protocol changes

Instead of adding potentially optional keys to the EDU schema, the entire object could instead be
embedded into the content key, using an EDU type key that denotes it as an user-defined type.
This would mean a smaller change to the server-server communication, while still allowing a server
module to filter or track events based on their types or origins.

```json
{
"content": {
"room_id": "!636q39766251:example.com",
"sender": "@printbot:example.com",
"origin": "example.com",
"origin_server_ts": 1432735830211,
"type": "com.example.3dprint",
"content": {
"print_event_id": "$E2RPcyuMUiXyDkQ02ASEbFxcJ4wFNrt5JVgov0wrqWo",
"printer_id": "10",
"status": {
"hotend_c": 181.4,
"bed_c": 62.5,
"position": [54, 275, 87.2]
},
"time": {
"elapsed": 4324,
"estimated": 7440
}
}
},
"edu_type": "m.user_defined"
}
```

Possibly, the additional requirements for user-defined types could instead also be expanded to cover
the regular Matrix types as well, which would remove the need for optional fields - but could in
return impact the federation between servers, if they're built to only handle the exact requirements
of the spec.

Copy link
Member

@anoadragon453 anoadragon453 Apr 7, 2022

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Unstable prefixes will need to be defined in order for an implementation of the MSC to be carried out. I suggest implementation replace:

  • ephemeral and ephemeral_default fields in the m.room.power_levels state event with org.matrix.msc2477.ephemeral and org.matrix.msc2477.ephemeral_default respectively.
  • the PUT /_matrix/client/v1/rooms/{roomId}/ephemeral/{eventType}/{txnId} endpoint with PUT /_matrix/client/unstable/org.matrix.msc2477/rooms/{roomId}/ephemeral/{eventType}/{txnId}.

And an experimental room version with name org.matrix.msc2477 should be used where this feature is enabled.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

And a whole lot later than it should've been, but I've added a section on an unstable prefix.

Wasn't sure about adding blurbs on when the m.room.power_levels keys should be allowed, but I figured that it'd a bit superfluous since the only method that will act on them is limited to the experimental room version.

[MSC1763]: https://github.com/matrix-org/matrix-doc/pull/1763
[MSC2228]: https://github.com/matrix-org/matrix-doc/pull/2228
[MSC2409]: https://github.com/matrix-org/matrix-doc/pull/2409
[MSC2487]: https://github.com/matrix-org/matrix-doc/pull/2487