Re-use work of getting state for a given state_group (_get_state_groups_from_groups)#15617
Re-use work of getting state for a given state_group (_get_state_groups_from_groups)#15617MadLittleMods wants to merge 12 commits intodevelopfrom
state_group (_get_state_groups_from_groups)#15617Conversation
|
|
||
| state_groups_we_have_already_fetched.add(group) | ||
|
|
||
| else: |
There was a problem hiding this comment.
To not complicate the diff, I've held off on applying the same treatment to SQLite.
We can iterate on this in another PR or just opt for people to use Postgres in order to see performance
…rom-previous-group
| results[group] = partial_state_map_for_state_group | ||
|
|
||
| state_groups_we_have_already_fetched.add(group) | ||
|
|
There was a problem hiding this comment.
After
[...]
In the sample request above, simulating how the app would make the queries with the changes in this PR, we don't actually see that much benefit because it turns out that there isn't much
state_groupsharing amongst events. That 14s SQL query is just fetching the state at each of theprev_eventsof one event and only one seems to sharestate_groups.[...]
Initially, I thought the lack of sharing was quite strange but this is because of the state_group snapshotting feature where if an event requires more than 100 hops on
state_event_edges, then a Synaspe will create a newstate_groupwith a snapshot of all of that state. It seems like this isn't done very efficiently though. Relevant docs.And it turns out the event is an
org.matrix.dummy_eventwhich Synapse automatically puts in the DAG to resolve outstanding forward extremities and these events aren't even shown to clients so we don't even need to waste time waiting for them to backfill. Tracked by #15632Generally, I think this PR could bring great gains in conjunction to running some sort of state compressor over the database to get a lot more sharing. In addition to trying to fix the online state_group snapshotting logic to be smarter. I don't know how the existing state_compressors work but I imagine we could create snapshots and bucket for years -> months -> weeks -> days -> hours -> individual events and create new
state_groupchains which utilize these from biggest to smallest to get maximal sharing.-- "After" section of the PR description
This PR hasn't made as big of an impact as I thought it would for that type of request. Are we still interested in a change like this? It may work well for sequential events that we backfill.
It seems like our state_group sharing is realllly sub-par and the way that state_groups can only have a max of 100 hops puts an upper limit on how much gain this PR can give. I didn't anticipate that's how state_groups worked and thought it was one state_group per-state-change which it is until it starts doing snapshots.
Maybe it's more interesting to improve our state_group logic to be much smarter first and we could re-visit something like this. Or look into the state compressor stuff to optimize our backlog which would help for the Matrix Public Archive. I'm not sure if the current state compressors optimize for disk space or sharing or how inter-related those two goals are.
state_groupstate_group (_get_state_groups_from_groups)
|
It sounds like this didn't make much of an improvement and significantly complicates the code. I'm going to close this due to that, but I did grab the nice improvements to the comments and move them to #16383. |
Re-use work of getting state for a given
state_group. This is useful when we_compute_event_context_with_maybe_missing_prevs(...)->_get_state_groups_from_groups(...)Part of making
/messagesfaster: #13356Background (explaining the situation before this PR)
When we have to
_compute_event_context_with_maybe_missing_prevs,_get_state_groups_from_groupscurrently takes up most of the time. It makes the following recursive query for each state group given which goes up the entire state_group chain and returns the complete distinct state which means thousands of membership events at the very least. For example, with a random/messagesrequest in the Matrix HQ room we did 10x of these queries which each took 0.5-2 seconds and return roughly 88k events every time totaling 14s.https://explain.depesz.com/s/OuOe
https://explain.dalibo.com/plan/ef6bc290f2dced17
EXPLAIN ANAYLZEquery outputFull Jaeger trace JSON
Before
Say we have
state_groups1-10 wherestate_group1would be them.room.create, and then onwards, etc.Currently, when you ask for the state at
state_groups = [5, 8], we will fetch state from5 -> 1and8 -> 1which is pretty wasteful because we're mostly fetching the same thing again and time consuming (imagine this with Matrix HQ which has chains of 88k events).After
Now instead when you ask for the state at
state_groups = [5, 8], we fetch5 -> 1as normal, but then only fetch8 -> 5and layer it on top of the work we did for the state atstate_group = 5.In terms of query performance, our first query will take the same amount of time as before but any subsequent state_groups that are shared will take significantly less time.
In the sample request above, simulating how the app would make the queries with the changes in this PR, we don't actually see that much benefit because it turns out that there isn't much
state_groupsharing amongst events. That 14s SQL query is just fetching the state at each of theprev_eventsof one event and only one seems to sharestate_groups.state_groupchains not actually shared that much in that 14s database query 😬flowchart BT subgraph $LrOm7aODe5StbJwD9tFAfrTZ1ClzWJLkIUsMH1BMc-Y direction BT 736242551 --> 736244167 --> 736258375 --> 736271166 --> 736272133 --> 736273053 --> 736279599 --> 736283383 --> 736285535 --> 736306125 --> 736307774 --> 736315167 --> 736320311 --> 736321249 --> 736329091 --> 736332354 --> 736349788 --> 736355933 --> 736356577 --> 736359340 --> 736361356 --> 736364055 --> 736364651 --> 736364778 --> 736366361 --> 736374000 --> 736374226 --> 736376011 --> 736380262 --> 736381521 --> 736390487 --> 736392574 --> 736394358 --> 736394537 --> 736402720 --> 736411655 --> 736412991 --> 736414800 --> 736417133 --> 736417721 --> 736424434 --> 736425468 --> 736426289 --> 736430049 --> 736430450 --> 736433258 --> 736437043 --> 736438637 --> 736439262 --> 736439384 end subgraph $06merKAyQAOg4R5KSvvwq-aY8YxmLxBYKJuYh8S2oFg direction BT 738551430 --> 738551567 --> 738551631 --> 738551632 --> 738554279 --> 738555253 --> 738556945 --> 738563579 --> 738564700 --> 738567178 --> 738567279 --> 738567972 --> 738568036 --> 738568218 --> 738568518 --> 738569471 --> 738570430 --> 738571084 --> 738574267 --> 738574570 --> 738574739 --> 738576792 --> 738577181 --> 738579766 --> 738579767 end subgraph $OXO_pg-0sMtWmY23TeA3XWacwe-rtrtRPsCuuC5nI9E direction BT 739921430 --> 739923253 --> 739926162 end subgraph $d_C3gK6O9cHcEkL4wb1ixzgc5I2Z5YYCaaP4PkS2w8Q direction BT 747450250 --> 747454137 --> 747454497 --> 747457549 --> 747458705 --> 747459083 --> 747460418 --> 747460615 --> 747460879 --> 747460907 --> 747462327 --> 747463184 --> 747463210 --> 747463728 --> 747471813 --> 747472497 --> 747472883 --> 747473622 --> 747474075 --> 747475361 --> 747476403 --> 747477315 --> 747478191 --> 747478527 --> 747479040 --> 747479405 --> 747487181 --> 747487318 --> 747490575 --> 747490649 --> 747491096 --> 747493955 --> 747494115 --> 747495543 --> 747496126 --> 747496673 --> 747500245 --> 747501014 --> 747502386 --> 747503796 --> 747504895 --> 747505981 --> 747507859 --> 747508081 --> 747508938 --> 747511840 --> 747511898 --> 747511899 --> 747512157 --> 747512397 --> 747514214 --> 747514428 --> 747515134 --> 747515135 --> 747515473 --> 747515813 --> 747515814 --> 747517299 --> 747517409 --> 747517726 --> 747517888 --> 747518621 --> 747522938 --> 747522939 --> 747524253 --> 747524349 --> 747531289 --> 747532433 --> 747537361 --> 747538391 --> 747538672 --> 747538876 --> 747539259 --> 747539318 --> 747539721 --> 747540484 --> 747540957 --> 747540985 --> 747541741 --> 747541853 --> 747543282 --> 747543429 --> 747544548 --> 747545428 --> 747545527 --> 747546551 --> 747546629 --> 747546765 --> 747546878 --> 747546913 --> 747547518 --> 747548528 --> 747548884 --> 747549000 end subgraph $R4RlkxDq6my3GMq4P4kEoSgpZPcFxRm_YZGOZcf7KHk direction BT 747894355 --> 747894387 --> 747894392 --> 747894435 --> 747894456 --> 747894481 --> 747894520 --> 747894566 --> 747894582 --> 747894595 --> 747894607 --> 747894613 --> 747894654 --> 747894696 --> 747894728 --> 747894756 --> 747894790 end subgraph $HeKxcdhKnOl-ocKIXEHOzHsOvkL7B-cjTFcw9RHfeeY direction BT 748155154 --> 748155201 --> 748155202 --> 748155210 --> 748156064 --> 748156371 --> 748157020 --> 748157216 --> 748159224 --> 748161119 --> 748161155 --> 748161391 --> 748161847 --> 748167002 --> 748167718 --> 748172461 --> 748172864 --> 748172962 --> 748173425 --> 748174163 --> 748174259 --> 748175893 --> 748176100 --> 748177162 --> 748178903 end subgraph $1xVp4R7Wh8jldeVFvdNoNi0kV9Mp_zPDrnQ9rESs9Vw direction BT 749290472 --> 749291708 --> 749291709 --> 749292485 --> 749292486 --> 749293705 --> 749293836 --> 749293941 --> 749294167 --> 749294279 --> 749294292 --> 749294295 --> 749294297 --> 749294324 --> 749294350 end subgraph $hYIHVcmQoxy_VUIk8GpKWynuSQr_n1A-NHGdwh5Qn3U direction BT %%749290472 --> 749291708 --> 749291709 --> 749292485 --> 749292486 --> 749293705 --> %%749293836 --> 749293941 --> 749294167 --> 749294279 --> 749294292 --> 749294295 --> 749294297 --> 749294496 --> 749294565 --> 749294632 --> 749294668 --> 749294692 --> 749294733 --> 749294882 end subgraph $CskkFIQnNzFMhfgaqOijXmSDNcJ1VyEzDCyYlQwN934 direction BT 745396521 --> 745396885 --> 745397790 --> 745398057 --> 745400234 --> 745401111 --> 745401737 --> 745401872 --> 745404304 --> 745404475 --> 745406724 --> 745409854 --> 745410377 --> 745410535 --> 745413344 --> 745417326 --> 745420130 --> 745420131 --> 745421885 --> 745422806 --> 745423670 --> 745435724 --> 745438992 --> 745439103 --> 745441593 --> 745441594 --> 745442004 --> 745442437 --> 745443725 --> 745444001 --> 745447038 --> 745447167 --> 745447479 --> 745450805 --> 745450841 --> 745454335 --> 776857245 endInitially, I thought the lack of sharing was quite strange but this is because of the state_group snapshotting feature where if an event requires more than 100 hops on
state_event_edges, then a Synaspe will create a newstate_groupwith a snapshot of all of that state. It seems like this isn't done very efficiently though. Relevant docs.And it turns out the event is an
org.matrix.dummy_eventwhich Synapse automatically puts in the DAG to resolve outstanding forward extremities and these events aren't even shown to clients so we don't even need to waste time waiting for them to backfill. Tracked by #15632Generally, I think this PR could bring great gains in conjunction to running some sort of state compressor over the database to get a lot more sharing. In addition to trying to fix the online state_group snapshotting logic to be smarter. I don't know how the existing state_compressors work but I imagine we could create snapshots and bucket for years -> months -> weeks -> days -> hours -> individual events and create new
state_groupchains which utilize these from biggest to smallest to get maximal sharing.Dev notes
Pull Request Checklist
EventStoretoEventWorkerStore.".code blocks.Pull request includes a sign off(run the linters)