[nexus] Fix pagination in instance-by-sled query by hawkw · Pull Request #6534 · oxidecomputer/omicron

hawkw · 2024-09-06T17:52:49Z

The instance_and_vmm_list_by_sled_agent query in nexus-db-queries returns a list of all instances, their projects, their active VMMs, and the sleds those active VMMs are running on. This is used by the instance_watcher background task to find VMMs to perform health checks for. The query is paginated by sled IDs to limit the amount of data fetched in one query, and so that the returned VMMs are grouped by sled, allowing the instance_watcher task to reuse the same sled-agent client connection pool for all VMMs on a sled.

Unfortunately, the current pagination is a bit wrong, as @davepacheco kindly pointed out to me in this comment:

This is paginated by just sled_id, but that doesn't quite work. The
pagination marker must be unique for each row. I think you'll want to
use paginated_multicolumn with another id here, maybe either the
instance id or vmm id? As is, I'd expect this code to break when the
list of instances returned for a particular sled spans a pagination
boundary. (What I'd expect would happen there is: we process each page
fine up through the first page of results containing that sled id in
it, then we use that sled id as the marker, and the subsequent query
would fetch results with sled ids >= the one given, and so you'd
miss all the rows for that sled id that weren't on the first page of
results for that sled id.) I was surprised to see that this is not
documented on either Paginator::found_batch() nor paginated() :(

The only reason we haven't encountered this bug yet is that currently, the batch size for the query is fairly high. However, PR #6527 reduces the batch size substantially to serve as a limit on the number of concurrently in flight health checks, meaning we'll see some VMMs not being health checked any time a sled has more than 16 VMMs on it.

This commit fixes the query by changing it to use
paginated_multicolumn to paginate the results by both the sled's ID and the VMM's ID. This is made possible by #6530, which changed paginated_multicolumn to allow the query fragment that's paginated to be a FROM clause that's a JOIN, in addition to a single table --- because we must join the sled and vmm tables in order to get both IDs. This change required restructuring the order in which the query is constructed, due to Diesel Reasons, so I took the liberty to add some comments while I was moving stuff around.

@davepacheco

The `instance_and_vmm_list_by_sled_agent` query in `nexus-db-queries` returns a list of all instances, their projects, their active VMMs, and the sleds those active VMMs are running on. This is used by the `instance_watcher` background task to find VMMs to perform health checks for. The query is paginated by sled IDs to limit the amount of data fetched in one query, and so that the returned VMMs are grouped by sled, allowing the `instance_watcher` task to reuse the same sled-agent client connection pool for all VMMs on a sled. Unfortunately, the current pagination is a bit wrong, as @davepacheco kindly pointed out to me in [this comment][1]: > This is paginated by just `sled_id`, but that doesn't quite work. The > pagination marker must be unique for each row. I think you'll want to > use `paginated_multicolumn` with another id here, maybe either the > instance id or vmm id? As is, I'd expect this code to break when the > list of instances returned for a particular sled spans a pagination > boundary. (What I'd expect would happen there is: we process each page > fine up through the first page of results containing that sled id in > it, then we use that sled id as the marker, and the subsequent query > would fetch results with sled ids `>=` the one given, and so you'd > miss all the rows for that sled id that weren't on the first page of > results for that sled id.) I was surprised to see that this is not > documented on either `Paginator::found_batch()` nor `paginated()` :( The only reason we haven't encountered this bug yet is that currently, the batch size for the query is fairly high. However, PR #6527 reduces the batch size substantially to serve as a limit on the number of concurrently in flight health checks, meaning we'll see some VMMs not being health checked any time a sled has more than 16 VMMs on it. This commit fixes the query by changing it to use `paginated_multicolumn` to paginate the results by both the sled's ID *and* the VMM's ID. This is made possible by #6530, which changed `paginated_multicolumn` to allow the query fragment that's paginated to be a `FROM` clause that's a `JOIN`, in addition to a single table --- because we must join the `sled` and `vmm` tables in order to get both IDs. This change required restructuring the order in which the query is constructed, due to Diesel Reasons, so I took the liberty to add some comments while I was moving stuff around. [1]: #6519 (review)

davepacheco

No major issues here but I had some questions about the query.

davepacheco · 2024-09-09T19:01:16Z

+            // Ignore VMMs which are in states that are not known to exist on a
+            // sled. Since this query drives instance-watcher health checking,
+            // it is not necessary to perform health checks for VMMs that don't
+            // actually exist in real life.


This is probably tangential to all of this work, but do we have anything that notices a case where a VM is unexpectedly running on a sled?

Hmm, the instance-watcher task will not do that, since it's asking if the sled-agents actually have the instances they're believed to have, rather than asking them "hey, what instances do you have?".

I think there may be a code path in cpapi_instances_put that handles the case where a sled-agent has sent nexus an update regarding a VMM that Nexus doesn't believe exists, but off the top of my head, I don't think that code path does anything besides returning an error to the sled-agent...

Co-authored-by: David Pacheco <dap@oxidecomputer.com>

hawkw requested review from davepacheco and sunshowers September 6, 2024 17:52

hawkw mentioned this pull request Sep 6, 2024

[nexus] Mark VMMs on Expunged sleds as Failed #6519

Merged

davepacheco reviewed Sep 9, 2024

View reviewed changes

hawkw and others added 2 commits September 9, 2024 12:14

Update nexus/db-queries/src/db/datastore/instance.rs

0350d9f

Co-authored-by: David Pacheco <dap@oxidecomputer.com>

query review feedback from @davepacheco

6f40029

hawkw requested a review from davepacheco September 9, 2024 19:58

davepacheco approved these changes Sep 9, 2024

View reviewed changes

hawkw enabled auto-merge (squash) September 9, 2024 20:25

hawkw merged commit 1c24f45 into main Sep 9, 2024

hawkw deleted the eliza/fix-instance-list-by-sled branch September 9, 2024 21:20

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

[nexus] Fix pagination in instance-by-sled query#6534

[nexus] Fix pagination in instance-by-sled query#6534
hawkw merged 3 commits into
mainfrom
eliza/fix-instance-list-by-sled

hawkw commented Sep 6, 2024

Uh oh!

davepacheco left a comment

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

davepacheco Sep 9, 2024

Uh oh!

hawkw Sep 9, 2024

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

Conversation

hawkw commented Sep 6, 2024

Uh oh!

davepacheco left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

davepacheco Sep 9, 2024

Choose a reason for hiding this comment

Uh oh!

hawkw Sep 9, 2024

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants