Add BlueprintZoneFilter to all_omicron_zones#5348
Conversation
Created using spr 1.3.6-beta.1
…o-all_omicron_zones
…o-all_omicron_zones
…o-all_omicron_zones
…o-all_omicron_zones
|
This is closer, I added a single test so far that ensures that filtering around external dns works. A few more tests may be warranted. With #5438, @sunshowers added the disposition field and filtered the one join on |
|
I think this is good to go. |
There was a problem hiding this comment.
Is this the PR where we're also supposed to fix up Nevermind! The DataStore::vpc_resolve_to_sleds() (which implicitly uses all_omicron_zones, sorta, by directly querying the bp_omicron_zone table, but doesn't currently consider the zone disposition)?vpc_resolve_to_sleds() fix landed in #5238
| ) -> impl Iterator<Item = (Uuid, &OmicronZoneConfig)> { | ||
| self.blueprint_zones.iter().flat_map(|(sled_id, z)| { | ||
| z.zones.iter().map(|z| (*sled_id, &z.config)) | ||
| self.blueprint_zones.iter().flat_map(move |(sled_id, z)| { |
There was a problem hiding this comment.
Two questions:
- Can we remove
all_blueprint_zones()entirely now? (Last time I looked, every caller immediately mapped the results of that function to just theOmicronZoneConfig, but was calling it because this function didn't take a filter yet.) - If not, could this function be implemented in terms of it instead?
self.all_blueprint_zones(filter).map(|(sled_id, z)|(sled_id, &z.config))I think?
There was a problem hiding this comment.
Ahh I have a wip change that needs all_blueprint_zones.
There was a problem hiding this comment.
Ok, I'll go with door number 2 then - can we rewrite the implementation of this to use all_blueprint_zones()?
| Self::InService => match filter { | ||
| BlueprintZoneFilter::All => true, | ||
| BlueprintZoneFilter::SledAgentPut => true, | ||
| BlueprintZoneFilter::Crucible => true, |
There was a problem hiding this comment.
I don't love this name. This:
all_omicron_zones(BlueprintZoneFilter::Crucible)looks like it's filtering to the zones that have zone type Crucible, to me. I'm not even 100% sure what we're saying here, exactly. I would understand SledFilter::ShouldHaveCrucibleZones, but what does BlueprintZoneFilter::Crucible mean when applied to a non-crucible zone?
(I have similar concerns about ::External and ::SledAgentPut, but this seemed like the easiest one on which I could state the concern.)
There was a problem hiding this comment.
I would understand SledFilter::ShouldHaveCrucibleZones, but what does BlueprintZoneFilter::Crucible mean when applied to a non-crucible zone?
I feel like this comment somewhat questions the overall premise of the BlueprintZoneFilter. We could indeed have SledFilter filter out any unnecessary zones by looking at zone disposition, but that would likely be somewhat incoherent as the sled is now filtering zones based on disposition disposition. I totally get what you are saying here though, and I think you have a point. You are talking about filtering on behaviors or what should happen as a result of a filter matched with a disposition, rather than individual zones, since a filter for an individual zone doesn't do anything for other zones. However, crucible zones are kinda special and they do go on all disks, unless the disposition is expunged. I wonder if instead of specifying crucible, we could cover other zones with this similar property, such as SledAgentPut, and even VpcFirewall. Those all get deployed unless expunged. The downside of that is lack of specificity in the caller.
I'm kinda ranting here, but I wonder if there's a larger change to be made...
There was a problem hiding this comment.
So... the main reason we introduced BlueprintZoneFilter, versus for example passing in a callback that returns a boolean, is to ensure that we have an accounting about every place that is making decisions about which zones it's considering.
In this case, BlueprintZoneFilter::Crucible is used to identify the set of zones for which we expect active dataset records to exist.
But you bring a really important point up: that BlueprintZoneFilter::Crucible can return non-Crucible zones in a manner that is somewhat surprising. So... what if BlueprintZoneFilter::Crucible also filtered based on zone kind? The first thing ensure_crucible_dataset_records_exist does is to not consider non-Crucible zones, so it would fit right in.
Also, I think it may be worth renaming this to CrucibleDatasetRecords or similar.
There was a problem hiding this comment.
I think it's worth changing the name as well. I will go ahead and do that. As for SledAgentPut, I think that one is actually extremely useful. It provided a solid mechanism for limiting which zones actually get included in the put message to sled agent.
Right now, External really only maps to external DNS, unless I missed something. Maybe it's actually worth making that one more specific?
There was a problem hiding this comment.
I retract my statement on External. I like how it covers multiple things. How about renaming it to ExternalResources though?
There was a problem hiding this comment.
Ooh, or we could change the name of External to ExternallyReachable.
There was a problem hiding this comment.
I ended up changing the name to ExternallyReachable in 514ff8c
I'm happy to change it again if you all disagree.
There was a problem hiding this comment.
I also changed Crucible to CrucibleDatasets
There was a problem hiding this comment.
But you bring a really important point up: that BlueprintZoneFilter::Crucible can return non-Crucible zones in a manner that is somewhat surprising. So... what if BlueprintZoneFilter::Crucible also filtered based on zone kind? The first thing ensure_crucible_dataset_records_exist does is to not consider non-Crucible zones, so it would fit right in.
Yeah, I like this, although I think I'm still getting hung up on the fact that this primarily looks at the zone disposition. We want this call:
.all_omicron_zones(BlueprintZoneFilter::SomethingSomething)to return (a) all crucible zones that (b) are still running in some capacity (either in service or quiesced). ::Crucible sounds like it's just returning all the crucible zones; CrucibleDatasets sounds like either "zones that provide datasets" or "zones that need datasets", but neither of those feel like they capture the "still running in some capacity" part of it.
I'm not sure where that leave me, other than vaguely unsatisfied with the names (with my apologies!). I feel like I want a word for "in service or quiesced". Running doesn't seem right, but if it were, I'd propose something like:
::RunningCrucible(only non-expunged zones with type::Crucible)::Running(all non-expunged zones, which clients can additionally filter by type if they want to)
Ooh, or we could change the name of External to ExternallyReachable.
I like this name a lot more; I feel like it captures the intent of what we want (i.e., we filter by this before applying any changes related to making zones reachable from outside the rack).
There was a problem hiding this comment.
I made the changes that @jgallagher, @sunshowers, and I discussed in a video chat.
jgallagher
left a comment
There was a problem hiding this comment.
Thanks for sticking with this - I really like how the Shoud... names read at call sites.
I'm not actually sure these are quite correct, but putting these up there for
an early look.
TODO:
bp_omicron_zonetable -- but that will be easier once the zones are a simple column on [WIP] [reconfigurator] start removing expunged zones #5211.