Skip to content

Extract upstream network configuration from PlanningInput#10601

Open
bnaecker wants to merge 2 commits into
mainfrom
ben/add-upstream-networking-to-planner-input
Open

Extract upstream network configuration from PlanningInput#10601
bnaecker wants to merge 2 commits into
mainfrom
ben/add-upstream-networking-to-planner-input

Conversation

@bnaecker

Copy link
Copy Markdown
Collaborator

@bnaecker bnaecker marked this pull request as draft June 11, 2026 03:40
- Move construction of the upstream network configuration out of the
  `BlueprintBuilder` and into the `PlanningInput`. The builder no longer
  scans all its zones to determine the upstream NTP servers, DNS
  servers, etc. Instead, read this from a combination of the DB and the
  parent blueprint at planning construction time. This will eventually
  come from the database entirely, when we persist operator decisions
  around this data to it.
- Fixes #9040
@bnaecker bnaecker force-pushed the ben/add-upstream-networking-to-planner-input branch from bdb50e0 to d07f0f1 Compare June 11, 2026 16:43
@bnaecker bnaecker marked this pull request as ready for review June 11, 2026 16:43
@bnaecker

Copy link
Copy Markdown
Collaborator Author

I tackled this issue as a small foray into Reconfigurator, in preparation for the work to run the control plane over IPv6 (or dual-stack). I took the half-measure proposed in #9040. We're still extracting most of the information from the parent or current blueprint, but that happens inside PlanningInputFromDb::assemble() rather than when we build each zone. That means we can more easily switch to reading the information from dedicated tables, instead of the current blueprint.

I think the new location of all the data here makes sense, but it took me a while to suss out what belongs on PlanningInput, Planner, and BlueprintBuilder. LMK if something seems out of place.

Comment thread nexus/types/src/deployment.rs Outdated
Comment thread nexus/types/src/deployment.rs Outdated
Comment thread nexus/types/src/deployment.rs Outdated

@jgallagher jgallagher left a comment

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for picking this up! Most of my suggested changes are of the "could we take this a bit further now" variety; I'm not 100% sure we can but I think we might be able to?

/// which we can infer the upstream configuration. (This should only be the
/// case for test blueprints - real systems always deploy at least one
/// boundary NTP zone).
pub fn upstream_ntp_config(&self) -> Option<UpstreamNtpConfig<'_>> {

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can we remove these methods entirely now? If this information is coming from the planning input, I don't think we should also have methods to get it from the blueprint.

Maybe we're still using this where we load the info "from the db" (which is really "load the blueprint from the db then call this method")? I wonder if we should move these methods into that datastore implementation instead, and make them private, to make it clear that this information is supposed to be coming from PlanningInput. (This might also let us drop the Cow bits in favor of whatever type planning input wants to give us?)

/// Operator-supplied configuration for the rack's externally-facing service
/// networking.
///
///

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Stray extra newline

/// Adds a nexus zone on this sled.
pub fn sled_add_zone_nexus(
&mut self,
config: &OperatorNexusConfig<'_>,

@jgallagher jgallagher Jun 15, 2026

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Couple nits that apply to all three of the methods changed here:

  1. Stylistically I think all the rest of the sled_add_* methods take the sled ID as the first arg; could we move this to the end of the arg list (or at least after sled_id)?
  2. Since the caller is now required to provide this config (which I think is correct!), I don't think we need a separate sled_add_zone_nexus_with_config() - can we delete that one and move its impl into this one?

use std::net::IpAddr;

/// External networking configuration for Oxide-managed services, extracted from
/// the current target blueprint at `PlanningInput` construction time.

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ug, I think there was a preexisting race here that this comment made obvious, sorry. PlanningInputFromDb::assemble() takes the parent_blueprint as an argument, and has a long comment about how it's important for planner cleanup. But when we call these methods as a part of assembling that input, they load the current target blueprint from the db; it's possible the blueprint they load is newer than the parent_blueprint passed to assemble().

I can't come up with a reason this could cause misbehavior; once we address all these issues and separate loading the desired config from the blueprint, we'll be loading these independently anyway. Although this highlights we probably do need some kind of generation guard on all of that config to ensure the planner doesn't act on old config (e.g., if parent_blueprint was created with "external networking operator config generation N", we need to ensure that assemble() doesn't load an external networking config generation < N).

Sorry this is kind of rambly; I'm not sure we need actual changes here. I wonder if these methods should take the blueprint as an argument? For now, it would be because it needs that for the actual data, and in the future, it could be to add a guard on the config generation? I'm not actually sure the latter is necessary though; if we've already loaded a blueprint from the db, I don't think it's possible for a subsequent read to see a config older than whatever was used to produce that blueprint.

cfg.domain.map(Cow::into_owned),
cfg.dns_servers.into_owned(),
),
None => (Vec::new(), None, Vec::new()),

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We should probably return an internal error here - this should never happen, and if it does we probably don't want to proceed with planning (although things may already be quite busted).

///
/// Uses [`Cow`] so the same type can represent both a borrowed view (e.g.,
/// returned from [`Blueprint::upstream_ntp_config`]) and an owned value
/// (e.g., stored in a `PlanningInput`).

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm not sure this comment is right - it doesn't look like PlanningInput stores one of these, and instead it constructs one on demand. It looks like both construction points use Cow::Borrowed() - could these go back to plain &'a references?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

Blueprint planner: Upstream network configuration should come from PlanningInput, not the parent blueprint

2 participants