Remove config of pf0hpf_if instead of setting it to down after configuration#457
Conversation
bcavnvidia
left a comment
There was a problem hiding this comment.
The logic around port_configs is a little funny and probably needs clean up. Could be now or could be later. No strong preference from me. 😄
3ad59a4 to
8c7a078
Compare
abvarshney-nv
left a comment
There was a problem hiding this comment.
I checked from dhcp-server pov, lgtm.
65a898a to
d181762
Compare
| // Note that the nvue config handles the blocking of traffic on the interface. This is only so that the host link reflects the correct state. | ||
| if let Err(err) = ethernet_virtualization::update_interface_state(&conf).await { | ||
| tracing::error!(error = format!("{err:#}"), "Updating interface state."); | ||
| } |
There was a problem hiding this comment.
Brian showed me that if we down the pf0hpf interface (in the dpu os itself) it will be reflected by the host, so instead of removing all of it, I removed the hbn stuff and targeted the new interface.
Also, I removed the cached state. I didn't feel like running ip link show was a big deal now that its not in the container and its way easier to test and the "is it down already" check is always right.
…ty to disable dhcp server
156a575 to
04538bf
Compare
bcavnvidia
left a comment
There was a problem hiding this comment.
Changes in this crate are always a little 😅 , but these changes look ok.
…uration (NVIDIA#457) When applying a new nvue config, it sets the interface to up and for the non-primary DPUs, agent was going back and bringing the interface down. This leaves a small window where the host can get a dhcp response, leaking the admin IP address assigned to the host. - removes the interface and vpcs from the nvue config -- bridge was always include in the non-fnn case. this was removed if not needed - disables dhcp server (it fails without the vpc) - ignores the dhcp server health when it is disabled - instead of bringing down the interface in the container, it brings down the interface on the dpu os, which is reflected in the host. (this is not required, but is a nice to have) - updates tests Tested both etv and fnn configs in the dev env. ## Description <!-- Describe what this PR does --> ## Type of Change <!-- Check one that best describes this PR --> - [ ] **Add** - New feature or capability - [ ] **Change** - Changes in existing functionality - [X] **Fix** - Bug fixes - [ ] **Remove** - Removed features or deprecated functionality - [ ] **Internal** - Internal changes (refactoring, tests, docs, etc.) ## Related Issues (Optional) <!-- If applicable, provide GitHub Issue. --> ## Breaking Changes - [ ] This PR contains breaking changes <!-- If checked above, describe the breaking changes and migration steps --> ## Testing <!-- How was this tested? Check all that apply --> - [X] Unit tests added/updated - [ ] Integration tests added/updated - [X] Manual testing performed - [ ] No testing required (docs, internal refactor, etc.) ## Additional Notes <!-- Any additional context, deployment notes, or reviewer guidance --> nvbug # 5931831
…uration (#457) When applying a new nvue config, it sets the interface to up and for the non-primary DPUs, agent was going back and bringing the interface down. This leaves a small window where the host can get a dhcp response, leaking the admin IP address assigned to the host. - removes the interface and vpcs from the nvue config -- bridge was always include in the non-fnn case. this was removed if not needed - disables dhcp server (it fails without the vpc) - ignores the dhcp server health when it is disabled - instead of bringing down the interface in the container, it brings down the interface on the dpu os, which is reflected in the host. (this is not required, but is a nice to have) - updates tests Tested both etv and fnn configs in the dev env. ## Description <!-- Describe what this PR does --> ## Type of Change <!-- Check one that best describes this PR --> - [ ] **Add** - New feature or capability - [ ] **Change** - Changes in existing functionality - [X] **Fix** - Bug fixes - [ ] **Remove** - Removed features or deprecated functionality - [ ] **Internal** - Internal changes (refactoring, tests, docs, etc.) ## Related Issues (Optional) <!-- If applicable, provide GitHub Issue. --> ## Breaking Changes - [ ] This PR contains breaking changes <!-- If checked above, describe the breaking changes and migration steps --> ## Testing <!-- How was this tested? Check all that apply --> - [X] Unit tests added/updated - [ ] Integration tests added/updated - [X] Manual testing performed - [ ] No testing required (docs, internal refactor, etc.) ## Additional Notes <!-- Any additional context, deployment notes, or reviewer guidance --> nvbug # 5931831
…EFIX_LIST (#538) ## Description - Rule 65535 in the DPU_TO_EVPN_DROP_PREFIX_LIST prefix-list was using route-map object syntax (action: {deny: {}}) instead of the string syntax required by prefix-list rules (action: deny), causing NVUE schema validation to reject the entire network config with the error: Config invalid at router.policy.prefix-list.DPU_TO_EVPN_DROP_PREFIX_LIST.rule.65535.action: {'deny': {}} is not of type 'string' - This rule was added in #457 to handle the case where no VPCs are configured (making rule 10 conditional), but was copy-pasted from the adjacent route-map section which uses a different schema for action - Fix updates the template and 4 test expected files ## Type of Change <!-- Check one that best describes this PR --> - [ ] **Add** - New feature or capability - [ ] **Change** - Changes in existing functionality - [x] **Fix** - Bug fixes - [ ] **Remove** - Removed features or deprecated functionality - [ ] **Internal** - Internal changes (refactoring, tests, docs, etc.) ## Related Issues (Optional) <!-- If applicable, provide GitHub Issue. --> ## Breaking Changes - [ ] This PR contains breaking changes <!-- If checked above, describe the breaking changes and migration steps --> ## Testing <!-- How was this tested? Check all that apply --> - [x] Unit tests added/updated - [ ] Integration tests added/updated - [ ] Manual testing performed - [ ] No testing required (docs, internal refactor, etc.) ## Additional Notes <!-- Any additional context, deployment notes, or reviewer guidance --> Signed-off-by: Laura Gray <lgray@nvidia.com>
…EFIX_LIST (NVIDIA#538) ## Description - Rule 65535 in the DPU_TO_EVPN_DROP_PREFIX_LIST prefix-list was using route-map object syntax (action: {deny: {}}) instead of the string syntax required by prefix-list rules (action: deny), causing NVUE schema validation to reject the entire network config with the error: Config invalid at router.policy.prefix-list.DPU_TO_EVPN_DROP_PREFIX_LIST.rule.65535.action: {'deny': {}} is not of type 'string' - This rule was added in NVIDIA#457 to handle the case where no VPCs are configured (making rule 10 conditional), but was copy-pasted from the adjacent route-map section which uses a different schema for action - Fix updates the template and 4 test expected files ## Type of Change <!-- Check one that best describes this PR --> - [ ] **Add** - New feature or capability - [ ] **Change** - Changes in existing functionality - [x] **Fix** - Bug fixes - [ ] **Remove** - Removed features or deprecated functionality - [ ] **Internal** - Internal changes (refactoring, tests, docs, etc.) ## Related Issues (Optional) <!-- If applicable, provide GitHub Issue. --> ## Breaking Changes - [ ] This PR contains breaking changes <!-- If checked above, describe the breaking changes and migration steps --> ## Testing <!-- How was this tested? Check all that apply --> - [x] Unit tests added/updated - [ ] Integration tests added/updated - [ ] Manual testing performed - [ ] No testing required (docs, internal refactor, etc.) ## Additional Notes <!-- Any additional context, deployment notes, or reviewer guidance --> Signed-off-by: Laura Gray <lgray@nvidia.com>
…EFIX_LIST (NVIDIA#538) ## Description - Rule 65535 in the DPU_TO_EVPN_DROP_PREFIX_LIST prefix-list was using route-map object syntax (action: {deny: {}}) instead of the string syntax required by prefix-list rules (action: deny), causing NVUE schema validation to reject the entire network config with the error: Config invalid at router.policy.prefix-list.DPU_TO_EVPN_DROP_PREFIX_LIST.rule.65535.action: {'deny': {}} is not of type 'string' - This rule was added in NVIDIA#457 to handle the case where no VPCs are configured (making rule 10 conditional), but was copy-pasted from the adjacent route-map section which uses a different schema for action - Fix updates the template and 4 test expected files ## Type of Change <!-- Check one that best describes this PR --> - [ ] **Add** - New feature or capability - [ ] **Change** - Changes in existing functionality - [x] **Fix** - Bug fixes - [ ] **Remove** - Removed features or deprecated functionality - [ ] **Internal** - Internal changes (refactoring, tests, docs, etc.) ## Related Issues (Optional) <!-- If applicable, provide GitHub Issue. --> ## Breaking Changes - [ ] This PR contains breaking changes <!-- If checked above, describe the breaking changes and migration steps --> ## Testing <!-- How was this tested? Check all that apply --> - [x] Unit tests added/updated - [ ] Integration tests added/updated - [ ] Manual testing performed - [ ] No testing required (docs, internal refactor, etc.) ## Additional Notes <!-- Any additional context, deployment notes, or reviewer guidance --> Signed-off-by: Laura Gray <lgray@nvidia.com> Signed-off-by: Thomas McRoberts <tmcroberts@nvidia.com>
…EFIX_LIST (#538) ## Description - Rule 65535 in the DPU_TO_EVPN_DROP_PREFIX_LIST prefix-list was using route-map object syntax (action: {deny: {}}) instead of the string syntax required by prefix-list rules (action: deny), causing NVUE schema validation to reject the entire network config with the error: Config invalid at router.policy.prefix-list.DPU_TO_EVPN_DROP_PREFIX_LIST.rule.65535.action: {'deny': {}} is not of type 'string' - This rule was added in #457 to handle the case where no VPCs are configured (making rule 10 conditional), but was copy-pasted from the adjacent route-map section which uses a different schema for action - Fix updates the template and 4 test expected files ## Type of Change <!-- Check one that best describes this PR --> - [ ] **Add** - New feature or capability - [ ] **Change** - Changes in existing functionality - [x] **Fix** - Bug fixes - [ ] **Remove** - Removed features or deprecated functionality - [ ] **Internal** - Internal changes (refactoring, tests, docs, etc.) ## Related Issues (Optional) <!-- If applicable, provide GitHub Issue. --> ## Breaking Changes - [ ] This PR contains breaking changes <!-- If checked above, describe the breaking changes and migration steps --> ## Testing <!-- How was this tested? Check all that apply --> - [x] Unit tests added/updated - [ ] Integration tests added/updated - [ ] Manual testing performed - [ ] No testing required (docs, internal refactor, etc.) ## Additional Notes <!-- Any additional context, deployment notes, or reviewer guidance --> Signed-off-by: Laura Gray <lgray@nvidia.com> Signed-off-by: Thomas McRoberts <tmcroberts@nvidia.com>
…EFIX_LIST (NVIDIA#538) ## Description - Rule 65535 in the DPU_TO_EVPN_DROP_PREFIX_LIST prefix-list was using route-map object syntax (action: {deny: {}}) instead of the string syntax required by prefix-list rules (action: deny), causing NVUE schema validation to reject the entire network config with the error: Config invalid at router.policy.prefix-list.DPU_TO_EVPN_DROP_PREFIX_LIST.rule.65535.action: {'deny': {}} is not of type 'string' - This rule was added in NVIDIA#457 to handle the case where no VPCs are configured (making rule 10 conditional), but was copy-pasted from the adjacent route-map section which uses a different schema for action - Fix updates the template and 4 test expected files ## Type of Change <!-- Check one that best describes this PR --> - [ ] **Add** - New feature or capability - [ ] **Change** - Changes in existing functionality - [x] **Fix** - Bug fixes - [ ] **Remove** - Removed features or deprecated functionality - [ ] **Internal** - Internal changes (refactoring, tests, docs, etc.) ## Related Issues (Optional) <!-- If applicable, provide GitHub Issue. --> ## Breaking Changes - [ ] This PR contains breaking changes <!-- If checked above, describe the breaking changes and migration steps --> ## Testing <!-- How was this tested? Check all that apply --> - [x] Unit tests added/updated - [ ] Integration tests added/updated - [ ] Manual testing performed - [ ] No testing required (docs, internal refactor, etc.) ## Additional Notes <!-- Any additional context, deployment notes, or reviewer guidance --> Signed-off-by: Laura Gray <lgray@nvidia.com>
## Description #457 introduced a new policy rule missing a match-block. This PR adds that missing block. ## Type of Change <!-- Check one that best describes this PR --> - [ ] **Add** - New feature or capability - [ ] **Change** - Changes in existing functionality - [x] **Fix** - Bug fixes - [ ] **Remove** - Removed features or deprecated functionality - [ ] **Internal** - Internal changes (refactoring, tests, docs, etc.) ## Related Issues (Optional) <!-- If applicable, provide GitHub Issue. --> ## Breaking Changes - [ ] This PR contains breaking changes <!-- If checked above, describe the breaking changes and migration steps --> ## Testing <!-- How was this tested? Check all that apply --> - [x] Unit tests added/updated - [ ] Integration tests added/updated - [x] Manual testing performed - [ ] No testing required (docs, internal refactor, etc.) ## Additional Notes <!-- Any additional context, deployment notes, or reviewer guidance -->
## Description #457 introduced a new policy rule missing a match-block. This PR adds that missing block. ## Type of Change <!-- Check one that best describes this PR --> - [ ] **Add** - New feature or capability - [ ] **Change** - Changes in existing functionality - [x] **Fix** - Bug fixes - [ ] **Remove** - Removed features or deprecated functionality - [ ] **Internal** - Internal changes (refactoring, tests, docs, etc.) ## Related Issues (Optional) <!-- If applicable, provide GitHub Issue. --> ## Breaking Changes - [ ] This PR contains breaking changes <!-- If checked above, describe the breaking changes and migration steps --> ## Testing <!-- How was this tested? Check all that apply --> - [x] Unit tests added/updated - [ ] Integration tests added/updated - [x] Manual testing performed - [ ] No testing required (docs, internal refactor, etc.) ## Additional Notes <!-- Any additional context, deployment notes, or reviewer guidance -->
…r configuration (NVIDIA#457)" This reverts commit e508409.
…er configuration (NVIDIA#457)" This reverts commit 9f3c2da.
When applying a new nvue config, it sets the interface to up and for the non-primary DPUs, agent was going back and bringing the interface down. This leaves a small window where the host can get a dhcp response, leaking the admin IP address assigned to the host.
-- bridge was always include in the non-fnn case. this was removed if not needed
Tested both etv and fnn configs in the dev env.
Description
Type of Change
Related Issues (Optional)
Breaking Changes
Testing
Additional Notes
nvbug # 5931831