sandboxed containers: Add wait loop for CatalogSource readiness in env-cm step#79608
sandboxed containers: Add wait loop for CatalogSource readiness in env-cm step#79608tbuskey wants to merge 1 commit into
Conversation
env-cm creates brew-catalog CatalogSource but doesn't wait for it to be ready. If the catsrc isn't ready, subscriptions can fail. This adds wait_for_catsrc() function that polls for READY state (120s timeout, 5s intervals). Co-Authored-By: Claude Sonnet 4.5 <noreply@anthropic.com> Signed-off-by: Tom Buskey <tbuskey@redhat.com>
WalkthroughA new ChangesCatalogSource readiness polling
🎯 3 (Moderate) | ⏱️ ~20 minutes 🚥 Pre-merge checks | ✅ 11 | ❌ 1❌ Failed checks (1 warning)
✅ Passed checks (11 passed)
✏️ Tip: You can configure your own custom pre-merge checks in the settings. ✨ Finishing Touches🧪 Generate unit tests (beta)
Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out. Comment |
|
[APPROVALNOTIFIER] This PR is APPROVED This pull-request has been approved by: tbuskey The full list of commands accepted by this bot can be found here. The pull request process is described here DetailsNeeds approval from an approver in each of these files:Approvers can indicate their approval by writing |
There was a problem hiding this comment.
Actionable comments posted: 1
Caution
Some comments are outside the diff and can’t be posted inline due to platform limitations.
⚠️ Outside diff range comments (1)
ci-operator/step-registry/sandboxed-containers-operator/env-cm/sandboxed-containers-operator-env-cm-commands.sh (1)
1-2:⚠️ Potential issue | 🔴 Critical | ⚡ Quick win
wait_for_catsrcfailure is not guaranteed to stop the step.If readiness times out,
wait_for_catsrcreturns non-zero, but withoutset -euo pipefail(or an explicit|| return/exit), the script can continue and still run downstream steps.Suggested fix
#!/bin/bash +set -euo pipefail @@ - wait_for_catsrc "${CATALOG_SOURCE_NAME}" + wait_for_catsrc "${CATALOG_SOURCE_NAME}"(If you prefer not to rely on
-e, then use:)- wait_for_catsrc "${CATALOG_SOURCE_NAME}" + wait_for_catsrc "${CATALOG_SOURCE_NAME}" || exit 1As per coding guidelines: "Step registry script files must use
set -euo pipefail(without-x) as default and only enable-xwhen actively debugging".Also applies to: 125-125
🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the rest with a brief reason, keep changes minimal, and validate. In `@ci-operator/step-registry/sandboxed-containers-operator/env-cm/sandboxed-containers-operator-env-cm-commands.sh` around lines 1 - 2, Add strict shell failure handling to this step script by enabling "set -euo pipefail" at the top (immediately after the shebang) so that a non-zero return from wait_for_catsrc will stop the step; if you prefer to avoid global -e, ensure every call to wait_for_catsrc (and similar readiness/check helpers) is followed with an explicit "|| exit 1" to fail fast. Update the top of the script and any invocation sites of wait_for_catsrc to guarantee the step aborts on timeout/failure.
🤖 Prompt for all review comments with AI agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.
Inline comments:
In
`@ci-operator/step-registry/sandboxed-containers-operator/env-cm/sandboxed-containers-operator-env-cm-commands.sh`:
- Line 91: The oc get call used to populate the local variable state (local
state=$(oc get catalogsource -n openshift-marketplace "${catsrc_name}" -o
jsonpath='{.status.connectionState.lastObservedState}' 2>/dev/null || echo ""))
can hang and bypass the overall 120s retry budget; modify that oc get invocation
to include a per-request timeout (e.g. --request-timeout=10s) so each poll
attempt is bounded and will fail fast within the loop's retry budget.
---
Outside diff comments:
In
`@ci-operator/step-registry/sandboxed-containers-operator/env-cm/sandboxed-containers-operator-env-cm-commands.sh`:
- Around line 1-2: Add strict shell failure handling to this step script by
enabling "set -euo pipefail" at the top (immediately after the shebang) so that
a non-zero return from wait_for_catsrc will stop the step; if you prefer to
avoid global -e, ensure every call to wait_for_catsrc (and similar
readiness/check helpers) is followed with an explicit "|| exit 1" to fail fast.
Update the top of the script and any invocation sites of wait_for_catsrc to
guarantee the step aborts on timeout/failure.
🪄 Autofix (Beta)
Fix all unresolved CodeRabbit comments on this PR:
- Push a commit to this branch (recommended)
- Create a new PR with the fixes
ℹ️ Review info
⚙️ Run configuration
Configuration used: Repository YAML (base), Central YAML (inherited)
Review profile: CHILL
Plan: Enterprise
Run ID: 0364a09b-5e01-4aa9-a5cd-220db599dd6b
📒 Files selected for processing (1)
ci-operator/step-registry/sandboxed-containers-operator/env-cm/sandboxed-containers-operator-env-cm-commands.sh
|
|
||
| local ready=false | ||
| for i in {1..24}; do | ||
| local state=$(oc get catalogsource -n openshift-marketplace "${catsrc_name}" -o jsonpath='{.status.connectionState.lastObservedState}' 2>/dev/null || echo "") |
There was a problem hiding this comment.
Add a request timeout to oc get inside the polling loop.
A stuck API call can block forever and bypass the intended 120s retry budget. Add --request-timeout so each poll attempt is bounded.
Suggested fix
- local state=$(oc get catalogsource -n openshift-marketplace "${catsrc_name}" -o jsonpath='{.status.connectionState.lastObservedState}' 2>/dev/null || echo "")
+ local state
+ state="$(oc --request-timeout=10s get catalogsource -n openshift-marketplace "${catsrc_name}" -o jsonpath='{.status.connectionState.lastObservedState}' 2>/dev/null || echo "")"📝 Committable suggestion
‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.
| local state=$(oc get catalogsource -n openshift-marketplace "${catsrc_name}" -o jsonpath='{.status.connectionState.lastObservedState}' 2>/dev/null || echo "") | |
| local state | |
| state="$(oc --request-timeout=10s get catalogsource -n openshift-marketplace "${catsrc_name}" -o jsonpath='{.status.connectionState.lastObservedState}' 2>/dev/null || echo "")" |
🧰 Tools
🪛 Shellcheck (0.11.0)
[warning] 91-91: Declare and assign separately to avoid masking return values.
(SC2155)
🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.
In
`@ci-operator/step-registry/sandboxed-containers-operator/env-cm/sandboxed-containers-operator-env-cm-commands.sh`
at line 91, The oc get call used to populate the local variable state (local
state=$(oc get catalogsource -n openshift-marketplace "${catsrc_name}" -o
jsonpath='{.status.connectionState.lastObservedState}' 2>/dev/null || echo ""))
can hang and bypass the overall 120s retry budget; modify that oc get invocation
to include a per-request timeout (e.g. --request-timeout=10s) so each poll
attempt is bounded and will fail fast within the loop's retry budget.
|
[REHEARSALNOTIFIER]
A total of 49 jobs have been affected by this change. The above listing is non-exhaustive and limited to 25 jobs. A full list of affected jobs can be found here Interacting with pj-rehearseComment: Once you are satisfied with the results of the rehearsals, comment: |
|
@tbuskey: The following test failed, say
Full PR test history. Your PR dashboard. DetailsInstructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes-sigs/prow repository. I understand the commands that are listed here. |
env-cm creates brew-catalog CatalogSource but doesn't wait for it to be ready. If the catsrc isn't ready, subscriptions can fail.
This adds wait_for_catsrc() function that polls for READY state (120s timeout, 5s intervals).
Summary by CodeRabbit
This PR improves the reliability of the sandboxed-containers-operator's CI testing pipeline by ensuring that a CatalogSource is fully ready before dependent operations attempt to use it.
What changed
The
env-cmstep in the sandboxed-containers-operator CI configuration adds a newwait_for_catsrc()function that:openshift-marketplacenamespace until its.status.connectionState.lastObservedStatefield indicatesREADYThe step now invokes this wait function immediately after creating the CatalogSource in the Pre-GA test flow, closing a race condition where subscriptions could fail if they were created before the CatalogSource finished initializing.
Impact
This affects CI testing of the sandboxed-containers-operator in Pre-GA release scenarios. The change makes the test environment setup more robust by explicitly gating subsequent steps on the CatalogSource readiness, rather than relying on implicit timing assumptions or downstream retry mechanisms.