Skip to content
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
7 changes: 7 additions & 0 deletions runner/rbac.yaml
Original file line number Diff line number Diff line change
Expand Up @@ -45,6 +45,13 @@ rules:
- apiGroups: [""]
resources: ["configmaps"]
verbs: ["get", "list", "watch", "create", "patch", "update"]
# Chaos Mesh Workflow: read-only. compute-target-height looks up the
# parent Workflow CR's UID so it can stamp the workflow-vars ConfigMap
# with an ownerReference. Deletion of the Workflow then cascades
# garbage-collection of the ConfigMap automatically.
- apiGroups: ["chaos-mesh.org"]
resources: ["workflows"]
verbs: ["get"]
---
apiVersion: rbac.authorization.k8s.io/v1
kind: RoleBinding
Expand Down
20 changes: 7 additions & 13 deletions scenarios/README.md
Original file line number Diff line number Diff line change
Expand Up @@ -207,11 +207,11 @@ namespace as the Workflow:
`$SEI_UPGRADE_NAME` per concurrent run, or treat the chain as serially
owned by one scenario at a time.

- **Cleanup:** ConfigMaps are not garbage-collected by the Workflow.
Operators clear them via the `sei.io/workflow-run` label (see Cleanup
above). A future enhancement is to set an `ownerReference` on the
ConfigMap pointing at the Workflow CR so it cascades on Workflow
deletion.
- **Cleanup:** the ConfigMap carries an `ownerReference` pointing at the
parent Workflow CR (`major-upgrade-$SEI_WORKFLOW_RUN_ID`). Deleting the
Workflow cascades garbage-collection of the ConfigMap automatically
via kube-controller-manager. Operators can still clean up by label
(`-l sei.io/workflow-run`) if multiple Workflows are torn down at once.

## Known limitations / deferred capability

Expand Down Expand Up @@ -244,19 +244,13 @@ namespace as the Workflow:
4. **The runner image is not yet auto-published.** Add a `runner` step to
`.github/workflows/ecr.yml` once this scenario is wired into a CI job.

5. **ConfigMap is not owner-referenced to the Workflow.** Cleanup is
manual today. A follow-up that adds an `ownerReferences` entry to
the ConfigMap (pointing at the Workflow CR) would make Workflow
deletion cascade. Punt until the runner manages the ConfigMap
lifecycle natively.

6. **Argo Workflows migration is still on the long-term roadmap.** The
5. **Argo Workflows migration is still on the long-term roadmap.** The
ConfigMap bridge is the MVP. Argo's `outputs.parameters` /
`inputs.parameters` is more ergonomic and avoids the per-run
ConfigMap garbage. Plan that migration once we have more than one
scenario worth porting.

7. **No fan-out from a single step.** The 4-vote step is hard-coded to
6. **No fan-out from a single step.** The 4-vote step is hard-coded to
4 children rather than `--per-node-selector=role=validator`. We could
collapse the four `vote-node-*` templates into one fan-out runner if
the SeiNodes carry a consistent label, but the explicit per-node form
Expand Down
20 changes: 19 additions & 1 deletion scenarios/major-upgrade.yaml
Original file line number Diff line number Diff line change
Expand Up @@ -76,7 +76,11 @@
apiVersion: chaos-mesh.org/v1alpha1
kind: Workflow
metadata:
name: major-upgrade
# Workflow CR name carries the run ID so two concurrent applies don't
# collide on the same CR. The workflow-vars ConfigMap (see
# compute-target-height) sets ownerReferences to this CR so a Workflow
# deletion cascades to the ConfigMap.
name: major-upgrade-$SEI_WORKFLOW_RUN_ID
labels:
sei.io/scenario: major-upgrade
sei.io/workflow-run: "$SEI_WORKFLOW_RUN_ID"
Expand Down Expand Up @@ -146,6 +150,17 @@ spec:
POST=$((TARGET + 10))
PANIC_BOUNDARY=$((TARGET - 1))
echo "current=${CUR} target=${TARGET} post=${POST} panic_boundary=${PANIC_BOUNDARY}"
# Look up the parent Workflow's UID so we can stamp an
# ownerReference on the ConfigMap. When the Workflow CR is
# deleted, kube-controller-manager garbage-collects the
# ConfigMap automatically — no operator-managed cleanup.
WORKFLOW_UID=$(kubectl get workflow.chaos-mesh.org \
"major-upgrade-${SEI_WORKFLOW_RUN_ID}" \
-o jsonpath='{.metadata.uid}')
if [ -z "${WORKFLOW_UID}" ]; then
echo "failed to resolve Workflow UID for major-upgrade-${SEI_WORKFLOW_RUN_ID}" >&2
exit 1
fi
kubectl create configmap "workflow-vars-${SEI_WORKFLOW_RUN_ID}" \
--from-literal=TARGET_HEIGHT="${TARGET}" \
--from-literal=UPGRADE_HEIGHT="${TARGET}" \
Expand All @@ -155,6 +170,9 @@ spec:
| kubectl label -f - --local -o yaml \
sei.io/workflow-run="${SEI_WORKFLOW_RUN_ID}" \
sei.io/scenario=major-upgrade \
| kubectl patch -f - --local --type=merge --patch \
"{\"metadata\":{\"ownerReferences\":[{\"apiVersion\":\"chaos-mesh.org/v1alpha1\",\"kind\":\"Workflow\",\"name\":\"major-upgrade-${SEI_WORKFLOW_RUN_ID}\",\"uid\":\"${WORKFLOW_UID}\",\"controller\":false,\"blockOwnerDeletion\":false}]}}" \
-o yaml \
| kubectl apply -f -
env:
- {name: SEI_DEPLOYMENT, value: "$SEI_DEPLOYMENT"}
Expand Down
Loading