Adding Helm Parameter CI Tests#2868
Conversation
|
Skipping CI for Draft Pull Request. |
|
[APPROVALNOTIFIER] This PR is NOT APPROVED This pull-request has been approved by: The full list of commands accepted by this bot can be found here. DetailsNeeds approval from an approver in each of these files:Approvers can indicate their approval by writing |
Code Coverage DiffThis PR does not change the code coverage |
b3ad7d8 to
b074ab6
Compare
|
/test pull-aws-ebs-csi-driver-e2e-single-az |
|
/test pull-aws-ebs-csi-driver-verify |
b074ab6 to
3334fde
Compare
|
/test pull-aws-ebs-csi-driver-e2e-single-az |
| param_set_legacy-compat() { | ||
| GINKGO_FOCUS="\[param:(useOldCSIDriver|legacyXFS)\]" | ||
| HELM_EXTRA_FLAGS="--set=useOldCSIDriver=true,node.legacyXFS=true" | ||
| } | ||
|
|
||
| param_set_selinux() { | ||
| GINKGO_FOCUS="\[param:selinux\]" | ||
| HELM_EXTRA_FLAGS="--set=node.selinux=true" | ||
| } |
There was a problem hiding this comment.
These are not in the PARAMETERS_ALL test for now, I will see if they need a separate cluster or not.
|
/hold Have to reverse single-az makefile target before merging |
3334fde to
976d61c
Compare
|
/test pull-aws-ebs-csi-driver-e2e-single-az |
|
Noticed some tests are being skipped, looking into it. |
| # standard - Behavioral params (tagging, metrics, logging, storage classes, etc.) | ||
| # other - Volume modification, volume attach limit, and metadata labeler | ||
| # debug - debugLogs=true overrides individual logLevel settings | ||
| # infra - Infrastructure/deployment params (resources, security, strategy, etc.) |
There was a problem hiding this comment.
Why is storage classes in standard and not in infra it is a cluster resource
There was a problem hiding this comment.
Also why are we dividing these into categories at all ? If we end up running them all anyways
21ccf1e to
8437c58
Compare
8437c58 to
68a45c5
Compare
| # THIS WILL BE REVERSED BEFORE MERGING. Only here because these tests do not | ||
| # yet have their own testgrid but reviewers need to see how JUnit output is after tests run for this PR. | ||
| .PHONY: e2e/single-az | ||
| e2e/single-az: bin/helm bin/ginkgo | ||
| AWS_AVAILABILITY_ZONES=us-west-2a \ | ||
| TEST_PATH=./tests/e2e/... \ | ||
| GINKGO_FOCUS="\[ebs-csi-e2e\] \[single-az\]" \ | ||
| GINKGO_PARALLEL=5 \ | ||
| HELM_EXTRA_FLAGS="--set=controller.volumeModificationFeature.enabled=true,sidecars.provisioner.additionalArgs[0]='--feature-gates=VolumeAttributesClass=true',sidecars.resizer.additionalArgs[0]='--feature-gates=VolumeAttributesClass=true',node.enableMetrics=true" \ | ||
| ./hack/e2e/run.sh | ||
| ./hack/e2e/param-sets.sh run-all |
There was a problem hiding this comment.
Thanks for leaving a comment about it, I'm including it in the review so that we don't forget to revert this before the merge.
| # Merge per-set JUnit XMLs into a single file with duplicate skipped tests removed. | ||
| # Each Ginkgo run reports ALL specs (most as skipped), so the same skipped test appears | ||
| # in every per-set file. This merges all results into one file, keeping non-skipped results | ||
| # (passed/failed) over skipped duplicates, and emitting each skipped test only once. | ||
| merge_junit_results() { | ||
| local report_dir="${REPORT_DIR:-/logs/artifacts}" | ||
| local output="${report_dir}/junit-params.xml" | ||
|
|
||
| python3 - "$report_dir" "$output" <<'PYEOF' | ||
| import glob, sys, xml.etree.ElementTree as ET |
There was a problem hiding this comment.
I'd drop this merge script and let the JUnit files stand on their own.
Testgrid handles multiple files per job fine, and keeping them separate actually helps debugging since you can tell which param set a failure came from just by the filename. The skipped test noise is a not an issue,(the external test jobs already report 7000+ skips), but if we want to clean it up properly, the right move is putting these parameter tests in their own package so Ginkgo only discovers the specs we care about.
| fi | ||
| } | ||
|
|
||
| # Run all standard parameter sets sequentially |
There was a problem hiding this comment.
Why are param sets running sequentially? can we parallelize?
Another thing, most of these tests don't need a live cluster. Things like "does the controller deployment have 3 replicas" or "does the node daemonset have this toleration" are assertions about rendered kubernetes objects, not runtime behavior. We could test the vast majority with helm template + assertions on the rendered YAML.
| param_set_standard() { | ||
| GINKGO_FOCUS="\[param:(extraCreateMetadata|k8sTagClusterId|extraVolumeTags|controllerMetrics|nodeMetrics|batching|defaultFsType|controllerLoggingFormat|nodeLoggingFormat|controllerLogLevel|nodeLogLevel|provisionerLogLevel|attacherLogLevel|snapshotterLogLevel|resizerLogLevel|nodeDriverRegistrarLogLevel|storageClasses|volumeSnapshotClasses|defaultStorageClass|snapshotterForceEnable|controllerUserAgentExtra|controllerEnablePrometheusAnnotations|nodeEnablePrometheusAnnotations|nodeKubeletPath|nodeTolerateAllTaints|controllerPodDisruptionBudget|provisionerLeaderElection|attacherLeaderElection|resizerLeaderElection|reservedVolumeAttachments|hostNetwork|nodeDisableMutation|nodeTerminationGracePeriod|nodeAllocatableUpdatePeriodSeconds)\]" | ||
| HELM_EXTRA_FLAGS="--set=controller.extraCreateMetadata=true,controller.k8sTagClusterId=e2e-param-test,controller.extraVolumeTags.TestKey=TestValue,controller.enableMetrics=true,node.enableMetrics=true,controller.batching=true,controller.defaultFsType=xfs,controller.loggingFormat=json,node.loggingFormat=json,controller.logLevel=5,node.logLevel=5,sidecars.provisioner.logLevel=5,sidecars.attacher.logLevel=5,sidecars.snapshotter.logLevel=5,sidecars.resizer.logLevel=5,sidecars.nodeDriverRegistrar.logLevel=5,defaultStorageClass.enabled=true,storageClasses[0].name=test-sc,storageClasses[0].parameters.type=gp3,volumeSnapshotClasses[0].name=test-vsc,volumeSnapshotClasses[0].deletionPolicy=Delete,sidecars.snapshotter.forceEnable=true,controller.userAgentExtra=e2e-test,controller.enablePrometheusAnnotations=true,node.enablePrometheusAnnotations=true,node.kubeletPath=/var/lib/kubelet,node.tolerateAllTaints=true,controller.podDisruptionBudget.enabled=true,sidecars.provisioner.leaderElection.enabled=true,sidecars.attacher.leaderElection.enabled=true,sidecars.resizer.leaderElection.enabled=true,node.reservedVolumeAttachments=2,node.hostNetwork=true,node.serviceAccount.disableMutation=true,node.terminationGracePeriodSeconds=60,nodeAllocatableUpdatePeriodSeconds=30" |
There was a problem hiding this comment.
Say someone adds a new helm value controller.foo=bar, here's what they have to do:
- Write the go test with the right
[param:foo]tag - Figure out which param set it belongs in (standard? infra? other? new one?)
- Add foo to the
GINKGO_FOCUSregex for that param set - Add
controller.foo=barto theHELM_EXTRA_FLAGSstring for that param set - Hardcode "bar" in the go test assertion and hope it matches what they put in step 4
- Hope they didn't typo anything in this massive 1000+ character strings
we need to simplify this, and have a single source of truth for what to deploy and what to assert. I suggest defining the test values once, in one place, such as by using a values yaml file per param set and have the go tests read expected values from it (or a shared const).
What type of PR is this?
/kind feature
What is this PR about? / Why do we need it?
This PR adds individual tests for the various helm parameters that can be configured when deploying the driver.
How was this change tested?
Tested manually by running:
Also ran in CI under the existing single-az testing grid. It will have it's own test grid after this merges but put under single-az for reviewers to see results and JUnit outputs.
Does this PR introduce a user-facing change?