diff --git a/.github/workflows/release.lock.yml b/.github/workflows/release.lock.yml index 276e8fb139f..775183b2d36 100644 --- a/.github/workflows/release.lock.yml +++ b/.github/workflows/release.lock.yml @@ -5977,19 +5977,19 @@ jobs: - name: Download Go modules run: go mod download - name: Generate SBOM (SPDX format) - uses: anchore/sbom-action@fbfd9c6c189226748411491745178e0c2017392d # v0 + uses: anchore/sbom-action@fbfd9c6c189226748411491745178e0c2017392d # v0.20.10 with: artifact-name: sbom.spdx.json format: spdx-json output-file: sbom.spdx.json - name: Generate SBOM (CycloneDX format) - uses: anchore/sbom-action@fbfd9c6c189226748411491745178e0c2017392d # v0 + uses: anchore/sbom-action@fbfd9c6c189226748411491745178e0c2017392d # v0.20.10 with: artifact-name: sbom.cdx.json format: cyclonedx-json output-file: sbom.cdx.json - name: Upload SBOM artifacts - uses: actions/upload-artifact@ea165f8d65b6e75b540449e92b4886f43607fa02 # v4 + uses: actions/upload-artifact@330a01c490aca151604b8cf639adc76d48f6c5d4 # v5 with: name: sbom-artifacts path: | diff --git a/pkg/workflow/data/action_pins.json b/pkg/workflow/data/action_pins.json index 76aee711f42..767dcd6e27e 100644 --- a/pkg/workflow/data/action_pins.json +++ b/pkg/workflow/data/action_pins.json @@ -70,6 +70,11 @@ "version": "v5", "sha": "330a01c490aca151604b8cf639adc76d48f6c5d4" }, + "anchore/sbom-action@v0": { + "repo": "anchore/sbom-action", + "version": "v0", + "sha": "fbfd9c6c189226748411491745178e0c2017392d" + }, "anchore/sbom-action@v0.20.10": { "repo": "anchore/sbom-action", "version": "v0.20.10", diff --git a/specs/README.md b/specs/README.md index e5b262fe21a..dc7cd95690b 100644 --- a/specs/README.md +++ b/specs/README.md @@ -19,6 +19,7 @@ This directory contains design specifications and implementation documentation f | [YAML Version Compatibility](./yaml-version-gotchas.md) | ✅ Documented | `pkg/workflow/compiler.go` | | [Schema Validation](./SCHEMA_VALIDATION.md) | ✅ Documented | `pkg/parser/schemas/` | | [GitHub Actions Security Best Practices](./github-actions-security-best-practices.md) | ✅ Documented | Workflow security guidelines and patterns | +| [End-to-End Feature Testing](./end-to-end-feature-testing.md) | ✅ Documented | `.github/workflows/dev.md`, `.github/workflows/dev-hawk.md` | ## Security Reviews @@ -42,4 +43,4 @@ When adding new specifications: --- -**Last Updated**: 2025-11-13 +**Last Updated**: 2025-12-05 diff --git a/specs/end-to-end-feature-testing.md b/specs/end-to-end-feature-testing.md new file mode 100644 index 00000000000..c7c56d7f720 --- /dev/null +++ b/specs/end-to-end-feature-testing.md @@ -0,0 +1,254 @@ +# End-to-End Feature Testing for Pull Requests + +This document describes how human developers can test new features end-to-end using agentic workflows directly in pull requests. + +## Overview + +When developing a new feature, you can test it end-to-end by: + +1. Having GitHub Copilot agent modify the `dev.md` agentic workflow to use the new feature +2. Triggering the Dev workflow in the PR branch via CLI or web UI +3. Waiting for the Dev workflow to finish and for Dev Hawk to analyze the results +4. Iterating based on the feedback + +This approach allows you to test features in a real GitHub Actions environment before merging to main. + +## Testing Workflow + +### Step 1: Instruct GitHub Copilot Agent to Modify dev.md + +The `dev.md` workflow is located at `.github/workflows/dev.md` and serves as a testing playground for new features. + +**How to request changes:** + +In your pull request, instruct the GitHub Copilot agent to modify the `dev.md` workflow to exercise your new feature. For example: + +``` +@copilot please update the dev.md workflow to test the new feature by: +- Adding the necessary configuration in the frontmatter +- Updating the task description to use the feature +- Including validation steps to verify the feature works correctly +``` + +**What the agent should do:** + +- Update the frontmatter YAML configuration to enable/configure the new feature +- Modify the task instructions to exercise the feature +- Add verification steps if applicable +- Ensure the workflow will demonstrate the feature working correctly + +**After the agent makes changes:** + +The agent should run `make recompile` to regenerate the `.github/workflows/dev.lock.yml` file. This compiled workflow is what GitHub Actions will execute. + +### Step 2: Trigger the Dev Workflow + +Once the `dev.md` workflow has been updated and the changes are committed to your PR branch, you can trigger the workflow in two ways: + +#### Option A: Using the GitHub CLI + +```bash +# Trigger the workflow on your PR branch +gh workflow run dev.md --ref your-branch-name + +# Check the workflow run status +gh run list --workflow=dev.md --limit 5 +``` + +#### Option B: Using the GitHub Web UI + +1. Navigate to the **Actions** tab in the GitHub repository +2. Select the **Dev** workflow from the left sidebar +3. Click the **Run workflow** dropdown button +4. Select your PR branch from the **Use workflow from** dropdown +5. Click the **Run workflow** button + +### Step 3: Monitor the Dev Workflow Execution + +After triggering the workflow: + +**Watch the workflow run:** + +- Go to the **Actions** tab to see the workflow execution in real-time +- Click on the specific run to see detailed logs +- Monitor for any errors or unexpected behavior + +**Expected outcomes:** + +- ✅ **Success**: The workflow completes successfully, demonstrating the feature works +- ⚠️ **Failure**: The workflow fails, indicating an issue with the feature or configuration + +### Step 4: Review Dev Hawk's Analysis + +After the Dev workflow completes, the **Dev Hawk** workflow will automatically run (see `.github/workflows/dev-hawk.md`). Dev Hawk is a monitoring agent that: + +- Detects when the Dev workflow completes on `copilot/*` branches +- Analyzes the workflow outcome (success or failure) +- Posts a detailed comment to your pull request with: + - Workflow status and link to the run + - Root cause analysis for failures (using the `gh aw audit` tool) + - Error details and recommendations + - Actionable next steps + +**What to look for in Dev Hawk's comment:** + +- **Success report**: Confirms the feature is working as expected +- **Failure analysis**: Identifies what went wrong with specific error messages +- **Recommendations**: Suggests fixes or next steps to resolve issues + +### Step 5: Iterate Based on Feedback + +Based on the results: + +**If the workflow succeeds:** + +- Review the workflow output to verify the feature behaves correctly +- Check that the feature produces expected results +- Consider additional test scenarios if needed + +**If the workflow fails:** + +- Review Dev Hawk's analysis in the PR comment +- Examine the specific error messages and recommendations +- Instruct the GitHub Copilot agent to fix the issues: + ``` + @copilot the dev workflow failed with [error]. Please fix the dev.md workflow to address this issue. + ``` +- After fixes are committed, re-trigger the Dev workflow (repeat from Step 2) + +**Iteration cycle:** + +``` +Modify dev.md → Recompile → Trigger workflow → Review results → Iterate +``` + +Continue this cycle until the workflow succeeds and the feature works as expected. + +## Example Testing Scenarios + +### Example 1: Testing a New Tool Integration + +```yaml +--- +engine: copilot +tools: + new-tool: + config_key: value +--- + +# Test New Tool Integration + +Use the new-tool to perform [specific task] and verify the results. +``` + +### Example 2: Testing New Engine Features + +```yaml +--- +engine: claude +new_feature_flag: true +--- + +# Test New Engine Feature + +Demonstrate the new engine feature by [description of test]. +``` + +### Example 3: Testing Safe Output Enhancements + +```yaml +--- +engine: copilot +safe-outputs: + new-output-type: + max: 5 + target: "test-*" +--- + +# Test New Safe Output Type + +Create multiple instances of the new output type to verify rate limiting and targeting work correctly. +``` + +## Best Practices + +### For Effective Testing + +1. **Single feature focus**: Test one feature at a time in dev.md +2. **Clear success criteria**: Define what success looks like for the test +3. **Include validation**: Add steps to verify the feature works correctly +4. **Use appropriate timeouts**: Set reasonable timeout-minutes for your test +5. **Clean up between tests**: Revert dev.md changes after testing is complete + +### For Better Iteration + +1. **Read Dev Hawk's analysis carefully**: It often identifies the exact issue +2. **Check workflow logs directly**: Sometimes additional context is in the full logs +3. **Test incrementally**: Start with minimal configuration, then add complexity +4. **Document unexpected behavior**: Note any issues in the PR for discussion + +### For Repository Hygiene + +1. **Don't merge dev.md changes**: The dev.md file should remain a simple, reusable test harness +2. **Reset dev.md after testing**: Restore it to the default configuration +3. **Focus PR changes on the actual feature**: Keep test changes separate from feature implementation + +## Troubleshooting + +### Dev Hawk Doesn't Comment + +**Possible reasons:** + +- The workflow wasn't triggered via `workflow_dispatch` +- The branch doesn't match the `copilot/*` pattern +- No pull request is associated with the commit + +**Solution:** + +- Ensure you're on a branch named `copilot/your-feature-name` +- Verify the Dev workflow was triggered manually (workflow_dispatch) +- Confirm a pull request exists for your branch + +### Workflow Fails to Trigger + +**Possible reasons:** + +- The lock file wasn't regenerated after modifying dev.md +- The branch name is incorrect +- Permissions issues + +**Solution:** + +- Run `make recompile` to regenerate the lock file +- Verify you have the correct branch name +- Check that the workflow file is valid YAML + +### Feature Doesn't Work as Expected + +**Debugging steps:** + +1. Check the workflow logs for error messages +2. Review Dev Hawk's root cause analysis +3. Use `gh aw audit ` locally to investigate further +4. Compare with existing working workflows for similar features + +## Related Documentation + +- [Dev Workflow](.github/workflows/dev.md) - The test harness workflow +- [Dev Hawk Workflow](.github/workflows/dev-hawk.md) - The monitoring agent +- [Testing Specification](./testing.md) - Overall testing framework +- [Contributing Guide](../CONTRIBUTING.md) - How to contribute to gh-aw + +## Integration with CI/CD + +While this manual testing approach is useful for rapid iteration during development, features should also have: + +- **Unit tests**: In `pkg/*/` test files +- **Integration tests**: Testing the feature in isolation +- **Automated workflows**: For continuous validation + +The end-to-end testing described here complements (but does not replace) automated testing. + +--- + +**Last Updated**: 2025-12-05