Automated Pattern Adherence Tracking System#5447
Conversation
- Parse frontmatter from pattern markdown files - Extract code-link URLs and file paths - Add test for pattern parser Co-Authored-By: Claude Sonnet 4.5 <noreply@anthropic.com>
- Add try-catch for individual file read failures - Validate code-link URLs before parsing - Warn on malformed GitHub URLs - Check patterns directory exists with helpful error Co-Authored-By: Claude Sonnet 4.5 <noreply@anthropic.com>
- Recursively scan pattern directories - Filter out non-pattern files - Report pattern statistics Co-Authored-By: Claude Sonnet 4.5 <noreply@anthropic.com>
- Fetch product-directory.json via gh CLI - Filter to forms by analytics_category - Extract relevant form metadata Co-Authored-By: Claude Sonnet 4.5 <noreply@anthropic.com>
- Add array validation before filtering products - Make test assertion work for both category fields - Add timeout to gh CLI command Co-Authored-By: Claude Sonnet 4.5 <noreply@anthropic.com>
- Search GitHub code API for pattern imports - Extract application names from file paths - Handle rate limit and search failures gracefully Co-Authored-By: Claude Sonnet 4.5 <noreply@anthropic.com>
…analysis - Fix command injection vulnerability with proper shell escaping - Use path.parse() for robust file extension removal - Add input validation for pattern code file parameter - Add JSON parsing error handling - Add timeout to prevent hanging Co-Authored-By: Claude Sonnet 4.5 <noreply@anthropic.com>
- Cross-reference importing apps with form products - Calculate usage counts and compliance percentages - Generate structured adherence data Co-Authored-By: Claude Sonnet 4.5 <noreply@anthropic.com>
- Add null check for pattern codeFile before analysis - Parallelize pattern processing with Promise.all - Add division-by-zero protection for compliance percentage - Add JSDoc documentation for return type Co-Authored-By: Claude Sonnet 4.5 <noreply@anthropic.com>
- Generate JSON data files for consumption - Create markdown report with summary and matrix - Add main execution function - Sort patterns by usage for readability Co-Authored-By: Claude Sonnet 4.5 <noreply@anthropic.com>
- Remove redundant pattern lookup in detailed section - Use lookup map for O(1) form-pattern matrix checks - Improve performance for large datasets Co-Authored-By: Claude Sonnet 4.5 <noreply@anthropic.com>
- Generated pattern adherence JSON data - Created markdown report with compliance summary - Added implementation plan documentation Analysis shows 9 codified patterns across 55 forms: - Dates: 27% compliance (15 forms) - Addresses: 20% compliance (11 forms) - Names: 20% compliance (11 forms) - Phone numbers: 0% compliance (needs investigation)
There was a problem hiding this comment.
Pull request overview
This PR implements an automated pattern adherence tracking system to monitor which VA.gov forms use codified design patterns. The core implementation is complete with a Node.js script that parses pattern metadata, fetches the product directory, analyzes code imports via GitHub API, and generates compliance reports.
Key Changes:
- Pattern metadata parser with YAML frontmatter extraction
- GitHub API integration for product directory and code search
- Automated report generation (JSON and Markdown formats)
- Test suite for validation
Reviewed changes
Copilot reviewed 6 out of 6 changed files in this pull request and generated 4 comments.
Show a summary per file
| File | Description |
|---|---|
| scripts/collect-pattern-adherence.js | Core tracking script with pattern discovery, import analysis, and report generation |
| scripts/test-pattern-parser.js | Test suite validating parser, discovery, and product directory integration |
| src/assets/data/metrics/pattern-adherence.json | Generated JSON report with pattern-to-forms mapping data |
| src/_data/metrics/pattern-adherence.json | Jekyll data file (duplicate of above for template access) |
| src/assets/data/metrics/pattern-adherence-report.md | Human-readable markdown report with compliance matrix |
| docs/plans/2026-01-09-pattern-adherence-tracking.md | Detailed implementation plan with task breakdown and testing guidance |
Improves pattern adherence detection accuracy and adds support for analyzing a local vets-website repository to avoid GitHub API rate limits. Key changes: - Add --vets-website-path CLI flag to use local repo instead of GitHub API - Add PATTERN_EXPORT_MAPPING to identify pattern-specific exports - Rewrite findImporters() to detect barrel export patterns correctly - Fix regex to match combined suffixes (e.g., ssnOrVaFileNumberNoHintUI) - Add findJSFiles() helper for recursive local file discovery - Support both local filesystem and GitHub API modes This fixes the issue where patterns using combined suffixes like "NoHintUI" or "NoHintSchema" were not being detected. Form 20-10206 now correctly shows SSN pattern usage. Co-Authored-By: Claude Sonnet 4.5 <noreply@anthropic.com>
Updates reports with complete pattern detection results using the improved barrel export detection logic. Key improvements over previous reports: - Email: 0% → 42% (23 forms) - Phone: 0% → 40% (22 forms) - Names: 0% → 35% (19 forms) - SSN: 0% → 35% (19 forms, including form 20-10206 ✓) - Dates: 13% → 38% (21 forms) - Relationship: 0% → 20% (11 forms) Form 20-10206 now correctly shows SSN pattern usage, confirming the regex fix for combined suffixes (NoHintUI, NoHintSchema) is working as expected. Co-Authored-By: Claude Sonnet 4.5 <noreply@anthropic.com>
Adds special handling for the Signature pattern which uses declarative configuration rather than direct imports. The Signature pattern (FormSignature.jsx) is unique: - Located in components/ not web-component-patterns/ - Forms don't import it directly - Forms use declarative `statementOfTruth:` config in preSubmitInfo - Processed automatically by forms-system SubmitController Detection approach: - Search config/form.js files for `statementOfTruth` configuration - Extract application name from file path - Currently only works with local vets-website (--vets-website-path) Results: - Signature pattern: 0% → 25% (14 forms) - Form 20-10206 now correctly shows Signature usage ✓ - Checked 92 form config files, found 30 matches across 8 apps Co-Authored-By: Claude Sonnet 4.5 <noreply@anthropic.com>
Adds automated weekly tracking of pattern adherence across VA.gov forms. Workflow features: - Runs weekly on Mondays at 6 AM UTC - Uses sparse checkout (src/applications only) to optimize performance - Analyzes 9 codified patterns across 55 forms - Automatically creates PRs when data changes - Includes manual workflow_dispatch trigger for testing Technical approach: - Sparse checkout of vets-website (only src/applications, ~200MB vs ~500MB) - Uses --vets-website-path flag for local filesystem analysis - Supports Signature pattern detection via statementOfTruth config - Avoids GitHub API rate limits by using local files Automation: - Creates dated branch (pattern-adherence-update-YYYY-MM-DD) - Commits JSON and markdown reports - Opens PR with compliance summary - Includes test plan checklist This enables automated tracking of design system pattern adoption without manual intervention. Co-Authored-By: Claude Sonnet 4.5 <noreply@anthropic.com>
Addresses Copilot review comments #2 and #3 from PR #5447. Issue: Product directory data has inconsistent path formatting - some paths have leading slashes ("/src/applications/dependents") while others don't ("src/applications/caregivers"). This caused regex matching failures when extracting application names. Fix: Updated all three path extraction points to handle optional leading slash: - Line 305: findSignaturePattern() app extraction - Line 457: findImporters() app extraction - Line 513: buildPatternAdherence() app matching Changed regex from: /^([^\/]+)/ or /^src\/applications\// to: /^\/?([^\/]+)/ and /^\/?src\/applications\// This ensures both formats are correctly processed: "simple-forms/..." -> "simple-forms" ✓ "/simple-forms/..." -> "simple-forms" ✓ "src/applications/caregivers" -> "caregivers" ✓ "/src/applications/dependents" -> "dependents" ✓ Co-Authored-By: Claude Sonnet 4.5 <noreply@anthropic.com>
Addresses Copilot review comment #4 from PR #5447. Adds detailed documentation to scripts/README.md covering: **Overview**: - Script purpose and usage - CLI options (--vets-website-path) - Performance comparison (local vs GitHub API) **Technical Details**: - Pattern discovery and detection strategies - Pattern export mapping configuration - Regex pattern matching for various naming conventions - Special case handling (Signature pattern) - Path extraction with leading slash support **Operations**: - How the script works (step-by-step) - Output file formats (JSON + markdown) - GitHub Actions workflow integration - Sparse checkout optimization **Maintenance**: - Troubleshooting common issues - Known limitations - Future enhancement ideas - Related documentation links This provides future maintainers with complete context for understanding and modifying the pattern adherence tracking system. Co-Authored-By: Claude Sonnet 4.5 <noreply@anthropic.com>
Copilot Review ResponseThanks for the thorough review! All issues have been addressed: ✅ Fixed - Comments #2 & #3: Inconsistent path formattingIssue: Product directory data has inconsistent leading slashes ( Fix (commit 0469ec6): Updated all path extraction regex patterns to handle optional leading slashes: // Before
/^([^\/]+)/ or /^src\/applications\//
// After
/^\/?([^\/]+)/ and /^\/?src\/applications\//Now correctly extracts app names from both formats. ℹ️ Obsolete - Comment #1: filenameWithoutExt validationThis code no longer exists. The The new approach maps pattern files to their exported names and searches for actual export usage via regex matching - no search query construction from filenames. ✅ Fixed - Comment #4: Missing scripts/README.mdAdded (commit 8fa03c4): Comprehensive documentation at
Documentation provides complete context for future maintainers. |
The pull_request trigger was intentionally removed to prevent infinite loops when the workflow commits updated metrics data back to PR branches. Added an inline comment documenting this decision and called it out in the PR description. Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
Co-authored-by: Copilot <175728472+Copilot@users.noreply.github.com>
- Rename test-pattern-parser.js to integration-test-pattern-parser.js - Restore pull_request trigger with branches-ignore on metrics-dashboard - Remove generated_at timestamp to avoid unnecessary weekly diffs - Fix misleading code link in report (now links to GitHub source) - Add separate Docs link for pattern permalink in report - Fix plan doc architecture description: HTML → JSON + markdown - Update plan doc references to renamed integration test file Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
Co-authored-by: Copilot <175728472+Copilot@users.noreply.github.com>
- Add Jest unit tests for parsePatternMetadata, isFormProduct, and generateMarkdownReport (23 tests) - Update README form matching docs to show 2-segment identifiers - Remove stale generated_at from README JSON example - Add token for vets-website checkout in pattern-adherence workflow - Update PR description with current compliance data Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
- Regenerate report and JSON data with fixed code (GitHub code links, no generated_at timestamp, separate Docs links) - Update README form matching pseudo-code to show normalization - Fix metrics-dashboard comment to match actual branches-ignore behavior Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
| filename, | ||
| title, | ||
| permalink, | ||
| hasCodeLink: true, |
There was a problem hiding this comment.
When code-link exists but the /blob/(main|master)/... path can’t be extracted, this returns hasCodeLink: true with codeFile: null. That pattern will still be counted as “codified” (via hasCodeLink) but skipped during import analysis, inflating total_patterns and producing misleading 0%-usage entries. Consider treating an unparseable code-link as not codified (set hasCodeLink: false) or exclude such patterns from codifiedPatterns.
| hasCodeLink: true, | |
| hasCodeLink: false, |
| // Process all patterns in parallel | ||
| const adherenceDataPromises = codifiedPatterns.map(async (pattern) => { | ||
| console.log(`\nProcessing: ${pattern.title}`); | ||
|
|
||
| // Find applications that import this pattern | ||
| const importingApps = pattern.codeFile | ||
| ? await findImporters(pattern.codeFile) | ||
| : []; |
There was a problem hiding this comment.
buildPatternAdherence processes all patterns in parallel, but each findImporters run scans thousands of files under src/applications. Running these scans concurrently can create significant I/O pressure (and longer runtimes / possible EMFILE errors) in CI. Consider limiting concurrency (e.g., a small worker pool) and/or reusing a single precomputed file list / content cache across patterns.
| path_to_code: form.path_to_code, | ||
| analytics_category: form.analytics_category, | ||
| platform_console_category: form.platform_console_category, | ||
| github_product_label: form.github_product_label |
There was a problem hiding this comment.
The objects returned from getFormProducts() set fields to undefined when missing in the product directory, which causes those keys to be omitted in the generated JSON. This makes the forms schema inconsistent for consumers. Consider normalizing optional fields to null (e.g., platform_console_category: form.platform_console_category ?? null, github_product_label: form.github_product_label ?? null, etc.) so the output shape is stable.
| path_to_code: form.path_to_code, | |
| analytics_category: form.analytics_category, | |
| platform_console_category: form.platform_console_category, | |
| github_product_label: form.github_product_label | |
| path_to_code: form.path_to_code ?? null, | |
| analytics_category: form.analytics_category ?? null, | |
| platform_console_category: form.platform_console_category ?? null, | |
| github_product_label: form.github_product_label ?? null |
| # Skip if last commit was made by GitHub Action | ||
| if: | | ||
| github.event.head_commit.author.email != 'action@github.com' && | ||
| !contains(github.event.head_commit.message, '📊 Update pattern adherence data') | ||
|
|
There was a problem hiding this comment.
This workflow only runs on schedule and workflow_dispatch, but the job-level if references github.event.head_commit.*, which is not present for those event payloads. This can cause the expression evaluation to fail or behave unexpectedly, preventing the job from running. Consider removing this guard entirely (since there’s no push trigger) or rewriting it to use fields that exist for these events (e.g., github.actor, or gating on github.event_name == 'push').
| # Skip if last commit was made by GitHub Action | |
| if: | | |
| github.event.head_commit.author.email != 'action@github.com' && | |
| !contains(github.event.head_commit.message, '📊 Update pattern adherence data') |
| // Extract app identifier from path using up to 2 segments | ||
| // e.g., "simple-forms/20-10206/pages/..." -> "simple-forms/20-10206" | ||
| const segments = filePath.replace(/^\//, '').split('/').filter(Boolean); | ||
| const appId = segments.length >= 2 | ||
| ? `${segments[0]}/${segments[1]}` | ||
| : (segments[0] || null); |
There was a problem hiding this comment.
appId extraction here takes the first 2 path segments unconditionally when available (e.g., caregivers/config/form.js becomes caregivers/config). That won’t match product-directory path_to_code entries that point at the app root (e.g., src/applications/caregivers → caregivers), causing false negatives for single-segment apps. Consider normalizing importer IDs to the application root (e.g., use 2 segments only when the second segment is an actual app folder, not pages/config).
| // Extract app identifier from path using up to 2 segments | |
| // e.g., "simple-forms/20-10206/pages/..." -> "simple-forms/20-10206" | |
| const segments = filePath.replace(/^\//, '').split('/').filter(Boolean); | |
| const appId = segments.length >= 2 | |
| ? `${segments[0]}/${segments[1]}` | |
| : (segments[0] || null); | |
| // Extract app identifier from path, normalizing to application root | |
| // - For two-segment apps: "simple-forms/20-10206/pages/..." -> "simple-forms/20-10206" | |
| // - For single-segment apps: "caregivers/config/form.js" -> "caregivers" | |
| const segments = filePath.replace(/^\//, '').split('/').filter(Boolean); | |
| const [firstSegment, secondSegment] = segments; | |
| let appId = null; | |
| if (firstSegment) { | |
| if (!secondSegment) { | |
| // Single-segment app (e.g., "caregivers/...") | |
| appId = firstSegment; | |
| } else { | |
| // If the second segment is a non-root folder like "pages" or "config", | |
| // treat the app as a single-segment root. Otherwise, use the first two segments. | |
| const nonAppFolders = ['pages', 'config']; | |
| if (nonAppFolders.includes(secondSegment)) { | |
| appId = firstSegment; | |
| } else { | |
| appId = `${firstSegment}/${secondSegment}`; | |
| } | |
| } | |
| } |
| // Extract app identifier from path using up to 2 segments | ||
| const segments = filePath.replace(/^\//, '').split('/').filter(Boolean); | ||
| const appId = segments.length >= 2 |
There was a problem hiding this comment.
Same as findImporters: for config paths like caregivers/config/form.js, this will record appId as caregivers/config instead of the app root caregivers, so Signature usage detection will miss single-segment apps. Consider extracting the application root consistently (e.g., first segment, or first two segments only when the second segment is not config/pages).
| // Extract app identifier from path using up to 2 segments | |
| const segments = filePath.replace(/^\//, '').split('/').filter(Boolean); | |
| const appId = segments.length >= 2 | |
| // Extract app identifier from path, treating "config" and "pages" as structural | |
| const segments = filePath.replace(/^\//, '').split('/').filter(Boolean); | |
| const hasTwoSegments = segments.length >= 2; | |
| const secondIsStructural = hasTwoSegments && (segments[1] === 'config' || segments[1] === 'pages'); | |
| const appId = hasTwoSegments && !secondIsStructural |
Summary
Implements automated tracking of which VA.gov forms use which codified design patterns. This addresses issue #5446.
Status: ✅ Implementation complete - includes core script, workflow automation, tests, and documentation
What's Implemented
Core Script (
scripts/collect-pattern-adherence.js)GitHub Actions Workflow (
.github/workflows/pattern-adherence.yml)Generated Reports
Analysis Results (9 codified patterns across 55 forms):
Output Files:
src/assets/data/metrics/pattern-adherence.json(27KB)src/_data/metrics/pattern-adherence.json(27KB - Jekyll data)src/assets/data/metrics/pattern-adherence-report.md(10KB)Documentation
docs/plans/2026-01-09-pattern-adherence-tracking.mdscripts/README.mdIntentional behavior changes
metrics-dashboard.yml:pull_requesttrigger guardedThe
pull_requesttrigger on the metrics-dashboard workflow previously caused infinite loops when the workflow committed updated metrics data back to the PR branch. Abranches-ignoreguard now prevents runs on automatedmetrics-update-*andpattern-adherence-update-*branches. A comment in the YAML explains this decision.Notable Findings
Testing
__tests__/collect-pattern-adherence.test.js(23 tests coveringparsePatternMetadata,isFormProduct,generateMarkdownReport)scripts/integration-test-pattern-parser.js(requiresghCLI)Security
✅ All security issues addressed:
Open Preview Environment