Skip to content
This repository is currently being migrated. It's locked while the migration is in progress.

Automated Pattern Adherence Tracking System#5447

Merged
humancompanion-usds merged 38 commits intomainfrom
starts-form-pattern-adherence
Feb 11, 2026
Merged

Automated Pattern Adherence Tracking System#5447
humancompanion-usds merged 38 commits intomainfrom
starts-form-pattern-adherence

Conversation

@humancompanion-usds
Copy link
Copy Markdown
Collaborator

@humancompanion-usds humancompanion-usds commented Jan 9, 2026

Summary

Implements automated tracking of which VA.gov forms use which codified design patterns. This addresses issue #5446.

Status: ✅ Implementation complete - includes core script, workflow automation, tests, and documentation

What's Implemented

Core Script (scripts/collect-pattern-adherence.js)

  • ✅ Pattern metadata parser - extracts code-links from pattern frontmatter
  • ✅ Pattern discovery - recursively finds all pattern files
  • ✅ Product directory integration - fetches form list via GitHub API
  • ✅ Import analysis - searches vets-website for pattern usage via GitHub Code Search
  • ✅ Pattern-to-forms mapping - cross-references imports with forms
  • ✅ Report generation - outputs JSON + markdown reports

GitHub Actions Workflow (.github/workflows/pattern-adherence.yml)

  • ✅ Weekly scheduled run (Mondays 6 AM UTC)
  • ✅ Manual trigger support
  • ✅ Sparse checkout of vets-website for performance
  • ✅ Auto-creates PRs with updated data

Generated Reports

Analysis Results (9 codified patterns across 55 forms):

Pattern Compliance Forms Using
Phone numbers 31% 17/55
Addresses 27% 15/55
Dates 25% 14/55
SSN/VA file number 25% 14/55
Email address 24% 13/55
Signature 22% 12/55
Names 16% 9/55
Relationship 5% 3/55
Confirmation 0% 0/55 ⚠️

Output Files:

  • src/assets/data/metrics/pattern-adherence.json (27KB)
  • src/_data/metrics/pattern-adherence.json (27KB - Jekyll data)
  • src/assets/data/metrics/pattern-adherence-report.md (10KB)

Documentation

Intentional behavior changes

metrics-dashboard.yml: pull_request trigger guarded

The pull_request trigger on the metrics-dashboard workflow previously caused infinite loops when the workflow committed updated metrics data back to the PR branch. A branches-ignore guard now prevents runs on automated metrics-update-* and pattern-adherence-update-* branches. A comment in the YAML explains this decision.

Notable Findings

⚠️ Confirmation pattern shows 0% usage - The "Keep a record of submitted information" pattern links to a README rather than a component file, so the import-based detection does not find matches. Future enhancement could add a dedicated detection strategy for this pattern.

Testing

  • ✅ Jest unit tests: __tests__/collect-pattern-adherence.test.js (23 tests covering parsePatternMetadata, isFormProduct, generateMarkdownReport)
  • ✅ Integration tests: scripts/integration-test-pattern-parser.js (requires gh CLI)
  • ✅ Manual test run successful with local vets-website

Security

✅ All security issues addressed:

  • Uses execFileSync (no shell) with argument validation
  • Input validation on all external data
  • Timeouts on all API calls
  • Error handling prevents script crashes
    Open Preview Environment

humancompanion-usds and others added 12 commits January 9, 2026 13:54
- Parse frontmatter from pattern markdown files
- Extract code-link URLs and file paths
- Add test for pattern parser

Co-Authored-By: Claude Sonnet 4.5 <noreply@anthropic.com>
- Add try-catch for individual file read failures
- Validate code-link URLs before parsing
- Warn on malformed GitHub URLs
- Check patterns directory exists with helpful error

Co-Authored-By: Claude Sonnet 4.5 <noreply@anthropic.com>
- Recursively scan pattern directories
- Filter out non-pattern files
- Report pattern statistics

Co-Authored-By: Claude Sonnet 4.5 <noreply@anthropic.com>
- Fetch product-directory.json via gh CLI
- Filter to forms by analytics_category
- Extract relevant form metadata

Co-Authored-By: Claude Sonnet 4.5 <noreply@anthropic.com>
- Add array validation before filtering products
- Make test assertion work for both category fields
- Add timeout to gh CLI command

Co-Authored-By: Claude Sonnet 4.5 <noreply@anthropic.com>
- Search GitHub code API for pattern imports
- Extract application names from file paths
- Handle rate limit and search failures gracefully

Co-Authored-By: Claude Sonnet 4.5 <noreply@anthropic.com>
…analysis

- Fix command injection vulnerability with proper shell escaping
- Use path.parse() for robust file extension removal
- Add input validation for pattern code file parameter
- Add JSON parsing error handling
- Add timeout to prevent hanging

Co-Authored-By: Claude Sonnet 4.5 <noreply@anthropic.com>
- Cross-reference importing apps with form products
- Calculate usage counts and compliance percentages
- Generate structured adherence data

Co-Authored-By: Claude Sonnet 4.5 <noreply@anthropic.com>
- Add null check for pattern codeFile before analysis
- Parallelize pattern processing with Promise.all
- Add division-by-zero protection for compliance percentage
- Add JSDoc documentation for return type

Co-Authored-By: Claude Sonnet 4.5 <noreply@anthropic.com>
- Generate JSON data files for consumption
- Create markdown report with summary and matrix
- Add main execution function
- Sort patterns by usage for readability

Co-Authored-By: Claude Sonnet 4.5 <noreply@anthropic.com>
- Remove redundant pattern lookup in detailed section
- Use lookup map for O(1) form-pattern matrix checks
- Improve performance for large datasets

Co-Authored-By: Claude Sonnet 4.5 <noreply@anthropic.com>
- Generated pattern adherence JSON data
- Created markdown report with compliance summary
- Added implementation plan documentation

Analysis shows 9 codified patterns across 55 forms:
- Dates: 27% compliance (15 forms)
- Addresses: 20% compliance (11 forms)
- Names: 20% compliance (11 forms)
- Phone numbers: 0% compliance (needs investigation)
Copy link
Copy Markdown
Contributor

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

This PR implements an automated pattern adherence tracking system to monitor which VA.gov forms use codified design patterns. The core implementation is complete with a Node.js script that parses pattern metadata, fetches the product directory, analyzes code imports via GitHub API, and generates compliance reports.

Key Changes:

  • Pattern metadata parser with YAML frontmatter extraction
  • GitHub API integration for product directory and code search
  • Automated report generation (JSON and Markdown formats)
  • Test suite for validation

Reviewed changes

Copilot reviewed 6 out of 6 changed files in this pull request and generated 4 comments.

Show a summary per file
File Description
scripts/collect-pattern-adherence.js Core tracking script with pattern discovery, import analysis, and report generation
scripts/test-pattern-parser.js Test suite validating parser, discovery, and product directory integration
src/assets/data/metrics/pattern-adherence.json Generated JSON report with pattern-to-forms mapping data
src/_data/metrics/pattern-adherence.json Jekyll data file (duplicate of above for template access)
src/assets/data/metrics/pattern-adherence-report.md Human-readable markdown report with compliance matrix
docs/plans/2026-01-09-pattern-adherence-tracking.md Detailed implementation plan with task breakdown and testing guidance

Comment thread scripts/collect-pattern-adherence.js Outdated
Comment thread src/assets/data/metrics/pattern-adherence.json
Comment thread scripts/collect-pattern-adherence.js Outdated
Comment thread scripts/collect-pattern-adherence.js Outdated
@humancompanion-usds humancompanion-usds self-assigned this Jan 9, 2026
humancompanion-usds and others added 4 commits January 9, 2026 15:26
Improves pattern adherence detection accuracy and adds support for
analyzing a local vets-website repository to avoid GitHub API rate limits.

Key changes:
- Add --vets-website-path CLI flag to use local repo instead of GitHub API
- Add PATTERN_EXPORT_MAPPING to identify pattern-specific exports
- Rewrite findImporters() to detect barrel export patterns correctly
- Fix regex to match combined suffixes (e.g., ssnOrVaFileNumberNoHintUI)
- Add findJSFiles() helper for recursive local file discovery
- Support both local filesystem and GitHub API modes

This fixes the issue where patterns using combined suffixes like
"NoHintUI" or "NoHintSchema" were not being detected. Form 20-10206
now correctly shows SSN pattern usage.

Co-Authored-By: Claude Sonnet 4.5 <noreply@anthropic.com>
Updates reports with complete pattern detection results using the
improved barrel export detection logic.

Key improvements over previous reports:
- Email: 0% → 42% (23 forms)
- Phone: 0% → 40% (22 forms)
- Names: 0% → 35% (19 forms)
- SSN: 0% → 35% (19 forms, including form 20-10206 ✓)
- Dates: 13% → 38% (21 forms)
- Relationship: 0% → 20% (11 forms)

Form 20-10206 now correctly shows SSN pattern usage, confirming
the regex fix for combined suffixes (NoHintUI, NoHintSchema) is
working as expected.

Co-Authored-By: Claude Sonnet 4.5 <noreply@anthropic.com>
Adds special handling for the Signature pattern which uses declarative
configuration rather than direct imports.

The Signature pattern (FormSignature.jsx) is unique:
- Located in components/ not web-component-patterns/
- Forms don't import it directly
- Forms use declarative `statementOfTruth:` config in preSubmitInfo
- Processed automatically by forms-system SubmitController

Detection approach:
- Search config/form.js files for `statementOfTruth` configuration
- Extract application name from file path
- Currently only works with local vets-website (--vets-website-path)

Results:
- Signature pattern: 0% → 25% (14 forms)
- Form 20-10206 now correctly shows Signature usage ✓
- Checked 92 form config files, found 30 matches across 8 apps

Co-Authored-By: Claude Sonnet 4.5 <noreply@anthropic.com>
Adds automated weekly tracking of pattern adherence across VA.gov forms.

Workflow features:
- Runs weekly on Mondays at 6 AM UTC
- Uses sparse checkout (src/applications only) to optimize performance
- Analyzes 9 codified patterns across 55 forms
- Automatically creates PRs when data changes
- Includes manual workflow_dispatch trigger for testing

Technical approach:
- Sparse checkout of vets-website (only src/applications, ~200MB vs ~500MB)
- Uses --vets-website-path flag for local filesystem analysis
- Supports Signature pattern detection via statementOfTruth config
- Avoids GitHub API rate limits by using local files

Automation:
- Creates dated branch (pattern-adherence-update-YYYY-MM-DD)
- Commits JSON and markdown reports
- Opens PR with compliance summary
- Includes test plan checklist

This enables automated tracking of design system pattern adoption
without manual intervention.

Co-Authored-By: Claude Sonnet 4.5 <noreply@anthropic.com>
Comment thread scripts/collect-pattern-adherence.js Fixed
Addresses Copilot review comments #2 and #3 from PR #5447.

Issue:
Product directory data has inconsistent path formatting - some paths
have leading slashes ("/src/applications/dependents") while others
don't ("src/applications/caregivers"). This caused regex matching
failures when extracting application names.

Fix:
Updated all three path extraction points to handle optional leading slash:
- Line 305: findSignaturePattern() app extraction
- Line 457: findImporters() app extraction
- Line 513: buildPatternAdherence() app matching

Changed regex from:
  /^([^\/]+)/  or  /^src\/applications\//
to:
  /^\/?([^\/]+)/  and  /^\/?src\/applications\//

This ensures both formats are correctly processed:
  "simple-forms/..." -> "simple-forms" ✓
  "/simple-forms/..." -> "simple-forms" ✓
  "src/applications/caregivers" -> "caregivers" ✓
  "/src/applications/dependents" -> "dependents" ✓

Co-Authored-By: Claude Sonnet 4.5 <noreply@anthropic.com>
Addresses Copilot review comment #4 from PR #5447.

Adds detailed documentation to scripts/README.md covering:

**Overview**:
- Script purpose and usage
- CLI options (--vets-website-path)
- Performance comparison (local vs GitHub API)

**Technical Details**:
- Pattern discovery and detection strategies
- Pattern export mapping configuration
- Regex pattern matching for various naming conventions
- Special case handling (Signature pattern)
- Path extraction with leading slash support

**Operations**:
- How the script works (step-by-step)
- Output file formats (JSON + markdown)
- GitHub Actions workflow integration
- Sparse checkout optimization

**Maintenance**:
- Troubleshooting common issues
- Known limitations
- Future enhancement ideas
- Related documentation links

This provides future maintainers with complete context for
understanding and modifying the pattern adherence tracking system.

Co-Authored-By: Claude Sonnet 4.5 <noreply@anthropic.com>
@humancompanion-usds
Copy link
Copy Markdown
Collaborator Author

Copilot Review Response

Thanks for the thorough review! All issues have been addressed:

✅ Fixed - Comments #2 & #3: Inconsistent path formatting

Issue: Product directory data has inconsistent leading slashes ("/src/applications/..." vs "src/applications/..."), causing regex matching failures.

Fix (commit 0469ec6): Updated all path extraction regex patterns to handle optional leading slashes:

// Before
/^([^\/]+)/  or  /^src\/applications\//

// After  
/^\/?([^\/]+)/  and  /^\/?src\/applications\//

Now correctly extracts app names from both formats.

ℹ️ Obsolete - Comment #1: filenameWithoutExt validation

This code no longer exists. The findImporters() function was completely rewritten to use PATTERN_EXPORT_MAPPING instead of filename-based searches (commits a7118ed, e95fb36).

The new approach maps pattern files to their exported names and searches for actual export usage via regex matching - no search query construction from filenames.

✅ Fixed - Comment #4: Missing scripts/README.md

Added (commit 8fa03c4): Comprehensive documentation at scripts/README.md covering:

  • Script purpose, usage, and CLI options
  • Pattern detection strategies (standard + special cases)
  • Pattern export mapping and regex matching
  • Performance comparison (local vs GitHub API)
  • GitHub Actions workflow integration
  • Troubleshooting guide and known limitations

Documentation provides complete context for future maintainers.

The pull_request trigger was intentionally removed to prevent infinite
loops when the workflow commits updated metrics data back to PR branches.
Added an inline comment documenting this decision and called it out in
the PR description.

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
Copy link
Copy Markdown
Contributor

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

Copilot reviewed 9 out of 9 changed files in this pull request and generated 6 comments.

Comment thread docs/plans/2026-01-09-pattern-adherence-tracking.md Outdated
Comment thread scripts/integration-test-pattern-parser.js
Comment thread scripts/collect-pattern-adherence.js
Comment thread scripts/collect-pattern-adherence.js Outdated
Comment thread .github/workflows/metrics-dashboard.yml
Comment thread docs/plans/2026-01-09-pattern-adherence-tracking.md Outdated
Co-authored-by: Copilot <175728472+Copilot@users.noreply.github.com>
Copy link
Copy Markdown
Contributor

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

Copilot reviewed 9 out of 9 changed files in this pull request and generated 4 comments.

Comment thread scripts/README.md Outdated
Comment thread src/assets/data/metrics/pattern-adherence-report.md
Comment thread scripts/collect-pattern-adherence.js
Comment thread .github/workflows/pattern-adherence.yml
- Rename test-pattern-parser.js to integration-test-pattern-parser.js
- Restore pull_request trigger with branches-ignore on metrics-dashboard
- Remove generated_at timestamp to avoid unnecessary weekly diffs
- Fix misleading code link in report (now links to GitHub source)
- Add separate Docs link for pattern permalink in report
- Fix plan doc architecture description: HTML → JSON + markdown
- Update plan doc references to renamed integration test file

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
Co-authored-by: Copilot <175728472+Copilot@users.noreply.github.com>
- Add Jest unit tests for parsePatternMetadata, isFormProduct, and
  generateMarkdownReport (23 tests)
- Update README form matching docs to show 2-segment identifiers
- Remove stale generated_at from README JSON example
- Add token for vets-website checkout in pattern-adherence workflow
- Update PR description with current compliance data

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
Copy link
Copy Markdown
Contributor

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

Copilot reviewed 10 out of 10 changed files in this pull request and generated 5 comments.

Comment thread src/assets/data/metrics/pattern-adherence-report.md
Comment thread scripts/collect-pattern-adherence.js
Comment thread scripts/README.md Outdated
Comment thread src/assets/data/metrics/pattern-adherence-report.md Outdated
Comment thread .github/workflows/metrics-dashboard.yml Outdated
- Regenerate report and JSON data with fixed code (GitHub code links,
  no generated_at timestamp, separate Docs links)
- Update README form matching pseudo-code to show normalization
- Fix metrics-dashboard comment to match actual branches-ignore behavior

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
Copilot AI review requested due to automatic review settings February 11, 2026 17:11
@humancompanion-usds humancompanion-usds merged commit 99335f3 into main Feb 11, 2026
9 of 10 checks passed
@humancompanion-usds humancompanion-usds deleted the starts-form-pattern-adherence branch February 11, 2026 17:13
Copy link
Copy Markdown
Contributor

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

Copilot reviewed 10 out of 10 changed files in this pull request and generated 6 comments.

filename,
title,
permalink,
hasCodeLink: true,
Copy link

Copilot AI Feb 11, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

When code-link exists but the /blob/(main|master)/... path can’t be extracted, this returns hasCodeLink: true with codeFile: null. That pattern will still be counted as “codified” (via hasCodeLink) but skipped during import analysis, inflating total_patterns and producing misleading 0%-usage entries. Consider treating an unparseable code-link as not codified (set hasCodeLink: false) or exclude such patterns from codifiedPatterns.

Suggested change
hasCodeLink: true,
hasCodeLink: false,

Copilot uses AI. Check for mistakes.
Comment on lines +591 to +598
// Process all patterns in parallel
const adherenceDataPromises = codifiedPatterns.map(async (pattern) => {
console.log(`\nProcessing: ${pattern.title}`);

// Find applications that import this pattern
const importingApps = pattern.codeFile
? await findImporters(pattern.codeFile)
: [];
Copy link

Copilot AI Feb 11, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

buildPatternAdherence processes all patterns in parallel, but each findImporters run scans thousands of files under src/applications. Running these scans concurrently can create significant I/O pressure (and longer runtimes / possible EMFILE errors) in CI. Consider limiting concurrency (e.g., a small worker pool) and/or reusing a single precomputed file list / content cache across patterns.

Copilot uses AI. Check for mistakes.
Comment on lines +283 to +286
path_to_code: form.path_to_code,
analytics_category: form.analytics_category,
platform_console_category: form.platform_console_category,
github_product_label: form.github_product_label
Copy link

Copilot AI Feb 11, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The objects returned from getFormProducts() set fields to undefined when missing in the product directory, which causes those keys to be omitted in the generated JSON. This makes the forms schema inconsistent for consumers. Consider normalizing optional fields to null (e.g., platform_console_category: form.platform_console_category ?? null, github_product_label: form.github_product_label ?? null, etc.) so the output shape is stable.

Suggested change
path_to_code: form.path_to_code,
analytics_category: form.analytics_category,
platform_console_category: form.platform_console_category,
github_product_label: form.github_product_label
path_to_code: form.path_to_code ?? null,
analytics_category: form.analytics_category ?? null,
platform_console_category: form.platform_console_category ?? null,
github_product_label: form.github_product_label ?? null

Copilot uses AI. Check for mistakes.
Comment on lines +14 to +18
# Skip if last commit was made by GitHub Action
if: |
github.event.head_commit.author.email != 'action@github.com' &&
!contains(github.event.head_commit.message, '📊 Update pattern adherence data')

Copy link

Copilot AI Feb 11, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This workflow only runs on schedule and workflow_dispatch, but the job-level if references github.event.head_commit.*, which is not present for those event payloads. This can cause the expression evaluation to fail or behave unexpectedly, preventing the job from running. Consider removing this guard entirely (since there’s no push trigger) or rewriting it to use fields that exist for these events (e.g., github.actor, or gating on github.event_name == 'push').

Suggested change
# Skip if last commit was made by GitHub Action
if: |
github.event.head_commit.author.email != 'action@github.com' &&
!contains(github.event.head_commit.message, '📊 Update pattern adherence data')

Copilot uses AI. Check for mistakes.
Comment on lines +546 to +551
// Extract app identifier from path using up to 2 segments
// e.g., "simple-forms/20-10206/pages/..." -> "simple-forms/20-10206"
const segments = filePath.replace(/^\//, '').split('/').filter(Boolean);
const appId = segments.length >= 2
? `${segments[0]}/${segments[1]}`
: (segments[0] || null);
Copy link

Copilot AI Feb 11, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

appId extraction here takes the first 2 path segments unconditionally when available (e.g., caregivers/config/form.js becomes caregivers/config). That won’t match product-directory path_to_code entries that point at the app root (e.g., src/applications/caregiverscaregivers), causing false negatives for single-segment apps. Consider normalizing importer IDs to the application root (e.g., use 2 segments only when the second segment is an actual app folder, not pages/config).

Suggested change
// Extract app identifier from path using up to 2 segments
// e.g., "simple-forms/20-10206/pages/..." -> "simple-forms/20-10206"
const segments = filePath.replace(/^\//, '').split('/').filter(Boolean);
const appId = segments.length >= 2
? `${segments[0]}/${segments[1]}`
: (segments[0] || null);
// Extract app identifier from path, normalizing to application root
// - For two-segment apps: "simple-forms/20-10206/pages/..." -> "simple-forms/20-10206"
// - For single-segment apps: "caregivers/config/form.js" -> "caregivers"
const segments = filePath.replace(/^\//, '').split('/').filter(Boolean);
const [firstSegment, secondSegment] = segments;
let appId = null;
if (firstSegment) {
if (!secondSegment) {
// Single-segment app (e.g., "caregivers/...")
appId = firstSegment;
} else {
// If the second segment is a non-root folder like "pages" or "config",
// treat the app as a single-segment root. Otherwise, use the first two segments.
const nonAppFolders = ['pages', 'config'];
if (nonAppFolders.includes(secondSegment)) {
appId = firstSegment;
} else {
appId = `${firstSegment}/${secondSegment}`;
}
}
}

Copilot uses AI. Check for mistakes.
Comment on lines +362 to +364
// Extract app identifier from path using up to 2 segments
const segments = filePath.replace(/^\//, '').split('/').filter(Boolean);
const appId = segments.length >= 2
Copy link

Copilot AI Feb 11, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Same as findImporters: for config paths like caregivers/config/form.js, this will record appId as caregivers/config instead of the app root caregivers, so Signature usage detection will miss single-segment apps. Consider extracting the application root consistently (e.g., first segment, or first two segments only when the second segment is not config/pages).

Suggested change
// Extract app identifier from path using up to 2 segments
const segments = filePath.replace(/^\//, '').split('/').filter(Boolean);
const appId = segments.length >= 2
// Extract app identifier from path, treating "config" and "pages" as structural
const segments = filePath.replace(/^\//, '').split('/').filter(Boolean);
const hasTwoSegments = segments.length >= 2;
const secondIsStructural = hasTwoSegments && (segments[1] === 'config' || segments[1] === 'pages');
const appId = hasTwoSegments && !secondIsStructural

Copilot uses AI. Check for mistakes.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

5 participants