Skip to content

feat: LLM Wiki generation (opt-in) + /archie-set-wiki toggle#47

Open
csacsi wants to merge 103 commits into
mainfrom
feat/llm-wiki-design
Open

feat: LLM Wiki generation (opt-in) + /archie-set-wiki toggle#47
csacsi wants to merge 103 commits into
mainfrom
feat/llm-wiki-design

Conversation

@csacsi
Copy link
Copy Markdown
Collaborator

@csacsi csacsi commented Apr 22, 2026

Summary

  • Adds a full LLM-generated wiki pipeline (.archie/wiki/) built from blueprint + scan data, with pages for components, data models, decisions, pitfalls, capabilities, guidelines, rules, and utilities. Incremental rebuild, lint, and viewer integration included.
  • Wiki generation is opt-in by default. Toggle with the new /archie-set-wiki command (on / off / no-arg status). Persisted to .archie/archie_config.json; ARCHIE_WIKI_ENABLED env var overrides.
  • Viewer dashboard gains a Wiki tab that auto-enables when a wiki exists, with data-models, guidelines, rules, and root pages in the sidebar.
  • Scanner emits symbols[] (Python, TS/JS, Swift functions) to power the Utilities catalog.
  • Deep-scan: Wave 1 forces re-run when blueprint is missing expected keys; adds a Data models agent; blueprint freshness check via check_blueprint_completeness.py.

What's in

  • wiki_builder.py, wiki_index.py, full wiki page templates (capabilities, data-models, decisions, pitfalls, guidelines, rules, components, root pages, diagram)
  • /archie-set-wiki slash command + intent_layer.py config get|set subcommand (allowlisted, currently wiki_enabled)
  • Viewer Wiki tab (share/viewer + archie/standalone/viewer.py)
  • Scanner symbols[] extraction for Python / TS / JS / Swift
  • Blueprint data_models[] synthesis + Wave 1 Data models agent
  • Bugfix: renderer.py + wiki_builder.py now read .archie/archie_config.json (matches what config set writes — the old .archie/archie.json path never existed, so the flag was silently dead)

Merge prep

Main was merged into this branch in `c65e5ae` to pull in Gabor's telemetry + structured findings + pitfall pipeline from `main`. Conflicts resolved preserving both sets of logic.

Test plan

  • `/archie-set-wiki` shows `disabled (default)` on a fresh project
  • `/archie-set-wiki on` persists and wiki builds on next scan
  • `/archie-set-wiki off` disables and subsequent scans skip wiki
  • `ARCHIE_WIKI_ENABLED=true` overrides a persisted `false`
  • Viewer Wiki tab auto-enables when `.archie/wiki/` exists
  • `/archie-scan` + `/archie-deep-scan` still succeed end-to-end with wiki both on and off
  • `scripts/verify_sync.py` passes (21 scripts, 5 commands)

🤖 Generated with Claude Code

csacsi and others added 30 commits April 17, 2026 12:30
Notes that Tobi Lütke's qmd is orthogonal (read-side retrieval) and can
be used by users over .archie/wiki/ without any change to Archie.
Lists v1.2+ candidates (wiki-query skill, MCP server, Karpathy log.md,
viewer graph view).

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Plan 1: Core builder — wiki_builder.py + wiki_index.py generate
        .archie/wiki/** from blueprint.json, Referenced-by sections,
        renderer.py CLAUDE/AGENTS patches, feature flag.
Plan 2: Capabilities agent — new Wave 1 agent, blueprint.capabilities[]
        synthesis, capability pages and promoted index section.
Plan 3: Incremental + lint — scan-diff-driven page refresh with
        SHA256 gating, scoped capabilities re-run, five lint kinds.
Plan 4: Viewer — /wiki/* route, markdown-to-HTML renderer, sidebar
        from backlinks, --with-wiki-ui flag.

Each plan is independently shippable.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
Appends build_wiki(), _build_slug_map(), _write(), and main() to wiki_builder.py
to orchestrate Pass 1 end-to-end. Also extends slugify() with CamelCase/PascalCase
splitting so component names like UserService produce correct user-service slugs.
Adds 2 integration tests (13 total pass).

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
Implements Pass 2 module wiki_index.py: extract_links parses relative .md
links from pages, build_backlinks inverts them into a stable backlinks graph.

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
Pass 2 in build_wiki: build_backlinks → write_backlinks → inject_referenced_by
→ write_provenance. Pages with inbound links get an idempotent '## Referenced by'
block (hidden marker for re-injection). _meta/backlinks.json + provenance.json emitted.

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
Adds wiki_enabled(), claude_md_wiki_pointer(), and agents_md_wiki_section()
helpers to renderer.py; injects wiki pointers into CLAUDE.md and AGENTS.md
output when the flag is on (default).

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
Add Step 8 (Build the LLM Wiki) after the Intent Layer phase, renumber
old steps 8→9 and 9→10, and add a Wiki summary line to the scan report
template with provenance.json as the data source.

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
Add "Before you implement anything" section above "Browse by type" in
render_index when capabilities exist; falls back to the prior blockquote
when none are present.

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
Inserts a Capabilities agent into the deep-scan Wave 1 parallel block (after the UI Layer agent). Includes trigger-condition guard, output wired to /tmp/archie_agent_capabilities.json, Step 4 save + merge references, and cleanup.

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
Add merge_capabilities() to validate and append Capabilities agent output,
cross-referencing uses_components/constrained_by_decisions/related_pitfalls
against known blueprint names; wire into CLI as the 5th file arg.

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
csacsi and others added 29 commits April 18, 2026 22:00
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Adds merge_data_models() and _load_data_models_file() to merge.py, mirroring
merge_capabilities(). Extends CLI argv routing to detect and process a trailing
data_models file. All three new TDD tests pass; 112 tests green with no regressions.

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
Adds test_wiki_data_models_end_to_end that runs one full builder pass and
asserts all three integration points together: data-model pages exist,
component pages carry the Data models section, and the index emits both
the browse-by-type bullet and the dedicated section.

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Three small fixture files (Swift, TypeScript, Python) with mixed public
and private functions for T2-T4 extractor tests to assert against.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Add _extract_swift_functions() helper to standalone scanner with two-pass
regex extraction: top-level public funcs + public extension methods. Private,
fileprivate, internal, and unmodified funcs are skipped.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Add _extract_typescript_functions helper that matches exported function
declarations and export const arrow functions, skipping private/unexported
and underscore-prefixed symbols. Strict TDD: 2 new tests confirm green.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Add _extract_python_functions helper that mirrors the Swift and TypeScript
extractors: regex-matches column-0 def/async-def lines, skips underscore
names, and returns the symbols[] schema shape (exported, language, kind).

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Wire Swift/TS/Python language extractors into run_scan() via a new
extract_symbols() helper; filter test files by path patterns and emit a
top-level symbols[] key in the scan result for wiki_builder (T6).

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Add _categorize_symbol() and render_utilities_catalog() to wiki_builder.py,
wire into build_wiki() to emit utilities.md from scan.json.symbols[].
Strict TDD: unit + integration tests added and passing.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Extends _page_type_from_dir to map data-models/ and guidelines/ and
rules/ directories, plus single-file root-level pages (utilities.md,
technology.md, etc.) so auto-injected Referenced-by labels are correct.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Add _normalize_field_type() to map language-specific type notations
(Optional<T>, T?, T | None, capitalized primitives) to canonical form,
and wire it into render_data_model so the Fields table shows
"canonical (raw)" when they differ.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Adds _extract_data_model_refs helper and a ## Related models section to
data-model pages, linking any field whose raw type string contains a known
model name (word-boundary regex). Self-references and unknown types are
silently skipped. Fixture extended with bidirectional User↔Session refs.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Tiny standalone script that verifies .archie/blueprint.json contains all
expected top-level keys, returning OK/MISSING/MALFORMED/STALE with exit
codes suitable for pipeline use. Covered by 6 TDD tests.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
… sidebar

The wiki sidebar hardcoded only the original 5 page types from Plan 1
(capabilities, decisions, components, patterns, pitfalls). After
Plans 5a/5b.1/5b.2 added data-models, guidelines, rules, and root-level
single-page outputs (utilities.md, technology.md, frontend.md,
quick-reference.md, architecture.md), those sections rendered as files
on disk but never appeared in the navigation — making the new pages
unreachable from the viewer UI.

Extends _SIDEBAR_ORDER and adds a "More" group for root-level pages.
Also surfaces decisions/index.md as an Overview link at the top of the
Decisions section (parallels the structure introduced by Plan 5a).

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
…i exists

The viewer now mounts the wiki as a Wiki tab inside the dashboard
instead of requiring a separate URL. Tab is auto-revealed when
.archie/wiki/ exists; --with-wiki-ui flag is no longer required
(kept for backwards compatibility); --no-wiki-ui added for opt-out.

Also: render_component links Public interface entries to their
data-model pages when the name matches a known data_models[*].name
(e.g. Models component now links Place/Tag/Article to their
data-models/*.md pages instead of just listing the names).

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
The wiki is now a tab inside /archie-viewer, auto-revealed when
.archie/wiki/ exists, so /archie-wiki has no purpose. Removes the
canonical command, the npm-package asset, and unregisters from
archie.mjs. archie-viewer.md updated to mention the Wiki tab.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Integrate telemetry, structured findings, and pitfall pipeline from main
with the wiki generation logic on this branch. Phase 4 of /archie-scan now
runs: evolve blueprint → findings → report → satellite files → wiki
capabilities/incremental/lint → cleanup → telemetry.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Adds a persistent feature flag for wiki generation that users can toggle
without needing to know env vars or config internals.

- intent_layer.py: new `config get|set <key> [value]` subcommand with
  allowlist (wiki_enabled). Separate from scan-config write so users
  don't need to supply scope/monorepo_type.
- /archie-set-wiki: new slash command. No arg = status, `on`/`off` = set.
  Surfaces ARCHIE_WIKI_ENABLED env override if active.
- archie.mjs installer: copy the new command into target projects.

Precedence preserved: env var > archie_config.json > default (true).

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Wiki generation now requires explicit opt-in via `/archie-set-wiki on`
or `ARCHIE_WIKI_ENABLED=true`. Default for fresh installs: no wiki
build during scan/deep-scan.

Also fixes a config-file path mismatch: renderer.py and wiki_builder.py
previously read `.archie/archie.json`, but `intent_layer.py config set`
writes to `.archie/archie_config.json`. The read path now matches the
write path, so `/archie-set-wiki on` actually takes effect end-to-end.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
@vercel
Copy link
Copy Markdown

vercel Bot commented Apr 22, 2026

The latest updates on your projects. Learn more about Vercel for GitHub.

Project Deployment Actions Updated (UTC)
archie Ready Ready Preview, Comment Apr 22, 2026 0:50am

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant