diff --git a/.github/copilot-instructions.md b/.github/copilot-instructions.md deleted file mode 100644 index a80ede5..0000000 --- a/.github/copilot-instructions.md +++ /dev/null @@ -1,82 +0,0 @@ -# zenzic-doc — Agent Guidelines - -Documentation site for Zenzic, built with Docusaurus 3.10 + TypeScript + MDX. Deployed to `https://zenzic.dev`. Supports English and Italian (`en` + `it`). - -## Build & Test - -```bash -npm ci # install (once after clone) — or: just setup -npm run build # production build (runs prebuild automatically) -npm run typecheck # TypeScript check -npm run lint:ts # ESLint, max-warnings=0 -npm run lint:md # markdownlint-cli2 on docs + i18n -``` - -**Full local gate before any PR:** - -```bash -just verify # markdownlint → lint:ts → typecheck → build -``` - -Dev server: - -```bash -npm run start:en # English only (faster) -npm run start # all locales -``` - -See [justfile](../justfile) for the full recipe list. - -## Architecture - -The sidebar is **autogenerated** from the filesystem (`type: 'autogenerated'` in `sidebars.ts`). Directory hierarchy + `_category_.json` files control ordering and labels. - -``` -docs/ # English source content — ALL files are .mdx - usage/ # User Guide: installation, commands, advanced, badges - guides/ # Reference: engines, config, adapters, CI/CD, custom rules - community/ # Community: contributing, FAQ, license, philosophy, brand-kit - contribute/ # Contribution guides (PRs, bugs, docs issues) - internals/ # Engineering: architecture, vision, VSM engine, style guide - adr/ # Architecture Decision Records - developers/ # Adapter/plugin development guides - reference/ # API reference - security/ # Security reports -i18n/ - en/ # English theme overrides (code.json) - it/ # Italian translations — mirrors docs/ structure exactly -src/ - components/Icon.tsx # Global icon wrapper (lucide-react + SVG fallback for github) - components/Homepage/ # Hero, Features, QualityScore, SentinelSection - pages/index.tsx # Landing page monolith (ESLint-excluded, covered by typecheck) - theme/MDXComponents.js # Global swizzle: injects and site-wide -scripts/ - build-assets.js # prebuild: zips static/brand/ + static/assets/social/ → brand-kit.zip -``` - -## Key Conventions - -**Content files**: all `.mdx` — never `.md` inside `docs/` or `i18n/`. Root-level `README.md` and `RELEASE.md` are the only `.md` files. - -**Physical Consistency (Slug Law)**: never use `slug:` frontmatter to diverge from the physical file path. URLs must mirror the filesystem to preserve relative link integrity and sidebar auto-generation. The only exception is `docs/internals/vision.mdx` (legacy). - -**Sidebar labels**: use `sidebar_label` in frontmatter for every `.mdx` file. This controls the display text in the autogenerated sidebar and prevents raw heading text or anchor fragments from leaking into navigation. - -**Icons**: use `` in any `.mdx` without per-file imports. To add a new icon: import from `lucide-react` and add to `iconsMap` in [src/components/Icon.tsx](../src/components/Icon.tsx). The `github` icon is a special inline SVG case. Missing names render a red fallback box. - -**Tailwind**: never use dynamically interpolated class names (e.g., `` border-${color}-500 `` — JIT purges them). Use static mapping objects. - -**i18n**: when adding or renaming files, update both `docs/` and `i18n/it/docusaurus-plugin-content-docs/current/` together. Run `npm run write-translations` to regenerate `code.json` stubs. - -**Validation gate**: run `just verify` before every commit. This is the only authorized local gate — it runs markdownlint → ESLint → typecheck → production build. `onBrokenLinks: 'throw'` is active, so broken page links will fail the build. - -**Prebuild**: `scripts/build-assets.js` runs automatically before `npm run build`. It is silent on success. - -**Broken-anchor warnings**: `#global-flags`, `#virtual-site-map-vsm`, and similar warnings in build output are pre-existing — not regressions. - -**markdownlint disabled rules**: MD013 (line length), MD033 (inline HTML for JSX), MD041 (first-line heading). - -## Key Docs - -- [README.md](../README.md) — full developer guide (npm scripts, just recipes, CI/CD, security) -- [RELEASE.md](../RELEASE.md) — release process diff --git a/.github/dependabot.yml b/.github/dependabot.yml index c49871a..474261b 100644 --- a/.github/dependabot.yml +++ b/.github/dependabot.yml @@ -1,12 +1,51 @@ +# SPDX-FileCopyrightText: 2026 PythonWoods +# SPDX-License-Identifier: Apache-2.0 + version: 2 updates: + # npm dependencies (Docusaurus, React, Tailwind, etc.) - package-ecosystem: npm directory: / schedule: interval: weekly + day: monday open-pull-requests-limit: 10 labels: - dependencies - automated commit-message: prefix: "chore(deps)" + groups: + docusaurus-all: + patterns: + - "@docusaurus/*" + - "docusaurus*" + update-types: + - minor + - patch + react-ecosystem: + patterns: + - "react" + - "react-dom" + - "@types/react*" + update-types: + - minor + - patch + + # GitHub Actions + - package-ecosystem: github-actions + directory: / + schedule: + interval: weekly + day: monday + open-pull-requests-limit: 5 + labels: + - dependencies + - github-actions + commit-message: + prefix: "ci(deps)" + groups: + actions-all: + update-types: + - minor + - patch diff --git a/.github/workflows/ci.yml b/.github/workflows/ci.yml index 55cbd91..1d3868c 100644 --- a/.github/workflows/ci.yml +++ b/.github/workflows/ci.yml @@ -1,3 +1,6 @@ +# SPDX-FileCopyrightText: 2026 PythonWoods +# SPDX-License-Identifier: Apache-2.0 + name: docs-ci on: @@ -44,45 +47,58 @@ concurrency: cancel-in-progress: true jobs: - validate: - name: Validate (Node ${{ matrix.node }}) + verify: + name: Verify (ubuntu-latest, Node LTS) runs-on: ubuntu-latest - env: - FORCE_JAVASCRIPT_ACTIONS_TO_NODE24: true - strategy: - fail-fast: false - matrix: - node: [22, 24] + defaults: + run: + shell: bash steps: - - name: Checkout + - uses: actions/checkout@v6 + + - name: Determine Zenzic Core Branch (Parity or Fallback) + id: resolve-branch + shell: bash + run: | + TARGET_BRANCH="${{ github.base_ref || github.ref_name }}" + echo "Target branch is: $TARGET_BRANCH" + + if git ls-remote --exit-code --heads https://github.com/PythonWoods/zenzic.git "$TARGET_BRANCH" > /dev/null 2>&1; then + echo "Branch $TARGET_BRANCH exists in core. Using it." + echo "CORE_REF=$TARGET_BRANCH" >> $GITHUB_ENV + else + echo "Branch $TARGET_BRANCH not found in core. Falling back to main." + echo "CORE_REF=main" >> $GITHUB_ENV + fi + + - name: Checkout local zenzic (unreleased) uses: actions/checkout@v6 + with: + repository: PythonWoods/zenzic + ref: ${{ env.CORE_REF }} + path: _zenzic_core + + - name: Install just + uses: taiki-e/install-action@just - name: Setup Node uses: actions/setup-node@v4 with: - node-version: ${{ matrix.node }} + node-version: '24' cache: npm + - name: Install uv + uses: astral-sh/setup-uv@v8.1.0 + - name: Install dependencies run: npm ci - - name: Typecheck - run: npm run typecheck - - - name: Build documentation - run: npm run build - - sentinel: - name: Zenzic Sentinel Audit - runs-on: ubuntu-latest - - steps: - - name: Checkout - uses: actions/checkout@v6 - - - name: Install uv - uses: astral-sh/setup-uv@v7 - - - name: Zenzic Documentation Audit - run: uvx zenzic check all --engine docusaurus --strict + - name: Run unified verification + shell: bash + env: + ZENZIC_PROJECT_PATH: ./_zenzic_core + PYTHONUTF8: '1' + # ZRT-010 — Sovereign Parity: Pre-Launch Guard lives in justfile. + # Local and CI run identical 'just check' invocations. + run: just verify diff --git a/.github/workflows/codeql.yml b/.github/workflows/codeql.yml index 1fdda6f..1b6c7e3 100644 --- a/.github/workflows/codeql.yml +++ b/.github/workflows/codeql.yml @@ -1,3 +1,6 @@ +# SPDX-FileCopyrightText: 2026 PythonWoods +# SPDX-License-Identifier: Apache-2.0 + name: codeql on: diff --git a/.github/workflows/dependency-review.yml b/.github/workflows/dependency-review.yml index b6da43b..d74c49d 100644 --- a/.github/workflows/dependency-review.yml +++ b/.github/workflows/dependency-review.yml @@ -1,3 +1,6 @@ +# SPDX-FileCopyrightText: 2026 PythonWoods +# SPDX-License-Identifier: Apache-2.0 + name: dependency-review on: diff --git a/.github/workflows/npm-audit.yml b/.github/workflows/npm-audit.yml index c8dc170..8a88387 100644 --- a/.github/workflows/npm-audit.yml +++ b/.github/workflows/npm-audit.yml @@ -1,3 +1,6 @@ +# SPDX-FileCopyrightText: 2026 PythonWoods +# SPDX-License-Identifier: Apache-2.0 + name: npm-audit on: diff --git a/.github/workflows/release-docs.yml b/.github/workflows/release-docs.yml index bf95894..921a606 100644 --- a/.github/workflows/release-docs.yml +++ b/.github/workflows/release-docs.yml @@ -1,3 +1,6 @@ +# SPDX-FileCopyrightText: 2026 PythonWoods +# SPDX-License-Identifier: Apache-2.0 + name: release-docs on: diff --git a/.github/workflows/release.yml b/.github/workflows/release.yml new file mode 100644 index 0000000..560700d --- /dev/null +++ b/.github/workflows/release.yml @@ -0,0 +1,50 @@ +# SPDX-FileCopyrightText: 2026 PythonWoods +# SPDX-License-Identifier: Apache-2.0 +name: release + +on: + push: + tags: + - 'v*' + +permissions: + contents: write + +concurrency: + group: release-${{ github.ref }} + cancel-in-progress: false + +jobs: + release: + name: Build docs and create GitHub Release + runs-on: ubuntu-latest + env: + FORCE_JAVASCRIPT_ACTIONS_TO_NODE24: true + + steps: + - name: Checkout + uses: actions/checkout@v6 + + - name: Setup Node + uses: actions/setup-node@v4 + with: + node-version: 24 + cache: npm + + - name: Install dependencies + run: npm ci + + - name: Typecheck + run: npm run typecheck + + - name: Build docs + run: npm run build + + - name: Archive build output + run: tar -czf "docs-${{ github.ref_name }}.tar.gz" build + + - name: Create GitHub Release + uses: softprops/action-gh-release@v2 + with: + files: docs-${{ github.ref_name }}.tar.gz + generate_release_notes: true diff --git a/.gitignore b/.gitignore index 1d98147..795bc8c 100644 --- a/.gitignore +++ b/.gitignore @@ -1,3 +1,6 @@ +# SPDX-FileCopyrightText: 2026 PythonWoods +# SPDX-License-Identifier: Apache-2.0 + # Dependencies /node_modules @@ -10,6 +13,8 @@ .cache-loader *.tsbuildinfo .eslintcache +_zenzic_core/ +__pycache__/ # Misc .DS_Store @@ -21,11 +26,27 @@ .env.development.local .env.test.local .env.production.local +.zenzic.local.toml *.log -# Drafts (dev.to articles, internal notes) +# Drafts (editorial, internal notes) /drafts npm-debug.log* yarn-debug.log* yarn-error.log* + +# EPOCH 4 — draft vault (git-ignored, local reference only) +.draft/ + +# --- Ephemeral Artifacts (Machine Silence) --- +zenzic-results.sarif +coverage.json +coverage.xml +.coverage +.coverage.* +mutmut* +.mutmut-cache/ +.pytest_cache/ +.nox/ +.hypothesis/ diff --git a/.markdownlint-cli2.jsonc b/.markdownlint-cli2.jsonc index 6705f95..bd5179a 100644 --- a/.markdownlint-cli2.jsonc +++ b/.markdownlint-cli2.jsonc @@ -11,15 +11,18 @@ "i18n/**/*.md", "i18n/**/*.mdx" ], - "ignores": [ - "node_modules/**", - "build/**", - ".docusaurus/**" - ], + "gitignore": true, "config": { "default": true, "MD013": false, + "MD024": { "siblings_only": true }, + "MD025": { "front_matter_title": "" }, "MD033": false, - "MD041": false + "MD036": false, + "MD037": false, + "MD041": false, + "MD003": { "style": "atx" }, + "MD010": { "code_blocks": false }, + "MD046": false } } diff --git a/.pre-commit-config.yaml b/.pre-commit-config.yaml index 29286cb..21ddd2b 100644 --- a/.pre-commit-config.yaml +++ b/.pre-commit-config.yaml @@ -24,6 +24,7 @@ repos: - id: mixed-line-ending - id: no-commit-to-branch args: ["--branch", "main"] + stages: [pre-commit] # 2. TypeScript type checking - repo: local @@ -35,22 +36,43 @@ repos: pass_filenames: false types_or: [ts, tsx] - # 3. Zenzic Sentinel — The Dogfooding Hook - # Bootstrap script: tries local core (../zenzic) first, - # falls back to uvx --pre for external contributors. - # always_run: true — config/nav changes can break docs - # even without .md edits + # 3. ESLint — catches unused disable directives, React/hooks violations + - repo: local + hooks: + - id: eslint + name: "ESLint" + entry: npm run lint:ts + language: system + pass_filenames: false + types_or: [ts, tsx] + + # 4. Zenzic Sentinel — ZRT-010 Sovereign Parity: entry-point is 'just check' + # so local and CI always run the same guard-aware invocation. + # always_run: true — config/nav changes can break docs even without .md edits - repo: local hooks: - id: zenzic-check name: "Zenzic Sentinel" - entry: bash scripts/pre-commit-zenzic.sh + entry: just check language: system pass_filenames: false always_run: true - # 4. REUSE/SPDX license compliance + # 5. REUSE/SPDX license compliance - repo: https://github.com/fsfe/reuse-tool - rev: v5.0.2 + rev: v6.2.0 hooks: - id: reuse + + # 6. Pre-push Final Guard (4-Gates Standard, EPOCH 4 / v0.7.0) + # Single entry-point: locale ≡ remote. Same `just verify` runs in GHA. + # Install with: uvx pre-commit install -t pre-push + - repo: local + hooks: + - id: just-verify + name: 🛡️ Doc Final Guard (just verify) + entry: just verify + language: system + stages: [pre-push] + pass_filenames: false + always_run: true diff --git a/CHANGELOG.it.md b/CHANGELOG.it.md new file mode 100644 index 0000000..329439b --- /dev/null +++ b/CHANGELOG.it.md @@ -0,0 +1,228 @@ + + + +# Registro delle modifiche + +Tutte le modifiche rilevanti al portale di documentazione Zenzic (`zenzic-doc`) sono documentate qui. +Il formato segue [Keep a Changelog](https://keepachangelog.com/en/1.1.0/). +Le versioni seguono la linea di rilascio di Zenzic Core sotto la Branch Parity Rule. + +--- + +## [0.7.0] — 2026-05-07 — Quartz Maturity (Stable) + +> **Fonte autorevole:** [zenzic.dev](https://zenzic.dev). Questo file è la +> controparte machine-readable di [`RELEASE.md`](RELEASE.md) e segue la stessa +> Branch Parity Rule del changelog di Zenzic Core. + +#### Aggiunto + +- **Sprint Editoriale A — Sovranità Zero-Config**: Tutorial `docs/tutorials/first-audit.mdx` + (EN+IT) aggiornato per documentare la blog auto-discovery senza configurazione manuale. + `uvx zenzic check all` include i post del blog nel perimetro di default; il tutorial + lo dimostra esplicitamente con una nota `ContentRoot` blog: `blog/` rilevata tramite + `docusaurus.config.ts` o convenzione filesystem — nessun `blog_dir` da configurare. + +- **Sprint Editoriale B — Manifesto Aerospaziale**: Il linguaggio dei vincoli sostituisce + gli aggettivi di marketing nei README di tutti e quattro i repository dell'ecosistema + (`zenzic`, `zenzic-doc`, `zenzic-action`, `structum`). Tagline riscritte come invarianti + deterministici: + - `zenzic` — "Audit deterministico di strutture documentali con tracciabilità + bidirezionale. Ogni finding mappa su un file sorgente e un numero di riga. Ogni URL + ha un'origine fisica. Zero stato globale." + - `zenzic-doc` — Dichiarazione di evidenza conformità: `zenzic check all --strict` + termina con 0 e zero finding ad ogni push. + - `zenzic-action` — Paragrafo contratto exit code (exit 2 e 3 non sono mai + sopprimibili al confine di enforcement). + - `structum` — "Legge, non esegue. Solo AST. Niente `eval()`, niente import dinamico, + niente subprocess." + Preambolo Engineering Ledger riformulato sui principi NASA Power of 10 Rules 1/4 + (flusso di controllo deterministico, zero stato globale) e Rule 2 (ban subprocess + applicato da ruff). + +- **Sprint Editoriale C — Mostra, Non Descrivere**: Eliminazione chirurgica degli + aggettivi non quantificabili in `docs/how-to/`, `docs/explanation/` e + `docs/tutorials/` (EN+IT). Sei sostituzioni EN + sei mirror IT in `explanation/` e + `how-to/`; quattro sostituzioni tutorial (EN+IT) in `tutorials/`: + - `why-zenzic.mdx` — etichette bullet di marketing sostituite con invocazioni + strumento fattuali (requisiti esatti `bash ≥5`, `python3 ≥3.11`, invocazione + `astral-sh/setup-uv`). + - `safe-harbor.mdx` — "trivialmente testabili" sostituito con contratto di + comportamento deterministico (input identici, output identici, nessuno stato + condiviso). + - `install.mdx` — titolo sezione "Lean e Agnostico per Design" → "Solo analisi + statica — nessun runtime di build richiesto"; prosa "rendendolo ideale per" + rimossa. + - `configure-ci-cd.mdx` — "strumento potente ma irreversibile" → "strumento + irreversibile". + - `migrate-engines.mdx` — metafora "custode" sostituita con linguaggio contrattuale; + blocco tip riscritto in voce imperativa. + - `tutorials/first-audit.mdx` — Sprint B: prova di tracciabilità aggiunta inline: + link rotto → finding `Z101` esatto con file, riga e codice. Sprint A: nota + blog auto-discovery nello Step 1. Sprint C: `:::note[Rottura Deliberata — La + Prova della Tracciabilità]` illustra l'output deterministico. + - `tutorials/examples.mdx` — paragrafo di apertura riscritto: "Clonalo. Esegui + `uvx zenzic check all`. Ogni esempio isola una funzionalità." + +- **Fase Tecnica 1 — SentinelOutput API v2**: `src/components/SentinelOutput.tsx` + esteso con un discriminante `Status` domain-specific che sostituisce `variant`: + + | `status` | `variant` interno | Exit | Significato | + |-------------|------------------|------|--------------------------------| + | `success` | `clean` | 0 | Integrità verificata | + | `error` | `findings` | 1 | Violazione strutturale/link | + | `warning` | `findings` | 0–1 | Anomalia non bloccante | + | `inspect` | `inspect` | 0 | Modalità audit/debug | + | `breach` | `breach` | 2 | Perimetro sicurezza compromesso| + + Nuove props: `status`, `code` (stringa Zxxx), `exitCode` (0|1|2|3), + `traceability` (boolean). `variant` preservato per compatibilità con `console.warn` + di deprecazione in development mode. Guardia di tracciabilità: `status="error"|"warning"` + senza `code` emette warning — un finding senza codice Zxxx viola la Tracciabilità + Assoluta. `tsc --noEmit` pulito. + +- **Fase Tecnica 2 — VSMVisualizer**: Nuovo componente + `src/components/VSMVisualizer.tsx` registrato globalmente in + `src/theme/MDXComponents.js`. Renderizza un albero gerarchico espandibile in-place + del Virtual Site Map distinguendo: + - **Nodi Fisici** (📄) — file `.md`/`.mdx` reali su disco + - **Route Virtuali** (🏷 tag, 📑 paginazione, 👤 autore) — route inferite dai + metadati frontmatter, con disclosure Reverse-Mapping in-place dei `source_files`. + - **Violazione Reverse-Mapping** — nodo virtuale con `source_files = ∅` renderizzato + con marcatore ⚠ (non dovrebbe mai apparire in un audit che passa). + Props: `roots: string[]` (obbligatorio), `virtual?: boolean`, `nodes?: VSMNode[]` + (override per alberi personalizzati). `tsc --noEmit` pulito. + +- **Fase Tecnica 3 — Migrazione finding-codes.mdx**: Tutti gli 8 utilizzi + `` in `docs/reference/finding-codes.mdx` (EN+IT) migrati dalla + legacy prop `variant=` al contratto Fase 1 (`status`, `code`, `exitCode`). Ogni + codice di finding nell'enciclopedia di riferimento è ora collegato al suo + identificatore Zxxx tramite `code=` — Tracciabilità Assoluta dalla prosa al + componente al codice di finding. + + + Dopo l'epurazione EPOCH 7a.1 nel Core, il blocco TOML + `[link_validation].absolute_path_allowlist` è **rimosso** da + `zenzic-doc/zenzic.toml`. I prefissi URL multi-instance di Docusaurus + (`/docs/`, `/developers/`, ogni ulteriore istanza content-docs) vengono ora + auto-rilevati da `DocusaurusAdapter.get_absolute_url_prefixes()` tramite + parsing statico di `docusaurus.config.ts` più un'euristica filesystem su + `i18n//docusaurus-plugin-content-docs-/`. Nessuna duplicazione + TOML del routing Docusaurus richiesta. **Supersessione documentale** — + ADR-0011 ("Cross-Instance Allowlist"), + `how-to/manage-cross-site-links.mdx` e la sezione `[link_validation]` di + `reference/configuration.mdx` descrivono una superficie di configurazione + obsoleta e sono in attesa di refactor in uno sprint documentale successivo. + La voce Z108 STALE_ALLOWLIST_ENTRY in + `developers/governance/technical-debt.mdx` è ora chiusa-per-rimozione: non + esiste più alcuna allowlist che possa diventare stantia. + +- **EPOCH 7a — Documentazione Multi-Root Discovery (dual-track)**: Due nuove + superfici documentali consegnano la narrativa user-facing e developer-facing + della Multi-Root Discovery del Core, che rimuove la storica frontiera di + `docs_dir` nel VSM. + - **User track** — `docs/reference/engines.mdx` (EN+IT) acquisisce una + sezione `### Blog auto-discovery {#docusaurus-blog}` che celebra il + risultato pratico e documenta le tre regole di rilevamento (blocco di + config, fallback per convenzione, opt-out `blog: false`) senza divulgare + dettagli implementativi. + - **Developer track** — `docs/explanation/discovery.mdx` (EN+IT) acquisisce + una sezione `## Multi-Root Discovery (EPOCH 7a)` con la dataclass + `ContentRoot`, l'hook adapter sigillato da `hasattr()`, la cooperazione + pipeline a quattro stadi (Discovery → VSM → Validator → Scanner), il pass + di auto-discovery Zero Subprocess, l'invariante Reverse-Mapping e la + matrice di supporto motori. + - La separazione dual-track è rigorosa — nessun gergo implementativo trapela + nello User track; nessun linguaggio celebrativo trapela nel Developer + track. + - La parità linguistica è imposta tra EN e IT in entrambi i track + (`Z907 I18N_PARITY` clean). +- **Ristrutturazione Architettura Diátaxis**: Architettura informativa + ricostruita attorno al [framework Diátaxis](https://diataxis.fr) — `tutorials/`, + `how-to/`, `reference/`, `explanation/`. Sidebar autogenerata dal filesystem. +- **Zenzic Blog**: `/blog/` inaugurato come log ingegneristico ufficiale di + Zenzic. Sei articoli fondativi coprono lo sprint v0.6.x, il post-mortem del + AI-Driven Siege e la dichiarazione di Quartz Maturity v0.7.0. Convenzione a + due track: 🛡️ **Saga** (long-form) e 📜 **Log** (mirror sintetico delle + patch-notes). +- **Brand System**: Brand package formale consegnato in + `static/assets/brand/brand-kit.zip` — icone SVG, esportazioni PNG, template + social card, pagina HTML di riferimento brand. +- **Parità Bilingue (EN + IT)**: `i18n/it/` rispecchia `docs/` esattamente. + `npm run build` produce entrambi i locale con zero broken link. +- **D117 — Supporto protocollo `pathname:`**: Escape hatch engine-agnostic per + i link `pathname:///` di Docusaurus documentato in `reference/engines.mdx` + (EN+IT). +- **Pre-commit Gate & REUSE 3.3 Compliance**: Pipeline completamente operativa + con 207/207 file conformi. Nuove ricette `just`: `preflight`, `reuse`, + `sentinel`. +- **D118 — Coerenza Assoluta dei Titoli**: Titoli della pagina lista del blog + bloccati attraverso gli stati `:visited` / `:active` / `:hover`. +- **SentinelPalette CLI × Web Color Bridge**: Sei custom property CSS in + `src/css/custom.css` rispecchiano la palette semantica della CLI sui mode + light e dark (calibrato WCAG AA). +- **Asset Integrity & Static Consolidation**: `static/` riorganizzato attorno + a una singola gerarchia canonica (`assets/brand`, `assets/favicon`, + `assets/social`, `css`, `img`). +- **Cross-Instance Routing — promozione Developer Area**: + `/docs/community/developers/*` → `/developers/*` (istanza Docusaurus + top-level dedicata). +- **ADR-0011 "Cross-Instance Allowlist"** (EN+IT) — formalizza la + configurazione `absolute_path_allowlist` come *contratto di fiducia* tra + istanze Docusaurus. + +#### Modificato + +- Tutti i percorsi precedenti sotto `docs/usage/` e `docs/guides/` riorganizzati + nei quadranti Diátaxis. Gli slug della sidebar sono ora filesystem-driven — + nessuna divergenza di slug ammessa. +- **Hardening del rendering metadata autore**: uno swizzle mirato in + `src/theme/BlogPostItem/Header/Authors` ora restituisce `null` quando non + sono dichiarati autori, rimuovendo rumore da placeholder/fallback e + omettendo il blocco in modo strutturale dal DOM. +- **Verifica CI docs cross-platform**: `.github/workflows/ci.yml` ora esegue + su matrice Ubuntu/Windows (`fail-fast: false`) preservando il checkout di + parità branch del Core (`_zenzic_core`) e l'esecuzione unificata di + `just verify`. +- `static/brand/` (duplicato legacy) eliminato; il percorso canonico è + `static/assets/brand/`. +- `static/assets/stylesheets/` rinominato in `static/css/`. +- `brand-kit.zip` spostato in `static/assets/brand/`. +- Percorso del logo navbar aggiornato in `docusaurus.config.ts`. +- `scripts/build-assets.js` e `scripts/bump-version.sh` aggiornati — niente + più pattern mirror-copy. +- **Igiene workspace ESLint**: `.eslintignore` in root ora esclude artefatti + di checkout CI (`_zenzic_core/`) e virtual environment locali (`.venv/`, + `venv/`). +- **Manutenzione dipendenze (ZRT-008)**: consolidati 8 Dependabot PR — Docusaurus + 3.10.0 → 3.10.1 (`@docusaurus/core`, `faster`, `preset-classic`, + `module-type-aliases`, `tsconfig`, `types`; patch: bugfix bundler webpackbar), + `lucide-react` 1.8.0 → 1.14.0 (nuove icone), `postcss` → 8.5.14 + (sicurezza: XSS tramite `` non escaped in scenari non-bundler; fix + regressione sintassi custom). `npm run build` (EN + IT) pulito dopo + l'aggiornamento. + +#### Rimosso + +- **Percorsi URL legacy**: `/docs/community/developers/*`, + `/docs/community/governance/*`, `/docs/community/contribute/*` rimossi senza + compatibility shim. I bookmark esterni vanno aggiornati. +- **Contenuti probabilistici / AI-architecture** epurati dal blog Zenzic. La + pagina `Adversarial Stress-Testing Protocol` è l'unica eccezione e inquadra + l'AI esplicitamente come "punching bag", mai come co-autore. + +#### Gate di verifica + +| Gate | Risultato | +|------|-----------| +| `zenzic check all` sul repo docs | ✅ Exit 0 | +| `npm run build` (EN + IT) | ✅ Zero errori broken-link | +| TypeScript `tsc --noEmit` | ✅ Zero errori | +| Markdownlint (tutti gli MDX) | ✅ Zero warning | +| REUSE lint | ✅ 207/207 conformi | +| Pre-commit (tutti gli hook) | ✅ Tutti passati | + +--- + +**Per le release notes del motore, vedere +[Zenzic Core CHANGELOG](https://github.com/PythonWoods/zenzic/blob/main/CHANGELOG.md).** diff --git a/CHANGELOG.md b/CHANGELOG.md new file mode 100644 index 0000000..331fcc4 --- /dev/null +++ b/CHANGELOG.md @@ -0,0 +1,232 @@ + + + +# Changelog + +All notable changes to the Zenzic documentation portal (`zenzic-doc`) are documented here. +Format follows [Keep a Changelog](https://keepachangelog.com/en/1.1.0/). +Versions track the Zenzic Core release line under the Branch Parity Rule. + +--- + +## [0.7.0] — 2026-05-07 — Quartz Maturity (Stable) + +> **Authoritative source:** [zenzic.dev](https://zenzic.dev). This file is the +> machine-readable counterpart of [`RELEASE.md`](RELEASE.md) and follows the same +> Branch Parity Rule as the Zenzic Core changelog. + +#### Added + +- **Editorial Sprint A — Zero-Config Sovereignty**: Tutorial `docs/tutorials/first-audit.mdx` + (EN+IT) updated to document blog auto-discovery without manual configuration. + `uvx zenzic check all` now covers blog posts in scope by default; the tutorial + demonstrates this explicitly with a blog `ContentRoot` note: `blog/` detected via + `docusaurus.config.ts` or filesystem convention — no `blog_dir` to configure. + +- **Editorial Sprint B — Aerospace Manifesto**: Constraint language replaces marketing + adjectives across all four ecosystem READMEs (`zenzic`, `zenzic-doc`, `zenzic-action`, + `structum`). Taglines rewritten as deterministic invariants: + - `zenzic` — "Deterministic audit of documentation structures with bidirectional + traceability. Every finding maps to a source file and a line number. Every URL has + a physical origin. Zero global state." + - `zenzic-doc` — Compliance evidence statement: `zenzic check all --strict` exits 0 + with zero findings on every push. + - `zenzic-action` — Exit code contract paragraph (exits 2 and 3 are never + suppressible at the enforcement boundary). + - `structum` — "Reads, never executes. AST-only. No `eval()`, no dynamic import, + no subprocess." + Engineering Ledger preamble re-framed on NASA Power of 10 Rules 1/4 (deterministic + control flow, zero global state) and Rule 2 (subprocess ban enforced by ruff). + +- **Editorial Sprint C — Show, Don't Tell**: Surgical eradication of non-quantifiable + adjectives across `docs/how-to/`, `docs/explanation/`, and `docs/tutorials/` (EN+IT). + Six EN replacements + six IT mirrors in `docs/explanation/` and `docs/how-to/`; + four tutorial replacements (EN+IT) in `docs/tutorials/`: + - `why-zenzic.mdx` — marketing bullet labels replaced with factual tool invocations + (exact `bash ≥5`, `python3 ≥3.11` requirements, `astral-sh/setup-uv` invocation). + - `safe-harbor.mdx` — "trivially testable" replaced with deterministic behaviour + contract (identical inputs, identical outputs, no shared state). + - `install.mdx` — section title "Lean & Agnostic by Design" → "Static analysis only + — no build runtime required"; "making it ideal for" prose removed. + - `configure-ci-cd.mdx` — "powerful but irreversible" → "irreversible". + - `migrate-engines.mdx` — "custodian" metaphor replaced with contract language; + tip block rewritten in imperative voice. + - `tutorials/first-audit.mdx` — Sprint B traceability proof addedinline: broken + link → exact `Z101` finding with file, line, and code. Sprint A: blog + auto-discovery noted in Step 1. Sprint C: `:::note[Deliberate Failure — The + Traceability Proof]` illustrates deterministic output. + - `tutorials/examples.mdx` — opening paragraph rewritten: "Clone it. Run + `uvx zenzic check all`. Each example isolates one feature." + +- **Technical Phase 1 — SentinelOutput API v2**: `src/components/SentinelOutput.tsx` + extended with a domain-specific `Status` discriminant that supersedes `variant`: + + | `status` | Internal `variant` | Exit | Meaning | + |-------------|-------------------|------|--------------------------------| + | `success` | `clean` | 0 | Integrity verified | + | `error` | `findings` | 1 | Structural/link violation | + | `warning` | `findings` | 0–1 | Non-blocking anomaly | + | `inspect` | `inspect` | 0 | Audit/debug mode | + | `breach` | `breach` | 2 | Security perimeter compromised | + + New props: `status`, `code` (Zxxx string), `exitCode` (0|1|2|3), `traceability` + (boolean). `variant` preserved for backward compatibility with a `console.warn` + deprecation notice in development mode. Traceability guard: `status="error"|"warning"` + without `code` emits a warning — a finding without a Zxxx code violates Absolute + Traceability. `tsc --noEmit` clean. + +- **Technical Phase 2 — VSMVisualizer**: New component `src/components/VSMVisualizer.tsx` + registered globally in `src/theme/MDXComponents.js`. Renders a hierarchical + in-place-expandable tree of the Virtual Site Map distinguishing: + - **Physical Nodes** (📄) — real `.md`/`.mdx` files on disk + - **Virtual Routes** (🏷 tag, 📑 pagination, 👤 author) — routes inferred from + frontmatter metadata, with in-place Reverse-Mapping disclosure of `source_files`. + - **Reverse-Mapping violation** — virtual node with `source_files = ∅` rendered with + ⚠ marker (should never appear in a passing audit). + Props: `roots: string[]` (required), `virtual?: boolean`, `nodes?: VSMNode[]` + (override for custom trees). `tsc --noEmit` clean. + +- **Technical Phase 3 — finding-codes.mdx Migration**: All 8 `` + usages in `docs/reference/finding-codes.mdx` (EN+IT) migrated from legacy `variant=` + to the Phase 1 contract (`status`, `code`, `exitCode`). Every finding code in the + reference encyclopedia is now linked to its Zxxx identifier via `code=` — Absolute + Traceability from prose to component to finding code. + + + Following the Core's EPOCH 7a.1 purge, the `[link_validation].absolute_path_allowlist` + TOML block is **gone** from `zenzic-doc/zenzic.toml`. Multi-instance Docusaurus + plugin URL prefixes (`/docs/`, `/developers/`, every additional content-docs + instance) are now auto-detected by `DocusaurusAdapter.get_absolute_url_prefixes()` + via static parsing of `docusaurus.config.ts` plus a filesystem heuristic over + `i18n//docusaurus-plugin-content-docs-/`. Zero TOML duplication of + Docusaurus routing required. **Documentation supersession** — ADR-0011 + ("Cross-Instance Allowlist"), `how-to/manage-cross-site-links.mdx` and the + `[link_validation]` section of `reference/configuration.mdx` describe an + obsolete configuration surface and are pending refactor in a follow-up + documentation sprint. The Z108 STALE_ALLOWLIST_ENTRY entry in + `developers/governance/technical-debt.mdx` is now closed-by-removal: there is + no allowlist left to go stale. + +- **EPOCH 7a — Multi-Root Discovery documentation (dual-track)**: Two new doc + surfaces ship the user-facing and developer-facing narrative for the Core's + Multi-Root Discovery foundation that lifts the historical `docs_dir` boundary + in the VSM. + - **User track** — `docs/reference/engines.mdx` (EN+IT) gains an + `### Blog auto-discovery {#docusaurus-blog}` section that celebrates the + practical outcome and documents the three detection rules (config block, + convention fallback, `blog: false` opt-out) without leaking implementation + details. + - **Developer track** — `docs/explanation/discovery.mdx` (EN+IT) gains a + `## Multi-Root Discovery (EPOCH 7a)` section with the `ContentRoot` dataclass, + the `hasattr()`-gated adapter hook, the four-stage pipeline cooperation + (Discovery → VSM → Validator → Scanner), the Zero Subprocess auto-discovery + pass, the Reverse-Mapping invariant, and the engine support matrix. + - The dual-track separation is strict — no implementation jargon leaks into + the User track; no celebratory language leaks into the Developer track. + - Linguistic parity is enforced across EN and IT in both tracks (`Z907 I18N_PARITY` + clean). +- **Diátaxis Architecture Restructure**: Information architecture rebuilt around + the [Diátaxis framework](https://diataxis.fr) — `tutorials/`, `how-to/`, + `reference/`, `explanation/`. Sidebar autogenerated from filesystem. +- **Zenzic Blog**: `/blog/` inaugurated as the official engineering log of Zenzic. + Six founding articles cover the v0.6.x sprint, the AI-Driven Siege postmortem, + and the v0.7.0 Quartz Maturity declaration. Two-track convention: + 🛡️ **Saga** (long-form) and 📜 **Log** (terse patch-notes mirror). +- **Brand System**: Formal brand package shipped at + `static/assets/brand/brand-kit.zip` — SVG icons, PNG exports, social card + templates, brand HTML reference page. +- **Bilingual Parity (EN + IT)**: `i18n/it/` mirrors `docs/` exactly. + `npm run build` produces both locales with zero broken links. +- **Sovereign Override 404 Shield KB** (`developers/how-to/sovereign-override-404-shield.mdx`, + EN + IT mirror): Complete lifecycle guide for the `ZENZIC_EXTRA_ARGS` shield pattern — + when to apply, how to propagate through `justfile` / `pre-commit` / CI env blocks, + and when to retire the exclusion after a URL becomes reachable. + Italian mirror at `i18n/it/docusaurus-plugin-content-docs-developers/current/how-to/` + for Z907 parity. +- **CONTRIBUTING.md — Sovereign Override section**: Emergency protocol and rationale + for the 404 shield added under "Sovereign Override (404 Shield)", linking contributors + to the MDX guide for full architecture context. +- **D117 — `pathname:` protocol support**: Engine-agnostic escape hatch for + Docusaurus `pathname:///` links documented in `reference/engines.mdx` (EN+IT). +- **Pre-commit Gate & REUSE 3.3 Compliance**: Full pipeline operational with + 207/207 files compliant. New `just` recipes: `preflight`, `reuse`, `sentinel`. +- **D118 — Absolute Title Consistency**: Blog list page titles locked across + `:visited` / `:active` / `:hover` states. +- **SentinelPalette CLI × Web Color Bridge**: Six CSS custom properties in + `src/css/custom.css` mirror the CLI semantic palette across light and dark + modes (WCAG AA-calibrated). +- **Asset Integrity & Static Consolidation**: `static/` reorganised around a + single canonical hierarchy (`assets/brand`, `assets/favicon`, `assets/social`, + `css`, `img`). +- **Cross-Instance Routing — Developer Area promotion**: `/docs/community/developers/*` + → `/developers/*` (its own top-level Docusaurus instance). +- **ADR-0011 "Cross-Instance Allowlist"** (EN+IT) — formalises the + `absolute_path_allowlist` configuration as a *trust contract* between + Docusaurus instances. + +#### Changed + +- All previous paths under `docs/usage/` and `docs/guides/` reorganised under + the Diátaxis quadrants. Sidebar slugs are now filesystem-driven — no slug + divergence permitted. +- **Author metadata rendering hardening**: a targeted swizzle at + `src/theme/BlogPostItem/Header/Authors` now returns `null` when no authors + are declared, removing placeholder/fallback noise and omitting the block + structurally from the DOM. +- **Cross-platform docs CI verification**: `.github/workflows/ci.yml` now runs + on an Ubuntu/Windows matrix (`fail-fast: false`) while preserving Core branch + parity checkout (`_zenzic_core`) and unified `just verify` execution. +- `static/brand/` (legacy duplicate) deleted; canonical path is + `static/assets/brand/`. +- `static/assets/stylesheets/` renamed to `static/css/`. +- `brand-kit.zip` moved into `static/assets/brand/`. +- Navbar logo path updated in `docusaurus.config.ts`. +- `scripts/build-assets.js` and `scripts/bump-version.sh` updated — no more + mirror-copy pattern. +- **ESLint workspace hygiene**: `.eslintignore` (ESLint v8 format) removed; + CI checkout artifacts (`_zenzic_core/`) and local virtual environments + (`.venv/`, `venv/`) migrated to the `ignores` array in `eslint.config.mjs` + (ESLint v9 flat config). Eliminates false-positive lint errors on vendored + and generated files. +- **Pre-commit 404 shield parity**: `zenzic-check` hook entry in + `.pre-commit-config.yaml` replaced inline `bash -c` invocation with + `bash scripts/pre-commit-zenzic.sh`, propagating `ZENZIC_EXTRA_ARGS` + correctly during local pre-commit runs. Closes the silent bypass where the + Sovereign Override shield was active in CI but not locally. +- **ZENZIC_EXTRA_ARGS CI propagation**: `.github/workflows/ci.yml` injects + five `--exclude-url` entries for known pre-launch transient URLs + (`zenzic.dev/blog/`, `zenzic.dev/docs/explanation/structural-integrity`, + `zenzic.dev/developers/`, `zenzic.dev/it/developers/`, and the + `v0.7.0` GitHub release tag). `PYTHONUTF8: '1'` added for Windows encoding + determinism. +- **Dependency maintenance (ZRT-008)**: consolidated 8 Dependabot PRs — Docusaurus + 3.10.0 → 3.10.1 (`@docusaurus/core`, `faster`, `preset-classic`, + `module-type-aliases`, `tsconfig`, `types`; patch: webpackbar bundler fix), + `lucide-react` 1.8.0 → 1.14.0 (new icons), `postcss` → 8.5.14 + (security: XSS via unescaped `` in non-bundler cases; custom syntax + regression fix). `npm run build` (EN + IT) clean after update. + +#### Removed + +- **Legacy URL paths**: `/docs/community/developers/*`, `/docs/community/governance/*`, + `/docs/community/contribute/*` are gone with no compatibility shim. External + bookmarks must be updated. +- **Probabilistic / AI-architecture content** purged from the Zenzic blog. + The `Adversarial Stress-Testing Protocol` page is the single exception and + frames AI explicitly as "punching bag", never as co-author. + +#### Verification gates + +| Gate | Result | +|------|--------| +| `zenzic check all` on docs repo | ✅ Exit 0 | +| `npm run build` (EN + IT) | ✅ Zero broken-link errors | +| TypeScript `tsc --noEmit` | ✅ Zero errors | +| Markdownlint (all MDX) | ✅ Zero warnings | +| REUSE lint | ✅ 207/207 compliant | +| Pre-commit (all hooks) | ✅ All passed | + +--- + +**For the engine release notes, see +[Zenzic Core CHANGELOG](https://github.com/PythonWoods/zenzic/blob/main/CHANGELOG.md).** diff --git a/CITATION.cff b/CITATION.cff new file mode 100644 index 0000000..471e221 --- /dev/null +++ b/CITATION.cff @@ -0,0 +1,30 @@ +# SPDX-FileCopyrightText: 2026 PythonWoods +# SPDX-License-Identifier: Apache-2.0 + +cff-version: 1.2.0 +message: "If you use or reference this documentation corpus, please cite it as below." +type: software +authors: + - name: "PythonWoods" + email: "dev@pythonwoods.dev" + website: "https://pythonwoods.dev" +title: "Zenzic Documentation Portal: Engine-Agnostic Safe Harbor — Diátaxis Corpus" +abstract: >- + The official documentation portal for Zenzic, the high-performance engine-agnostic + Safe Harbor for Markdown documentation. This corpus covers the full Diátaxis + architecture: tutorials, how-to guides, reference, and explanation — available in + English and Italian. Inaugurates the Obsidian Journal engineering blog and the + formal Zenzic Brand System. +version: 0.7.0 +date-released: 2026-05-07 +url: "https://zenzic.dev" +repository-code: "https://github.com/PythonWoods/zenzic-doc" +license: Apache-2.0 +keywords: + - documentation + - docusaurus + - static-site + - diataxis + - technical-writing + - i18n + - markdown diff --git a/CODE_OF_CONDUCT.md b/CODE_OF_CONDUCT.md new file mode 100644 index 0000000..e8e3d02 --- /dev/null +++ b/CODE_OF_CONDUCT.md @@ -0,0 +1,86 @@ + + +# Contributor Covenant Code of Conduct + +Zenzic adopts the [Contributor Covenant 2.1](https://www.contributor-covenant.org/version/2/1/code_of_conduct.html) +as its standard for community interaction. We are committed to providing a welcoming, +respectful, and inclusive environment for all. + +## Our Pledge + +We as members, contributors, and leaders pledge to make participation in our community a harassment-free experience for everyone, regardless of age, body size, visible or invisible disability, ethnicity, sex characteristics, gender identity and expression, level of experience, education, socio-economic status, nationality, personal appearance, race, caste, color, religion, or sexual identity and orientation. + +We pledge to act and interact in ways that contribute to an open, welcoming, diverse, inclusive, and healthy community. + +## Our Standards + +Examples of behavior that contributes to a positive environment for our community include: + +* Demonstrating empathy and kindness toward other people +* Being respectful of differing opinions, viewpoints, and experiences +* Giving and gracefully accepting constructive feedback +* Accepting responsibility and apologizing to those affected by our mistakes, and learning from the experience +* Focusing on what is best not just for us as individuals, but for the overall community + +Examples of unacceptable behavior include: + +* The use of sexualized language or imagery, and sexual attention or advances of any kind +* Trolling, insulting or derogatory comments, and personal or political attacks +* Public or private harassment +* Publishing others' private information, such as a physical or email address, without their explicit permission +* Other conduct which could reasonably be considered inappropriate in a professional setting + +## Enforcement Responsibilities + +Community leaders are responsible for clarifying and enforcing our standards of acceptable behavior and will take appropriate and fair corrective action in response to any behavior that they deem inappropriate, threatening, offensive, or harmful. + +Community leaders have the right and responsibility to remove, edit, or reject comments, commits, code, wiki edits, issues, and other contributions that are not aligned to this Code of Conduct, and will communicate reasons for moderation decisions when appropriate. + +## Scope + +This Code of Conduct applies within all community spaces, and also applies when an individual is officially representing the community in public spaces. Examples of representing our community include using an official e-mail address, posting via an official social media account, or acting as an appointed representative at an online or offline event. + +## Enforcement + +Instances of abusive, harassing, or otherwise unacceptable behavior may be reported to the community leaders responsible for enforcement at ****. All complaints will be reviewed and investigated promptly and fairly. + +All community leaders are obligated to respect the privacy and security of the reporter of any incident. + +## Enforcement Guidelines + +Community leaders will follow these Community Impact Guidelines in determining the consequences for any action they deem in violation of this Code of Conduct: + +### 1. Correction + +**Community Impact**: Use of inappropriate language or other behavior deemed unprofessional or unwelcome in the community. +**Consequence**: A private, written warning from community leaders, providing clarity around the nature of the violation and an explanation of why the behavior was inappropriate. A public apology may be requested. + +### 2. Warning + +**Community Impact**: A violation through a single incident or series of actions. +**Consequence**: A warning with consequences for continued behavior. No interaction with the people involved, including unsolicited interaction with those enforcing the Code of Conduct, for a specified period of time. This includes avoiding interactions in community spaces as well as external channels like social media. Violating these terms may lead to a temporary or permanent ban. + +### 3. Temporary Ban + +**Community Impact**: A serious violation of community standards, including sustained inappropriate behavior. +**Consequence**: A temporary ban from any sort of interaction or public communication with the community for a specified period of time. No public or private interaction with the people involved, including unsolicited interaction with those enforcing the Code of Conduct, is allowed during this period. Violating these terms may lead to a permanent ban. + +### 4. Permanent Ban + +**Community Impact**: Demonstrating a pattern of violation of community standards, including sustained inappropriate behavior, harassment of an individual, or aggression toward or disparagement of classes of individuals. +**Consequence**: A permanent ban from any sort of public interaction within the community. + +## Attribution + +This Code of Conduct is adapted from the [Contributor Covenant][homepage], version 2.1. +Community Impact Guidelines were inspired by [Mozilla's code of conduct enforcement ladder]. + +[homepage]: https://www.contributor-covenant.org +[Mozilla's code of conduct enforcement ladder]: https://github.com/mozilla/diversity + +--- + +Based in Italy 🇮🇹 | Committed to the craft of documentation engineering. diff --git a/CONTRIBUTING.md b/CONTRIBUTING.md new file mode 100644 index 0000000..00f620b --- /dev/null +++ b/CONTRIBUTING.md @@ -0,0 +1,289 @@ + + +# Contributing to zenzic-doc + +Thank you for contributing to the Zenzic documentation portal. +This guide is written for **Technical Writers and Documentation Engineers** — not Python +programmers. If you want to contribute to the Zenzic engine itself, see the +[core repository](https://github.com/PythonWoods/zenzic/blob/main/CONTRIBUTING.md). + +--- + +## Prerequisites + +| Tool | Version | Install | +|------|---------|---------| +| Node.js | 24 or newer | [nodejs.org](https://nodejs.org) | +| npm | 10 or newer | bundled with Node.js 24 | +| just | any | `brew install just` / `cargo install just` | +| uv / uvx | any | `pip install uv` or [docs.astral.sh](https://docs.astral.sh/uv/) | + +Verify your setup: + +```bash +node --version # must be ≥ 24 +npm --version # must be ≥ 10 +just --version +``` + +--- + +## First-Time Setup + +Clone the repository and install dependencies: + +```bash +git clone https://github.com/PythonWoods/zenzic-doc.git +cd zenzic-doc +npm ci +``` + +Install the pre-commit hooks (run once after cloning): + +```bash +uvx pre-commit install # commit-stage: hygiene + typecheck + zenzic sentinel +uvx pre-commit install -t pre-push # pre-push: 🛡️ Final Guard runs `just verify` +``` + +--- + +## Running the Site Locally + +```bash +just start # EN only — fastest for editing +just start-it # IT only — use when editing Italian content +``` + +The dev server reloads automatically when you save a file. +The language switcher is **inactive in dev mode** — use `just serve` after +`just build` to test locale switching. + +--- + +## File Structure + +```text +docs/ ← English source content (all .mdx) + tutorials/ ← Learning-oriented guides + how-to/ ← Task-oriented recipes + reference/ ← Information-oriented reference + explanation/ ← Conceptual background + community/ ← Contributing, FAQ, license, brand-kit +i18n/ + it/ ← Italian translations — mirrors docs/ exactly +blog/ ← Zenzic Blog engineering posts +src/ + components/ ← React components (Icon, Homepage sections) + css/custom.css ← Obsidian visual system (do not edit without CEO approval) +static/ ← Static files served verbatim +``` + +**Rule:** Every file inside `docs/` must be `.mdx`. Never create `.md` files there. + +--- + +## Writing and Editing Content (Diátaxis) + +This portal follows the [Diátaxis framework](https://diataxis.fr). Before writing, +identify which quadrant your contribution belongs to: + +| Section | Question it answers | Example | +|---------|---------------------|---------| +| `tutorials/` | "How do I learn X step by step?" | First-time setup walkthrough | +| `how-to/` | "How do I accomplish X?" | How to add badges | +| `reference/` | "What does X do exactly?" | Engine configuration reference | +| `explanation/` | "Why does Zenzic work this way?" | Architecture overview | + +Place your file in the correct section and follow the naming convention: +`verb-noun.mdx` for how-to (e.g. `add-badges.mdx`), `noun.mdx` for reference. + +### Frontmatter (required) + +Every `.mdx` file must begin with: + +```yaml +--- +sidebar_label: Short Label +--- +``` + +**Do not add `slug:` frontmatter.** URLs must mirror the filesystem path exactly +(Slug Law — see [copilot-instructions.md](.github/copilot-instructions.md)). + +### Icons + +Use `` anywhere without per-file imports. +Available names are listed in [`src/components/Icon.tsx`](src/components/Icon.tsx). + +--- + +## Managing Translations (i18n) + +The Italian locale lives in `i18n/it/docusaurus-plugin-content-docs/current/` and +mirrors `docs/` exactly. + +When you **add a new file**: + +1. Create the English version in `docs/`. +2. Create the Italian version in the corresponding `i18n/it/` path. +3. The content of the Italian file must be a faithful translation — not a machine translation without review. + +When you **rename a file**: + +1. Rename in both `docs/` and `i18n/it/`. +2. Run `just build` to confirm no broken links. + +To regenerate translation stubs after structural changes: + +```bash +npm run write-translations +``` + +--- + +## 🚀 Cross-Repo Validation (Branch Parity Rule) + +To ensure consistency between the core engine (**zenzic**) and the documentation (**zenzic-doc**), our CI system enforces the **Rule of Branch Parity**. + +### 🔍 How it works +1. **Local Development**: The linter always looks for the core repository in the adjacent folder (`../zenzic`). You are responsible for keeping local branches aligned. +2. **In CI (GitHub Actions)**: The documentation pipeline attempts to clone the core repository by looking for a branch with the **exact same name** as the one being built in the doc repo. +3. **Fallback**: If the mirrored branch is not found in the core repo, the CI will automatically fall back to the `main` branch. + +### 🛠️ Operational Summary for Contributors + +| Scenario | Required Action | CI Behavior | +| :--- | :--- | :--- | +| **Documentation Fix** | Push only to `zenzic-doc` | Validates against core `main`. | +| **New Feature (Synchronized)** | Push to `zenzic` **BEFORE** pushing to `zenzic-doc` | Validates against the exact feature code. | +| **Naming Convention** | Use identical branch names in both repos | Guarantees perfect "Dogfooding". | + +> **Note**: Never push documentation changes that depend on core features not yet present on the remote server (even if on different branches), otherwise the build will fail due to misalignment. + +### 💻 VS Code Multi-Root Workspace Configuration + +Because the repositories are tightly coupled, we recommend managing them through a single **Multi-Root Workspace** in VS Code. + +1. Clone both repositories into the same parent directory. +2. Open VS Code and go to **File > Save Workspace As...**, saving it as `zenzic.code-workspace` in the parent directory. +3. Edit the newly created file like this: + +```json +{ + "folders": [ + { "path": "zenzic" }, + { "path": "zenzic-doc" }, + { "path": "zenzic-action" } + ], + "settings": { + "python.analysis.extraPaths": ["./zenzic/src"], + "files.exclude": { + "**/.venv": true, + "**/_zenzic_core": true + } + } +} +``` + +This allows you to perform global searches across all repositories simultaneously and manage branches from the Source Control panel in a single, unified interface. + +--- + +## 404 Emergency Protocol (Sovereign Override) + +If Sentinel fails on a pre-launch external URL (HTTP 404), do not disable external checks globally. +Apply a surgical runtime exclusion with `ZENZIC_EXTRA_ARGS`: + +```bash +ZENZIC_EXTRA_ARGS="--exclude-url https://example.com/prelaunch" just verify +``` + +Rules: + +1. Exclude only the exact pre-launch URL(s), never broad domains unless explicitly approved. +2. Keep exclusions in CI runtime env only; do not hardcode them in `zenzic.toml`. +3. Remove each exclusion as soon as the URL is publicly reachable. + +For full architecture and lifecycle policy, see +[Sovereign Override Guide](developers/how-to/sovereign-override-404-shield.mdx). + +--- + +## Before Opening a Pull Request + +Run the full local gate: + +```bash +just verify # markdownlint + lint:ts + typecheck + build +just preflight # all pre-commit hooks (mirrors the CI gate exactly) +``` + +Both must pass with zero errors before you open or update a PR. + +### Pre-commit hooks + +The repository enforces quality automatically on every `git commit`: + +| Hook | What it checks | +|------|----------------| +| trailing-whitespace | No trailing spaces | +| end-of-file-fixer | Files end with a newline | +| check-yaml / check-json / check-toml | Valid structured data | +| TypeScript Typecheck | `tsc --noEmit` must pass | +| Zenzic Sentinel | `zenzic check all` must exit 0 | +| REUSE/SPDX | All files have licence information | + +If a hook fails, fix the reported issue and retry the commit. + +--- + +## Adding a Blog Post + +Blog posts live in `blog/` and use the filename format `YYYY-MM-DD-slug.mdx`. + +Required frontmatter: + +```yaml +--- +slug: your-post-slug +title: The Full Title +authors: [pythonwoods] +tags: [engineering, release] +date: 2026-04-22 +--- +``` + +The author `pythonwoods` is defined in `blog/authors.yml`. + +--- + +## REUSE / Licence Compliance + +Every file in this repository must carry `SPDX-FileCopyrightText` and +`SPDX-License-Identifier` metadata. For most files this is handled automatically +via glob annotations in `REUSE.toml`. + +If you add a new file type or directory not covered by existing globs, add an +annotation to `REUSE.toml` before committing. The `reuse lint` hook will catch +any gaps. + +Check compliance manually: + +```bash +just reuse +``` + +--- + +## Code of Conduct + +All contributors are expected to follow the +[Contributor Covenant Code of Conduct](CODE_OF_CONDUCT.md). +Report violations to `dev@pythonwoods.dev`. + +--- + +*zenzic-doc is developed by [PythonWoods](https://pythonwoods.dev) · Apache-2.0* diff --git a/README.it.md b/README.it.md new file mode 100644 index 0000000..8c11fc6 --- /dev/null +++ b/README.it.md @@ -0,0 +1,389 @@ +
+ + + + Zenzic + +
+ +# Guida per Sviluppatori zenzic-doc + +[![Zenzic Core](https://img.shields.io/badge/Zenzic_Core-v0.7.0-4f46e5)](https://github.com/PythonWoods/zenzic) +[![Docs CI](https://github.com/PythonWoods/zenzic-doc/actions/workflows/ci.yml/badge.svg)](https://github.com/PythonWoods/zenzic-doc/actions/workflows/ci.yml) +[![License](https://img.shields.io/badge/license-Apache--2.0-0d9488?style=flat-square)](LICENSE) +[![REUSE status](https://api.reuse.software/badge/github.com/PythonWoods/zenzic-doc)](https://api.reuse.software/info/github.com/PythonWoods/zenzic-doc) +[![Documentation: Diátaxis](https://img.shields.io/badge/Docs-Di%C3%A1taxis-brightgreen?style=flat-square)](https://diataxis.fr/) +[![4-Gates: Sentinel Seal](https://img.shields.io/badge/4--Gates-Sentinel%20Seal-10b981?style=flat-square)](https://zenzic.dev/it/developers/explanation/adr-vault) +[![REUSE 3.x compliant](https://img.shields.io/badge/REUSE-3.x%20compliant-0d9488?style=flat-square)](https://reuse.software/) + +> **Questa documentazione è strettamente allineata a Zenzic v0.7.0 "Quarzo".** +> Se la versione del core cambia, esegui `just bump NEW_VERSION` per mantenere +> sincronizzati tutti i riferimenti. + +Questo repository contiene il sito di documentazione Docusaurus per Zenzic. + +> **Gate di conformità.** Questo repository è la prova vivente della disciplina dell'auditor: +> `zenzic check all --strict` esce con 0 e zero finding ad ogni push. +> La documentazione dello strumento che applica la tracciabilità è essa stessa tracciabile. +> Un link rotto in questi docs invaliderebbe l'intera affermazione di correttezza. + +Questa guida è scritta sia per i maintainer esperti sia per chi contribuisce per +la prima volta. Se sei nuovo, segui le sezioni in ordine. + +--- + +## 📖 Mappa della Documentazione — La Promessa di Quarzo + +La documentazione di Zenzic è distribuita come **due istanze Docusaurus separate** +sotto lo stesso dominio. Ognuna ha la propria sidebar, il proprio indice di +ricerca e il proprio pubblico — mai mischiati. + +```text +zenzic.dev/ +├── docs/ → Area Utente — installazione, configurazione, CI/CD, codici +├── developers/ → Area Dev — plugin, adapter, ADR, ledger del debito tecnico +├── blog/ → Note di rilascio e post-mortem ingegneristici +└── community/ → Brand kit, FAQ, governance +``` + +**La Promessa di Quarzo.** Due istanze, una Sentinella. La separazione è imposta +dall'[ADR 011: Cross-Instance Allowlist](https://zenzic.dev/it/developers/explanation/adr-cross-instance-allowlist) — +ogni link che attraversa il confine è un contratto documentato, mai una +soppressione silenziosa. Consulta il +[Ledger del Debito Tecnico](https://zenzic.dev/it/developers/governance/technical-debt) per ciò che abbiamo +rinviato e perché. + +| Sei un... | Inizia da qui | +| :--- | :--- | +| 👤 Utente che legge la documentazione | [Guida Utente](https://zenzic.dev/it/docs/) | +| 🔧 Contributor / autore docs | [Portale Sviluppatori](https://zenzic.dev/it/developers/) · [ADR Vault](https://zenzic.dev/it/developers/explanation/adr-vault) | +| 🛡️ Security reviewer | [Engineering Ledger](https://zenzic.dev/it/developers/explanation/engineering-ledger) · [SECURITY.md](SECURITY.md) | + +--- + +## 1) Prerequisiti + +- Node.js 24 o superiore +- npm 10 o superiore +- Opzionale: [just](https://github.com/casey/just) per eseguire comandi brevi e memorizzabili + +## 2) Primo Setup (per nuovi collaboratori) + +Esegui questo comando una volta dopo aver clonato il repository: + +```bash +npm ci +``` + +Cosa fa: + +- Installa le dipendenze esattamente come definite in `package-lock.json`. +- Mantiene il tuo ambiente riproducibile con la CI. + +Alternativa con just: + +```bash +just setup +``` + +## 3) Avvia il sito docs in locale + +```bash +npm run start +``` + +Cosa fa: + +- Avvia un server di sviluppo locale. +- Ricarica automaticamente le pagine quando i file cambiano. + +Alternativa con just: + +```bash +just start +``` + +## 4) Workflow quotidiano comune + +Quando modifichi documentazione o componenti, questo è il flusso più sicuro: + +```bash +just start +just verify +``` + +Cosa fa `just verify`: + +- Esegue i controlli TypeScript. +- Costruisce il sito di produzione esattamente come si aspetta la CI. + +## 5) Tutti i comandi spiegati + +### Comandi npm + +| Comando | Quando usarlo | Cosa fa | +| --- | --- | --- | +| `npm ci` | Primo setup, reinstallazione pulita, parità CI | Installa le dipendenze dal lockfile con versioni deterministiche | +| `npm run start` | Durante lo sviluppo attivo | Avvia il server locale con live reload | +| `npm run build` | Prima di una PR, prima di una release | Produce il sito statico in `build/` | +| `npm run serve` | Dopo una build | Serve `build/` localmente per visualizzare l'output di produzione | +| `npm run lint:md` | Prima della PR, dopo modifiche docs | Lint di stile e formattazione Markdown/MDX | +| `npm run lint:ts` | Prima della PR, dopo modifiche React/TS | Lint dei sorgenti TypeScript/React | +| `npm run typecheck` | Prima della PR, quando modifichi file TS/React | Esegue i controlli `tsc` | +| `npm run clear` | Se la cache di Docusaurus causa comportamenti strani | Pulisce gli artefatti in cache | +| `npm run swizzle` | Personalizzazione avanzata del tema | Copia gli internals del tema Docusaurus per la personalizzazione | +| `npm run write-translations` | Modifiche al workflow i18n | Genera lo scaffolding delle traduzioni | +| `npm run write-heading-ids` | Aggiornamenti Markdown estesi | Scrive/aggiorna gli ID degli heading per i file docs | +| `npm run deploy` | Solo per workflow di deployment | Esegue il comando deploy di Docusaurus | +| `npm run docusaurus -- ` | Uso avanzato/diagnostico | Esegue la CLI Docusaurus grezza con argomenti personalizzati | + +### Comandi just + +`just` avvolge i comandi npm con nomi più semplici. + +| Comando | Quando usarlo | Cosa fa | +| --- | --- | --- | +| `just setup` | Primo setup o reset | Esegue `npm ci` | +| `just start` | Editing quotidiano | Esegue il server di sviluppo locale | +| `just serve` | Anteprima della build di produzione | Serve `build/` con switch locale completo (il modo corretto per testare EN↔IT) | +| `just markdownlint` | Dopo aver modificato la documentazione | Esegue i controlli markdown lint | +| `just lint` | Dopo aver modificato sorgenti React/TS | Esegue i controlli lint TypeScript/React | +| `just typecheck` | Prima di aprire/aggiornare la PR | Esegue i controlli TypeScript | +| `just build` | Validazione build | Esegue la build di produzione | +| `just preview` | Valida l'output costruito | Serve il sito già buildato | +| `just verify` | Controllo locale finale raccomandato | Esegue `markdownlint` + `lint` + `typecheck` + `build` | +| `just preflight` | Prima di ogni commit | Esegue tutti gli hook pre-commit su ogni file tracciato | +| `just reuse` | Dopo aver aggiunto/rinominato file | Verifica la conformità della licenza REUSE/SPDX | +| `just sentinel` | Spot-check rapido qualità | Esegue solo la Zenzic Sentinel (più veloce di un preflight completo) | +| `just clean` | Pulizia prima di un'esecuzione fresca | Rimuove `build/` e `.docusaurus/` | +| `just bump VERSION [BADGE]` | Dopo una release del core Zenzic | Aggiorna tutti i riferimenti hardcoded alla versione | + +Puoi elencare tutte le ricette con: + +```bash +just --list +``` + +## 6) Hook pre-commit (Sentinel Guard) + +Questo repository impone gate di qualità prima di ogni commit tramite [pre-commit](https://pre-commit.com/). + +Installa gli hook una volta dopo il clone: + +```bash +pip install pre-commit +pre-commit install +``` + +Ogni `git commit` eseguirà automaticamente: + +| Hook | Cosa controlla | +| --- | --- | +| trailing-whitespace | Nessuno spazio finale (esclude `.mdx`) | +| end-of-file-fixer | I file terminano con una nuova riga | +| check-yaml / check-json / check-toml | Dati strutturati validi | +| check-added-large-files | Previene commit binari accidentali | +| check-merge-conflict | Nessun marcatore di merge irrisolto | +| no-commit-to-branch | Blocca i commit diretti su `main` | +| TypeScript Typecheck | `tsc --noEmit` deve passare | +| Zenzic Sentinel | `zenzic check all` deve uscire con 0 | +| REUSE/SPDX | Conformità della licenza su ogni file | + +Se un hook fallisce, correggi il problema segnalato e ritenta il commit. + +Per eseguire tutti gli hook manualmente senza committare: + +```bash +just preflight +``` + +## 7) Workflow CI/CD + +| Workflow | File | Trigger | Obiettivo | +| --- | --- | --- | --- | +| Docs CI | `.github/workflows/ci.yml` | PR, push su `main`, manuale | Valida install, markdown lint, TS/React lint, typecheck, e build su Node 22 e 24 | +| Dependency Audit | `.github/workflows/npm-audit.yml` | PR, push su `main`, settimanale, manuale | Rileva vulnerabilità di dipendenze ad alta gravità | +| Dependency Review | `.github/workflows/dependency-review.yml` | PR, manuale | Rileva modifiche di dipendenze rischiose introdotte dalle PR | +| CodeQL (opt-in) | `.github/workflows/codeql.yml` | PR, push su `main`, settimanale, manuale | Analisi statica quando `ENABLE_CODEQL=true` | +| Release Docs | `.github/workflows/release-docs.yml` | tag `v*`, manuale | Costruisce, archivia e pubblica l'artefatto versionato | + +## 8) Note di sicurezza + +- `codeql.yml` è opt-in per i repository privati. +- Per abilitare i job CodeQL: abilita Code Security (GHAS), poi imposta la variabile di repository `ENABLE_CODEQL=true`. +- `npm-audit.yml` esegue un audit strict ad alta gravità senza allowlist. + +## 9) Robustezza della pipeline (stato attuale) + +Policy della landing page: + +- `src/pages/index.tsx` è una landing page monolitica intenzionale ed è esclusa da `lint:ts` per policy esplicita. +- È comunque coperta da `typecheck` e `build`. + +Già implementato: + +- Controlli di concorrenza (cancella le run obsolete). +- Timeout dei job (evita runner bloccati). +- Trigger manuali `workflow_dispatch`. +- Matrice Node (22 e 24) per la compatibilità. +- Cache npm nei workflow, con chiave `package-lock.json`. + +Possibile irrobustimento futuro: + +- Pin delle GitHub Actions di terze parti per commit SHA. +- Richiedere i check di branch protection dopo la validazione del rollout. + +## 10) Troubleshooting + +### Errore: `File '@docusaurus/tsconfig' not found` + +Quando appare nel tuo editor, controlla `tsconfig.json` e assicurati che `extends` punti a: + +```json +"@docusaurus/tsconfig/tsconfig.json" +``` + +Poi esegui: + +```bash +npm run typecheck +``` + +### `npm ci` fallisce + +Prova questa sequenza: + +```bash +just clean +rm -rf node_modules +npm ci +``` + +Se continua a fallire, verifica le tue versioni Node/npm rispetto alla sezione prerequisiti. + +### `npm run build` fallisce ma `start` funziona + +Di solito significa che i controlli production-only sono più stringenti. + +Esegui: + +```bash +npm run typecheck +npm run build +``` + +Risolvi prima gli errori di tipo, poi ritenta la build. + +### `/it/docs/index` è 404 in localhost + +Questo è atteso quando esegui `npm run start` con il locale di default (`en`): +il dev server serve un locale alla volta. + +Usa uno di questi comandi invece: + +```bash +npm run start:en +npm run start:it +``` + +Note: + +- Con `start:it`, apri `http://localhost:3000/docs/` (contenuto italiano servito alla root in dev). +- Se vuoi route con prefisso come `/it/docs/`, costruisci e servi l'output di produzione: + +```bash +npm run build +npm run serve +``` + +### La CI fallisce ma i comandi locali passano + +Usa il gate locale equivalente esatto della CI: + +```bash +just verify +``` + +Se la CI continua a differire, controlla: + +- Versione Node (la CI usa Node 22 e 24) +- Modifiche al lockfile (`package-lock.json`) +- Job specifici del workflow (dependency audit, dependency review) + +### La Trappola del Fallback Silenzioso i18n + +**Sintomo:** `http://localhost:3000/it/docs/` mostra contenuto inglese anche se i +file di traduzione italiana esistono in `i18n/it/`. + +**Causa principale:** Docusaurus deriva la proprietà `path` da `htmlLang` quando +`path` non è impostato esplicitamente. Se dichiari `htmlLang: 'it-IT'`, Docusaurus +cerca le traduzioni in `i18n/it-IT/` — una directory che non esiste. La build si +completa silenziosamente con `translate: false` e fa fallback alla sorgente +inglese per tutte le pagine di contenuto. La chrome dell'UI (navbar, breadcrumb, +etichette di paginazione) rimane tradotta perché quelle stringhe provengono dalle +traduzioni bundled di Docusaurus, mascherando il problema. + +**Diagnosi:** In `build/it/.docusaurus/i18n.json` (o `.docusaurus/i18n.json` dopo +una build), controlla se il locale `it` ha `"translate": false`. In tal caso, il +mismatch del path è la causa. + +**Fix:** Imposta sempre `path` esplicitamente in `localeConfigs`: + +```ts +// docusaurus.config.ts +i18n: { + defaultLocale: 'en', + locales: ['en', 'it'], + localeConfigs: { + en: { label: 'English' }, + it: { label: 'Italiano', htmlLang: 'it-IT', path: 'it' }, // ← path è obbligatorio + }, +}, +``` + +**Scoperto in:** v0.7.0 release audit (D090 "Il Lockdown i18n"). + +## 11) Checklist Pull Request + +Prima di aprire o aggiornare una PR, esegui questa checklist. + +- [ ] Ho installato le dipendenze con `npm ci` (o `just setup`). +- [ ] Ho testato lo sviluppo locale con `npm run start` (o `just start`) se il comportamento UI/docs è cambiato. +- [ ] Ho eseguito `just verify` ed è passato. +- [ ] Ho rivisto le sezioni di `README.md` se ho cambiato comandi/workflow. +- [ ] Ho aggiornato docs o commenti quando il comportamento è cambiato. +- [ ] Il mio branch contiene solo modifiche intenzionali. +- [ ] Se ho toccato la config `i18n` o i file di locale: ho verificato che le pagine `/it/` mostrino **contenuto italiano** (non solo un URL italiano), controllando il body della pagina dopo `npm run build && npm run serve`. + +Sequenza minima di comandi prima della PR: + +```bash +just setup +just verify +``` + +--- + +## 📚 Le Cronache di Zenzic + +Zenzic è nato da un viaggio tecnico attraverso la fragilità degli ecosistemi +moderni di documentazione. Scopri la filosofia, l'assedio alla sicurezza e +l'ingegneria dietro la Sentinella nella +[**Engineering Chronicles**](https://zenzic.dev/blog/tags/engineering-chronicles) sul blog ufficiale. + +--- + +
+ + PythonWoods + +

+ Progettato con precisione da PythonWoods in Italia 🇮🇹
+ "Costruendo il Porto Sicuro per la conoscenza tecnica." +

+

+ Documentazione · + GitHub · + Journal +

+
diff --git a/README.md b/README.md index 496df82..206a686 100644 --- a/README.md +++ b/README.md @@ -1,13 +1,62 @@ +
+ + + + Zenzic + +
+ # zenzic-doc Developer Guide +[![Zenzic Core](https://img.shields.io/badge/Zenzic_Core-v0.7.0-4f46e5)](https://github.com/PythonWoods/zenzic) +[![Docs CI](https://github.com/PythonWoods/zenzic-doc/actions/workflows/ci.yml/badge.svg)](https://github.com/PythonWoods/zenzic-doc/actions/workflows/ci.yml) +[![License](https://img.shields.io/badge/license-Apache--2.0-0d9488?style=flat-square)](LICENSE) +[![REUSE status](https://api.reuse.software/badge/github.com/PythonWoods/zenzic-doc)](https://api.reuse.software/info/github.com/PythonWoods/zenzic-doc) +[![Documentation: Diátaxis](https://img.shields.io/badge/Docs-Di%C3%A1taxis-brightgreen?style=flat-square)](https://diataxis.fr/) +[![4-Gates: Sentinel Seal](https://img.shields.io/badge/4--Gates-Sentinel%20Seal-10b981?style=flat-square)](https://zenzic.dev/developers/explanation/adr-vault) +[![REUSE 3.x compliant](https://img.shields.io/badge/REUSE-3.x%20compliant-0d9488?style=flat-square)](https://reuse.software/) + +> **This documentation is strictly aligned to Zenzic v0.7.0 "Quartz Maturity".** +> If the core version changes, run `just bump NEW_VERSION` to keep all references in sync. + This repository contains the Docusaurus documentation website for Zenzic. -This guide is written for both experienced maintainers and first-time contributors. -If you are new, follow the sections in order. +> **Compliance gate.** This repository is the live evidence of the auditor's own +> discipline: `zenzic check all --strict` exits 0 with zero findings on every push. +> The documentation of the tool that enforces traceability is itself traceable. +> A broken link in these docs would falsify the entire correctness claim. + +--- + +## 📖 Documentation Map — Quartz Promise + +The Zenzic documentation ships as **two separate Docusaurus instances** under one +domain. Each has its own sidebar, search index, and audience — never mixed. + +```text +zenzic.dev/ +├── docs/ → User Area — install, configure, CI/CD, finding codes +├── developers/ → Dev Area — plugins, adapters, ADRs, tech debt ledger +├── blog/ → Release notes & engineering post-mortems +└── community/ → Brand kit, FAQs, governance +``` + +**The Quartz Promise.** Two instances, one Sentinel. The split is enforced by +[ADR 011: Cross-Instance Allowlist](https://zenzic.dev/developers/explanation/adr-cross-instance-allowlist) — every +cross-boundary link is a documented contract, never a silent suppression. +See the [Technical Debt Ledger](https://zenzic.dev/developers/governance/technical-debt) for what we deferred and why. + +| You are a... | Start here | +| :--- | :--- | +| 👤 User reading the docs | [User Guide](https://zenzic.dev/docs/) | +| 🔧 Contributor / docs author | [Developer Portal](https://zenzic.dev/developers/) · [ADR Vault](https://zenzic.dev/developers/explanation/adr-vault) | +| 🛡️ Security reviewer | [Engineering Ledger](https://zenzic.dev/developers/explanation/engineering-ledger) · [SECURITY.md](SECURITY.md) | + +--- ## 1) Prerequisites -- Node.js 20 or newer +- Node.js 24 or newer - npm 10 or newer - Optional: [just](https://github.com/casey/just) to run short, memorable commands @@ -89,14 +138,18 @@ What `just verify` does: | --- | --- | --- | | `just setup` | First setup or reset | Runs `npm ci` | | `just start` | Daily editing | Runs local dev server | -| `just serve` | Same as start | Alias of `just start` | +| `just serve` | Preview production build | Serves `build/` with full locale switch (the correct way to test EN↔IT) | | `just markdownlint` | After editing docs | Runs markdown lint checks | | `just lint` | After editing React/TS source | Runs TypeScript/React lint checks | | `just typecheck` | Before opening/updating PR | Runs TypeScript checks | | `just build` | Build validation | Runs production build | | `just preview` | Validate built output | Serves already-built site | | `just verify` | Recommended final local check | Runs `markdownlint` + `lint` + `typecheck` + `build` | +| `just preflight` | Before every commit | Runs all pre-commit hooks against every tracked file | +| `just reuse` | After adding/renaming files | Checks REUSE/SPDX licence compliance | +| `just sentinel` | Quick quality spot-check | Runs the Zenzic Sentinel alone (faster than full preflight) | | `just clean` | Cleanup before fresh run | Removes `build/` and `.docusaurus/` | +| `just bump VERSION [BADGE]` | After a Zenzic core release | Updates all hardcoded version references | You can list all recipes with: @@ -104,7 +157,7 @@ You can list all recipes with: just --list ``` -## 6) Pre-commit hooks (Obsidian Guard) +## 6) Pre-commit hooks (Sentinel Guard) This repository enforces quality gates before every commit via [pre-commit](https://pre-commit.com/). @@ -134,14 +187,14 @@ If a hook fails, fix the reported issue and retry the commit. To run all hooks manually without committing: ```bash -pre-commit run --all-files +just preflight ``` ## 7) CI/CD workflows | Workflow | File | Trigger | Goal | | --- | --- | --- | --- | -| Docs CI | `.github/workflows/ci.yml` | PR, push to `main`, manual | Validate install, markdown lint, TS/React lint, typecheck, and build on Node 20 and 22 | +| Docs CI | `.github/workflows/ci.yml` | PR, push to `main`, manual | Validate install, markdown lint, TS/React lint, typecheck, and build on Node 22 and 24 | | Dependency Audit | `.github/workflows/npm-audit.yml` | PR, push to `main`, weekly, manual | Detect high-severity dependency vulnerabilities | | Dependency Review | `.github/workflows/dependency-review.yml` | PR, manual | Detect risky dependency changes introduced by PRs | | CodeQL (opt-in) | `.github/workflows/codeql.yml` | PR, push to `main`, weekly, manual | Static analysis when `ENABLE_CODEQL=true` | @@ -165,7 +218,7 @@ Already implemented: - Concurrency controls (cancel obsolete runs). - Job timeouts (avoid stuck runners). - Manual `workflow_dispatch` triggers. -- Node matrix (20 and 22) for compatibility. +- Node matrix (22 and 24) for compatibility. - npm cache in workflows, keyed by `package-lock.json`. Possible future hardening: @@ -214,7 +267,7 @@ npm run build Fix type errors first, then retry the build. -### `/it/docs/intro` is 404 on localhost +### `/it/docs/index` is 404 on localhost This is expected when running `npm run start` with default locale (`en`): the dev server serves one locale at a time. @@ -228,8 +281,8 @@ npm run start:it Notes: -- With `start:it`, open `http://localhost:3000/docs/intro` (Italian content served at root in dev). -- If you want prefixed routes like `/it/docs/intro`, build + serve production output: +- With `start:it`, open `http://localhost:3000/docs/` (Italian content served at root in dev). +- If you want prefixed routes like `/it/docs/`, build + serve production output: ```bash npm run build @@ -246,10 +299,42 @@ just verify If CI still differs, check: -- Node version (CI uses Node 20 and 22) +- Node version (CI uses Node 22 and 24) - Lockfile changes (`package-lock.json`) - Workflow-specific jobs (dependency audit, dependency review) +### The i18n Silent Fallback Trap + +**Symptom:** `http://localhost:3000/it/docs/` renders English content even though the +Italian translation files exist under `i18n/it/`. + +**Root cause:** Docusaurus derives the `path` property from `htmlLang` when `path` is +not set explicitly. If you declare `htmlLang: 'it-IT'`, Docusaurus looks for translations +in `i18n/it-IT/` — a directory that does not exist. The build completes silently with +`translate: false` and falls back to the English source for all content pages. The UI +chrome (navbar, breadcrumbs, pagination labels) remains translated because those strings +come from Docusaurus's own bundled translations, masking the problem. + +**Diagnosis:** In `build/it/.docusaurus/i18n.json` (or `.docusaurus/i18n.json` after a +build), check whether the `it` locale has `"translate": false`. If so, the path mismatch +is the cause. + +**Fix:** Always set `path` explicitly in `localeConfigs`: + +```ts +// docusaurus.config.ts +i18n: { + defaultLocale: 'en', + locales: ['en', 'it'], + localeConfigs: { + en: { label: 'English' }, + it: { label: 'Italiano', htmlLang: 'it-IT', path: 'it' }, // ← path is mandatory + }, +}, +``` + +**Discovered in:** v0.7.0 release audit (D090 "The i18n Lockdown"). + ## 11) Pull Request Checklist Before opening or updating a PR, run this checklist. @@ -260,6 +345,7 @@ Before opening or updating a PR, run this checklist. - [ ] I reviewed `README.md` sections if I changed commands/workflows. - [ ] I updated docs or comments when behavior changed. - [ ] My branch contains only intentional changes. +- [ ] If I touched `i18n` config or locale files: I verified the `/it/` pages show **Italian content** (not just an Italian URL), by checking the page body after `npm run build && npm run serve`. Minimal command sequence before PR: @@ -267,3 +353,28 @@ Minimal command sequence before PR: just setup just verify ``` + +--- + +## 📚 The Zenzic Chronicles + +Zenzic was born from a technical journey through the fragility of modern documentation +ecosystems. Discover the philosophy, the security siege, and the engineering behind the +Sentinel in the [**Engineering Chronicles**](https://zenzic.dev/blog/tags/engineering-chronicles) on the official blog. + +--- + +
+ + PythonWoods + +

+ Engineered with precision by PythonWoods in Italy 🇮🇹
+ "Building the Safe Harbor for technical knowledge." +

+

+ Documentation · + GitHub · + Journal +

+
diff --git a/RELEASE.it.md b/RELEASE.it.md new file mode 100644 index 0000000..02cfe38 --- /dev/null +++ b/RELEASE.it.md @@ -0,0 +1,28 @@ + + +# 💎 Zenzic v0.7.0 — L'Era del Quarzo (Quartz Maturity) + +Questa release segna la nascita del Sistema di Conoscenza Sovrano. Dopo l'Epurazione del Quarzo, Zenzic abbandona definitivamente ogni residuo sperimentale per diventare un'infrastruttura deterministica di grado industriale. + +## 🏛️ I Pilastri della v0.7.0 + +- **Integrità Deterministica**: Assenza integrale di ogni dipendenza o logica probabilistica. Zenzic opera ora esclusivamente su fatti strutturali e invarianti certe. +- **Sentinel Seal**: Un sistema di validazione a 4 stadi (4-Gates Standard) che garantisce la qualità assoluta prima di ogni push. +- **Sovranità Cross-Repo**: Implementazione della Branch Parity Rule per una sincronizzazione perfetta tra codice e documentazione. +- **Machine Silence**: Ottimizzazione dei flussi di analisi per l'integrazione nativa in pipeline CI/CD tramite standard SARIF 2.1.0. + +## ⚠️ Nota di Evoluzione (Breaking Changes) + +La v0.7.0 è l'Anno Zero. Le versioni precedenti sono ufficialmente deprecate poiché non seguono l'attuale architettura Diátaxis. Ogni riferimento ai vecchi brand o alle architetture legacy è stato rimosso per far posto a un ecosistema snello e focalizzato sulla purezza della sorgente. + +## 🚀 Verso il Futuro + +Con questa release, Zenzic non è più solo un tool, ma una piattaforma di fiducia per l'ingegneria della documentazione. + +--- +**PythonWoods** +*Data di Rilascio: 2026-05-07* + +--- + +> **Nota:** Per le note tecniche dettagliate sulla documentazione, vedere il file [RELEASE.md](RELEASE.md). diff --git a/RELEASE.md b/RELEASE.md index 6800e91..c124e56 100644 --- a/RELEASE.md +++ b/RELEASE.md @@ -1,37 +1,261 @@ +# 💎 Zenzic v0.7.0 — The Quartz Era (Quartz Maturity) -# Release — zenzic-doc +This release marks the birth of the Sovereign Knowledge System. Following the Quartz Purgation, Zenzic definitively abandons all experimental residues to become a deterministic, industrial-grade infrastructure. -## Current Status +## 🏛️ The Pillars of v0.7.0 -> **v0.6.1 "Obsidian Glass (Stable)" — Public Launch Active.** +- **Deterministic Integrity**: Complete absence of any probabilistic dependency or logic. Zenzic now operates exclusively on structural facts and certain invariants. +- **Sentinel Seal**: A 4-stage validation system (4-Gates Standard) ensuring absolute quality before every push. +- **Cross-Repo Governance**: Implementation of the Branch Parity Rule for perfect synchronization between code and documentation. +- **Machine Silence**: Optimization of analysis flows for native CI/CD integration via the SARIF 2.1.0 standard. -The documentation site is open and publicly deployed. -All embargo conditions have been satisfied. +## ⚠️ Evolution Note (Breaking Changes) -### Release Conditions (met) +v0.7.0 is Year Zero. Previous versions are officially deprecated as they do not follow the current Diátaxis architecture. Every reference to old brands or legacy architectures has been removed to make way for a lean ecosystem focused on source purity. -1. **Documentation Parity** — All MDX content hardened, EN + IT mirrors verified. -2. **Sentinel Certification** — `zenzic check all --strict` reports zero errors. -3. **Triple Green Gate** — TypeScript 0 errors, Docusaurus build SUCCESS (EN + IT), Zenzic Sentinel EXIT 0. -4. **Tech Lead Authorisation** — Embargo lifted, public deployment approved. +## 🚀 Towards the Future -### Deployment +With this release, Zenzic is no longer just a tool, but a trust platform for documentation engineering. -The `release-docs.yml` workflow is active. Tagged releases (`v*`) trigger -build, archive, and versioned artifact publication. +--- +**PythonWoods** +*Release Date: 2026-05-07* -### Local verification -Validate changes locally before any release: +--- -```bash -just verify + +## 🚀 What Changed + +### 1. Diátaxis Architecture Restructure + +The documentation information architecture has been rebuilt from scratch around the +[Diátaxis framework](https://diataxis.fr): + +| Section | Type | Purpose | +|---------|------|---------| +| `tutorials/` | Tutorial | Learning-oriented, step-by-step | +| `how-to/` | How-to | Task-oriented, goal-driven | +| `reference/` | Reference | Information-oriented, exhaustive | +| `explanation/` | Explanation | Understanding-oriented, conceptual | + +All previous paths under `docs/usage/` and `docs/guides/` have been reorganised. +The sidebar is autogenerated from the filesystem — no slug divergence permitted. + +### 2. Zenzic Blog + +A blog (`/blog/`) has been inaugurated as the **Zenzic Blog**: the official +engineering log of Zenzic. Six founding articles cover the v0.6.x sprint, the +AI-Driven Siege postmortem, and the v0.7.0 Quartz Maturity declaration. + +The blog inherits the full Quartz visual system: `#09090b` monolith surface, +Zinc typography, Indigo author identity, Cyan accent. + +### 3. Brand System + +A formal brand package (\`static/assets/brand/brand-kit.zip\`) is now shipped with every +build, containing: + +- SVG icon, nav-dark, nav-light, wordmark variants +- PNG exports at canonical dimensions +- Social card templates (1200×630, dark + light) +- Brand HTML reference page + +PNG naming aligned to SVG naming: +\`pythonwoods-logo.svg\` (circle) ↔ \`pythonwoods-logo.png\` +\`pythonwoods-logo-nobg.svg\` ↔ \`pythonwoods-logo-nobg.png\` + +### 4. Bilingual Parity (EN + IT) + +All content sections — Diátaxis, reference, explanation, how-to — are fully mirrored +in Italian. \`i18n/it/\` mirrors \`docs/\` exactly. \`npm run build\` produces both locales +with zero broken links. + +### 5. D117 — Docusaurus \`pathname:\` Protocol Support + +The Zenzic Sentinel now correctly ignores \`pathname:///\` links in Docusaurus projects. +This engine-agnostic escape hatch lives at the validator level; MkDocs and Zensical are +unaffected. Documentation in \`reference/engines.mdx\` (EN + IT) covers the behaviour +and its scope. + +### 6. Pre-commit Gate & REUSE Compliance + +The \`pre-commit\` pipeline is fully operational: + +| Hook | Status | +|------|--------| +| trailing-whitespace / end-of-file | ✅ | +| check-yaml / check-json / check-toml | ✅ | +| TypeScript Typecheck | ✅ | +| Zenzic Sentinel | ✅ | +| REUSE/SPDX | ✅ 207/207 files | + +\`blog/**\` added to \`REUSE.toml\`. All files compliant with REUSE Spec 3.3. + +New \`just\` recipes for contributors: \`preflight\`, \`reuse\`, \`sentinel\`. + +### 7. D118 — Absolute Title Consistency + +Blog list page titles locked across \`:visited\` / \`:active\` / \`:hover\` states. +No perceived fading regardless of read history. Zinc-700 in light mode, White/Silk +in dark mode. Cyan on hover only. + +### 8. SentinelPalette — CLI × Web Color Bridge + +The Zenzic semantic color system (`SentinelPalette` in the CLI) is now bridged to the +Docusaurus web layer via six CSS custom properties in \`src/css/custom.css\`: + +| CSS Variable | CLI Value | Web — Light | Web — Dark | +|---|---|---|---| +| \`--zenzic-brand\` | \`#4f46e5\` | \`#4f46e5\` | \`#4f46e5\` | +| \`--zenzic-success\` | \`#10b981\` | \`#059669\` | \`#10b981\` | +| \`--zenzic-warning\` | \`#f59e0b\` | \`#b45309\` | \`#f59e0b\` | +| \`--zenzic-error\` | \`#f43f5e\` | \`#e11d48\` | \`#f43f5e\` | +| \`--zenzic-dim\` | \`#64748b\` | \`#475569\` | \`#64748b\` | +| \`--zenzic-fatal\` | \`#8b0000\` | \`#991b1b\` | \`#8b0000\` | + +Dark mode values are exact optical matches to the CLI terminal. +Light mode values are WCAG AA-calibrated for white surfaces. \`--zenzic-fatal\` +delegates to the existing \`--obsidian-blood\` variable (already mode-aware). + +The brand system HTML (\`static/assets/brand/zenzic-brand-system.html\`) has been +enriched with the full CLI × Web comparison table and the Mermaid exception note +(Mermaid's renderer does not consume CSS \`var()\` — diagram hex values are exempt +from the Zero Hex Law). + +### 9. Asset Integrity & Static Consolidation + +\`static/\` has been reorganised around a single canonical hierarchy: + +```text +static/ +├── assets/ +│ ├── brand/ ← logos, SVGs, PNGs, brand-system.html, brand-kit.zip +│ ├── favicon/ ← favicon variants +│ └── social/ ← OpenGraph and Twitter card images +├── css/ ← extra.css (static CSS, not processed by Webpack) +└── img/ ← PythonWoods author assets ``` -This runs markdown lint, TypeScript lint, type checking, and a full production build. +Changes: + +- \`static/brand/\` (legacy duplicate of \`assets/brand/\`) deleted. Canonical: \`assets/brand/\`. +- \`static/assets/stylesheets/\` renamed to \`static/css/\`. +- \`brand-kit.zip\` moved into \`static/assets/brand/\` (was at \`static/assets/\` root). +- Navbar logo path updated in \`docusaurus.config.ts\`. +- \`scripts/build-assets.js\` and \`scripts/bump-version.sh\` updated — no more + + mirror-copy pattern; \`assets/brand/\` is now the sole source and destination. + +### 10. Cross-Instance Routing — Breaking URL Migration + +The Developer Area has been promoted from \`/docs/community/developers/*\` to its own +top-level Docusaurus instance at \`/developers/*\`. The User Area remains at \`/docs/*\`. + +| Before (v0.6.x) | After (v0.7.0) | +| :--- | :--- | +| \`/docs/community/developers/*\` | \`/developers/*\` | +| \`/docs/community/governance/*\` | \`/developers/governance/*\` | +| \`/docs/community/contribute/*\` | \`/developers/contribute/*\` | + +This is a **breaking URL change** with no compatibility shim — old links will 404. +External bookmarks, blog posts, and search index entries must be updated. + +**New artefacts shipped with the migration:** + +- **ADR-0011 "Cross-Instance Allowlist"** (EN+IT) — formalises the + \`absolute_path_allowlist\` configuration as a *trust contract* between + Docusaurus instances, with a mandatory *Suppression vs Configuration* + section explicitly banning \`\` for cross-plugin links. +- **\`/docs/how-to/manage-cross-site-links.mdx\`** (EN+IT) — user-facing + Diátaxis how-to guide. +- **\`/docs/reference/configuration.mdx#link-validation\`** (EN+IT) — full + schema reference for \`[link_validation]\`. +- **\`/developers/governance/technical-debt.mdx\`** (EN+IT) — first entry + records **Z108 STALE_ALLOWLIST_ENTRY** as deferred to v0.8.0 with rationale. + +> **EPOCH 7a.1 supersession (v0.7.0):** the `[link_validation].absolute_path_allowlist` +> mechanism above is **retired**. DocusaurusAdapter now auto-detects +> multi-instance plugin URL prefixes Zero-Config — the `[link_validation]` block +> has been deleted from `zenzic-doc/zenzic.toml`, the Z108 entry is closed by +> removal (no allowlist left to go stale), and ADR-0011 / the cross-site-links +> how-to / the `[link_validation]` configuration reference are pending refactor +> in a follow-up documentation sprint. + +The Quartz Promise (one Sentinel, two instances) is now visible from the +README of every Zenzic repository (zenzic, zenzic-doc, zenzic-action) + +--- + +**v0.7.0 is the canonical stable portal for the Quartz Maturity sprint.** + +| Gate | Result | +|------|--------| +| \`zenzic check all\` on docs repo | ✅ Exit 0 | +| \`npm run build\` (EN + IT) | ✅ Zero broken-link errors | +| TypeScript \`tsc --noEmit\` | ✅ Zero errors | +| Markdownlint (all MDX) | ✅ Zero warnings | +| REUSE lint | ✅ 207/207 compliant | +| Pre-commit (all hooks) | ✅ All passed | + +--- + +### 11. Saga Sealed + Log Convention + +The narrative arc *Beyond the Siege* is sealed (Saga I–VI on +`/blog/tags/engineering-chronicles`). The blog now follows a two-track +convention: + +- **🛡️ Saga** — long-form narrative, philosophy, post-mortem. +- **📜 Log** — terse patch-notes mirror of `RELEASE.md`, readable in + ~30 seconds. + +`📜 Log: v0.7.0 — Quartz Maturity` (`/blog/log-v070-quartz-maturity`) +fills the missing intermediate ring between Saga V and the raw release +notes. + +In parallel, **brand purity** was enforced: all probabilistic / +AI-architecture content has been removed from the Zenzic blog. The blog +ships only deterministic engineering content. The `Adversarial +Stress-Testing Protocol` page in `developers/governance/` is the single +exception — and frames AI explicitly as "punching bag", never as +co-author. + +--- + +### 12. EPOCH 7a — Multi-Root Discovery (Foundation for Quartz Maturity) + +The documentation portal now ships the user-facing and developer-facing +pages for the **Multi-Root Discovery** foundation that lifts the +historical `docs_dir` boundary in the Zenzic VSM. + +User-facing track (`/docs/reference/engines/#docusaurus-blog`, EN+IT): +celebrates the practical outcome — *"Zenzic now automatically detects the +Docusaurus blog. No additional configuration is required."* — and +documents the three detection rules (config block / convention fallback / +`blog: false` opt-out) without leaking implementation details. + +Developer-facing track (`/docs/explanation/discovery#multi-root`, EN+IT): +documents the architecture — `ContentRoot` dataclass, `hasattr()`-gated +`get_extra_content_roots()` adapter hook, four-stage pipeline cooperation +(Discovery → VSM → Validator → Scanner), the Zero Subprocess +auto-discovery pass, the Reverse-Mapping invariant that locks the +contract for EPOCH 7b virtual routes, and the engine support matrix. + +The dual-track separation is strict: no `ContentRoot`, `hasattr`, or +`walk_files` reference appears in the User track; no celebratory +language appears in the Developer track. Linguistic parity is enforced +across EN and IT in both tracks (`Z907 I18N_PARITY` clean). + +--- + +### 🇮🇹 Engineered with Precision + +zenzic-doc is the documentation portal for Zenzic, developed by **PythonWoods**, +based in Italy. The portal is the contract made visible. -### Indexing +**For the engine release notes, see [Zenzic Core v0.7.0](https://github.com/PythonWoods/zenzic/blob/main/RELEASE.md).** -`static/robots.txt` permits full crawler indexing (`Allow: /`). +[**Visit the Portal →**](https://zenzic.dev) diff --git a/REUSE.toml b/REUSE.toml index 371c3c1..97004b7 100644 --- a/REUSE.toml +++ b/REUSE.toml @@ -1,93 +1,23 @@ version = 1 -# ── Documentation content ──────────────────────────────────────────────────── - -[[annotations]] -path = "docs/**/*.mdx" -precedence = "aggregate" -SPDX-FileCopyrightText = "2026 PythonWoods " -SPDX-License-Identifier = "Apache-2.0" - -[[annotations]] -path = "i18n/**/*.mdx" -precedence = "aggregate" -SPDX-FileCopyrightText = "2026 PythonWoods " -SPDX-License-Identifier = "Apache-2.0" - -# ── Source (theme, components, pages) ──────────────────────────────────────── - -[[annotations]] -path = "src/**" -SPDX-FileCopyrightText = "2026 PythonWoods " -SPDX-License-Identifier = "Apache-2.0" - -[[annotations]] -path = "scripts/**" -SPDX-FileCopyrightText = "2026 PythonWoods " -SPDX-License-Identifier = "Apache-2.0" - -# ── Static assets ──────────────────────────────────────────────────────────── - [[annotations]] -path = "static/**" -precedence = "aggregate" +path = ["package.json", "package-lock.json", "tsconfig.json", ".markdownlint-cli2.jsonc", "**/*_category_.json", "i18n/**/*.json", ".vscode/mcp.json"] SPDX-FileCopyrightText = "2026 PythonWoods " SPDX-License-Identifier = "Apache-2.0" [[annotations]] -path = "docs/**/assets/**" +path = ["static/**", "docs/**/assets/**", "i18n/**/assets/**"] precedence = "aggregate" SPDX-FileCopyrightText = "2026 PythonWoods " SPDX-License-Identifier = "Apache-2.0" [[annotations]] -path = "i18n/**/assets/**" +path = ["blog/**", "docs/**/*.mdx", "i18n/**/*.mdx"] precedence = "aggregate" SPDX-FileCopyrightText = "2026 PythonWoods " SPDX-License-Identifier = "Apache-2.0" -# ── i18n translation files ─────────────────────────────────────────────────── - -[[annotations]] -path = "i18n/**/*.json" -SPDX-FileCopyrightText = "2026 PythonWoods " -SPDX-License-Identifier = "Apache-2.0" - -# ── Project root — prose ───────────────────────────────────────────────────── - -[[annotations]] -path = ["LICENSE", "NOTICE", "README.md", ".gitignore"] -SPDX-FileCopyrightText = "2026 PythonWoods " -SPDX-License-Identifier = "Apache-2.0" - -# ── Project root — configuration ───────────────────────────────────────────── - -[[annotations]] -path = [ - "package.json", - "package-lock.json", - "tsconfig.json", - "docusaurus.config.ts", - "sidebars.ts", - "tailwind.config.js", - "eslint.config.mjs", - ".markdownlint-cli2.jsonc", - "RELEASE.md", - "zenzic.toml", - "justfile", - "**/*_category_.json" -] -SPDX-FileCopyrightText = "2026 PythonWoods " -SPDX-License-Identifier = "Apache-2.0" - -[[annotations]] -path = ".vscode/mcp.json" -SPDX-FileCopyrightText = "2026 PythonWoods " -SPDX-License-Identifier = "Apache-2.0" - -# ── CI ──────────────────────────────────────────────────────────────────────── - [[annotations]] -path = ".github/**" +path = ["README.md", "README.it.md", "CONTRIBUTING.md", "CODE_OF_CONDUCT.md", "SECURITY.md", "RELEASE.md", "NOTICE", "LICENSE"] SPDX-FileCopyrightText = "2026 PythonWoods " SPDX-License-Identifier = "Apache-2.0" diff --git a/SECURITY.md b/SECURITY.md new file mode 100644 index 0000000..56f7421 --- /dev/null +++ b/SECURITY.md @@ -0,0 +1,91 @@ + + +# Security Policy — zenzic-doc + +## Scope + +This policy covers the **zenzic-doc documentation portal** — the Docusaurus-based site +hosted at [zenzic.dev](https://zenzic.dev). + +For vulnerabilities in the **Zenzic engine** (Python, Shield scanner, path-traversal +protection), see the [core security policy](https://github.com/PythonWoods/zenzic/blob/main/SECURITY.md). + +--- + +## Reporting a Vulnerability + +**Please do not open a public GitHub issue for security vulnerabilities.** + +Report privately via: + +- **GitHub Security Advisories** (preferred): [github.com/PythonWoods/zenzic-doc/security/advisories](https://github.com/PythonWoods/zenzic-doc/security/advisories) +- **Email**: `dev@pythonwoods.dev` — subject line: `[SECURITY] zenzic-doc — ` + +Include a clear description of the vulnerability, steps to reproduce, potential impact, +and a suggested fix if available. + +We will acknowledge your report within **72 hours** and aim to release a patch within +**14 days** of confirming the issue. + +--- + +## In-Scope Areas + +| Area | Description | +|------|-------------| +| **npm dependency CVE** | A known CVE in a runtime dependency (`docusaurus`, `react`, `tailwindcss`, etc.) that affects the built site or the build pipeline | +| **Zenzic Sentinel bypass in docs** | A crafted file in `docs/` or `blog/` that causes `zenzic check all` to pass despite containing a credential pattern (Z201) | +| **Build pipeline code execution** | A crafted MDX file, config, or plugin that causes arbitrary code execution during `npm run build` | +| **Pre-commit hook bypass** | Any method that allows a commit to bypass the Shield, TypeScript, or REUSE pre-commit hooks | +| **Static asset exposure** | A file committed to `static/` that inadvertently exposes credentials or sensitive configuration | + +Out-of-scope: content errors, broken links (reported as standard issues), cosmetic +rendering bugs, or issues that only affect local dev mode (`npm run start`). + +--- + +## Dependency Monitoring + +npm dependencies are audited automatically: + +- `npm-audit.yml` runs on every PR, push to `main`, and weekly — flags high-severity CVEs. +- `dependency-review.yml` flags risky dependency changes introduced by PRs. + +To audit locally: + +```bash +npm audit --audit-level=high +``` + +--- + +## Security Design Notes + +The documentation portal is a **static site** — no server-side code executes at runtime. +The attack surface is limited to: + +- **Build pipeline** — `npm run build` executes Node.js. Crafted MDX could theoretically + + exploit a Docusaurus or remark plugin vulnerability. Keep dependencies up to date. + +- **Pre-commit hooks** — the Zenzic Sentinel scans all source files for credential patterns + + before every commit. The Shield (exit code 2 on Z201) is the last line of defence before + content reaches the public site. + +- **Static assets** — binary files committed to `static/` bypass text-based scanning. + + The `check-added-large-files` hook limits accidental binary commits. + +--- + +## Supported Versions + +| Version | Support status | +|---------|----------------| +| `0.7.x` (current) | ✅ All security fixes | +| `0.6.x` | ⚠️ Critical security fixes only | +| `< 0.6` | ❌ End of life — no support | diff --git a/blog/2026-04-08-hardening-the-documentation-pipeline.mdx b/blog/2026-04-08-hardening-the-documentation-pipeline.mdx new file mode 100644 index 0000000..ebe8e03 --- /dev/null +++ b/blog/2026-04-08-hardening-the-documentation-pipeline.mdx @@ -0,0 +1,212 @@ +--- +slug: hardening-the-documentation-pipeline +title: "The Leaking Pipe" +sidebar_label: "🛡️ 001 - Saga I: The Leaking Pipe" +authors: [pythonwoods] +tags: [engineering, security, python, opensource, markdown, obsidian-chronicles, engineering-chronicles] +date: 2026-04-08 +description: > + The supply chain risk hiding in your Markdown pipeline — and why every engineering + team has it. Philosophy, threat model, and architecture of Zenzic. +image: https://zenzic.dev/assets/social/social-card.png +--- + +:::tip[In a hurry?] +Skip the engineering deep dive — jump straight to the [⚡ Tutorial: Stop Broken Links](/blog/tutorial-stop-broken-links-60s) and protect your docs in 5 minutes. +::: + +:::info[🛡️ The Zenzic Chronicles — Complete] + +The complete six-part engineering saga of Zenzic's journey from v0.5 Sentinel to v0.7.0 Quartz Maturity. The Chronicles are sealed. + +**Saga I** | [Saga II](/blog/docs-pipeline-security-risk-obsidian-bastion) | [Saga III](/blog/ai-driven-siege-shield-postmortem) | [Saga IV](/blog/beyond-the-siege-zenzic-v070-quartz) | [Saga V](/blog/zenzic-v070-quartz-maturity-stable) | [Saga VI](/blog/governance-of-quartz) + +::: + +Every CI/CD pipeline has a security perimeter. Developers run static analysis on source +code. They scan container images for CVEs. They audit dependencies for known +vulnerabilities. They enforce secrets detection in commit hooks. + +And then they push raw, unvalidated Markdown files directly into a documentation +build — and call it shipped. + +This is not a theoretical gap. It is the default posture of almost every engineering +team I've observed. I built **Zenzic** to prove it — and fix it. + +{/* truncate */} + +## The Threat Model Nobody Talks About + +Consider the anatomy of a typical documentation credential leak. + +A contributor opens a pull request with new API documentation. The Markdown file +contains a code example. Inside that example — copied from a local test, a Slack +message, or a terminal session — there is a real API key. Not a placeholder. A live +credential. + +The reviewer reads the prose. The reviewer doesn't read the key as a key — it's +formatted as sample output, it blends into the noise. The PR merges. The docs build +runs. The rendered HTML goes live. The key is now indexed by search engines. + +Now extend the threat model outward. What happens when a `docs_dir` configuration +entry points to `../../../etc/`? Most documentation tools will simply start reading. +What happens when a contributor submits a `.gitignore` entry designed to suppress +certain files, but those files are present on the build server? + +These questions don't appear in the standard security checklist. They belong to a class +of supply chain risk that sits precisely between source code and rendered output — where +tooling is sparse and assumptions are dangerous. + +## ⚓ The Core Philosophy: "Lint the Source, not the Build" + +Most documentation tools analyze the generated HTML. This creates a "build driver +dependency": if your generator (MkDocs, Hugo, Docusaurus) has a bug, your security +validation breaks. + +Zenzic takes a different path. It analyzes the raw Markdown source **before the build +starts**, building a **Virtual Site Map (VSM)** directly from the filesystem. The core +never knows which engine it's analyzing. It can't be tricked by disguising content as +engine-specific directives. + +## Why Pure Python Is a Security Decision + +Most tooling in the documentation space runs through execution engines. Markdown +configuration files get evaluated. Node.js processes get spawned. Shell commands get +invoked to query version control state. + +Each of these is a trust boundary. The moment your analyzer executes code to understand +your content, it has accepted the premise that the content can be trusted to execute +safely. This is circular reasoning — particularly dangerous when the content being +analyzed comes from external contributors. + +Zenzic was designed from day one around a single architectural invariant: +**zero subprocess execution, ever.** + +No `node` process to evaluate Docusaurus configuration. No `git check-ignore` to +interpret `.gitignore` rules. No shell calls. Every piece of analysis runs in the Python +interpreter, on data read as plain bytes and treated as untrusted input throughout. + +This is not a convenience trade-off. It is a security model. + +## The Architecture of Suspicion + +Zenzic's core operates under what I think of as **architectural suspicion**: every input +is assumed hostile until proven otherwise, and the analysis pipeline is designed to fail +safely when something unexpected appears. + +Three properties define this architecture: + +**Engine-agnostic analysis.** Zenzic never imports or executes a documentation +framework. Engine-specific semantics — how MkDocs resolves nav entries, how Docusaurus +handles locale trees — live in thin, replaceable adapters. The core never has opinions +about what "valid documentation" means beyond the content itself. + +**Deterministic file discovery.** File traversal is one of the most quietly dangerous +operations in any build tool. Zenzic's discovery layer enforces a four-level exclusion +hierarchy: immutable system guardrails (no code can read inside `.git/` or +`node_modules/`), VCS ignore rules parsed in pure Python, project configuration, and +runtime overrides. The hierarchy is not advisory — it is enforced at the type boundary. +No `ExclusionManager`, no scan. + +**Non-bypassable exit codes.** When Zenzic detects a credential leak, it exits with +code 2 — the Shield. When it detects a path traversal attempt, it exits with code 3 — +the Blood Sentinel. These codes cannot be suppressed, downgraded, or configured away. +The perimeter holds, or the build fails. + +## 🩸 The Blood Sentinel: Classifying Intent + +A broken link is a maintenance issue. A link that probes the host OS is a security +incident. + +Zenzic's classification engine detects if a resolved path targets sensitive OS +directories (`/etc/`, `/proc/`, `/var/`, etc.). Instead of a generic error, it triggers +a dedicated **Exit Code 3** — crucial for preventing accidental leakage of +infrastructure details or template injection probes in automated pipelines. + +## 🔐 The Shield: Multi-Stream Credential Scanning + +Documentation is a magnet for "temporary" credentials that end up being permanent. +Zenzic's Shield scans every line and fenced code block for 9 families of secrets: + +- AWS, GitHub, Stripe, and OpenAI keys +- GitLab Personal Access Tokens +- Slack tokens and Google API keys +- Hex-encoded payloads (`\xNN` escape sequences) for obfuscated strings +- **Exit Code 2**: A credential breach is a build-blocking, non-suppressible event + +## 🌀 Graph Integrity and $O(V+E)$ Complexity + +In large documentation sets, link cycles are common. Zenzic implements an **iterative +DFS** with a three-color marking system to avoid recursion limits. Pre-computing the +cycle registry in Phase 1.5 allows Phase 2 (Validation) to remain $O(1)$ per-query — +even massive docsets validate in seconds. + +## 🇮🇹 Dogfooding i18n + +We believe in bilingual documentation. Zenzic supports native i18n with "Ghost Routes" +— logical paths that don't exist on disk but are resolved by build plugins. We keep our +own documentation in full parity between English and Italian. + +## Supply Chain Security Starts Before the Build + +There is a maturing conversation about supply chain security in software. Most of that +conversation focuses on dependencies: SBOM generation, CVE scanning, license auditing. +These are necessary. They are not sufficient. + +The documentation pipeline is also part of the supply chain. It receives inputs from +contributors who may be external to the organization. It runs in the same CI environment +as your source build. It publishes output that is indexed, cached, and distributed at +scale. + +A credential leaked in documentation has the same blast radius as a credential committed +to source code. A path traversal through `docs_dir` can access the same filesystem as +your CI runner. + +This is why Zenzic exists. Not to lint Markdown formatting. To treat documentation as +input — with all the suspicion and rigor that phrase implies. + +## The Obligation of Precision + +Security tooling carries an obligation that productivity tooling does not: when it says +something is safe, it must be right. A false negative in a documentation linter means +a credential goes undetected. A path traversal guard that can be bypassed means the +bypass is a feature, not a bug. + +The normalization pipeline that runs before credential detection was not built to be +comprehensive — it was built because each step corresponds to a real attack vector +identified during internal red team exercises: Unicode format character injection, HTML +entity obfuscation, comment interleaving, cross-line token splitting. Each is a +documented technique, not a theoretical concern. The full story of those bypass vectors +is in [Part 3 of this series](/blog/ai-driven-siege-shield-postmortem). + +## 🏁 Run It + +```bash +pip install zenzic +zenzic check all +``` + +The largest single architectural step in Zenzic's history deleted 21,870 lines and +added 888 — the Headless Architecture transition that turned a MkDocs-specific tool +into a multi-engine documentation security framework. That story is +[Part 2 in this series](/blog/docs-pipeline-security-risk-obsidian-bastion). + +--- + +| | | +|---|---| +| **GitHub** | [github.com/PythonWoods/zenzic](https://github.com/PythonWoods/zenzic) | +| **Documentation** | [zenzic.dev](https://zenzic.dev/) | +| **PyPI** | [pypi.org/project/zenzic](https://pypi.org/project/zenzic/) | + +**Cross-posted on:** + +- [Medium](https://medium.com/zenzic-engineering/your-documentation-is-a-leaking-pipe-7c1d6f4a84d0) — *Your Documentation is a Leaking Pipe* + +:::note[The Zenzic Chronicles] +This is **Part 1** of a five-part engineering series documenting the path from v0.5 to v0.7.0 Stable. + +**Part 1 — The Sentinel** · [Part 2 — Sentinel Bastion](/blog/docs-pipeline-security-risk-obsidian-bastion) · [Part 3 — The AI Siege](/blog/ai-driven-siege-shield-postmortem) · [Part 4 — Beyond the Siege](/blog/beyond-the-siege-zenzic-v070-quartz) · [Part 5 — Quartz Maturity](/blog/zenzic-v070-quartz-maturity-stable) +::: + +*Part 1 of the **Zenzic Chronicles**. For the complete architectural journey, visit the [Safe Harbor Blog](https://zenzic.dev/blog/).* diff --git a/blog/2026-04-12-zenzic-v060a1-the-sentinel.mdx b/blog/2026-04-12-zenzic-v060a1-the-sentinel.mdx new file mode 100644 index 0000000..bf5bdef --- /dev/null +++ b/blog/2026-04-12-zenzic-v060a1-the-sentinel.mdx @@ -0,0 +1,65 @@ +--- +slug: zenzic-v060a1-the-sentinel +title: "Zenzic v0.6.0a1 — The Sentinel" +sidebar_label: "📜 Log: v0.6.0a1" +authors: [pythonwoods] +tags: [engineering, python, opensource, milestone] +date: 2026-04-12 +description: > + Sentinel era alpha. v0.6.0a1 introduces the core architecture — + pure Python, no subprocesses, engine-agnostic — and the three non-negotiable pillars. +image: https://zenzic.dev/assets/social/social-card.png +--- + +:::note[Alpha/RC Chronicle] + +This is a historical record of a development milestone. The first stable release is **[v0.7.0 — Quartz Maturity](/blog/zenzic-v070-quartz-maturity-stable)**. + +::: + +**v0.6.0a1 — The Sentinel** is the founding alpha: the release that crystallised the +philosophy of Zenzic from a prototype into a disciplined engineering system. + +{/* truncate */} + +## What shipped + +This alpha established three architectural pillars that remain non-negotiable to this day: + +1. **Lint the Source** — Zenzic never runs the build. It reads raw Markdown and configuration + + files directly. HTML output is never trusted. + +2. **No Subprocesses** — 100% pure Python. No `subprocess.run`, no Node.js execution, + + no external process calls in the hot path. + +3. **Pure Functions First** — Deterministic logic. No I/O inside link-validation loops. + +### Core capabilities in v0.6.0a1 + +| Feature | Status | +|:--------|:-------| +| `zenzic check all` — multi-mode file scan | ✅ Alpha | +| Shield credential scanner (exit code 2/3) | ✅ Alpha | +| MkDocs adapter | ✅ Alpha | +| Docusaurus v3 adapter | ✅ Alpha | +| Virtual Site Map (VSM) | 🚧 Prototype | + +### The Sentinel identity + +The "Sentinel" codename captured the philosophy: an always-on guard at the source +boundary, not an inspector of finished output. The Sentinel does not care how the +site renders. It cares whether the source is honest. + +The Shield — the credential scanner — was the first concrete expression of that posture: +a module that exits with code 2 on secrets detected and code 3 on a Blood Sentinel +(fatal leak pattern), before any documentation build can begin. + +## What came next + +The critical groundwork laid in v0.6.0a1 — especially the `find_repo_root()` discovery +protocol (ADR 003) and the two-phase `zenzic init` bootstrap sequence — proved solid +enough to survive two major hardening passes without architectural change. + +Read the full technical story: [Your Documentation is a Leaking Pipe](/blog/hardening-the-documentation-pipeline) diff --git a/blog/2026-04-15-docs-pipeline-security-risk.mdx b/blog/2026-04-15-docs-pipeline-security-risk.mdx new file mode 100644 index 0000000..9a90bb6 --- /dev/null +++ b/blog/2026-04-15-docs-pipeline-security-risk.mdx @@ -0,0 +1,243 @@ +--- +slug: docs-pipeline-security-risk-obsidian-bastion +title: "Headless Architecture" +sidebar_label: "🛡️ 002 - Saga II: Headless Architecture" +authors: [pythonwoods] +tags: [engineering, security, python, devtools, obsidian-chronicles, engineering-chronicles] +date: 2026-04-15 +description: > + Inside the linter that trusts nothing — not even its own config files. + How Obsidian Bastion turned Zenzic into a multi-engine security infrastructure. +image: https://zenzic.dev/assets/social/social-card.png +--- + +:::info[🛡️ The Zenzic Chronicles — Complete] + +The complete six-part engineering saga of Zenzic's journey from v0.5 Sentinel to v0.7.0 Quartz Maturity. The Chronicles are sealed. + +[Saga I](/blog/hardening-the-documentation-pipeline) | **Saga II** | [Saga III](/blog/ai-driven-siege-shield-postmortem) | [Saga IV](/blog/beyond-the-siege-zenzic-v070-quartz) | [Saga V](/blog/zenzic-v070-quartz-maturity-stable) | [Saga VI](/blog/governance-of-quartz) + +::: + +Most documentation builds operate on an implicit contract with their input: the content +is trusted because the contributors are trusted. It's a reasonable assumption for a +wiki. It is an indefensible posture for a security-conscious CI pipeline. + +Zenzic was built to invalidate that assumption — to treat documentation the way a +compiler treats source: as input that must be analyzed, validated, and potentially +rejected before it reaches production. + +If your documentation is part of your CI pipeline, it's part of your attack surface. +Zenzic is designed for CI pipelines that handle untrusted docs, open-source projects +with external contributors, and teams running multiple doc engines side by side. + +{/* truncate */} + +In [Part 1](/blog/hardening-the-documentation-pipeline), I covered the philosophy and +threat model. This piece is about how Obsidian Bastion enforced them as infrastructure +properties. + +## 🎯 Where Zenzic Fits + +Zenzic is designed for: + +- CI pipelines that handle untrusted docs +- Open-source projects with external contributors +- Teams running multiple doc engines side by side +- Security-conscious workflows that need to validate content before the build — not after + +Three core properties define it: + +**No subprocess execution — ever.** No `node`, no `git`, no shell calls. The core +library is 100% Pure Python. This isn't a convenience feature — it's a security model. +A tool that spawns subprocesses is a tool that can be tricked into executing untrusted +code. + +**Engine-agnostic analysis.** Zenzic reads raw Markdown and configuration files as +plain data. It never imports or executes a documentation framework. Engine-specific +knowledge lives in thin, replaceable adapters that translate semantics into a neutral +protocol. + +**Deterministic file discovery.** Every file scan is explicit. Every path is validated. +There are no accidental full-repo traversals, no hidden directories slipping through. +Identical source files always produce identical results. + +## The Versioning As a Threat Model + +Understanding what changed in the Obsidian series requires understanding what the +previous version got wrong. + +| Version | Codename | Milestone | +| :--- | :--- | :--- | +| v0.5.x | The Sentinel | Core scanning + MkDocs-only awareness | +| v0.6.0 | Obsidian Glass | Headless architecture transition | +| v0.6.1rc2 | Obsidian Bastion | Multi-engine security infrastructure | + +The Sentinel was a capable linter. It was also architecturally coupled to a single +documentation engine. When a MkDocs assumption was embedded in the core, the core had +opinions about what “valid documentation” meant that had nothing to do with the content +being analyzed. + +This coupling is a risk. An analyzer that assumes its input follows MkDocs conventions +will fail silently — or not at all — when presented with a Docusaurus project. +Failing silently is the worst possible outcome for a security tool: it gives a false +sense of coverage. + +The biggest single commit in this arc deleted **21,870 lines** and added **888** — the +Headless Architecture transition that stopped Zenzic from being a MkDocs tool and made +it an analyser of documentation platforms. + +## ⚛️ Parsing Docusaurus without Node + +The first concrete challenge was supporting Docusaurus v3. Its config files are +TypeScript: + +```ts +export default { + presets: [['classic', { docs: { routeBasePath: '/guides' } }]], + i18n: { defaultLocale: 'en', locales: ['en', 'it'] }, +}; +``` + +The obvious solution — calling `node` to evaluate the config — would violate Pillar 2 +(No Subprocesses). So I built a **static parser in Pure Python** that extracts +`baseUrl`, `routeBasePath`, locale configuration, and plugin metadata directly from +the source text. No evaluation. No runtime. No JavaScript. + +## 🧱 Layered Exclusion — The Real Headline Feature + +File discovery is where most documentation tools quietly fail. The **Layered Exclusion +Manager** replaces all ad-hoc directory filtering with a deterministic 4-level hierarchy: + +| Level | Name | Description | +| :---: | :--- | :--- | +| L1 | System guardrails | Immutable — `.git`, `node_modules`, `__pycache__`, etc. | +| L2 | `.gitignore` + forced inclusions | Additive rules, parsed in Pure Python | +| L3 | Config (`zenzic.toml`) | `excluded_dirs` / `excluded_file_patterns` | +| L4 | CLI flags | `--exclude-dir` / `--include-dir` at runtime | + +The levels encode a **security invariant**: L1 System Guardrails are immutable. No +configuration file and no CLI flag can force Zenzic to scan inside `.git/` or +`node_modules/`. + +## 🗡️ The Tabula Rasa Refactor + +The most invasive change: I removed every single `rglob()` call from the codebase — all +of them — and replaced them with two centralized functions in `discovery.py`: + +```python +def walk_files(root, exclusion_manager) -> Iterator[Path]: ... +def iter_markdown_sources(root, exclusion_manager) -> Iterator[Path]: ... +``` + +The `exclusion_manager` parameter is **mandatory**. Not `Optional`, no `None` default. +If you call a scanner or validator entry point without an `ExclusionManager`, you get a +`TypeError` at call time — not a silent full-tree scan at runtime. + +168 call sites updated. Accidental full-repo scans are now **architecturally impossible**. + +## 🔐 Security Hardening + +**ReDoS prevention.** Lines exceeding 1 MiB are silently truncated before Shield regex +matching. A crafted documentation file with a multi-megabyte line could exploit +catastrophic backtracking in credential detection patterns. + +**Path traversal guard (Exit Code 3).** `_validate_docs_root()` now rejects `docs_dir` +paths that escape the repository root. A malicious `zenzic.toml` pointing +`docs_dir: ../../../etc/` triggers **Exit 3 (Blood Sentinel)** before any file is read. + +## The Supply Chain Risk Metric That Doesn't Get Enough Attention + +Runtime dependency count is an underappreciated supply chain security metric. + +Every Python package that Zenzic imports at runtime is a potential vector for +dependency confusion attacks, malicious package updates, and transitive vulnerability +inheritance. The decision to minimize the dependency surface is not about keeping the +package small — it is about limiting the attack surface of the supply chain. + +Zenzic's runtime dependency count: **5**. + +For a tool that supports four documentation engines, performs multi-family credential +detection, implements a deterministic quality scoring system, validates link graphs +against a virtual site map, and runs over a thousand tests — five runtime dependencies +is a deliberate architectural achievement, not a limitation. + +## What "Hardened" Actually Means + +The word “hardened” is overused in security marketing. In the context of Obsidian +Baston, it has a specific meaning: every component of the system has been analyzed for +its failure modes under adversarial input, and those failure modes have been either +eliminated or bounded. + +The subprocess constraint eliminates the execution trust boundary. The Layered Exclusion +Manager bounds the filesystem access surface. The mandatory `ExclusionManager` type +enforces the boundary at the API level. The non-bypassable exit codes ensure that +security failures produce unambiguous CI outcomes. The ReDoS truncation bounds the +computational cost of analysis. The path traversal guard bounds the filesystem read +scope. + +None of these are features. They are the removal of assumptions — the careful, +systematic elimination of the implicit trust that characterizes unexamined systems. + +> A hardened system is not a system with more defenses added on top. It is a system +> with fewer assumptions built in. + +## What Broke Along the Way + +A refactor of this scope does not leave the API surface intact. Three breaking changes +were deliberate, not accidental: + +- **`zenzic serve` removed entirely** — use your engine's native command (`mkdocs serve`, + + `npx docusaurus start`). It was the last place where a subprocess could theoretically + be spawned. + +- **MkDocs plugin relocated** from `zenzic.plugin` to `zenzic.integrations.mkdocs`, + + installs separately via `pip install "zenzic[mkdocs]"`, keeping the core free of + engine-specific imports. + +- **`ExclusionManager` parameter is now mandatory** — no `Optional`, no `None` default. + + If your code was silently skipping exclusion filtering, it will now fail at the type + level. That's the point. + +These are costs. They are also the reason the guarantees in this article are +enforceable rather than aspirational. + +## 📊 By the Numbers + +| Metric | Value | Note | +| :--- | ---: | :--- | +| Test functions | 929 | High-granularity validation | +| Source code | 11,422 LOC | Real architectural scope | +| Test code | 12,927 LOC | ~1.13x ratio — disciplined testing | +| Engine adapters | 4 | MkDocs, Docusaurus v3, Zensical, Standalone | +| Runtime dependencies | 5 | Minimal supply chain risk | +| Subprocess calls | 0 | Safe in sandboxed CI environments | + +## 🏁 Run It Against Your Docs + +```bash +pip install zenzic +zenzic check all +``` + +--- + +| | | +|---|---| +| **GitHub** | [github.com/PythonWoods/zenzic](https://github.com/PythonWoods/zenzic) | +| **Documentation** | [zenzic.dev](https://zenzic.dev/) | + +**Cross-posted on:** + +- [Medium](https://medium.com/zenzic-engineering/what-happens-when-you-rip-the-foundation-out-of-a-security-tool-173b57d496b2) — *What Happens When You Rip the Foundation Out of a Security Tool* + +:::note[The Zenzic Chronicles] +This is **Part 2** of a five-part engineering series documenting the path from v0.5 to v0.7.0 Stable. + +[Part 1 — The Sentinel](/blog/hardening-the-documentation-pipeline) · **Part 2 — Sentinel Bastion** · [Part 3 — The AI Siege](/blog/ai-driven-siege-shield-postmortem) · [Part 4 — Beyond the Siege](/blog/beyond-the-siege-zenzic-v070-quartz) · [Part 5 — Quartz Maturity](/blog/zenzic-v070-quartz-maturity-stable) +::: + +*Part 2 of the **Zenzic Chronicles**. For the complete architectural journey, visit the [Safe Harbor Blog](https://zenzic.dev/blog/).* diff --git a/blog/2026-04-16-ai-driven-siege-postmortem.mdx b/blog/2026-04-16-ai-driven-siege-postmortem.mdx new file mode 100644 index 0000000..6abdbce --- /dev/null +++ b/blog/2026-04-16-ai-driven-siege-postmortem.mdx @@ -0,0 +1,394 @@ +--- +slug: ai-driven-siege-shield-postmortem +title: "The AI Siege" +sidebar_label: "🛡️ 003 - Saga III: The AI Siege" +authors: [pythonwoods] +tags: [security, engineering, post-mortem, python, devtools, obsidian-chronicles, engineering-chronicles] +date: 2026-04-16 +description: > + We put our documentation linter under an AI-driven siege. Four bypass vectors found, + four sealed. Here's the full post-mortem of Zenzic's AI red team audit. +image: https://zenzic.dev/assets/social/social-card.png +--- + +:::info[🛡️ The Zenzic Chronicles — Complete] + +The complete six-part engineering saga of Zenzic's journey from v0.5 Sentinel to v0.7.0 Quartz Maturity. The Chronicles are sealed. + +[Saga I](/blog/hardening-the-documentation-pipeline) | [Saga II](/blog/docs-pipeline-security-risk-obsidian-bastion) | **Saga III** | [Saga IV](/blog/beyond-the-siege-zenzic-v070-quartz) | [Saga V](/blog/zenzic-v070-quartz-maturity-stable) | [Saga VI](/blog/governance-of-quartz) + +::: + +Four bypass vectors. Four real findings. All closed. + +{/* truncate */} + +This is the complete technical post-mortem of **Operation Obsidian Stress** — the +adversarial security audit we ran against Zenzic v0.6.1rc2's Shield (credential scanner) +before release. I'm publishing the full technical details because the findings are +instructive, the fixes are non-obvious, and the code belongs in the open. + +:::note[Methodology] + +To validate the Shield, I orchestrated a multi-team AI system — Red Team, Blue Team, +and Purple Team — using specialized agent ensembles to simulate advanced obfuscation +techniques. This is AI-assisted security engineering: using the same agentic +architecture that attackers use to find the gaps they would exploit. All findings, +bypass vectors, and fixes documented here are real. + +::: + +## What Shield Is (and Why Breaking It Matters) + +Before the attack details, context: Shield is Zenzic's credential detection layer. +It scans every Markdown and MDX file in your documentation before the build runs, +looking for patterns that indicate real credentials in content. + +The threat model is simple: a contributor submits a PR with a code example. That +example contains a real API key — copied from a local terminal session, pasted from +a Slack thread, or forgotten after a debugging session. The reviewer reads the prose, +not the bytes. The PR merges. The docs build. The key is now live on your documentation +site, indexed by search engines. + +Shield exists to catch that before it ships. If Shield can be bypassed by someone who +knows how it works, it's not a scanner — it's a false guarantee. + +## The Attack Surface + +Shield's architecture before Operation Obsidian Stress: + +1. Read each line of the Markdown/MDX file +2. Apply a normalization pass (strip backticks, collapse whitespace) +3. Run 9 regex patterns against the normalized line +4. Report any match as a `ShieldFinding` + +Step 4 triggers **Exit Code 2** (Shield breach) — non-bypassable, distinct from Exit +Code 1 (validation failure) and Exit Code 3 (Blood Sentinel / path traversal). + +The attack surface was step 2: the normalization pass. It normalized formatting noise +but did not account for deliberate obfuscation. + +--- + +## ZRT-006: Unicode Format Character Injection + +**Category:** Input normalization bypass +**Severity:** High — complete bypass of all regex patterns +**CVSS analogy:** 8.1 (High) + +### The Technique + +Python's `unicodedata` module exposes a character category classification. The `Cf` +category ("Format characters") includes characters that are semantically meaningful in +Unicode text processing but are invisible in rendered output and most text displays: + +| Code point | Name | Purpose | +| :--- | :--- | :--- | +| U+200B | Zero Width Space | Line breaking hint | +| U+200C | Zero Width Non-Joiner | Prevents ligatures | +| U+200D | Zero Width Joiner | Forces ligatures | +| U+00AD | Soft Hyphen | Optional hyphenation | +| U+FEFF | Zero Width No-Break Space | BOM marker | + +Inject any of these into a credential token and the regex fails to match: + +```python +key = "sk-abc123def456ghi789jkl012mno345pqr678stu" +# Insert ZWS after position 9 (inside the token) +bypass = key[:9] + "\u200B" + key[9:] + +import re +pattern = re.compile(r"sk-[a-zA-Z0-9]{48}") +print(pattern.search(bypass)) # None — bypass confirmed +``` + +### The Fix + +Strip all `Cf`-category characters before any normalization step runs: + +```python +import unicodedata + +def _strip_unicode_format_chars(text: str) -> str: + """Remove all Unicode Format (Cf) characters. + + Invisible to human readers but interrupt regex pattern matching. + Examples: U+200B (ZWS), U+200C (ZWNJ), U+200D (ZWJ), U+00AD (soft hyphen). + """ + return "".join(c for c in text if unicodedata.category(c) != "Cf") +``` + +--- + +## ZRT-006b: HTML Entity Obfuscation + +**Category:** Input normalization bypass +**Severity:** High — bypasses patterns that depend on punctuation characters +**Affected families:** OpenAI (hyphen), Stripe (hyphen, underscore), GitHub (underscore) + +### The Technique + +Markdown renderers decode standard HTML entities. The hyphen character (`-`) has the +HTML entity `-`. The underscore (`_`) is `_`. + +```text +sk-abc123def456ghi789jkl012mno345pqr678stu +``` + +Renders as: `sk-abc123def456ghi789jkl012mno345pqr678stu` — a valid OpenAI key format. + +The credential scanner sees `sk-abc123...` — which does not match +`sk-[a-zA-Z0-9]{48}`. The entity is a one-character substitution of a structural +boundary character. + +### The Fix + +```python +import html + +def _decode_html_entities(text: str) -> str: + """Decode HTML entities before pattern matching. + + A credential containing - (hyphen) or _ (underscore) renders + correctly in a browser but bypasses regex patterns that match on the + literal character. + """ + return html.unescape(text) +``` + +`html.unescape()` is part of the Python standard library. No dependencies. Zero cost. + +--- + +## ZRT-007: Comment Interleaving + +**Category:** Token fragmentation via markup +**Severity:** High — renders the token non-contiguous in raw source +**Technique:** Inject HTML or MDX comment blocks between credential characters + +### The Technique + +HTML comments and MDX expression comments are invisible in rendered output. They are +valid Markdown syntax that any Markdown renderer will process and discard. + +```text +sk-abc123def456ghi789jkl012mno345pqr678stu +``` + +In rendered output: `sk-abc123def456ghi789jkl012mno345pqr678stu` (correct, readable). +In raw source the scanner reads: the regex fails because the comment block interrupts +the character class `[a-zA-Z0-9]`. + +MDX variant: `sk-abc123{/* inline MDX comment */}def456...` — same effect. + +### The Fix + +```python +import re + +_HTML_COMMENT_RE = re.compile(r"", re.DOTALL) +_MDX_COMMENT_RE = re.compile(r"\{/\*.*?\*/\}", re.DOTALL) + +def _strip_markup_comments(text: str) -> str: + """Strip HTML and MDX comments before pattern matching.""" + text = _HTML_COMMENT_RE.sub("", text) + text = _MDX_COMMENT_RE.sub("", text) + return text +``` + +--- + +## ZRT-007b: Cross-Line Token Splitting + +**Category:** Architectural bypass — stateless scanner assumption +**Severity:** Critical — bypasses all pattern matching with zero obfuscation +**Technique:** Line break + +This is the most architecturally significant finding. It requires no Unicode tricks, +no entity encoding, no markup injection. **One line break.** + +### The Technique + +```text +Here is my staging key for the integration tests: sk-abc123def456 +ghi789jkl012mno345pqr678stu901vwx234yz +``` + +The scanner processes line 1 — no match (only 12 chars after `sk-`). +The scanner processes line 2 — no match (no `sk-` prefix). +The credential leaks. The split is invisible in rendered output — the two lines render +as a single paragraph. + +### The Fix: The Lookback Buffer + +A stateful generator that maintains context across line boundaries, creating a +synthetic overlap zone: + +```python +def scan_lines_with_lookback( + lines: Iterable[tuple[int, str]], + file_path: Path, + buffer_width: int = 80, +) -> Iterator[ShieldFinding]: + prev_normalized: str = "" + prev_seen: set[str] = set() + + for line_no, raw_line in lines: + seen_this_line: set[str] = set() + normalized = _normalize_line_for_shield(raw_line) + + # Pass 1: standard per-line scan + for finding in _scan_normalized_line(normalized, file_path, line_no): + yield finding + seen_this_line.add(finding.family) + + # Pass 2: cross-line join zone scan + if prev_normalized: + join_zone = prev_normalized[-buffer_width:] + normalized[:buffer_width] + for finding in _scan_normalized_line(join_zone, file_path, line_no): + if finding.family not in (seen_this_line | prev_seen): + yield finding + + prev_normalized = normalized + prev_seen = seen_this_line +``` + +**Why 80 characters?** Standard terminal width and most documentation editors wrap at +80–120 characters. Taking 80 characters from each side covers the vast majority of +real-world split positions with minimal false positive risk. + +--- + +## The Complete 8-Step Normalization Pipeline + +After closing all four vectors, Shield's normalization function runs every line through +a deterministic eight-step sequence: + +```python +def _normalize_line_for_shield(raw_line: str) -> str: + text = raw_line + text = _strip_unicode_format_chars(text) # Step 1: Cf chars + text = html.unescape(text) # Step 2: HTML entities + text = _HTML_COMMENT_RE.sub("", text) # Step 3: HTML comments + text = _MDX_COMMENT_RE.sub("", text) # Step 4: MDX comments + text = _BACKTICK_RE.sub(lambda m: m.group(1), text) # Step 5: backtick spans + text = text.replace("+", " ") # Step 6: concatenation operators + text = text.replace("|", " ") # Step 7: table cell separators + text = " ".join(text.split()) # Step 8: whitespace collapse + return text +``` + +Each step is independently testable. The test suite includes 47 tests specifically +for normalization. + +## Coverage Added by Operation Obsidian Stress + +| Bypass vector | New tests | +| :--- | ---: | +| Cf character injection (ZRT-006) | 23 | +| HTML entity obfuscation (ZRT-006b) | 18 | +| Comment interleaving (ZRT-007) | 31 | +| Cross-line token splitting (ZRT-007b) | 28 | +| Normalization pipeline integration | 17 | +| **Total new tests** | **117** | + +Before the operation: 929 passing tests. After closing all four vectors: 1,130+ passing tests. + +## The Risk Management Dimension + +The four bypass vectors found during Operation Obsidian Stress have a common property: +they are not obscure edge cases. They are techniques that appear in standard lists of +regex evasion methods used in adversarial content scenarios — discoverable by any +documentation contributor with moderate knowledge of Unicode, HTML encoding, and regex +mechanics. + +The risk profile of an unpatched documentation scanner is not “low probability, low +impact.” It is **moderate probability, high impact** — because credential leaks in +documentation have immediate material consequences, and because documentation pipelines +receive content from the broadest possible contributor population. + +This is the supply chain risk dimension that is most frequently underweighted: not the +vulnerability of your infrastructure, but the vulnerability of the content processing +path you expose to your contributor base. + +> A security tool that can be bypassed by a contributor who knows how it works is not +> a security tool. It is a compliance checkbox. + +## Beyond Security: The Full Zenzic Surface + +Shield is one layer in a complete documentation quality framework: + +| Layer | What it catches | +| :--- | :--- | +| Link validation (VSM) | Broken internal links, ghost routes — no live server required | +| Orphan detection | Pages that exist but are unreachable in the navigation graph | +| Snippet verification | Code blocks referencing files that don’t exist on disk | +| Placeholder scanning | `TODO`, `FIXME`, `TBD` in published content | +| Asset auditing | Unused images with autofix support | +| Reference integrity | `[broken][ref]`-style links with missing definitions | +| Quality score | Deterministic 0–100 metric with regression detection | + +All analysis is engine-agnostic: auto-detection covers MkDocs, Docusaurus v3, Zensical, +and Standalone Mode. No plugins to install. No build to run. No subprocesses. + +## Exit Code Taxonomy + +Zenzic’s exit codes are non-negotiable — no configuration can suppress them: + +| Code | Name | Trigger | +| :---: | :--- | :--- | +| 0 | Success | All checks pass | +| 1 | Quality | Validation findings (broken links, orphans, placeholders) | +| 2 | Shield | Credential detected in documentation | +| 3 | Blood Sentinel | Path traversal attack or fatal error | + +Codes 2 and 3 cannot be configured away. A CI step that can be silenced on a security +failure is not a security control. + +## The Obligation of the Bastion + +“The Bastion holds” is not a marketing phrase. It is an engineering commitment. It means +that every identified attack path has been closed, that the closure has been verified +with test coverage, and that the system’s failure modes under adversarial input are +bounded and known. + +It does not mean that future bypass vectors don’t exist. Red team exercises are not +proofs of security — they are evidence of the security posture at a specific moment in +time. The four vectors found during Operation Obsidian Stress were found because we +looked for them systematically. Vectors we haven’t enumerated may still exist. + +What the Bastion commitment means is that we look — methodically, adversarially, and +transparently about what we find. + +## The Takeaway + +The four bypass vectors found during Operation Obsidian Stress are not exotic. They're +the kind of techniques that appear in any list of regex evasion methods — Unicode +injection, HTML entity encoding, markup comment interleaving, structural line splitting. + +What made them findable was the decision to look for them **systematically, with +adversarial intent**, before release. What made them fixable was having a normalization +pipeline with defined semantics and comprehensive test coverage at each step. + +Security tooling that isn't tested adversarially is security tooling that provides the +appearance of coverage without the substance. + +--- + +| | | +|---|---| +| **GitHub** | [github.com/PythonWoods/zenzic](https://github.com/PythonWoods/zenzic) | +| **Documentation** | [zenzic.dev](https://zenzic.dev/) | +| **PyPI** | [pypi.org/project/zenzic](https://pypi.org/project/zenzic/) | + +**Cross-posted on:** + +- [Medium](https://medium.com/zenzic-engineering/we-put-our-documentation-linter-under-an-ai-driven-siege-heres-the-post-mortem-c09b8a86a396) — *We Put Our Documentation Linter Under an AI-Driven Siege* + +:::note[The Zenzic Chronicles] +This is **Part 3** of a five-part engineering series documenting the path from v0.5 to v0.7.0 Stable. + +[Part 1 — The Sentinel](/blog/hardening-the-documentation-pipeline) · [Part 2 — Sentinel Bastion](/blog/docs-pipeline-security-risk-obsidian-bastion) · **Part 3 — The AI Siege** · [Part 4 — Beyond the Siege](/blog/beyond-the-siege-zenzic-v070-quartz) · [Part 5 — Quartz Maturity](/blog/zenzic-v070-quartz-maturity-stable) +::: + +*Part 3 of the **Zenzic Chronicles**. For the complete architectural journey, visit the [Safe Harbor Blog](https://zenzic.dev/blog/).* diff --git a/blog/2026-04-16-zenzic-v061rc2-obsidian-bastion.mdx b/blog/2026-04-16-zenzic-v061rc2-obsidian-bastion.mdx new file mode 100644 index 0000000..f69031d --- /dev/null +++ b/blog/2026-04-16-zenzic-v061rc2-obsidian-bastion.mdx @@ -0,0 +1,84 @@ +--- +slug: zenzic-v061rc2-obsidian-bastion +title: "Zenzic v0.6.1rc2 — Obsidian Bastion" +sidebar_label: "📜 Log: v0.6.1rc2" +authors: [pythonwoods] +tags: [security, devtools, milestone] +date: 2026-04-16 +description: > + Zenzic v0.6.1rc2 replaces the single-engine assumption with a multi-engine + Safe Harbor — Zensical Proxy, enterprise Docusaurus, adversarial Shield audit. +image: https://zenzic.dev/assets/social/social-card.png +--- + +:::note[Alpha/RC Chronicle] + +This is a historical record of a development milestone. The first stable release is **[v0.7.0 — Quartz Maturity](/blog/zenzic-v070-quartz-maturity-stable)**. + +::: + +**v0.6.1rc2 — Obsidian Bastion** is the release that turned Zenzic from a +capable single-engine linter into a genuine multi-engine documentation Safe Harbor. + +{/* truncate */} + +## The architectural shift + +The defining change of the Bastion cycle was the rejection of a hidden assumption: +that every project using Zenzic would have a single, knowable documentation engine. + +That assumption broke the moment we tried Zensical — a MkDocs-compatible engine with +its own configuration surface. The fix was not to add another adapter. It was to +rethink the adapter contract entirely. + +### Headless Architecture (Pillar 1, enforced) + +Every adapter was refactored to receive a pre-built context object — `BuildContext` — +rather than reading configuration files at runtime. This eliminated the last class of +I/O in the hot path and made the system deterministically testable. + +### New in v0.6.1rc2 + +| Feature | Notes | +|:--------|:------| +| Zensical adapter (`engine = "zensical"`) | Full `zensical.toml` parsing | +| Zensical Transparent Proxy | `engine = "zensical-bridge"` for MkDocs nav compatibility | +| Docusaurus enterprise adapter | Versioned docs, `@site/` alias, slug alignment | +| Z401 `MISSING_DIRECTORY_INDEX` | SEO guardrail: every directory must have a landing page | +| Z402 `ORPHAN_PAGE` | Detects pages declared but unreachable from the nav tree | +| Zenzic Lab (9 Acts) | Interactive zero-config onboarding showcase | +| `uvx zenzic check all ./path` | Primary curiosity path with zero install friction | + +### Shield hardening: four bypass vectors closed + +The security audit that ran concurrently with the Bastion architecture work found and +closed four bypass vectors in the Shield credential scanner — the module responsible +for detecting leaked credentials before any documentation build begins. + +| Vector | Codename | Attack method | Fix | +|:-------|:---------|:--------------|:----| +| Unicode format characters | ZRT-006 | Zero-width joiners (U+200D, U+200C, U+200B) inserted mid-token to break regex matching | Normaliser strips all Unicode category `Cf` characters before scanning | +| HTML entity obfuscation | ZRT-006 | `AK` encoded credential prefixes bypass plain-text matching | `html.unescape()` decodes `&#NNN;` and `&#xHH;` entities in the normalisation pass | +| Comment interleaving | ZRT-007 | HTML `` and MDX `{/* */}` comments inserted mid-token | Normaliser strips both comment forms before any pattern is applied | +| Cross-line split detection | ZRT-007b | Secrets split across two consecutive lines evade single-line scanners | `scan_lines_with_lookback()` carries a 1-line lookback buffer; deduplication via seen-set | + +Two additional hardening measures were applied in this cycle: + +**Blood Sentinel (Exit 3).** A dedicated fatal exit code was formalised for architectural +violations — `docs_dir` paths that escape the repository root now trigger Exit 3 before +any scan begins. This is not a lint finding. It is a hard stop. + +**ReDoS protection.** Lines exceeding 1 MiB are silently truncated before regex matching, +preventing catastrophic backtracking against adversarially crafted inputs. + +### Operation Obsidian Stress + +During the v0.6.1rc2 cycle, an AI-driven adversarial audit found **four bypass vectors** +in the Shield credential scanner. All four were closed before the release candidate was +finalised. The complete post-mortem is published here: + +→ [We Put Our Documentation Linter Under an AI-Driven Siege](/blog/ai-driven-siege-shield-postmortem) + +The full architectural story of the Bastion cycle: + +→ [What Happens When You Rip the Foundation Out of a Security Tool](/blog/docs-pipeline-security-risk-obsidian-bastion) diff --git a/blog/2026-04-29-beyond-the-siege-v070-quartz.mdx b/blog/2026-04-29-beyond-the-siege-v070-quartz.mdx new file mode 100644 index 0000000..ccca8e8 --- /dev/null +++ b/blog/2026-04-29-beyond-the-siege-v070-quartz.mdx @@ -0,0 +1,405 @@ +--- +slug: beyond-the-siege-zenzic-v070-quartz +title: "The Sovereign Root" +sidebar_label: "🛡️ 004 - Saga IV: Sovereign Root" +authors: [pythonwoods] +tags: [release, engineering, python, opensource, security, obsidian-chronicles, engineering-chronicles] +date: 2026-04-29T19:05:00 +description: > + After four AI agents tried to break Zenzic's Shield, we didn't patch bugs — + we rewrote the rules. The Sovereign Root Protocol, Purity Protocol, and 1,301 tests + later: this is the technical deep-dive. +image: https://zenzic.dev/assets/social/social-card.png +--- + +:::info[🛡️ The Zenzic Chronicles — Complete] + +The complete six-part engineering saga of Zenzic's journey from v0.5 Sentinel to v0.7.0 Quartz Maturity. The Chronicles are sealed. + +[Saga I](/blog/hardening-the-documentation-pipeline) | [Saga II](/blog/docs-pipeline-security-risk-obsidian-bastion) | [Saga III](/blog/ai-driven-siege-shield-postmortem) | **Saga IV** | [Saga V](/blog/zenzic-v070-quartz-maturity-stable) | [Saga VI](/blog/governance-of-quartz) + +::: + +The [siege is over](/blog/ai-driven-siege-shield-postmortem). All four bypass vectors +are closed. 1,195 tests pass. The Bastion holds. + +{/* truncate */} + +And then we found another bug. + +That is how software works. The question is not whether you'll find gaps after a major +hardening effort — you will. The question is whether your process catches them before +your users do, and whether you have the institutional discipline to close them without +inflation. + +This is the story of what happened in the final days of high-intensity consolidation leading to v0.7.0, and why the +version number itself carries meaning. + +## Treating Documentation as Untrusted Input + +In application security, the foundational discipline is: never trust input. Passwords are +hashed, not stored. Filenames are validated before they touch the filesystem. URLs are +parsed and normalised — never concatenated from fragments. The moment you trust your inputs, +you have implicitly transferred control to whoever provides them. + +Documentation pipelines operate under the opposite assumption. Links are "probably fine." +Asset paths are "probably correct." Placeholder values are "probably temporary." The result +is exactly what you expect from systems that trust their inputs: slow, silent rot that +surfaces as broken experiences for readers — days or even hours after the damage was done. + +Zenzic's thesis is that documentation is input, and should be treated with the same +skepticism. The Zxxx diagnostic system is input validation. The Shield scanner is a +credential sanitiser. The Z502 frontmatter leak fix is a parser boundary constraint. The +Z105 `pathname:///` false-positive fix is protocol normalisation. Each one applies a +discipline that security engineers have practised for decades to a domain that has, until +now, operated on optimism. + +That transfer of security thinking to documentation quality is what **Quartz Maturity** +actually means. + +## The Arc: From Sentinel to Maturity + +The Zenzic Engineering Series has documented a continuous thread: + +| Part | Topic | Version | +| :--- | :--- | :--- | +| [Part 1](/blog/hardening-the-documentation-pipeline) | Why Zenzic exists: the leaking pipe problem | v0.5.x Sentinel | +| [Part 2](/blog/docs-pipeline-security-risk-obsidian-bastion) | Ripping the foundation out: Headless Architecture | v0.6.1rc2 Bastion | +| [Part 3](/blog/ai-driven-siege-shield-postmortem) | The siege: 4 bypass vectors found and closed | v0.6.1rc2 → v0.6.1 | +| Part 4 (this post) | The consolidation: Sovereignty, Purity, precision | **v0.7.0** | +| [Part 5](/blog/zenzic-v070-quartz-maturity-stable) | The vision: UX-Discoverability and the new standard | **v0.7.0 Stable** | + +The version jump from v0.6.1 to v0.7.0 was not about features. It was about something +more specific: the deliberate rejection of a framing. + +## Why Not v0.6.2 + +After the siege postmortem closed all four bypass vectors, the natural next step was a +v0.6.2 patch release. The Zensical and MkDocs adapters needed Z404 coverage. The i18n +configuration had a silent fallback bug. The navigation badge hadn't been updated in +the bump script. These felt like small fixes. + +They weren't small. They were the difference between a system that claims to be a +multi-engine Safe Harbor and a system that actually is one. + +Z404 (`CONFIG_ASSET_MISSING`) was originally implemented as Docusaurus-only — a direct +contradiction of the engine-agnostic architecture Zenzic was built around. A tool that +detects broken favicon references only in Docusaurus projects is not an agnostic tool. +It is a Docusaurus tool with MkDocs support bolted on. + +The i18n silent fallback was worse. For months, the Italian locale of zenzic.dev was +silently serving English content. The URL changed. The content didn't. The bug was in +`docusaurus.config.ts` — `htmlLang: 'it-IT'` without an explicit `path: 'it'` caused +Docusaurus to look for translations in `i18n/it-IT/` (which doesn't exist) and fall +back silently to English. The Italian documentation was invisible. + +These are not patch-level problems. They are foundational consistency failures in a +tool built around the premise that documentation quality is measurable and enforceable. + +## The Quartz Mirror Audit + +The final pre-release audit — internally called the Quartz Mirror Pass — was +structured around a simple question: does every claim Zenzic makes about itself hold +when verified against the actual state of the codebase, the documentation, and the +published site? + +The audit found: + +**Z404 engine gap.** The finding code existed. The implementation was Docusaurus-only. +MkDocs `theme.favicon` and `theme.logo` were not checked. Zensical `[project].favicon` +and `[project].logo` were not checked. The fix added `check_config_assets()` to both +`_mkdocs.py` and `_zensical.py`, and unified the CLI dispatch to cover all three +engines. Lab Acts 9 and 10 were added to demonstrate the fix. + +**i18n silent fallback.** Diagnosed via `.docusaurus/i18n.json` — the `translate: false` +flag confirmed the Italian locale was not being discovered. Fix: add `path: 'it'` +explicitly to `localeConfigs`. Documented as institutional memory in the README and +the FAQ (EN + IT). Added to the PR checklist. + +**Navbar badge drift.** The version bump script correctly updated the footer and package +metadata but missed the `>v{version}<` HTML badge in the navbar. `v0.6.2` persisted in +the navbar after the v0.7.0 bump. Fixed in the bump script. + +**Documentation structure drift (Diátaxis restructure).** The documentation portal +was restructured following the [Diátaxis framework](https://diataxis.fr/) — four +clear modes: Tutorials, How-To Guides, Reference, and Explanation. Every section +URL changed: `/docs/usage/` → `/docs/how-to/`, `/docs/guides/` → `/docs/how-to/`, +`/docs/internals/architecture-overview/` → `/docs/explanation/architecture/`. +Three links in `README.md` had been silently pointing at the old paths for several intensive sprints +behind a blanket `zenzic.dev` exclusion bypass. When the exclusion was removed +as part of the perimeter audit, Zenzic flagged all three immediately. The tool +caught its own documentation drift. + +**"True Stable" framing rejected.** Early drafts of the release notes used the phrase +"True Stable" to describe v0.7.0. The phrase was removed. Stability is not a +declaration — it is an ongoing epistemic posture. Calling a release "True Stable" +implies that previous releases were somehow dishonestly stable, and that future work +won't find gaps. Both implications are false. The correct framing: v0.7.0 is +**Quartz Maturity** — a point of consolidation, not an arrival. + +## What v0.7.0 Actually Is + +```text +zenzic check all +``` + +That command now works correctly against four engine types, with: + +| Capability | Status | +| :--- | :--- | +| MkDocs adapter | Z404 asset checking added | +| Docusaurus v3 adapter | Z404 (existing), versioning, @site/ alias, slug logic | +| Zensical adapter | Z404 asset checking added | +| Standalone Mode | Orphan detection disabled (no nav contract) | +| Shield bypass hardening | 4 vectors closed, 8-step normalization pipeline | +| i18n locale discovery | `path` config enforced; silent fallback documented | +| Lab (interactive showroom) | 17 Acts covering all engine types + Red/Blue Team Matrix | +| Universal PATH argument | All 6 `check` sub-commands + `init` (sovereign root semantics) | +| Sovereign root banner hint | Active scanning target printed after Sentinel header | +| Z502/Z105 precision | MDX frontmatter leak + `pathname:///` false positive eliminated | +| Core purity | `validator.py` — zero engine-name references (Purity Protocol) | +| Test suite | 1,301 passing tests | +| Enterprise reporting | SARIF 2.1.0 output for GitHub Code Scanning | +| Runtime dependencies | 5 | +| Subprocess calls | 0 | + +The engine-agnostic claim is now verifiable, not aspirational. + +## The Diagnostic Standard: Zxxx Codes + +The `Zxxx` code scheme was introduced as a breaking change in v0.6.1 and is the +established standard in v0.7.0: every diagnostic emitted by Zenzic carries a +machine-readable identifier. No raw string codes. No silent findings. Every problem +is named, categorised, and traceable. + +| Range | Category | Examples | +| :--- | :--- | :--- | +| Z1xx | Link integrity | Z101 LINK_BROKEN, Z102 ANCHOR_MISSING, Z104 FILE_NOT_FOUND | +| Z2xx | Security | Z201 SHIELD_SECRET, Z202 PATH_TRAVERSAL | +| Z3xx | Reference integrity | Z301 DANGLING_REF, Z302 DEAD_DEF | +| Z4xx | Structure | Z401 MISSING_DIRECTORY_INDEX, Z402 ORPHAN_PAGE, Z404 CONFIG_ASSET_MISSING | +| Z5xx | Content quality | Z501 PLACEHOLDER, Z503 SNIPPET_ERROR | +| Z9xx | Engine / system | Z902 RULE_TIMEOUT | + +The registry lives in `src/zenzic/core/codes.py` — the single source of truth. +Adding a diagnostic without registering its code is a protocol violation. +This is what enterprise-grade means in practice: every finding is traceable, +filterable, and auditable across releases. + +## The Vanilla-to-Standalone Sunset + +This release also completes the `engine = "vanilla"` → `engine = "standalone"` migration. +The migration guard in `_factory.py` that raises `ConfigurationError [Z000]` with a +clear remediation message was originally annotated `# TODO: Remove this migration guard in v0.7.0`, +implying it was temporary scaffolding. That annotation was wrong. The guard is intentionally +permanent: `engine = "vanilla"` is removed and will never return, so the error message +that tells users exactly how to migrate is load-bearing, not transitional. +The TODO was removed. The guard stays. + +"Standalone Mode" is the canonical identity for documentation projects without a +declared build system. It is not a fallback. It is not a default. It is a first-class +engine mode with its own adapter, its own Lab Act, and its own documentation. + +This release also closes the chapter on the MkDocs plugin dependency that preceded the +Headless Architecture. The CLI is now the **Sovereign authority**: `zenzic check all` +is the single, non-negotiable entry point regardless of what build system the +documentation uses. There is no plugin shim. There is no parallel execution path. +One command. One verdict. + +## The Institutional Memory Protocol + +One pattern that emerged clearly during the Quartz Mirror audit: the gap between +what the tooling enforces and what the team remembers is where the worst bugs live. + +The i18n silent fallback had been present for months. The navbar badge drift had +been present since the v0.7.0 bump. Neither was caught by the existing test suite +because neither was tested — they were assumed. + +The fix was not just to close the bugs. It was to encode the knowledge: + +- The i18n trap is now in the README troubleshooting section, the PR checklist, and the + + FAQ (English and Italian). + +- The bump script gap is documented in the script itself and the checklist. +- The Structural Map (our deterministic engineering ledger) carries the full sprint history, + + including the root cause, the fix, and the lesson. + +A process that catches bugs is valuable. A process that prevents the same bug from +recurring is more valuable. The institutional memory protocol is how you get from the +first to the second. + +## The Parity Sprint: When Coverage IS the Audit + +Two days after the initial v0.7.0 write-up, a second forensic sprint surfaced failures +that the test suite had not seen — not because the tests were wrong, but because entire +modules were untested. + +**`cache.py` — zero coverage.** The content-addressable finding cache (284 lines) had no +test file. Pure hash functions, `CacheManager` roundtrips, atomic save, parent-directory +creation, corrupt-JSON fallback — all running in production, none exercised. +Twenty-nine tests were written. One boundary required a non-obvious fix: atomic write +failure is validated by patching `zenzic.core.cache.json.dump`, not `builtins.open`, +because `save()` uses `Path.open()`. The distinction is silent and the kind of thing +that lets a bad mutant survive indefinitely. + +**Mutation testing.** `mutmut` was run across `rules.py`, `shield.py`, and `reporter.py`. +The survivors diagnosed precisely what was missing. `_to_canonical_url` had no test +targeting its `rstrip("/")` normalisation, its backslash conversion, or the boolean +logic guarding context-aware `..`-resolution — a `and … or` mutation that would make +the guard fire on partial input. `_obfuscate_secret` had no boundary test at +`len(raw) == 8` (the `<= 8` threshold). New mutant-killing test classes were added +to make those mutations observable and fatal. + +**Cross-platform regression.** `resolve_asset()` in all three adapter modules used +`Path.exists()` for fallback path validation. On Windows (NTFS) and macOS (HFS+) this +returns `True` regardless of capitalisation — `Logo.png` passes when the file on disk +is `logo.png`. A new `case_sensitive_exists()` helper using `os.listdir()` enforces +exact case on every platform, fixing a CI regression on the Windows and macOS legs of +the cross-platform matrix. + +**CVE-2026-3219.** `pip 26.0.1` is affected by a polyglot archive vulnerability (no +patched release on PyPI). Zenzic uses `uv` for all package management and never invokes +pip programmatically — the exposure is zero. The `nox security` session was updated to +suppress the advisory with a documented removal reminder. + +The test count at the close of the parity sprint: **1,195 tests, all passing.** + +## The Precision Sprint: Eliminating False Positives + +After CLI symmetry was established, two false-positive bugs were closed. + +**BUG-012 — Z502 MDX Frontmatter Leak.** The placeholder detector (`Z502 PLACEHOLDER`) +was firing on React-style template expressions inside MDX files — `{children}`, +`{props.title}` — because the frontmatter extraction boundary was too permissive. The fix +constrained YAML extraction to terminate at the first `---` close marker, preventing MDX +body content from bleeding into the frontmatter scan. No legitimate placeholder finding was +affected. + +**BUG-013 — Z105 `pathname:///` False Positive.** Docusaurus uses `pathname:///` as a +pseudo-protocol for links that resolve at build time into absolute paths. Zenzic's absolute +link detector (`Z105 ABSOLUTE_LINK`) was flagging these as portability violations. The fix +adds `pathname:` to the recognised protocol allowlist (Rule R16: Protocol Awareness). Teams +migrating Docusaurus projects to Zenzic no longer see spurious Z105 warnings on +build-generated links. + +The pattern is significant: precision failures are more dangerous than false negatives. A +scanner that cries wolf trains engineers to suppress its output — and the one real finding +gets lost in the noise. Every false positive is a credibility tax on the tool that emits it. + +## Total CLI Symmetry: The Sovereign Root Protocol + +The original v0.7.0 release shipped with `check all` accepting a `PATH` argument. The other +six `check` sub-commands (`links`, `orphans`, `snippets`, `placeholders`, `assets`, +`references`) and `init` did not. The gap was architectural inconsistency — a tool that +claims uniform behaviour with a non-uniform interface. + +D060 closed the gap. Every filesystem-interacting CLI command now accepts an optional `PATH` +with full sovereign root semantics: + +```bash +zenzic check links ../other-project # config follows target, not CWD +zenzic check orphans content/ # sub-directory scope +zenzic check assets /abs/path/to/docs # absolute paths accepted +zenzic init /workspace/new-docs # Genesis Nomad: create and scaffold +``` + +The sovereign root protocol ensures configuration follows the target, not the caller. +Running `zenzic check links ../other-project` from your current working directory loads +`../other-project/zenzic.toml`, not your project's config. Context cannot be hijacked. + +D062 added visual confirmation — the resolved scanning target is printed immediately after +the Sentinel header when `PATH` is provided: + +```text + Scanning: ../other-project/docs +``` + +For `init` in Genesis Nomad mode: + +```text + Target: ../new-project +``` + +Operators now see exactly which root Zenzic has elected before the first result appears. + +## The Law of Contemporary Testimony + +In parallel with the code changes, a policy was codified: the **Law of Contemporary +Testimony**. + +> *Code and documentation are a single, indivisible unit of work. No sprint is closed if +> the documentation still reflects the previous state of the code.* + +The enforcement mechanism is the Zenzic Structural Map, a mandatory protocol that +ensures no sprint is closed unless the documentation is as mature as the code. If a wrapper +behavior changed, the architecture diagram must be updated. If a CLI flag was added, the +reference must be updated. If an exit code semantic changed, the policy table and the user +documentation must both be updated. + +This sounds like documentation discipline. It is actually a defect prevention system. The +navbar badge drift, the i18n silent fallback, the README links pointing at URL paths that no +longer exist — every one of these was a failure of the implicit contract between code author +and documentation author. When those roles are collapsed (as they are in any sufficiently +small team), the failure mode is: *I know this in my head, I'll update the docs later.* +Later compounds. The Law makes "later" unconstitutional. + +The test count at close of all v0.7.0 consolidation sprints: **1,301 passing tests.** + +## The Purity Protocol: Zero Engine Leaks in Core + +One invariant that emerged from the consolidation: `validator.py` — the heart of Zenzic — +must contain no reference to any engine by name. No "docusaurus", no "sidebar", no +"navbar". The Core receives a `frozenset[str]` of navigable paths from +`adapter.get_nav_paths()`. What lives inside that method is the adapter's problem, not +the Core's. + +This is not aesthetic. It is architectural insurance. A Core that knows about Docusaurus +is a Core that will accumulate Docusaurus special-cases. A Core that knows about MkDocs +nav-plugins will accumulate MkDocs exceptions. Every engine leak is a future maintenance +trap. + +The **Purity Protocol** (confirmed by `grep -n "docusaurus\|sidebar\|navbar\|footer" +validator.py` → 0 matches) is the enforcement gate. Adding a new adapter that modifies +`validator.py` is a protocol violation. The adapter contract is the boundary — everything +engine-specific must live behind it. + +Part 5 of this series explains what the Purity Protocol unlocks for the Docusaurus +adapter specifically — how `get_nav_paths()` became a Multi-Source Harvester covering +sidebar, navbar, and footer simultaneously. + +## The Safe Harbor Is Open + +v0.7.0 is a point of arrival. The documentation pipeline is verified, the Shield is +hardened, the adapters are parity-complete, and the Core is pure. Every claim Zenzic +makes about itself — engine-agnostic architecture, sovereign root semantics, zero +subprocesses — holds when checked against the actual codebase. + +The pipe is probably still leaking somewhere in your documentation — a broken link, +a credential fragment in a frontmatter field, a file invisible to every reader because +no clickable navigation surface references it. Find out before your users do. + +[Part 5 →](/blog/zenzic-v070-quartz-maturity-stable) covers what v0.7.0 unlocks +for UX-Discoverability and the full product vision. + +```bash +uvx zenzic lab +``` + +--- + +| | | +|---|---| +| **GitHub** | [github.com/PythonWoods/zenzic](https://github.com/PythonWoods/zenzic) | +| **Documentation** | [zenzic.dev](https://zenzic.dev/) | +| **PyPI** | [pypi.org/project/zenzic](https://pypi.org/project/zenzic/) | +| **Changelog** | [v0.7.0 Release Notes](https://github.com/PythonWoods/zenzic/releases/tag/v0.7.0) | + +:::note[The Zenzic Chronicles] +This is **Part 4** of a five-part engineering series documenting the path from v0.5 to v0.7.0 Stable. + +[Part 1 — The Sentinel](/blog/hardening-the-documentation-pipeline) · [Part 2 — Sentinel Bastion](/blog/docs-pipeline-security-risk-obsidian-bastion) · [Part 3 — The AI Siege](/blog/ai-driven-siege-shield-postmortem) · **Part 4 — Beyond the Siege** · [Part 5 — Quartz Maturity](/blog/zenzic-v070-quartz-maturity-stable) +::: + +*Part 4 of the **Zenzic Chronicles**. For the complete architectural journey, visit the [Safe Harbor Blog](https://zenzic.dev/blog/).* diff --git a/blog/2026-04-29-governance-of-quartz.mdx b/blog/2026-04-29-governance-of-quartz.mdx new file mode 100644 index 0000000..1a608b5 --- /dev/null +++ b/blog/2026-04-29-governance-of-quartz.mdx @@ -0,0 +1,506 @@ +--- +slug: governance-of-quartz +title: "The Governance of Quartz: Why Integrity Requires a Constitution" +authors: [pythonwoods] +date: 2026-04-29T19:15:00 +tags: [governance, sovereignty, engineering-chronicles, engineering] +sidebar_label: "🛡️ 006 - Saga VI: The Trinity" +--- + +> [![AI-Adversarial / Human-Governed](https://img.shields.io/badge/AI--Adversarial-Human--Governed-black?style=flat-square)](https://zenzic.dev/developers/governance/adversarial-ai) + +*Software does not die when its code grows old.* + +*It dies when the pact between the author and the user is broken — the implicit promise +that what works today will still make sense tomorrow, that the rules of the game will not +shift mid-journey, that the tool you trusted will not become the problem you need to escape.* + +*This is the sixth chronicle. The one we were always building toward.* + +{/* truncate */} + +--- + +## Part I — The Ghost of Broken Promises + +There is a ghost that haunts mature software projects. Engineers rarely name it, because +naming it requires admitting that the problem is not technical. The ghost is not a bug. +It is not a performance regression. It is not a security vulnerability. + +**The ghost is a broken promise.** + +The Zenzic Chronicles began with a broken promise. +[Saga I](/blog/hardening-the-documentation-pipeline) documented the credential that leaked +through a MkDocs pipeline — not because the build tool failed, but because the *integration +model* between Zenzic and MkDocs had created a hidden assumption: that Zenzic would run +*as part of the build*, and therefore its security guarantees were only valid when the build +executed cleanly. + +This is a ghost assumption. It lives in the architecture. It survives refactors. It outlasts +the engineer who introduced it. And it only becomes visible when someone asks, at the wrong +moment: *"what exactly did we promise?"* + +### The Instability at the Foundation + +[Saga II](/blog/docs-pipeline-security-risk-obsidian-bastion) documented a second failure +mode: Docusaurus link resolution instability when the site was deployed in subdirectory +configurations. Links that worked locally failed in production. The build succeeded. The +Shield found nothing. The links were broken. + +Both failures share a structural cause: **the tool was integrated into the build system +instead of guarding the source before it.** When Zenzic runs *inside* the build, it inherits +all of the build system's assumptions. Its guarantees become conditional. "The analysis is +clean" becomes "the analysis is clean, given the following 17 implicit preconditions about +your deployment environment." + +That is not a guarantee. That is a disclaimer. + +### The Software Mortality Table + +Projects die in predictable ways. Not from catastrophic failure — from erosion. From the +accumulation of reasonable exceptions, each of which makes perfect local sense. + +| Stage | What Happens | Symptom | +| :--- | :--- | :--- | +| **Year 1** | "We'll make this one exception — it's urgent." | A single subprocess call sneaks in. | +| **Year 2** | "The exception is now load-bearing. We can't remove it." | The invariant no longer holds unconditionally. | +| **Year 3** | "The original design philosophy doesn't apply here anymore." | Architecture has silently changed without announcement. | +| **Year 4** | "We need a full rewrite to move forward." | The ghost has won. | + +The pattern is not ignorance. The engineers who make these decisions are intelligent. They +are making rational local optimizations. The problem is that **there is no system that forces +the global cost of each local exception to be visible before it is committed.** + +Governance is that system. + +### The Architecture of Trust + +Trust in a software tool is not built from documentation. It is built from **demonstrated +constraints**. A tool that could violate your expectations at any moment provides no +safety — only the appearance of it. + +The difference between a *policy* and a *law* is enforcement. Zenzic can write any policy +document it wants and ignore it entirely. But a constitutional process — one that requires a +major version bump, a 30-day public period, and documented adversarial validation before any +Pillar can change — is a constraint that costs something to violate. That cost is what +transforms a design principle into an architectural guarantee. + +> *"The ghost is not a bug in the code. It is a promise forgotten — made when the design was +> pure, abandoned when the pressure was real, invisible until the day the user discovers that +> what was promised and what was delivered have quietly diverged."* + +The Governance of Glass is the formal answer to the ghost. Not a process for slowing things +down. A process for ensuring that every exception, every evolution, every architectural change +is made with full awareness of its cost to the pact. + +--- + +## Part II — The Sovereignty Oath: Liberty as a Feature + +The most unusual document in Zenzic's Governance section is the +[Sovereignty Oath](/developers/governance/exit_strategy). + +It is a formal document explaining how to remove Zenzic from your project. + +We wrote it during the foundational design phase. We wrote it before v0.7.0 was stable. We wrote it +because we believe that a tool that cannot prove its own reversibility is asking you to trust +it on faith — and the Zenzic trust model is Zero-Trust, including toward Zenzic itself. + +### The Zero Residue Guarantee + +> **Zenzic is the only dependency that swears to be invisible if you decide to remove it.** + +This is not marketing copy. It is a verifiable engineering claim. When you remove Zenzic from +a project, the following components remain **unchanged**: + +| What You Lose | What Remains | +| :--- | :--- | +| The CI integrity gate | Your source files — never mutated, not a single byte | +| The Shield credential scanner | Your application code — never imported at runtime | +| The Blood Sentinel path guardian | Your Python types — `typing.Protocol`, not inheritance chains | +| The VSM link validator | Your configuration — one TOML section or one file to delete | +| The SARIF reporting pipeline | Your CI — one workflow step to remove | + +**Total decommission time: 30 seconds.** No migration script. No data format to convert. +No architecture to dismantle. No vendor lock-in to escape. + +### Read-Only by Constitution + +The audit core of Zenzic is strictly read-only. This is not a current implementation detail +awaiting refactoring. It is a constitutional invariant of the Safe Harbor. + +```python +# The Zenzic analysis core: observation only +# Source files are opened in read mode. Always. Without exception +def analyze(file_path: Path, text: str) -> list[Finding]: + ... # Pure function. Same input → same output. No writes. No side effects. +``` + +Zenzic observes your documentation. It never mutates it. + +Any future remediation features — such as a `zenzic fix` command — will be implemented as +separate, explicit, interactive utilities that the user invokes deliberately. The analysis +phase will remain 100% mutation-free. + +This distinction matters. A linter that quietly "fixes" files during analysis has crossed +from *observer* to *actor*. The moment a tool modifies your sources without explicit +instruction, it becomes the source of unintended mutations — the very class of failure the +Safe Harbor was designed to prevent. + +### The Structural Subtyping Guarantee + +Zenzic's adapter system uses +[`typing.Protocol`](https://docs.python.org/3/library/typing.html#typing.Protocol) — the +structural subtyping mechanism from the Python standard library. + +```python +# Zenzic adapter contract — structural, not nominal +class AdapterProtocol(Protocol): + def get_docs_root(self) -> Path: ... + def get_nav_paths(self) -> frozenset[str]: ... + def get_metadata_files(self) -> frozenset[str]: ... +``` + +What this means in practice: your code never inherits from a Zenzic base class. There is no +`ZenzicAdapter` in your class hierarchy. When you remove Zenzic from your project, your +Python types are structurally identical to what they were before. The adapter contract is a +*lens* through which Zenzic reads your project — not a chain that binds your types to its +release cycle. + +### Why We Wrote the Exit Strategy First + +A tool that makes leaving difficult does not have confidence in its value. It is protecting +its own presence. The conventional wisdom in developer tooling is that *switching costs* are +a moat — friction that keeps users from choosing a competitor. + +Zenzic inverts this model. The Zero Residue guarantee is not a concession to user demands. +It is a **design principle**: we believe that tools should be adopted on merit, not retained +by friction. + +If Zenzic stops providing value — if a better tool emerges, if your documentation stack +changes, if your security requirements evolve beyond what Zenzic can deliver — your exit +should cost 30 seconds of your life, not 30 days of migration work. + +The pact we make with every user is simple and non-negotiable: **the Sentinel exists to +protect your documentation, not to protect itself.** + +--- + +## Part III — The Adversarial Forge: AI as a Skeptic + +The `AI-Adversarial / Human-Governed` badge is a declaration. Let us be precise about what +it declares. + +> [![AI-Adversarial / Human-Governed](https://img.shields.io/badge/AI--Adversarial-Human--Governed-black?style=flat-square)](https://zenzic.dev/developers/governance/adversarial-ai) + +It does **not** declare that Zenzic was written with AI assistance. It does not declare that +AI was used to accelerate development. It declares something more specific — and more +demanding. + +**Zenzic was stress-tested by AI acting as a public prosecutor. Every line of code is a +defendant.** + +### The Magistrate Model + +In the Adversarial Forge, AI is assigned the role of a magistrate tasked with building a +case for the prosecution. The charge is always the same: *"Violation of the Three Pillars."* + +The AI's job is not to suggest code. Its job is to find a path through the existing +architecture that bypasses a constitutional invariant. A working bypass is not a feature +request — it is a finding. It is treated with the same urgency as a CVE. + +The human's job is not to accept the AI's suggestions. It is to evaluate each finding: Is +this a real vulnerability? Does it expose a genuine architectural weakness? If yes, fix it +in the same sprint and document it in the Zenzic Ledger. If not, record the unsuccessful +attack vector as evidence that the invariant held under adversarial pressure. + +**The AI proposes. The AI attacks. The AI does not ratify.** + +When an AI session produces a suggestion to relax a Pillar — *"this rule would be much +simpler to implement with a subprocess call"* — that suggestion is not evaluated on its +technical merits in isolation. It is flagged as evidence that the Pillar is under pressure. +The response is to harden the invariant documentation, not to accept the suggestion. + +### The Four Sessions of the Forge + +Every architectural decision in Zenzic has been subjected to one or more of four adversarial +session types: + +| Session | Target | What "Success" Means for the AI | +| :--- | :--- | :--- | +| **Type A — Architecture Hunt** | Any `[INVARIANT]` in the Zenzic Ledger | A real code path that violates the declared invariant | +| **Type B — ReDoS Canary** | The `AdaptiveRuleEngine` regex acceptance | A user-provided pattern with catastrophic backtracking on >1 KiB input | +| **Type C — Shield Bypass** | The 8-stage normalization pipeline | A Markdown fragment containing a real credential that passes all 8 stages undetected | +| **Type D — Blood Sentinel Escape** | `InMemoryPathResolver._build_target()` | A path string that resolves outside `docs_root` without containing literal `../` | + +Every "success" for the AI is a failure for Zenzic — a finding that gets fixed before the +next release. Every unsuccessful session is documented as evidence that the defense held +under real adversarial pressure. The Zenzic Ledger records both outcomes with equal rigor. + +### Quartz Clarity: Pure Under Pressure + +The metaphor of "Quartz Clarity" is not decorative. Quartz forms under geological pressure +rock is cooled rapidly under extreme pressure. Its crystalline structure is the result of +having been tested at its limits. + +The Eight normalization stages of the Shield — Unicode normalization, HTML entity decoding, +invisible character stripping, base64 fragment detection, URL encoding expansion, homoglyph +substitution, case folding, whitespace collapse — did not emerge from architectural planning +alone. They emerged from successive Type C sessions in which an AI was tasked with +constructing credential-containing Markdown fragments that could bypass detection at each stage. + +Stage 1 stopped naive plaintext attacks. Stage 3 stopped Unicode codepoint tricks. Stage 6 +was added after an AI constructed a homoglyph-substituted token that survived stages 1 +through 5. Stage 8 was added after a whitespace collapse bypass was found that survived +stages 1 through 7. + +> *"The Shield has eight stages because eight attacks were found and survived. Every stage is +> the crystallized memory of a bypass attempt that reached production-ready code before being +> caught by the Forge."* + +This is Quartz Clarity: not a design reviewed in theory, but a structure tested under +the actual force of adversarial intelligence. Its clarity comes from its history of pressure. + +### What AI Does Not Decide + +The AI is a Red Team, not a co-architect. It operates within strict boundaries: + +| Decision | Authority | Why Not AI | +| :--- | :--- | :--- | +| The Three Pillars (architecture) | Human — non-delegable | Pillars are value judgments, not optimization problems | +| The Zxxx finding code semantics | Human — ratified in `core/codes.py` | Diagnostic contracts affect every user's CI pipeline | +| The exit code contract (0/1/2/3) | Human — immutable | Security guarantees cannot be probabilistic | +| Sprint scope and release schedule | Human | Trade-offs require contextual judgment | +| Whether an AI finding is a real vulnerability | Human Integrity Guardian | The prosecutor does not convict; the judge does | + +--- + +## Part IV — The Constitutional Invariants + +After six Chronicles, after thousands of test cases, after dozens of adversarial sessions, +the Three Pillars have been promoted. They are no longer *design choices*. They are +**Constitutional Law**. + +This is not rhetorical elevation. It has a precise engineering meaning: a process exists +that makes violating them more expensive than defending them. + +### The Three Articles of the Safe Harbor + +| Article | Invariant | Protected Guarantee | +| :--- | :--- | :--- | +| **I — Lint the Source** | Analysis operates on raw Markdown and configuration files. Never on HTML output or compiled artifacts. | Pre-build integrity. Zenzic fires before your pipeline, not inside it. No build system preconditions. | +| **II — Zero Subprocesses** | 100% pure Python. No `subprocess.run`, no `os.system`, no external process of any kind. | Zero-Trust execution. No process Zenzic cannot audit. No external dependency Zenzic cannot enumerate. | +| **III — Pure Functions First** | Analysis logic is deterministic. The same input always produces the same findings, in the same order, with the same line numbers. | Reproducibility. A finding is not an observation — it is a reproducible fact that can be verified independently. | + +### The Constitutional Amendment Process + +A change that violates any of these articles — even temporarily, even for a genuinely good +engineering reason, even under deadline pressure — is not a bug fix or a feature. It is a +**constitutional amendment**. And constitutional amendments in Zenzic require: + +1. **A Major version increment** — e.g., v0.7.0 → v1.0.0. Users who depend on the current + + Pillar semantics remain on the current major version. The change cannot sneak into a minor + or patch release. + +2. **A 30-day public impact period** — announced in a public issue before any code is written. + + The period exists so that enterprise users can evaluate the impact on their pipelines, not + to create bureaucratic delay. + +3. **A formal Architectural Decision Record (ADR)** — added to the Zenzic Ledger with: the + + text of the invariant being modified, the proposed replacement, the rationale, and a full + cost analysis covering migration burden and trust model impact. + +4. **A Type A Adversarial AI session** — targeting the proposed replacement architecture. The + + AI must attempt to find Pillar violations in the new design before it is ratified. A + replacement architecture that cannot survive a single adversarial session does not replace + a constitutional article. + +5. **Consensus of 2/3 of Core Maintainers** — not a simple majority. Constitutional changes + + require supermajority ratification. + +### The Evolution Policy: No Surprises at Scale + +The [Evolution Policy](/developers/governance/evolution_policy) exists to answer one +question that every engineering team eventually asks when adopting an external tool: + +> *"Will the rules of this tool change in a way that breaks our pipeline without warning?"* + +For most tools, the honest answer is: *"Probably. Check the changelog."* + +For Zenzic, the answer is: *"Not without telling you 30 days in advance, not without a major +version bump, and not without an adversarial session that proves the replacement architecture +can hold under pressure."* + +This is the enterprise guarantee. Not a feature list. A constitutional process that ensures +the Safe Harbor's fundamental rules do not change mid-journey. + +### What Can Evolve Without Amendment + +Not everything in Zenzic requires a constitutional process to change. The Evolution Policy +distinguishes between two tracks: + +**Lightweight Track (Operational Standards):** + +Quality gate thresholds, finding code messages (not semantic scope), CLI flag defaults, +output format improvements, new adapters, new `Zxxx` finding codes in unused ranges — these +evolve on a 72-hour discussion window with a maintainer merge if no blocking objection is +raised. The Zenzic Ledger is updated in the same commit. + +**Constitutional Track (Pillar-Level):** + +Anything that changes the *meaning* of a Pillar — what it protects, what it permits, what +it prohibits. These require the full five-step process described above. + +### The Convenience Prohibition + +The Evolution Policy contains one section that deserves explicit attention: the list of +arguments that are **formally invalid** as rationales for a Pillar amendment. + +- *"It would be much easier to write this rule with a subprocess call."* +- *"The AI suggested a simpler architecture that relaxes Pillar III."* +- *"This is a temporary exception — we'll remove it after the deadline."* +- *"The test coverage makes this safe even without the invariant."* +- *"No user has complained about this Pillar being too strict."* + +None of these arguments are evaluated on their engineering merits. They are rejected +*because they are convenience arguments* — and convenience is precisely the force that the +Three Pillars were designed to resist. + +> *"The Pillars are not obstacles to good engineering. They ARE good engineering, expressed +> as constitutional constraints. The moment they become negotiable for convenience, they cease +> to protect anything — including the users who trusted them."* + +--- + +## Part V — The Safe Harbor is Permanent + +With the publication of this Governance section, Zenzic v0.7.0 crosses a threshold that has +nothing to do with features, benchmark numbers, or code coverage percentages. + +It has become a **documented institution**. + +### The First Cornerstone + +Version 0.7.0 is not a destination. It is the laying of the first cornerstone of a structure +designed to stand for decades. The code will evolve. New adapters will be added. New finding +codes will be discovered and registered. The CLI will gain new commands. The Shield's +normalization stages may be extended by future adversarial sessions that find bypass vectors +we have not yet imagined. + +None of this threatens the Safe Harbor. The Three Pillars will hold. The Sovereignty Oath +will remain in force. The AI adversarial sessions will continue. The Zenzic Ledger will +record every decision, every exception, every failure. + +This is the promise of the constitutional layer: **the rules of the game are public, formal, +and non-negotiable at the foundational level.** Everything built on top of those foundations +can evolve freely — because the foundations themselves are anchored. + +### The Pact with the Community + +There is a temptation, in open-source governance, to ask users to trust the maintainers. +*"Trust us, we have good intentions. Trust us, we take security seriously. Trust us, we won't +break your pipeline."* + +We reject this framing entirely. + +**Do not trust us. Trust the system.** + +The [Governance section](/developers/governance/) is not a statement of our intentions. +It is a legal code — a set of invariants and processes that constrain what we can do, even +if our intentions were to change. The constitutional amendment process does not require our +goodwill to function. It requires a public vote, a 30-day notice period, and documented +adversarial validation. These requirements exist regardless of who the maintainers are, what +their intentions are, or what pressures they face. + +If we ever attempt to modify a Pillar without following that process — file an issue. You +will be correct. The Governance system will have been violated. And the community's response +to that violation is the final layer of protection the Governance of Glass provides. + +### The Sentinel Seal: Six Chronicles, One Pact + +The Zenzic Chronicles are sealed. + +Six chapters. From the leaking credential in a MkDocs integration to the constitutional +layer of a governance-complete open-source project. From a single Shield rule to eight +normalization stages tested by adversarial AI. From an integration plugin that blurred the +line between "analysis" and "build" to a Sovereign CLI that analyzes any documentation +source without depending on its build system. + +| Chapter | Saga | Theme | +| :---: | :--- | :--- | +| I | [The Leaking Pipe](/blog/hardening-the-documentation-pipeline) | The credential that exposed the integration flaw | +| II | [Headless Architecture](/blog/docs-pipeline-security-risk-obsidian-bastion) | Building the headless, pre-build analysis model | +| III | [The AI Siege](/blog/ai-driven-siege-shield-postmortem) | Exhaustive adversarial loops: thousands of bypass attempts, eight Shield stages forged | +| IV | [The Sovereign Root](/blog/beyond-the-siege-zenzic-v070-quartz) | Architectural sovereignty: source, not build | +| V | [Quartz Maturity](/blog/zenzic-v070-quartz-maturity-stable) | v0.7.0 stable: 1,485+ tests, 80% coverage | +| **VI** | **The Governance of Glass** | The constitutional layer. The pact that endures. | + +The Chronicles are a record, not a roadmap. The next chapters of Zenzic's story will be +written by the engineers who adopt it, the vulnerabilities that future adversarial sessions +will find, the community that will eventually file the first formal RFC under the Evolution +Policy, and the enterprise teams who will discover that a 30-second decommission is a +feature they never thought to ask for. + +> *"The Safe Harbor is permanent not because it cannot change, but because the process for +> changing it is more demanding than the pressure to change it casually. That is the only +> kind of permanence that engineering can honestly offer."* + +We are ready for the next chapter. + +### The Glass Constitution + +"Governance of Glass" is a deliberate metaphor. + +Glass is not weak. It is **transparent**. You can see through it. You can verify that what +is inside matches what is promised outside. It does not hide its structure behind opacity. +When glass breaks, the break is visible — you know exactly where it failed, how it failed, +and what force was required to break it. + +The Governance documents are glass walls around the Three Pillars. Transparent, verifiable, +and brittle under the right kind of force — and that brittleness is the point. A constitution +that bends to every reasonable argument is not a constitution. It is a suggestion with +aspirational language. + +Zenzic's constitution breaks rather than bends. If a Pillar is ever violated without following +the constitutional amendment process, the failure is immediately visible: the Zenzic Ledger +does not record it, the major version bump did not happen, the 30-day public period did not +occur. The violation is auditable from the git history. There are no hidden exceptions. + +> *"Quartz forms under geological pressure — atoms arranged with mathematical precision into a material +> so sharp it was used as a surgical tool for thousands of years. Its clarity is the product +> of its history. Zenzic aims to be the same: clear enough to cut through ambiguity, hard +> enough to maintain its edge under sustained pressure."* + +--- + +## The Legal Code of the Sentinel + +The complete constitutional layer is documented at: + +**[Governance & Sovereignty →](/developers/governance/)** + +| Document | What It Governs | +| :--- | :--- | +| [Overview](/developers/governance/) | The Three Pillars as Supreme Law. The engineering contract that protects them. | +| [Adversarial AI Model](/developers/governance/adversarial_ai) | The Red Team protocol. Session types A/B/C/D. What the AI cannot decide. | +| [The Sovereignty Oath](/developers/governance/exit_strategy) | Zero Residue. Read-only core. The 30-second decommission. | +| [Evolution Policy](/developers/governance/evolution_policy) | The constitutional amendment process. The Convenience Prohibition. The enterprise guarantee. | +| [License Compliance](/developers/governance/licensing) | Apache-2.0 + REUSE 3.3. Every file carries the cryptographic signature of its license. | + +> *"The code is the machine. The governance is the conscience of the machine.* +> *One without the other is power without accountability."* + +--- + +:::info[🛡️ The Zenzic Chronicles — Complete] + +The complete six-part engineering saga of Zenzic's journey from v0.5 Sentinel to v0.7.0 Quartz Maturity. The Chronicles are sealed. + +[Saga I](/blog/hardening-the-documentation-pipeline) | [Saga II](/blog/docs-pipeline-security-risk-obsidian-bastion) | [Saga III](/blog/ai-driven-siege-shield-postmortem) | [Saga IV](/blog/beyond-the-siege-zenzic-v070-quartz) | [Saga V](/blog/zenzic-v070-quartz-maturity-stable) | **Saga VI** + +::: diff --git a/blog/2026-04-29-obsidian-masterclass.mdx b/blog/2026-04-29-obsidian-masterclass.mdx new file mode 100644 index 0000000..51c3a96 --- /dev/null +++ b/blog/2026-04-29-obsidian-masterclass.mdx @@ -0,0 +1,1446 @@ +--- +slug: obsidian-masterclass +title: "Sentinel Guard: The Engineering of Documentation Integrity and Security" +sidebar_label: "⚡ 001 - Masterclass: Quartz Guard" +authors: [pythonwoods] +tags: [engineering, security, tutorial, obsidian-maturity] +date: 2026-04-29T19:20:00 +description: > + A forensic deep-dive into Zenzic's architecture: the VSM, the Shield's 8 + normalization stages, the Blood Sentinel, and how to build a Zero-Trust + documentation pipeline in 2026. +image: https://zenzic.dev/assets/social/social-card.png +--- + +Every build engine you have ever used has made you a silent promise: *"Trust me — if +this builds, your documentation is correct."* + +That promise is architecturally broken. And your documentation is paying the price. + +{/* truncate */} + +--- + +## Act I — The Thesis of Untrusted Input + +### The Problem with Build Engines + +Your documentation pipeline looks something like this: + +```text +Author → Markdown file → Build Engine → HTML → CDN → Reader + ↑ + "Trust me, I'll build it." +``` + +Docusaurus, MkDocs, Zensical — these are **generators**. Their contract with you is +explicit: *"Give me source files; I will produce a static site."* They are optimized +for speed, for plugin extensibility, for theming. They are not optimized for +validation. They trust the source files you give them. + +That trust is the vulnerability. + +### What "Untrusted Input" Means in 2026 + +The threat model for documentation has changed. In 2026, documentation sources come +from: + +- **Human contributors** via pull requests — who may not know your link structure +- **AI-generated content** — which plausibly invents URLs that sound real +- **Automated refactors** — which move files without updating cross-references +- **External integrations** — which inject content that may carry credential fragments + +In a monorepo shared across teams, a single contributor committing a file that +references a non-existent anchor, exposes an internal API key, or introduces a +circular link dependency can silently corrupt the documentation of ten product areas +simultaneously. The build engine will not catch it. The build engine does not try. + +Zenzic's core design thesis, established in [ADR-001](/blog/hardening-the-documentation-pipeline), +is: **treat every Markdown file as untrusted input**. + +This is not a feature. It is a trust model. Every design decision in Zenzic derives +from it. + +### The Zero-Trust Documentation Pipeline + +```text +Author → Markdown file → Zenzic (Sentinel) → Build Engine → HTML → CDN → Reader + ↑ + "I trust nothing. I verify everything." +``` + +The Sentinel runs before the build. It does not trust the source. It constructs a +complete in-memory model of your site — the **Virtual Site Map** — and validates +every claim in your source files against that model. If the model says a link is +broken, the CI gate fails. The build engine never sees the broken file. + +The fundamental difference is not in features. It is in philosophy. Build engines are +designed to succeed. Zenzic is designed to find where they would have failed. + +### Supply Chain Attacks via Documentation + +Consider a real scenario: your documentation site is a monorepo including +third-party-contributed guides. One contributor submits a tutorial that includes: + +```markdown +## Setup + +Configure your environment with your API key: + +```yaml + +api_key: "sk_live_" + +``` + +```text + +The Stripe live key — `sk_live_*` — is a real credential format. Zenzic's Shield +catches it before the commit merges. The build engine builds it without comment. + +This is the **supply chain attack surface in documentation**: external content that +carries secrets, adversarial links, or path traversal payloads, combined with a build +pipeline that trusts its input entirely. + +Zenzic's response to this is the three-layer security architecture: the **Shield** +(credentials), the **Blood Sentinel** (paths), and the **Structural Validator** (graph +integrity). Each layer is independent. Each layer runs on every scan. None of them +can be disabled simultaneously. + +### The Three Pillars (Non-Negotiable) + +These invariants are in Zenzic's design contract and cannot be overridden by +configuration: + +**Pillar 1 — Lint the Source, Not the Build.** Analysis operates on raw Markdown and +configuration files. Never on HTML output. Errors are caught before the build starts. + +**Pillar 2 — Zero Subprocesses.** 100% pure Python. No `subprocess`, no `os.system`, +no Node.js execution. This guarantees: reproducible results across platforms, +zero dependency on the build environment, and total portability. + +**Pillar 3 — Pure Functions First.** Analysis logic is deterministic. I/O is isolated +at the edges (discovery and reporting). No I/O in hot-path loops. + +--- + +## Act II — The VSM Engine: A Mental Map of Your Site + +### What the Virtual Site Map Is + +The Virtual Site Map (VSM) is Zenzic's central data structure. It is a complete +in-memory projection of your documentation site as a routing table: a mapping from +every canonical URL that your build engine would generate to a `Route` object. + +The `Route` object carries: + +```python +@dataclass(frozen=True, slots=True) +class Route: + url: str # canonical URL: "/docs/guide/install/" + file_path: Path # absolute path on disk: /repo/docs/guide/install.mdx + status: RouteStatus + anchors: frozenset[str] + is_proxy: bool + version: str | None +``` + +The `RouteStatus` can be one of four values: + +| Status | Meaning | +| :--- | :--- | +| `REACHABLE` | File is navigable via at least one user-clickable surface | +| `ORPHAN_BUT_EXISTING` | File exists on disk but no navigation surface links to it | +| `IGNORED` | System file (e.g., `_category_.json`) — not a content page | +| `CONFLICT` | Two files produce the same canonical URL — build collision | + +### Why the VSM Is Necessary + +Without the VSM, Zenzic could only answer: *"does this file exist on disk?"* That +question is easy. `pathlib.Path.exists()` answers it in a single syscall. + +The VSM enables Zenzic to answer a harder question: *"would this link resolve in the +rendered site, given how your specific build engine maps source files to URLs?"* + +Those are completely different questions. + +Consider Docusaurus: a file at `docs/guide/index.mdx` is served at `/docs/guide/`, +not at `/docs/guide/index/`. A link to `/docs/guide/index.html` would resolve to +nothing in the browser — even though the file exists on disk. + +Consider MkDocs: a file at `docs/api.md` with `nav:` entry `- API: api.md` is +reachable. The same file without a `nav:` entry is potentially an orphan depending +on `nav:` configuration. + +The VSM encodes these engine-specific routing rules in pure Python, without running +the build engine. + +### Building the VSM: The Architecture + +```text + ┌─────────────────────────────┐ + │ build_vsm() │ + │ (I/O boundary — called │ + │ once per scan) │ + └──────────┬──────────────────┘ + │ + ┌──────────────▼──────────────────┐ + │ Adapter.get_route_info() │ + │ (engine-specific, per-file) │ + └──────────────┬──────────────────┘ + │ + ┌────────────────────▼────────────────────────┐ + │ │ + ┌──────────▼─────────┐ ┌────────────────┐ ┌──────────▼──────────┐ + │ map_url(rel) │ │ classify_route │ │ get_nav_paths() │ + │ (canonical URL) │ │ (reachability)│ │ (sidebar+nav+footer)│ + └────────────────────┘ └────────────────┘ └─────────────────────┘ +``` + +The `build_vsm()` function is the only I/O boundary — it iterates over every Markdown +file in `docs_root` exactly once. All adapter calls are pure functions after that +initial read. No file is touched again during link validation. + +### O(1) Link Validation + +The VSM is a Python `dict[str, Route]` — a hash map keyed by canonical URL. + +When the validator needs to check whether a link target exists, it calls: + +```python +route = vsm.get(canonical_url) # dict.get() — O(1) hash lookup +if route is None: + # Z104: FILE_NOT_FOUND +``` + +This means validating 10,000 links against a 10,000-page site is not O(N²) — it is +10,000 independent O(1) lookups. The VSM is built once (O(N)) and then queried +indefinitely at O(1) per link. + +Compare this to naive implementations that call `Path.exists()` for every link target: +that is N×M syscalls for N links across M files, where each `stat()` call crosses the +user-kernel boundary. At 50,000 links across a large documentation site, the +difference between O(1) hash lookups and O(N×M) syscalls is the difference between +a 3-second scan and a 90-second scan. + +### The Anchor Cache + +In addition to the URL map, the VSM builder constructs an **anchor cache**: a mapping +from file path to the set of heading slugs that file declares. + +```python +anchors_cache: dict[Path, set[str]] = { + Path("/repo/docs/guide/install.mdx"): { + "prerequisites", "installation", "next-steps" + }, + ... +} +``` + +When a link contains a fragment (`/docs/guide/install/#next-steps`), Zenzic: + +1. Resolves the URL to a file path via the VSM (O(1)) +2. Checks the fragment against `anchors_cache[file_path]` (O(1) set lookup) + +A broken anchor (Z102) is detected with two hash lookups. Zero I/O. Zero subprocess +calls. + +### Ghost Routes: i18n and Versioning + +The VSM handles cases where a URL exists but no physical file produces it — what +Zenzic calls **Ghost Routes**. Two categories: + +**i18n Ghost Routes:** Docusaurus generates locale-specific index pages (e.g., +`/it/docs/`) automatically, even when no physical `it/index.mdx` exists. The VSM +marks these as `is_proxy=True` and `status=REACHABLE`, because the build engine will +generate them. + +**Versioned Routes:** Zenzic's Docusaurus adapter uses an internal `_version_` sentinel +prefix to track versioned documentation trees. A file at +`docs/versioned_docs/version-0.6/guide.md` is indexed as `_version_/0.6/guide.md` +in the VSM and served at `/docs/0.6/guide/` — transparent to the validator. + +In both cases, the VSM answer is correct: the URL is reachable for a reader. No +physical file required. + +### Collision Detection + +Two source files can produce the same canonical URL — this is a build-time error in +Docusaurus and MkDocs. The VSM detects this during construction: + +```python +def _detect_collisions(routes: list[Route]) -> None: + seen: dict[str, Route] = {} + for route in routes: + if route.url in seen: + route.status = "CONFLICT" + seen[route.url].status = "CONFLICT" + else: + seen[route.url] = route +``` + +A `CONFLICT` route surfaces as a Zenzic finding before the build runs, preventing the +silent data loss that occurs when two files compete for the same URL. + +--- + +## Act III — The Shield: 8 Stages of Truth + +### The Problem with Naive Secret Detection + +A naive credential scanner applies regex patterns line by line: + +```python +if re.search(r"AKIA[0-9A-Z]{16}", line): + flag_secret() +``` + +This works when the secret is written plainly. In documentation, secrets are rarely +written plainly. They appear in: + +- **Markdown tables**: `| Key | `` `AKIA` `` | `` `1234567890ABCDEF` `` |` +- **Concatenated strings**: `` `AKIA` `` + `` `1234ABCD5678EFGH` `` +- **HTML-entity encoded values**: `AKIA1234567890ABCDEF` +- **Unicode-obfuscated text**: `A\u200bK\u200bI\u200bA1234567890ABCDEF` (zero-width spaces) +- **Comment-interleaved tokens**: `ghp_ABC{/* comment */}DEF` +- **Cross-line YAML scalars**: key split across two lines by a folded block + +Zenzic's Shield is designed to defeat all of these patterns. It does so through a +**normalization pipeline** applied before regex matching. + +### The 8 Stages of Normalization + +The `_normalize_line_for_shield()` function applies these transformations in strict +order: + +#### Stage 1 — Unicode Format Character Stripping (ZRT-006) + +```python +normalized = "".join(c for c in line if unicodedata.category(c) != "Cf") +``` + +Unicode category `Cf` ("Format, other") includes invisible characters: zero-width +joiners (U+200D), zero-width non-joiners (U+200C), zero-width spaces (U+200B), and +word joiners (U+2060). An adversarial author can insert these between characters of a +secret key — the characters are visually invisible and collapse when copy-pasted, but +a naive regex will not match the fragmented token. + +Stage 1 strips them entirely, reconstructing the original token. + +#### Stage 2 — HTML Character Reference Decoding (ZRT-006) + +```python +normalized = html.unescape(normalized) +``` + +HTML character references (`A`, `A`, `&`) can encode any ASCII character. +A key like `AKIA1234567890ABCD` can be written as `AKIA1234567890ABCD` in inline HTML within a Markdown file — and will render correctly in the browser while evading naive scanners. + +`html.unescape()` from the Python standard library handles all forms: decimal (`&#NNN;`), +hexadecimal (`&#xHH;`), and named references (`&`). + +#### Stage 3 — HTML Comment Stripping (ZRT-007) + +```python +_HTML_COMMENT_RE = re.compile(r"") +normalized = _HTML_COMMENT_RE.sub("", normalized) +``` + +HTML comments can interleave token fragments: `ghp_ABCDEF`. After the +build, the comment is invisible. In the source, it splits the token. Stage 3 removes +the comment, joining `ghp_ABC` and `DEF` into `ghp_ABCDEF`, which is then matched by +the GitHub token pattern on a subsequent pass. + +#### Stage 4 — MDX Comment Stripping (ZRT-007) + +```python +_MDX_COMMENT_RE = re.compile(r"\{/\*.*?\*/\}") +normalized = _MDX_COMMENT_RE.sub("", normalized) +``` + +MDX files use JSX-style comments: `{/* ... */}`. The same interleaving attack applies. +Stage 4 handles the MDX-specific variant independently. + +#### Stage 5 — Backtick Code Span Unwrapping (ZRT-003) + +```python +_BACKTICK_INLINE_RE = re.compile(r"`([^`]*)`") +normalized = _BACKTICK_INLINE_RE.sub(r"\1", normalized) +``` + +Documentation authors frequently write tokens inside inline code spans for visual +formatting: `` `AKIA` ``. The backticks are presentation — they do not change the +semantics of the content. Stage 5 strips them, exposing the raw token to the regex +patterns. + +#### Stage 6 — Concatenation Operator Removal (ZRT-003) + +```python +_CONCAT_OP_RE = re.compile(r"[`'\"\s]*\+[`'\"\s]*") +normalized = _CONCAT_OP_RE.sub("", normalized) +``` + +Split-token patterns in documentation tables: + +```markdown +| Field | Value | +|-------|-------| +| Key | `AKIA` + `1234567890ABCDEF` | +``` + +The `+` operator joined with surrounding backticks is a common representation of +string concatenation in documentation. Stage 6 removes the concatenation construct, +joining the fragments into `AKIA1234567890ABCDEF`. + +#### Stage 7 — Table Pipe Replacement + +```python +_TABLE_PIPE_RE = re.compile(r"\|") +normalized = _TABLE_PIPE_RE.sub(" ", normalized) +``` + +Markdown table cells are separated by `|`. A secret split across cells would be: +`| AKIA | 1234567890ABCDEF |`. Stage 7 converts pipes to spaces, enabling +whitespace collapse in Stage 8 to produce a scannable line. + +#### Stage 8 — Whitespace Normalization + +```python +return " ".join(normalized.split()) +``` + +Collapses all whitespace runs (tabs, multiple spaces, newlines) into single spaces. +This is the final normalization before regex matching. The result is a clean, compact +line where all obfuscation techniques have been defeated. + +### The Lookback Buffer: Cross-Line Detection + +A secret that spans two lines defeats single-line scanning: + +```yaml +api_key: >- + AKIA + IOSFODNN7EXAMPLE +``` + +Each line individually contains only a fragment. Neither line matches the AWS access +key pattern `AKIA[0-9A-Z]{16}`. + +Zenzic addresses this with `scan_lines_with_lookback()` — a stateful scanner that +maintains a 1-line lookback buffer: + +```python +def scan_lines_with_lookback( + lines: Iterator[tuple[int, str]], + file_path: Path | str, +) -> Iterator[SecurityFinding]: + prev_normalized: str = "" + for line_no, raw_line in lines: + normalized = _normalize_line_for_shield(raw_line) + # Scan the cross-line join: tail of previous line + head of current + cross_line = prev_normalized[-40:] + normalized[:40] + yield from scan_line_for_secrets(cross_line, file_path, line_no) + # Scan the current line independently + yield from scan_line_for_secrets(raw_line, file_path, line_no) + prev_normalized = normalized +``` + +The cross-line join concatenates the last 40 characters of the normalized previous +line with the first 40 characters of the normalized current line — enough to +reconstruct any secret split across a line boundary, while keeping memory bounded. + +### Dual-Form Scanning + +Even after normalization, Zenzic scans each line in **two forms**: + +1. **Raw form** — the line exactly as it appears in the source, ensuring that normally + + formatted secrets are always caught with correct column positions for reporting. + +2. **Normalized form** — after all 8 stages, ensuring that obfuscated secrets are + + reconstructed and matched. + +Duplicate findings (same secret type on the same line in both forms) are suppressed +via a `seen: set[str]` de-duplication pass. + +### ReDoS Prevention: The F2-1 Hardening + +Regex patterns applied to pathological inputs can cause catastrophic backtracking — +a ReDoS (Regular Expression Denial of Service) attack. A crafted Markdown file with +a megabyte-long line could cause a regex engine to consume unbounded CPU. + +Zenzic's F2-1 hardening establishes a maximum line length constant: + +```python +_MAX_LINE_LENGTH: int = 1_048_576 # 1 MiB +``` + +Lines exceeding this limit are silently truncated before scanning. No secret longer +than 1 MiB exists in practice; a line longer than 1 MiB is not legitimate +documentation. + +Additionally, all regex patterns used in `_SECRETS` undergo an **eager ReDoS +pre-flight check** at engine construction time (ZRT-002): + +```python +def _assert_regex_canary(rule: BaseRule) -> None: + """Verify that the rule's regex does not exhibit catastrophic backtracking.""" + # Applies a timing canary against a known-adversarial input. + # Raises PluginContractError if the pattern exceeds the time budget. +``` + +Custom rules loaded via the `zenzic.rules` entry-point group are subject to the same +pre-flight check before the first file is scanned. + +### The 9 Secret Families + +Zenzic's Shield v0.7.0 detects credentials across 9 families: + +| Family | Pattern | Example prefix | +| :--- | :--- | :--- | +| **OpenAI API key** | `sk-[a-zA-Z0-9]{48}` | `sk-a1B2c3...` | +| **GitHub token** | `gh[pousr]_[a-zA-Z0-9]{36}` | `ghp_`, `gho_`, `ghu_`, `ghs_`, `ghr_` | +| **AWS access key** | `AKIA[0-9A-Z]{16}` | `AKIAIOSFODNN7EXAMPLE` | +| **Stripe live key** | `sk_live_[0-9a-zA-Z]{24}` | `sk_live_4xK8...` | +| **Slack token** | `xox[baprs]-[0-9a-zA-Z]{10,48}` | `xoxb-`, `xoxa-`, ... | +| **Google API key** | `AIza[0-9A-Za-z\-_]{35}` | `AIzaSyB...` | +| **Private key header** | `-----BEGIN [A-Z ]+ PRIVATE KEY-----` | RSA, EC, DSA | +| **Hex-encoded payload** | `(?:\\x[0-9a-fA-F]{2}){3,}` | `\x41\x4b\x49\x41...` | +| **GitLab PAT** | `glpat-[A-Za-z0-9\-_]{20,}` | `glpat-aBcDeFgHiJkL...` | + +Each pattern is pre-compiled at import time — zero compilation overhead during scanning. +The set is additive: new families are added by appending to the `_SECRETS` list. + +### Exit Code 2: The Sacred Exit + +Any detection by the Shield causes Zenzic to exit with **code 2**. This exit code is +**non-suppressible** — it cannot be silenced by `--exit-zero`, `fail-on-error: false`, +or any configuration flag. + +The rationale: a CI system that can be configured to ignore credential exposure is not +a security gate. It is theater. Exit code 2 is the guarantee that the security contract +cannot be bypassed by configuration drift or operator error. + +```text +Exit 0 — All checks passed +Exit 1 — Quality findings (broken links, orphans, placeholders) — suppressible +Exit 2 — Security breach (Shield: credential detected) — NEVER suppressible +Exit 3 — Fatal breach (Blood Sentinel: path traversal) — NEVER suppressible +``` + +The Shield operates in **Pass 1A** — before any structural analysis. A file that +triggers exit 2 does not proceed to link validation or orphan detection. The +Sentinel reports the breach and stops. + +--- + +## Act IV — Blood Sentinel: Kernel-Level Sandboxing + +### Path Traversal in CI/CD + +In a CI/CD pipeline, Zenzic runs in a containerized runner. The runner has access to: + +- SSH keys: `/home/runner/.ssh/id_rsa` +- System secrets: `/etc/passwd`, `/etc/shadow` +- Runner tokens: `/var/run/secrets/kubernetes.io/serviceaccount/token` + +A Markdown file can embed a path traversal attack: + +```markdown +[Evil link](../../../../etc/passwd) +[Another attack](../../../home/runner/.ssh/id_rsa) +``` + +A documentation site that renders these files to HTML becomes a vector for exfiltrating +runner secrets, depending on the deployment mechanism and how static assets are served. + +More critically: Zenzic itself reads file contents to validate them. A path traversal +in a link target could cause Zenzic to validate `/etc/passwd` as a documentation file +and include its content in a report. This is the **tool-level attack** — abusing the +validator to read secrets from the runner filesystem. + +The Blood Sentinel prevents both categories. + +### The `os.path.normpath` Collapse + +The defense is built into `InMemoryPathResolver._build_target()`: + +```python +def _build_target(self, source_file: Path, path_part: str) -> str: + if path_part.startswith("/"): + raw = self._root_str + os.sep + path_part.lstrip("/") + elif path_part.startswith("@site/docs/"): + raw = self._root_str + os.sep + path_part[len("@site/docs/"):] + elif path_part.startswith("@site/"): + raw = self._repo_root_str + os.sep + path_part[len("@site/"):] + else: + raw = str(source_file.parent) + os.sep + path_part + return os.path.normpath(raw) # ← The collapse +``` + +`os.path.normpath()` is pure C string arithmetic — no syscalls, no `stat()`, no +`readlink()`. It collapses all `.` and `..` segments mathematically. + +The result: + +```text +source: /repo/docs/guide/install.mdx +link: ../../../../etc/passwd + +raw = /repo/docs/guide/../../../etc/passwd +normpath → /etc/passwd +``` + +The target string `/etc/passwd` is produced before any filesystem call is made. +Then the Shield check: + +```python +shield_ok = ( + target_str == self._root_str + or target_str.startswith(self._root_prefix) +) +if not shield_ok: + return PathTraversal(raw_href=href) +``` + +`/etc/passwd` does not start with `/repo/docs/` → `PathTraversal` returned +immediately. **Zero filesystem access. Zero data exposure. Exit 3.** + +### The Multi-Root Perimeter + +Zenzic handles multi-locale Docusaurus projects where both `docs/` and +`i18n/it/docusaurus-plugin-content-docs/current/` contain cross-referencing files. + +The `InMemoryPathResolver` constructor accepts an `allowed_roots` parameter — a list +of additional authorized boundaries: + +```python +_extra = [self._coerce_path(r) for r in (allowed_roots or [])] +_pairs: list[tuple[str, str]] = [] +for _r in [self._root_dir, *_extra]: + _s = str(_r) + _pairs.append((_s, _s + os.sep)) +self._allowed_root_pairs: tuple[tuple[str, str], ...] = tuple(_pairs) +``` + +The Shield check becomes: + +```python +shield_ok = any( + target_str == root_str or target_str.startswith(root_prefix) + for root_str, root_prefix in self._allowed_root_pairs +) +``` + +A relative link from `docs/guide.mdx` to `../i18n/it/guide.mdx` is valid only if +`i18n/it/docusaurus-plugin-content-docs/current/` is in `allowed_roots`. Without +explicit authorization, it produces `PathTraversal`. The perimeter is explicitly +declared, not inferred. + +### The `@site/` Alias: Security Analysis + +Docusaurus allows `@site/` as an alias for the project root in `import` statements +and static asset references. Zenzic maps this alias to `repo_root`: + +```python +elif path_part.startswith("@site/docs/"): + raw = self._root_str + os.sep + path_part[len("@site/docs/"):] +elif path_part.startswith("@site/"): + raw = self._repo_root_str + os.sep + path_part[len("@site/"):] +``` + +A path like `@site/../etc/passwd` becomes: + +```text +raw = /repo/../etc/passwd +normpath → /etc/passwd +``` + +The normpath collapse happens before the perimeter check. `@site/` is not an +escape hatch from the Blood Sentinel. It is an alias for a specific root, and all +`..` traversals through it are collapsed and checked identically. + +### Exit Code 3: Non-Negotiable Termination + +Path traversal findings (Z202/Z203) cause exit 3. Like exit 2, this is +non-suppressible. A path traversal in a documentation source is not a quality +finding. It is an attempted perimeter breach. The Sentinel terminates. + +```text +Z202 PATH_TRAVERSAL — confirmed: resolved path escapes docs_root +Z203 PATH_TRAVERSAL_SUSPICIOUS — unresolvable path with traversal segments +``` + +The distinction: Z202 is triggered when normpath produces a path that fails the prefix +check. Z203 is triggered when the href contains `../` segments but cannot be fully +resolved (e.g., missing fragments, malformed URLs). Both produce exit 3. + +--- + +## Act V — The Docusaurus Adapter: isCategoryIndex and URL Collapsing + +### The Routing Problem + +Docusaurus maps source files to URLs through a set of rules that are not always +obvious to documentation authors. Zenzic must replicate these rules exactly in Python +to produce correct VSM entries. + +The most complex rule is **isCategoryIndex collapsing**: when a file's name matches +certain patterns, its URL is collapsed to the parent directory, not a file slug. + +### The Three Collapsing Cases + +From `_docusaurus.py`, the collapsing logic: + +```python +if parts: + file_name_lower = parts[-1].lower() + parent_name_lower = parts[-2].lower() if len(parts) >= 2 else None + if ( + file_name_lower == "index" # Case 1: index file + or file_name_lower == "readme" # Case 2: README file + or ( + parent_name_lower is not None + and file_name_lower == parent_name_lower # Case 3: folder-match + ) + ): + parts = parts[:-1] # collapse to parent +``` + +**Case 1 — Index collapse:** + +```text +docs/guide/index.mdx → /docs/guide/ +docs/index.mdx → /docs/ +``` + +**Case 2 — README collapse:** + +```text +docs/guide/README.md → /docs/guide/ +docs/README.md → /docs/ +``` + +**Case 3 — Folder-match collapse (isCategoryIndex):** + +```text +docs/guide/guide.mdx → /docs/guide/ (filename == parent dirname) +docs/api/api.md → /docs/api/ +``` + +This third case is frequently surprising to authors: a file named after its parent +directory is silently collapsed to the directory URL by Docusaurus. Zenzic replicates +this behavior exactly, producing the correct canonical URL in the VSM. + +### URL Priority: Frontmatter Slug First + +Before filesystem derivation, Zenzic checks for a `slug:` frontmatter declaration: + +```python +# Stage 1: frontmatter slug override +slug = self._slug_map.get(rel_posix) +if slug is not None: + if slug.startswith("/"): + # Absolute slug: prefix with routeBasePath + rbp = self._route_base_path or "docs" + return "/" + rbp + slug.rstrip("/") + "/" + else: + # Relative slug: replace last path segment + parent = rel.parent + return "/" + parent.as_posix() + "/" + slug.strip("/") + "/" +``` + +The full URL resolution priority: + +1. **Frontmatter `slug:`** — absolute or relative override +2. **isCategoryIndex** — index/README/folder-match collapse +3. **Extension stripping** — `.md` / `.mdx` removed +4. **routeBasePath prefix** — default `"docs"`, configurable + +### The `provides_index()` Contract + +The `provides_index(directory_path)` method determines whether a directory has a +landing page — required for the Z401 (MISSING_DIRECTORY_INDEX) check: + +```python +def provides_index(self, directory_path: Path) -> bool: + index_files = ("index.md", "index.mdx", "README.md", "README.mdx") + if any((directory_path / f).exists() for f in index_files): + return True + category_json = directory_path / "_category_.json" + if category_json.exists(): + data = json.loads(category_json.read_text(encoding="utf-8")) + link = data.get("link", {}) + return isinstance(link, dict) and link.get("type") == "generated-index" + return False +``` + +A directory provides an index when: + +1. An `index.md`, `index.mdx`, `README.md`, or `README.mdx` exists inside it, **or** +2. A `_category_.json` declares `"link": { "type": "generated-index" }` — causing + + Docusaurus to auto-generate a category index page. + +I/O is permitted in `provides_index()` because it is called once per directory during +the discovery phase — never inside per-link or per-file hot loops. + +### The Three-Surface Harvester + +For orphan detection, Zenzic's Docusaurus adapter aggregates navigation paths from +three sources: + +```python +def get_nav_paths(self) -> frozenset[str]: + """Merge sidebar + navbar + footer into a single navigable path set.""" + return ( + self._parse_sidebars() # sidebars.ts / sidebars.js + | self._parse_config_navigation() # navbar.items + footer.links + ) +``` + +**Sidebar parsing** (`_parse_sidebars()`): reads `sidebars.ts` or `sidebars.js` via +pure-Python regex. Strips JS-style line and block comments before parsing. Handles +both `type: 'doc'` explicit entries and bare string IDs. + +**Config navigation** (`_parse_config_navigation()`): reads `docusaurus.config.ts` +via regex, extracts `to:` URL paths from `navbar.items` and `footer.links`, strips +`baseUrl` and `routeBasePath` prefixes, and probes for `.md`/`.mdx` files on disk. + +A file is `ORPHAN_BUT_EXISTING` only if absent from sidebar AND navbar AND footer. +A changelog linked only in the navbar is `REACHABLE`. A legal notice linked only in +the footer is `REACHABLE`. This is **R21 — UX-Discoverability**. + +### The Slug Law: Physical Consistency + +Zenzic's own documentation enforces the **Slug Law** (ADR-003): no `slug:` frontmatter +that diverges from the physical file path. The rationale is architectural: the +autogenerated sidebar uses `type: 'autogenerated'` — it resolves URLs from file paths. +A diverged `slug:` creates a URL that the sidebar cannot resolve, causing navigation +failures without a build-time error. + +The VSM enforces this indirectly: if a `slug:` produces a URL that no sidebar entry +references, the file is `ORPHAN_BUT_EXISTING`. The Slug Law converts this from a +silent failure to a Zenzic finding. + +--- + +## Act VI — The Rule Engine: Adaptive Parallelism + +### The AdaptiveRuleEngine + +Custom rules in Zenzic — declared in `[[custom_rules]]` or implemented as Python +classes via the `zenzic.rules` entry-point group — are applied through the +`AdaptiveRuleEngine`: + +```python +class AdaptiveRuleEngine: + def __init__(self, rules: Sequence[BaseRule]) -> None: + for rule in rules: + _assert_pickleable(rule) # eager pickle validation + _assert_regex_canary(rule) # ZRT-002: ReDoS pre-flight + self._rules = rules + + def run(self, file_path: Path, text: str) -> list[RuleFinding]: + """Pure function: file path + text → findings. No I/O.""" + findings: list[RuleFinding] = [] + for rule in self._rules: + try: + findings.extend(rule.check(file_path, text)) + except Exception as exc: + # Rule failures are caught and converted to RULE-ENGINE-ERROR findings. + # One faulty plugin cannot abort the scan of the entire docs tree. + findings.append(RuleFinding(...)) + return findings +``` + +Rules are validated **eagerly** at construction time, before the first file is scanned. +A rule that fails pickle serialization is rejected immediately — not silently inside a +worker process during a long parallel scan. + +### The 50-File Threshold + +Zenzic's scanner switches between sequential and parallel execution based on the +number of files: + +```python +ADAPTIVE_PARALLEL_THRESHOLD: int = 50 # in scanner.py + +use_parallel = workers != 1 and len(md_files) >= ADAPTIVE_PARALLEL_THRESHOLD +``` + +Below 50 files: sequential execution. The overhead of spawning a +`ProcessPoolExecutor` — approximately 200–400 ms on a cold interpreter — exceeds +the parallelism benefit for small documentation sets. + +At or above 50 files: `ProcessPoolExecutor` is used: + +```python +with concurrent.futures.ProcessPoolExecutor(max_workers=actual_workers) as executor: + futures_map = { + executor.submit(_worker, item): item[0] + for item in work_items + } + for future in concurrent.futures.as_completed(futures_map): + results.extend(future.result()) +``` + +Each file is dispatched to an independent worker process. The worker receives a +serialized `(file_path, config, rules)` tuple via pickle — which is why the eager +pickle validation at `AdaptiveRuleEngine` construction is load-bearing. A +non-pickleable lambda in a custom rule would silently fail inside the worker process; +the eager check catches it in the main process at startup. + +### Pure Function Discipline: Why It Matters for Parallelism + +Pillar 3 — Pure Functions First — is not a style preference. It is an architectural +requirement for correctness under parallelism. + +A rule that holds mutable state between `check()` calls (e.g., a counter, a cache) +would produce data races when two workers process files simultaneously. A rule that +makes I/O calls inside `check()` would suffer from TOCTOU (time-of-check to +time-of-use) races in a parallel context. + +Pure functions — deterministic, stateless, side-effect-free — are safe to execute +concurrently without synchronization. The `AdaptiveRuleEngine` guarantees this by +contract: any rule that cannot be expressed as a pure function cannot satisfy the +`PluginContractError` validation and will not be admitted to the engine. + +### The Pickle Serialization Check + +Custom rules loaded via `entry_points(group="zenzic.rules")` are validated with: + +```python +def _assert_pickleable(rule: BaseRule) -> None: + try: + pickle.dumps(rule) + except Exception as exc: + raise PluginContractError( + f"Rule '{rule.rule_id}' cannot be pickled and is incompatible with " + f"multiprocessing: {exc}" + ) from exc +``` + +This is an **eager contract check**: the error is raised before any file is touched, +with a clear message pointing to the rule that failed. Without this check, the failure +would manifest as a cryptic `BrokenPipeError` or `EOFError` inside a worker process +at scan time — far harder to diagnose. + +--- + +## Act VII — Enterprise Integration: SARIF and the Quality Gate + +### SARIF 2.1.0: Documentation in Your Security Dashboard + +SARIF (Static Analysis Results Interchange Format) is the standard output format for +security tools consumed by GitHub Code Scanning, Azure DevOps, and other CI/CD +platforms. + +Zenzic produces valid SARIF 2.1.0 with: + +```bash +zenzic check all ./docs --format sarif > zenzic.sarif +``` + +The SARIF output includes: + +- **Tool descriptor** with Zenzic version and URI +- **Rules array** with one entry per Zxxx code found (ID, name, helpUri, severity) +- **Results array** with location (file + line + column), message, and level + +A minimal SARIF result for a broken link: + +```json +{ + "ruleId": "Z101", + "level": "error", + "message": { + "text": "Z101 LINK_BROKEN: './install.mdx' → './guide/setup.mdx' does not exist" + }, + "locations": [{ + "physicalLocation": { + "artifactLocation": { "uri": "docs/install.mdx" }, + "region": { "startLine": 42, "startColumn": 12 } + } + }] +} +``` + +Upload to GitHub Code Scanning: + +```yaml title=".github/workflows/zenzic.yml" +name: Documentation Integrity Gate + +on: [push, pull_request] + +jobs: + sentinel: + runs-on: ubuntu-latest + steps: + + - uses: actions/checkout@v4 + + - name: Run Zenzic Sentinel + + run: uvx zenzic check all ./docs --format sarif > zenzic.sarif + + - name: Upload to GitHub Security + + uses: github/codeql-action/upload-sarif@v3 + with: + sarif_file: zenzic.sarif + if: always() # upload even when Zenzic fails +``` + +The `if: always()` is critical: when Zenzic exits with code 1 (quality findings), the +step is marked as failed — but the SARIF upload must still execute to surface the +findings in the Security tab. Without `if: always()`, a failed step would abort before +uploading, producing silence instead of visibility. + +For teams using `zenzic-action`: + +```yaml title=".github/workflows/zenzic.yml" + +- uses: PythonWoods/zenzic-action@v1 + + with: + version: "0.7.0" + format: sarif + upload-sarif: "true" +``` + +The action handles the SARIF upload and the `if: always()` semantics automatically, +including SARIF integrity validation — if the SARIF file is truncated by runner OOM +or SIGKILL, the action emits a `::warning` annotation rather than uploading a false- +clean result (Output-First Semantics, ADR-004 in zenzic-action). + +### Machine Silence: RULE R20 + +When `--format sarif` or `--format json` is active, Zenzic enforces **Machine Silence +(R20)**: zero Rich banners, headers, or informational panels are written to stdout. +The output stream is a machine-readable format and must remain 100% valid against its +schema. + +This is enforced at the CLI level: + +```python +_MACHINE_FORMATS = frozenset({"json", "sarif"}) + +if output_format not in _MACHINE_FORMATS: + print_header(console) +``` + +A script that pipes `zenzic check all --format json | jq '.findings'` receives +valid JSON with no banner contamination. + +### The Quality Score: `zenzic score` + +Beyond binary pass/fail, Zenzic provides a **quality score** — a 0–100 metric +computed from the weighted sum of findings across all check categories: + +```bash +zenzic score ./docs +``` + +The score can be used as a regression gate: + +```bash +zenzic diff ./docs # compare current score to last snapshot +``` + +`diff` compares the current scan result against a stored snapshot (`zenzic.snapshot.json` +in the repo root). A score regression (e.g., score drops from 97 to 91) causes a +non-zero exit, enabling CI to block merges that degrade documentation quality. + +This is the **Quality Gate pattern**: not a binary pass/fail, but a tracked trend +with a configurable failure threshold (`fail_under` in `zenzic.toml`). + +### The Diagnostic Code Registry: Zxxx + +Every Zenzic finding carries a `Zxxx` code from `core/codes.py` — the single source +of truth for the diagnostic registry. + +The full registry by category: + +| Range | Category | Codes | +| :--- | :--- | :--- | +| **Z1xx** | Link Integrity | Z101 LINK_BROKEN, Z102 ANCHOR_MISSING, Z103 UNREACHABLE_LINK, Z104 FILE_NOT_FOUND, Z105 ABSOLUTE_PATH, Z106 ALT_TEXT_MISSING | +| **Z2xx** | Security | Z201 SHIELD_SECRET, Z202 PATH_TRAVERSAL, Z203 PATH_TRAVERSAL_SUSPICIOUS | +| **Z3xx** | Reference Integrity | Z301 DANGLING_REF, Z302 DEAD_DEF, Z303 CIRCULAR_LINK | +| **Z4xx** | Structure | Z401 MISSING_DIRECTORY_INDEX, Z402 ORPHAN_PAGE, Z403 SNIPPET_UNREACHABLE, Z404 CONFIG_ASSET_MISSING | +| **Z5xx** | Content Quality | Z501 PLACEHOLDER, Z502 SHORT_CONTENT, Z503 SNIPPET_ERROR, Z504 QUALITY_REGRESSION | +| **Z9xx** | Engine / System | Z901 RULE_ERROR, Z902 RULE_TIMEOUT, Z903 UNUSED_ASSET, Z904 DISCOVERY_ERROR | + +The codes are stable across versions. A CI system that filters findings by `Z201` +(credentials) can do so independently of Zenzic version bumps. The codes are the +documented API surface for tooling integration. + +--- + +## Act VIII — Performance: The Numbers + +### The Adaptive Parallelism Benchmark + +The 50-file threshold is a conservative heuristic derived from empirical measurement: + +| File count | Sequential (ms) | Parallel (ms) | Crossover | +| ---: | ---: | ---: | :--- | +| 10 | 28 | 380 | Sequential wins | +| 25 | 71 | 390 | Sequential wins | +| 50 | 142 | 395 | Roughly equal | +| 100 | 284 | 412 | Parallel wins | +| 500 | 1,420 | 680 | Parallel wins (2×) | +| 1,000 | 2,840 | 920 | Parallel wins (3×) | +| 10,000 | 28,400 | 4,200 | Parallel wins (6.7×) | + +*Measurements on a 4-core runner, cold start. Custom rules with moderate complexity.* + +The ~380 ms fixed overhead of `ProcessPoolExecutor` spawn is the reason the threshold +is not set lower. A threshold of 10 files would cause sequential scans of small repos +to pay the spawn cost without benefit. + +### VSM Construction vs Link Validation + +The scan time breakdown for a 1,000-file project: + +```text +Discovery (walk + read): ~450 ms (I/O bound — disk sequential) +VSM construction: ~120 ms (CPU bound — adapter URL mapping) +Anchor cache build: ~80 ms (CPU bound — heading slug extraction) +Link validation: ~95 ms (CPU bound — 50,000 hash lookups) +Orphan detection: ~35 ms (CPU bound — frozenset intersection) +Shield scan: ~210 ms (CPU bound — regex over 1M lines) +Report rendering: ~40 ms (CPU bound — Rich formatting) +───────────────────────────────────── +Total: ~1,030 ms +``` + +:::note Benchmark conditions +These figures are for **synthetic Markdown files** (minimal frontmatter, no JSX, ~10 +lines of prose). Real-world MDX files with frontmatter, JSX components, tables, and +dense link graphs cost significantly more per file. Measured against the real +`zenzic-doc` project (59 MDX pages): ~7 ms/file vs ~0.5 ms/file for synthetic files. +Run `python scripts/benchmark.py --repo ` to measure your own project. +::: + +Link validation at 50,000 links takes 95 ms — less than the report rendering phase. +This is the O(1) hash map in practice: 50,000 `dict.get()` calls at ~1.9 µs each. + +### Memory Profile + +The VSM for a 10,000-file project: + +```text +Route objects: 10,000 × ~280 bytes = ~2.8 MB +Anchor cache: 10,000 × ~1,200 bytes = ~12.0 MB +md_contents: 10,000 × ~8,000 bytes = ~80.0 MB +───────────────────────────────────────────────── +Total RSS: ~95 MB +``` + +The dominant cost is `md_contents` — the raw Markdown text held in memory for the +Shield scan. Zenzic holds all files in memory simultaneously to avoid repeated I/O +during multi-pass analysis. For projects above 50,000 files, a chunked processing +mode is planned for a future release. + +### Cross-Platform CI Matrix + +Zenzic's test suite runs a 3×3 platform matrix on every commit: + +```text +OS: [ubuntu-latest, windows-latest, macos-latest] +Python: [3.11, 3.12, 3.13 ] +``` + +9 parallel CI jobs. All 1,342+ tests must pass on all 9 combinations. This is the +**portability guarantee**: Zenzic's output is identical across all platforms. A +scan that passes on Ubuntu passes on macOS and Windows — critical for teams using +heterogeneous development environments. + +--- + +## Act IX — The Adapter Contract: Extending Zenzic + +### The BaseAdapter Protocol + +Zenzic's Core (`validator.py`, `scanner.py`) contains zero engine-name references. +This is **Purity Protocol** — Rule R21 (Protocol Sovereignty). Any engine-specific +behavior must be declared via the `AdapterProtocol` and queried by the Core. + +The adapter protocol (simplified): + +```python +class AdapterProtocol(Protocol): + def get_nav_paths(self) -> frozenset[str]: + """Return navigable paths from all user-clickable surfaces.""" + ... + + def map_url(self, rel: Path) -> str: + """Map a source file to its canonical URL.""" + ... + + def classify_route(self, rel: Path, nav_paths: frozenset[str]) -> RouteStatus: + """Classify a route as REACHABLE, ORPHAN_BUT_EXISTING, IGNORED, or CONFLICT.""" + ... + + def provides_index(self, directory_path: Path) -> bool: + """True when the directory will have a landing page.""" + ... + + def get_metadata_files(self) -> list[Path]: + """Return Level 1 System Guardrail files (excluded from all checks).""" + ... + + def get_link_scheme_bypasses(self) -> frozenset[str]: + """Return URI schemes that bypass Z105 absolute-path validation.""" + ... +``` + +The Core calls `adapter.get_nav_paths()`. It receives a `frozenset[str]`. What +generated that frozenset — whether it came from `sidebars.ts`, `mkdocs.yml`, or +`zensical.toml` — is invisible to the Core. + +Adding a new adapter requires implementing this protocol. Adding an engine-specific +behavior by modifying `validator.py` is a **protocol violation** and will be rejected +in code review. + +### The `pathname:///` Bypass (Rule R16) + +Docusaurus uses `pathname:///` as a Diplomatic Courier — an escape hatch for linking +to static assets that are not part of the docs routing system: + +```markdown +[Download PDF](pathname:///assets/whitepaper.pdf) +``` + +The Z105 gate (ABSOLUTE_PATH) normally fires on any path starting with `/`. The +`pathname:///` URI scheme is exempt in Docusaurus mode: + +```python +def get_link_scheme_bypasses(self) -> frozenset[str]: + return frozenset({"pathname"}) +``` + +The Core queries `adapter.get_link_scheme_bypasses()` before applying Z105. This is +R16 — Protocol Awareness — in action: engine-specific behavior declared in the +adapter, queried by the Core, with no `if engine == "docusaurus"` in Core logic. + +In all other engines (MkDocs, Zensical, Standalone), `pathname:///` is unrecognized +and triggers Z105 normally. The bypass is scoped precisely. + +### Level 1 System Guardrails + +Adapter metadata files — `docusaurus.config.ts`, `mkdocs.yml`, `zensical.toml`, +`package.json`, `pyproject.toml` — are declared as **Level 1 System Guardrails** +via `get_metadata_files()`. These files are: + +- Permanently excluded from Z903 (UNUSED_ASSET) checks +- Permanently excluded from all quality checks +- Never presented to the user as orphans, placeholders, or short-content warnings + +The rationale (Rule R13 — Intelligent Perimeter): asking the user to manually exclude +their own build configuration files from analysis is a failure of the tool, not a +configuration task. The adapter knows what its metadata files are; the Core does not +need to be told. + +--- + +## Act X — Getting Started + +### Immediate Verification (No Installation) + +```bash +uvx zenzic lab +``` + +`uvx` resolves the latest Zenzic from PyPI, installs it in an isolated temporary +environment, and runs the interactive Lab. Seventeen Acts, each demonstrating a +distinct capability. The entire experience requires no project setup. + +Start with Act 3 — the Shield in action against a planted Stripe live key. Watch +the Sentinel exit with code 2. That exit code is the promise. + +### Your First Scan + +```bash +uvx zenzic check all ./docs +``` + +Zenzic will: + +1. Discover your documentation engine (Docusaurus, MkDocs, Zensical, or Standalone) +2. Build the VSM from your source files +3. Run the Shield across every line of every file +4. Validate all internal links against the VSM +5. Detect orphan pages via R21 (navbar + sidebar + footer analysis) +6. Report all findings with Zxxx codes, file paths, and line numbers + +On a 100-page Docusaurus site: expect 2–4 seconds, cold start. + +### Pinned CI Integration + +```yaml title=".github/workflows/zenzic.yml" +name: Documentation Integrity Gate + +on: + push: + branches: [main] + pull_request: + +jobs: + sentinel: + runs-on: ubuntu-latest + permissions: + security-events: write # required for SARIF upload + steps: + + - uses: actions/checkout@v4 + + - uses: astral-sh/setup-uv@v5 + + - uses: PythonWoods/zenzic-action@v1 + + with: + version: "0.7.0" # pinned — deterministic CI gate + format: sarif + upload-sarif: "true" +``` + +Version pinning (`version: "0.7.0"`) is mandatory for production pipelines. `latest` +is appropriate for exploration; it introduces non-determinism into your CI gate. + +### `zenzic.toml` Configuration + +```toml title="zenzic.toml" +docs_dir = "docs" +fail_under = 95 # quality score gate: fail if score drops below 95 + +# Excluded external URLs (temporary — remove after deployment) +excluded_external_urls = [ + "https://internal.corp.example.com/api", +] + +# Excluded asset patterns (Docusaurus sidebar metadata) +excluded_assets = [ + "**/_category_.json", +] + +[build_context] +engine = "docusaurus" +base_url = "/" +default_locale = "en" +locales = ["it", "fr"] +``` + +The 4-level configuration priority: **CLI flags > `zenzic.toml` > `pyproject.toml` +`[tool.zenzic]` > built-in defaults**. CLI flags always win. This allows temporary +overrides without modifying project configuration. + +### Standalone Mode + +For projects with no build system — raw Markdown directories, GitHub wikis, plain +doc trees: + +```toml title="zenzic.toml" +docs_dir = "." + +[build_context] +engine = "standalone" +``` + +In Standalone mode: + +- Orphan detection (Z402) is **disabled** — there is no navigation contract +- Link validation still runs — broken links are broken regardless of engine +- The Shield still runs — credentials are credentials regardless of engine +- The Blood Sentinel still runs — path traversal is path traversal regardless of engine + +The security guarantees are engine-independent. Only the navigation contract is scoped. + +--- + +### Brand Integrity: Z905 BRAND_OBSOLESCENCE + +The fourth dimension of the Safe Harbor — beyond structural, security, and content +correctness — is **narrative integrity**. A documentation suite that refers to a +deprecated release codename has a different class of bug: it tells the wrong story. + +Configure `[project_metadata]` in `zenzic.toml` to activate the Brand Integrity layer: + +```toml title="zenzic.toml" +[project_metadata] +release_name = "Quartz" +obsolete_names = ["Obsidian"] +obsolete_names_exclude_patterns = ["CHANGELOG*.md", "adr-*.mdx"] +``` + + (Markdown) or {/* zenzic:ignore Z905 */} (MDX) to the line to suppress intentional references." + }]} +/> + +The `zenzic:ignore Z905` escape hatch is precise by design: it applies to a single line, +not a whole file. A CHANGELOG entry that says "Released under the Obsidian codename" +is historical fact. An architecture page that describes the current system as +"Obsidian-based" is a lie that the source code has already corrected. + +--- + +:::tip[The Sentinel's Filter — Why Every Quartz Rule Exists] +Every rule in the Quartz Core must pass a three-dimensional admission test before it ships: +**Structural Integrity** (broken links, orphans, missing indices), **Hardened Security** +(credentials, path traversal), or **Technical Accessibility** (machine-readable contracts +for downstream tooling — Z505 is the canonical example). Rules that fail this filter +— line length, list style, spelling — are deliberately out of scope. Zenzic is a Sentinel, +not a Proofreader. + +[Read the full rationale →](https://zenzic.dev/docs/explanation/structural-integrity) +::: + +## Epilogue: The Documentation is the Source + +The engineering tradition treats documentation as secondary — a description of the +system, not the system itself. This tradition is breaking down. + +In 2026, documentation is: + +- **The primary interface** for internal APIs in large organizations +- **The trust signal** that developers use to evaluate whether a library is maintained +- **The compliance artifact** that auditors examine in regulated industries +- **The attack surface** that adversaries probe for exposed credentials and path traversal + +A documentation pipeline that trusts its input is not a pipeline. It is a hope. + +Zenzic exists because the question *"is this documentation correct?"* is not the same +question as *"did this build succeed?"* A build that succeeds on broken documentation +has not validated anything. It has just run faster. + +The Safe Harbor is not a metaphor. It is an architectural guarantee: every file that +passes Zenzic's three layers — the Structural Validator, the Shield, and the Blood +Sentinel — has been verified against the navigation contract of your specific build +engine, scanned for all known credential formats with 8-stage normalization, and +checked for path traversal against an explicitly declared perimeter. + +That is the promise. Every exit-0 scan is the proof. + +--- + +*For the full engineering history of how these layers were designed, tested under +AI-generated siege, and hardened across five sprints — read the +[🛡️ The Zenzic Chronicles →](/blog/hardening-the-documentation-pipeline).* + +--- + +| | | +|---|---| +| **GitHub** | [github.com/PythonWoods/zenzic](https://github.com/PythonWoods/zenzic) | +| **Documentation** | [zenzic.dev](https://zenzic.dev/) | +| **PyPI** | [pypi.org/project/zenzic](https://pypi.org/project/zenzic/) | +| **Lab** | `uvx zenzic lab` | diff --git a/blog/2026-04-29-tutorial-stop-broken-links.mdx b/blog/2026-04-29-tutorial-stop-broken-links.mdx new file mode 100644 index 0000000..0d47183 --- /dev/null +++ b/blog/2026-04-29-tutorial-stop-broken-links.mdx @@ -0,0 +1,120 @@ +--- +slug: tutorial-stop-broken-links-60s +title: "Stop Broken Links in 60s" +sidebar_label: "⚡ 002 - Tutorial: Get Started" +authors: [pythonwoods] +tags: [tutorial, quickstart, python, opensource, devtools, user-tutorials] +date: 2026-04-29T19:00:00 +description: > + Install Zenzic, run your first audit, and protect your documentation + pipeline in under 60 seconds. No setup, no configuration, no build required. +image: https://zenzic.dev/assets/social/social-card.png +--- + +Your docs have broken links. You just haven't found them yet. + +**Zenzic finds them before your readers do** — before you build, before you deploy, +before it's too late. + +{/* truncate */} + +:::tip[New to Zenzic?] +This tutorial gets you from zero to first audit in 60 seconds. After that, +explore how Zenzic was built in [The Zenzic Chronicles →](/blog/hardening-the-documentation-pipeline) +::: + +--- + +## Step 1 — Launch + +No install. No virtual environment. One command: + +```bash title="Terminal" +uvx zenzic check all ./docs +``` + +No browser, no build engine, no heavy framework — a single Python tool cached on +first run and ready in seconds from then on. + +--- + +## Step 2 — Read the Report + +You'll see one of two results: + +**All clear:** + +```text +✨ Sentinel Seal: All checks passed. Your documentation is clean. +``` + +**Issues found:** + + + +Each finding carries a `Zxxx` code, a file path, a line number, and a clear description. +Fix what's flagged, re-run, and ship with confidence. + +--- + +## Step 3 — Protect Your CI + +One line in your GitHub Actions workflow: + +```yaml title=".github/workflows/zenzic.yml" + +- name: Audit documentation + + run: uvx zenzic check all ./docs +``` + +Every pull request is now guarded. Broken links, orphan pages, and leaked credentials +are caught before they reach `main`. + +--- + +## Why Zenzic + +- **Fast** — Zenzic is fast because it's lightweight. No build step, no Node.js, + + no browser launch. Analysis happens directly on your Markdown source files. + +- **Safe** — Zenzic is secure because it doesn't touch your system files. + + Read-only analysis, always. Your repository is observed, never modified. + +- **Universal** — Works with MkDocs, Docusaurus, Zensical, or any plain Markdown folder. + + Point it at your `docs/` directory and it figures out the rest. + +--- + +## Go Further + +| Command | What it does | +|---------|-------------| +| `uvx zenzic check all` | Full audit: links, orphans, credentials, snippets | +| `uvx zenzic check links` | Link integrity only | +| `uvx zenzic score` | Quality score with trend tracking | +| `uvx zenzic check all --format sarif` | SARIF output for GitHub Code Scanning | + +Pin a specific version for reproducible CI: + +```bash title="Terminal" +uvx "zenzic==0.7.0" check all ./docs +``` + +--- + +The full engineering story behind Zenzic — from the first broken pipe to Quartz +Maturity — lives in **[The Zenzic Chronicles →](/blog/hardening-the-documentation-pipeline)** + +For a 1,525-line architectural deep-dive into every Zenzic component — verified by **1,301 tests** across Python 3.11, 3.12, and 3.13 — see the **[Obsidian Masterclass →](/blog/obsidian-masterclass)**. diff --git a/blog/2026-05-07-log-v070-quartz.mdx b/blog/2026-05-07-log-v070-quartz.mdx new file mode 100644 index 0000000..54867c4 --- /dev/null +++ b/blog/2026-05-07-log-v070-quartz.mdx @@ -0,0 +1,134 @@ +--- +slug: log-v070-quartz-maturity +title: "Log: v0.7.0 — Quartz Maturity" +sidebar_label: "📜 Log: v0.7.0" +authors: [pythonwoods] +tags: [release, milestone, engineering-chronicles] +date: 2026-05-07T10:00:00 +description: >- + Patch-notes leggibili in 30 secondi: 4-Gates Standard, Z907 I18N_PARITY, + Multi-Root Discovery, Zero-Config Sovereignty, Virtual Routes, breaking changes, migration path. +image: /img/social-card.png +--- + +{/* SPDX-FileCopyrightText: 2026 PythonWoods */} +{/* SPDX-License-Identifier: Apache-2.0 */} + +> *Log convention — the terse mirror of `RELEASE.md`. For the narrative +> deep-dive, see [Saga V — Beyond the +> Siege](/blog/beyond-the-siege-zenzic-v070-quartz).* + +**TL;DR.** Zenzic v0.7.0 "Quartz Maturity" closes the post-Obsidian arc with +seven epochs: the **4-Gates Standard** (EPOCH 4), the language-agnostic +**Z907 I18N_PARITY** check (EPOCH 5), **Cross-Instance Sovereignty** now +Zero-Config (EPOCH 6), **Multi-Root Discovery** (EPOCH 7a), **Zero-Config +Sovereignty** with `[link_validation]` removed (EPOCH 7a.1), and **Virtual +Routes** with the `zenzic inspect routes` JSON API (EPOCH 7b). +1,485+ tests. Some breaking changes; clean migration path. + +{/* truncate */} + +## Breaking changes + +| From (≤ v0.6.x) | To (v0.7.0) | Migration | +|-----------------|-------------|-----------| +| `engine = "vanilla"` | `engine = "standalone"` | rename in `zenzic.toml` | +| MkDocs plugin (in-tree) | external adapter | drop dependency, use Sentinel CLI | +| `just preflight` | `just verify` | recipe rename — same 4-Gates content | +| Hook id `zenzic-check-all` | `zenzic-verify` | bump `rev:` to `v0.7.0` | +| `[link_validation]` TOML schema | *(removed)* | delete the block — URL prefixes auto-detected | + +No silent deprecation shims. Industry-grade only. + +## EPOCH 4 — The Safe Port (4-Gates Standard) + +A single command runs every quality gate locally: + +```bash +just verify +``` + +Sequence: `pre-commit` → `pytest` (with coverage) → `zenzic check all +--strict` → exit-code parity self-test. The same four gates run in CI; what +passes locally passes in the cloud. + +## EPOCH 5 — Z907 I18N_PARITY + +Language-agnostic translation parity check. Configure in `zenzic.toml`: + +```toml +[i18n] +enabled = true +default_locale = "en" +locales = ["en", "it"] +parity_strict = true +``` + +When `parity_strict = true`, every page in `default_locale` must have a +mirror in every other locale. Missing translations surface as `Z907 +I18N_PARITY` findings. + +## EPOCH 6 — Cross-Instance Sovereignty (Zero-Config) + +Multi-instance Docusaurus setups (`docs/`, `developers/`, every additional +`@docusaurus/plugin-content-docs` instance) are now fully supported without +manual TOML configuration. `DocusaurusAdapter.get_absolute_url_prefixes(repo_root)` +discovers every plugin's `routeBasePath` via static parsing of +`docusaurus.config.{ts,js,mjs,cjs}` — zero subprocess, zero allowlist, zero +duplication. + +:::note Historical note +An earlier draft of EPOCH 6 shipped a manual `[link_validation].absolute_path_allowlist` +field. That approach was abandoned. The Zero-Config implementation superseded it +entirely in EPOCH 7a.1. +::: + +## EPOCH 7a — Multi-Root Discovery + +The VSM is no longer bounded by `docs_dir`. The Docusaurus adapter auto-detects +the `blog/` plugin via two pure-parsing passes (static regex over config, then +convention fallback). Blog posts are first-class content: broken links inside +`blog/` and cross-tree links from `docs/` to `blog/` are caught by +`zenzic check all --strict`. A Reverse-Mapping invariant test asserts every +blog `Route.source` traces back to a real file on disk. + +## EPOCH 7a.1 — Zero-Config Sovereignty + +The `[link_validation]` TOML schema is **removed**. `LinkValidationConfig` and +`absolute_path_allowlist` are gone from the codebase. Configs that still declare +`[link_validation]` raise a TOML validation error. **Migration:** delete the +block — `DocusaurusAdapter` discovers plugin URL prefixes automatically. + +## EPOCH 7b — Virtual Routes & `zenzic inspect routes` + +Engine-generated pages — tag indexes, paginated blog lists, author profiles — +are now first-class VSM citizens with the Reverse-Mapping Invariant enforced at +construction time. Three new finding codes: + +| Code | Level | Trigger | +|------|-------|---------| +| **Z111 VIRTUAL_ROUTE_BROKEN** | Error | docs link targets a tag URL no blog post activates | +| **Z113 AUTHOR_KEY_COLLISION** | Error | duplicate author keys in `authors.yml` | +| **Z114 LARGE_PAGINATION_SET** | Info | pagination set exceeds 200 pages | + +New CLI command: + +```bash +zenzic inspect routes [--kind physical|virtual|all] [--json] +``` + +Exports the complete site map as deterministic JSON with per-route `url`, `kind`, +`source_files` (repo-relative POSIX), and `digest`. When `--json` is active, +`stdout` is exclusively valid JSON — no ANSI codes, no banners. + +## Migration path + +The full migration matrix lives in +[`RELEASE.md`](https://github.com/PythonWoods/zenzic/blob/main/RELEASE.md#breaking-changes) +under "Breaking changes". One pass through the table is usually enough. + +## Saga deep-dive + +For the philosophy, the post-mortem of the AI-driven siege, and the +engineering choices behind Quartz Maturity, see [**Saga V — Beyond the +Siege**](/blog/beyond-the-siege-zenzic-v070-quartz). diff --git a/blog/2026-05-07-v070-quartz-maturity-stable.mdx b/blog/2026-05-07-v070-quartz-maturity-stable.mdx new file mode 100644 index 0000000..b4a3a92 --- /dev/null +++ b/blog/2026-05-07-v070-quartz-maturity-stable.mdx @@ -0,0 +1,306 @@ +--- +slug: zenzic-v070-quartz-maturity-stable +title: "Quartz Maturity" +sidebar_label: "🛡️ 005 - Saga V: Quartz Maturity" +authors: [pythonwoods] +tags: [release, engineering, python, opensource, obsidian-chronicles, engineering-chronicles] +date: 2026-05-07T10:00:00 +description: > + Zenzic v0.7.0 is stable. The paradigm shift: a file is an orphan not when it's + missing from disk, but when it's invisible to the human eye. UX-Discoverability, + 20 Lab Acts, SARIF, and the Safe Harbor — all in one release. +image: https://zenzic.dev/assets/social/social-card.png +--- + +:::info[🛡️ The Zenzic Chronicles — Complete] + +The complete six-part engineering saga of Zenzic's journey from v0.5 Sentinel to v0.7.0 Quartz Maturity. The Chronicles are sealed. + +[Saga I](/blog/hardening-the-documentation-pipeline) | [Saga II](/blog/docs-pipeline-security-risk-obsidian-bastion) | [Saga III](/blog/ai-driven-siege-shield-postmortem) | [Saga IV](/blog/beyond-the-siege-zenzic-v070-quartz) | **Saga V** | [Saga VI](/blog/governance-of-quartz) + +::: + +During the final consolidation sprint for v0.7.0, we ran four AI agents against Zenzic's +own documentation site. They were instructed to find everything wrong with it. + +{/* truncate */} + +They found a file that Zenzic had marked **REACHABLE**. The file existed on disk, had +valid frontmatter, and was not referenced by any broken link. By every metric in the +pre-v0.7.0 implementation, it was clean. + +The file had not appeared in the sidebar for three weeks. No navbar entry linked to it. +No footer item referenced it. A user navigating the documentation site had no way to +reach it by clicking. It was, from the reader's perspective, gone. + +Zenzic was asking the wrong question. + +## The Wrong Question + +Traditional documentation linters ask: *does the file exist on disk?* + +That is the correct question for a filesystem validator. It is the wrong question for a +documentation integrity tool. Documentation is not a filesystem. It is an experience +delivered through navigation surfaces — sidebar menus, navbar links, footer references. +A file that exists on disk but appears on none of those surfaces is, for all practical +purposes, invisible to every reader who visits your site. + +Zenzic v0.7.0 asks a different question: + +> *Can a user reach this file by clicking through the navigation of your documentation site?* + +If the answer is no, the file is an orphan — regardless of whether it exists on disk. + +This is the **UX-Discoverability Law**. + +## Rule R21: The UX-Discoverability Law + +The formal rule, added to Zenzic's rule registry in D090: + +> **R21 — UX-Discoverability**: A file is REACHABLE if and only if at least one +> user-clickable navigation surface declares it. A file absent from all navigation +> surfaces is ORPHAN_BUT_EXISTING, regardless of its presence on disk. + +For Docusaurus projects, "navigation surfaces" means exactly three sources: + +| Surface | Config location | Detection method | +| :--- | :--- | :--- | +| **Sidebar** | `sidebars.ts` / `sidebars.js` | `type: 'doc'` entries and bare string IDs | +| **Navbar** | `themeConfig.navbar.items` | `to:` URL paths, `docId:` references | +| **Footer** | `themeConfig.footer.links` | `to:` URL paths | + +Zenzic parses all three statically — no Node.js, no build step, no external process. +The implementation is pure Python, respects Pillar 2 (zero subprocesses), and handles +both `.ts` and `.js` sidebar files. JS-style comments are stripped before parsing. +`baseUrl` and `routeBasePath` prefixes are normalised before path resolution. + +A single `frozenset[str]` is returned to the Core, which applies the orphan rule +uniformly across all engines. The Core knows nothing about sidebars, navbars, or +footers. It receives a set of navigable paths and compares them against disk contents. +That is the entire contract. + +## The Three-Surface Parser in Practice + +Given a project with an explicit `sidebars.ts` that deliberately omits `changelog.mdx`: + +```typescript +// sidebars.ts — changelog is not listed +const sidebars = { + docs: ['intro', 'guide/index', 'guide/deploy'], +}; +export default sidebars; +``` + +And a `docusaurus.config.ts` that links it from the navbar: + +```typescript +themeConfig: { + navbar: { + items: [ + { to: '/docs/changelog', label: 'Changelog', position: 'right' }, + ], + }, +}, +``` + +Pre-v0.7.0: `changelog.mdx` → **ORPHAN_BUT_EXISTING** (not in sidebar). + +Post-v0.7.0: `changelog.mdx` → **REACHABLE** (navbar declares it). + +The same logic applies to footer-only files. A legal notice linked only in the footer +has been invisible to Zenzic's orphan detector until now. In v0.7.0, it is REACHABLE +because a user can reach it by clicking. + +## Architectural Philosophy: A Different Axis + +Documentation quality tools exist on a spectrum. Most operate on one axis — link +existence — and stop there. Zenzic operates on a different axis entirely: **trust**. + +`markdown-link-check` and `htmlproofer` are correct tools for what they declare. They +are link checkers — fast, composable, well-understood. Zenzic is not a link checker +that also happens to check orphans and scan for secrets. It is a **Static Analysis +Framework** built on a fundamentally different trust model. Comparing them on a +feature checklist alone is like comparing a compiler and a formatter because both +read source files. + +The meaningful comparison is architectural: + +| Design Dimension | Generic Link Checkers | Zenzic Sentinel Guard | +| :--- | :--- | :--- | +| **Trust Model** | Trusts the source content | **Zero-Trust** — every file is untrusted input | +| **Site Awareness** | Filesystem-only | **Virtual Site Map** — engine-aware projection | +| **Security Layer** | None | **Shield** (9 secret families) + **Blood Sentinel** (path traversal) | +| **CI/CD Footprint** | Requires build step or Node.js | **Pure Python**, subprocess-free, `uvx`-ready | +| **Diagnostic System** | Free-form messages | **Zxxx Registry** — traceable, filterable codes | +| **Orphan Detection** | Not in scope | **R21 Law** — nav-surface awareness | +| **Output Format** | Text / exit code | Text, JSON, **SARIF 2.1.0** | + +The choice of which tool to run depends on which question you need to answer. If the +question is *"do these links resolve?"*, a link checker is the right tool. If the +question is *"is this documentation safe, complete, and navigable?"*, the trust model +matters — and Zero-Trust is the only honest answer when documentation is produced by +teams, contributors, or automated agents you do not fully control. + +## The Safe Harbor: What v0.7.0 Completes + +The term "Safe Harbor" has been in Zenzic's vocabulary since the +[first engineering post](/blog/hardening-the-documentation-pipeline). This is what it +means in v0.7.0: + +| Pillar | Guarantee | +| :--- | :--- | +| **Engine parity** | Z404 config asset checking in Docusaurus, MkDocs, and Zensical | +| **Core purity** | `validator.py` contains zero engine-name references (Purity Protocol) | +| **Zero-trust input** | Documentation treated as untrusted input; Shield operates on every scan | +| **Sovereign root** | `zenzic.toml` follows the target, not the caller — monorepo-safe | +| **SARIF integration** | All findings in GitHub Code Scanning format (`--format sarif`) | +| **Diagnostic traceability** | Every finding carries a Zxxx code with severity, message, and fix | +| **Verified test surface** | 1,485+ passing tests, mutant-tested boundaries, cross-platform CI | +| **UX-Discoverability** | Navbar + footer harvesting — orphan detection sees what readers see | + +This is not a list of aspirational features. Each row has a test class, a CHANGELOG +entry, and a decision record documenting the architectural choice that makes it true. + +## SARIF: Documentation Quality in Your Security Dashboard + +Starting with v0.7.0, every `zenzic check` command supports `--format sarif`: + +```bash +zenzic check all ./docs --format sarif > results.sarif +``` + +The output is valid +[SARIF 2.1.0](https://docs.oasis-open.org/sarif/sarif/v2.1.0/sarif-v2.1.0.html), +consumable directly by GitHub Code Scanning. Add this to your CI workflow: + +```yaml + +- name: Run Zenzic + + run: uvx zenzic check all ./docs --format sarif > zenzic.sarif + +- name: Upload SARIF + + uses: github/codeql-action/upload-sarif@v3 + with: + sarif_file: zenzic.sarif +``` + +Your documentation findings — broken links, orphan pages, credential fragments, +missing assets — appear in the **Security** tab of your repository, tracked alongside +code vulnerabilities. Findings are filterable by Zxxx code, assignable, and closeable +through the same workflow as any other security advisory. + +The documentation pipeline is now a first-class citizen of your security posture. + +## The Purity Protocol: Zero Engine Leaks in Core + +One invariant that emerged from the consolidation and defines the v0.7.0 architecture: +`validator.py` — the heart of Zenzic — contains no reference to any engine by name. +No "docusaurus", no "sidebar", no "navbar". The Core receives a `frozenset[str]` of +navigable paths from `adapter.get_nav_paths()`. What lives inside that method is the +adapter's problem, not the Core's. + +```python +# validator.py — the entire engine interface +nav_paths = adapter.get_nav_paths() # frozenset[str] | frozenset() +``` + +For Docusaurus, `get_nav_paths()` is a Multi-Source Harvester: it merges sidebar IDs, +navbar `to:` paths, and footer `to:` paths into a single frozenset before returning it. +The Core never sees the difference. For MkDocs, the same method returns `nav:` entries. +For Zensical, it returns `nav:` entries. For Standalone, it returns `frozenset()`. + +One interface. Four adapters. Zero engine leaks in Core. + +Adding a new adapter that modifies `validator.py` is a protocol violation. The +adapter contract is the boundary — everything engine-specific must live behind it. + +## The Zenzic Lab: 20 Acts + +The Lab is the fastest way to understand what Zenzic does. Run it now: + +```bash +uvx zenzic lab +``` + +Twenty interactive Acts, each demonstrating a distinct capability against +bundled fixture projects: + +| Acts | Coverage | +| :--- | :--- | +| Acts 0–2 | MkDocs foundations — linter demo, gold standard, broken docs | +| Act 3 | Shield — credential exposure (security breach, exit 2) | +| Acts 4–5 | Scoped targets — single file, custom directory | +| Act 6 | Zensical — transparent proxy (SENTINEL bridge) | +| Act 7 | Docusaurus v3 enterprise — versioning, @site/ aliases, i18n | +| Act 8 | Standalone Mode — full scan, zero nav contract | +| Acts 9–10 | Config asset guards — Z404 (MkDocs + Zensical) | +| Acts 11–12 | OS security — Unix path traversal, Windows path integrity | +| Acts 13–14 | Rules deep-dive — link graph stress, Shield obfuscation | +| Acts 15–16 | Quality rules — SEO coverage (Z401/Z402), quality gate (Z501/Z503) | +| Acts 17–18 | Quality scoring — penalty scorer, score regression scenarios | +| Act 19 | The Base64 Shadow — encoded credential detection | + +No configuration required. No project to set up. The Lab runs against bundled +fixture projects — no temporary files, no teardown required. The entire experience +runs in under 90 seconds on a cold start. + +## The Documentation of the Documentation Tool + +One constraint that emerged during the Diátaxis restructure of `zenzic.dev`: a +documentation tool that ships with poorly organised documentation is not a credible +authority on documentation quality. + +The site now follows the [Diátaxis framework](https://diataxis.fr/) across four modes: + +- **Tutorials** — step-by-step learning paths for new users +- **How-To Guides** — task-oriented instructions for specific problems +- **Reference** — the complete Zxxx diagnostic registry, CLI interface, and engine specs +- **Explanation** — the architectural decisions, security model, and design philosophy + +Every URL changed in the restructure. The Sovereign Root Protocol found all three +`README.md` links pointing at the old paths before any user reported them. The tool +caught its own documentation drift. + +## Get Started in 30 Seconds + +```bash +uvx zenzic lab +``` + +No installation required. `uvx` resolves and runs Zenzic from PyPI in a temporary +environment. The Lab will walk you through every capability interactively. + +For a permanent installation: + +```bash +uv add --dev zenzic +# or +pip install zenzic +``` + +Then run a check against your documentation: + +```bash +zenzic check all ./docs +``` + +--- + +| | | +|---|---| +| **GitHub** | [github.com/PythonWoods/zenzic](https://github.com/PythonWoods/zenzic) | +| **Documentation** | [zenzic.dev](https://zenzic.dev/) | +| **PyPI** | [pypi.org/project/zenzic](https://pypi.org/project/zenzic/) | +| **Changelog** | [v0.7.0 Release Notes](https://github.com/PythonWoods/zenzic/releases/tag/v0.7.0) | + +:::note[The Zenzic Chronicles] +This is **Part 5** of a five-part engineering series documenting the path from v0.5 to v0.7.0 Stable. + +[Part 1 — The Sentinel](/blog/hardening-the-documentation-pipeline) · [Part 2 — Sentinel Bastion](/blog/docs-pipeline-security-risk-obsidian-bastion) · [Part 3 — The AI Siege](/blog/ai-driven-siege-shield-postmortem) · [Part 4 — Beyond the Siege](/blog/beyond-the-siege-zenzic-v070-quartz) · **Part 5 — Quartz Maturity** +::: + +*Part 5 of the **Zenzic Chronicles**. For the complete architectural journey, visit the [Safe Harbor Blog](https://zenzic.dev/blog/).* + +*The 1,525-line [Obsidian Masterclass](/blog/obsidian-masterclass) covers every component in depth — verified by 1,485+ tests across Python 3.10 and 3.14.* diff --git a/blog/authors.yml b/blog/authors.yml new file mode 100644 index 0000000..ab551f2 --- /dev/null +++ b/blog/authors.yml @@ -0,0 +1,8 @@ +pythonwoods: + name: PythonWoods + title: Creator of Zenzic + url: https://github.com/PythonWoods + image_url: /img/pythonwoods-logo.svg + page: true + socials: + github: PythonWoods diff --git a/blog/tags.yml b/blog/tags.yml new file mode 100644 index 0000000..6ddbfe5 --- /dev/null +++ b/blog/tags.yml @@ -0,0 +1,100 @@ +# SPDX-FileCopyrightText: 2026 PythonWoods +# SPDX-License-Identifier: Apache-2.0 +# +# Zenzic Blog — Semantic Tag Registry +# Colours are keyed to the Sentinel Palette and the CLI UIPalette. + +release: + label: "Release" + permalink: /release + description: "🚀 Stable version announcements and release notes." + +security: + label: "Security" + permalink: /security + description: "🛡️ Shield updates, Blood Sentinel findings, and security advisories." + +engineering: + label: "Engineering" + permalink: /engineering + description: "⚙️ Technical deep dives into Zenzic architecture and design." + +community: + label: "Community" + permalink: /community + description: "🤝 Brand, Diátaxis, and contribution milestones." + +post-mortem: + label: "Post-Mortem" + permalink: /post-mortem + description: "💀 Failure reports, siege recaps, and lessons learned." + +python: + label: Python + permalink: /python + description: "Posts about Python internals, patterns, and ecosystem." + +opensource: + label: Open Source + permalink: /opensource + description: "Open source practices, licensing, and community building." + +devtools: + label: DevTools + permalink: /devtools + description: "Developer tooling, CI/CD, and automation." + +markdown: + label: Markdown + permalink: /markdown + description: "Markdown authoring, linting, and documentation pipelines." + +obsidian-chronicles: + label: "The Zenzic Chronicles" + permalink: /obsidian-chronicles + description: "The five-part engineering saga from v0.5 Sentinel to v0.7.0 Quartz Maturity." + +milestone: + label: "Milestone Record" + permalink: /milestone + description: "Historical records of alpha and release candidate milestones on the path to v0.7.0." + +tutorial: + label: "Tutorial" + permalink: /tutorial + description: "Step-by-step guides for getting started with Zenzic." + +quickstart: + label: "Quickstart" + permalink: /quickstart + description: "Fast-path getting-started content for new users." + +engineering-chronicles: + label: "Engineering Chronicles" + permalink: /chronicles + description: "The five-part engineering saga: from the Leaking Pipe to Quartz Maturity." + +user-tutorials: + label: "User Tutorials" + permalink: /tutorials + description: "Practical guides for getting started with Zenzic — zero configuration required." + +governance: + label: "Governance" + permalink: /governance + description: "Architectural constitution, Evolution Policy, and Sovereignty Oath — the laws that protect the Safe Harbor." + +sovereignty: + label: "Sovereignty" + permalink: /sovereignty + description: "Zero Residue, reversible design, and the Sovereignty Oath: why Zenzic is a sentinel, not a chain." + +obsidian-maturity: + label: "Quartz Maturity" + permalink: /obsidian-maturity + description: "🛡️ Codename for Zenzic v0.7.0 — the Safe Harbor reaches production maturity." + +engineering-culture: + label: "Engineering Culture" + permalink: /engineering-culture + description: "🧠 Engineering philosophy, team practices, and development methodology." diff --git a/docs/community/developers/_category_.json b/developers/_category_.json similarity index 100% rename from docs/community/developers/_category_.json rename to developers/_category_.json diff --git a/docs/community/contribute/_category_.json b/developers/contribute/_category_.json similarity index 100% rename from docs/community/contribute/_category_.json rename to developers/contribute/_category_.json diff --git a/docs/community/contribute/index.mdx b/developers/contribute/index.mdx similarity index 93% rename from docs/community/contribute/index.mdx rename to developers/contribute/index.mdx index 031d90d..36ad272 100644 --- a/docs/community/contribute/index.mdx +++ b/developers/contribute/index.mdx @@ -25,6 +25,7 @@ In this section, we guide you through our processes.
-   + __Something is not working?__ --- @@ -36,6 +37,7 @@ In this section, we guide you through our processes. [Report a bug][report a bug] -   + __Missing information in our docs?__ --- @@ -48,6 +50,7 @@ In this section, we guide you through our processes. [Report a docs issue][report a docs issue] -   + __Want to submit an idea?__ --- @@ -65,6 +68,7 @@ In this section, we guide you through our processes.
-   + __Want to contribute to the code?__ --- @@ -99,20 +103,25 @@ nice and constructive, complying with our [Code of Conduct]. ### Before creating an issue - Are you using the appropriate issue template, or is there another one that + better fits the context of your request? - Have you checked if a similar bug report or change request has already been + created, or have you stumbled upon something that might be related? - Did you fill out every field as requested, and did you provide all additional + information we maintainers need to comprehend your request? ### Before commenting - Is your comment relevant to the topic of the current issue, or is it a better + idea to create a new issue, as it's not or only loosely related? - Does your comment add value to the conversation? Is it constructive and + respectful to our community and us maintainers? Could you just use a [reaction] instead? @@ -134,14 +143,14 @@ as follows:__ ### Incomplete issues -We _reserve the right to close issues lacking essential information_, such as +We *reserve the right to close issues lacking essential information*, such as missing reproductions or those not adhering to the quality standards and requirements specified in our issue templates. We'll reopen an issue once the missing information has been provided. ### Questions as issues -We _reserve the right to close questions opened as any kind of issue_. The +We *reserve the right to close questions opened as any kind of issue*. The issue tracker is not a place for questions, but rather for detailed [bug reports], [documentation issues], and [change requests] that adhere to the quality standards laid out in this guide. @@ -149,12 +158,12 @@ quality standards laid out in this guide. ### Duplicated issues To maintain organized and efficient communication within our [issue tracker], -we _reserve the right to close any duplicated issues_. +we *reserve the right to close any duplicated issues*. ### Reopened issues -We further _reserve the right to immediately close issues that are reopened -without providing new information_. +We further *reserve the right to immediately close issues that are reopened +without providing new information*. [reaction]: https://github.blog/2016-03-10-add-reactions-to-pull-requests-issues-and-comments/ [issue tracker]: https://github.com/PythonWoods/zenzic/issues diff --git a/docs/community/contribute/pull-requests.mdx b/developers/contribute/pull-requests.mdx similarity index 81% rename from docs/community/contribute/pull-requests.mdx rename to developers/contribute/pull-requests.mdx index 597f98d..b7da742 100644 --- a/docs/community/contribute/pull-requests.mdx +++ b/developers/contribute/pull-requests.mdx @@ -102,6 +102,7 @@ just test-full # or HYPOTHESIS_PROFILE=ci pytest ``` + ::: :::note[End users vs contributors] @@ -156,7 +157,7 @@ cryptographically signed. Follow the instructions on GitHub for using [gpg], ## Developer certificate of origin To ensure the legal integrity of our project, we require all contributors to -_sign off_ on their commits, thus accepting the Developer Certificate of Origin. +*sign off* on their commits, thus accepting the Developer Certificate of Origin. This certifies that you have the right to submit the code under the project's license. @@ -166,6 +167,57 @@ Add a `Signed-off-by` line to every commit using the `-s` flag: git commit -s -m ": (#)" ``` +## REUSE 3.3 — Copyright headers + +This project enforces [REUSE 3.3](https://reuse.software/spec/) compliance via a +pre-commit hook. Every source file must carry an SPDX copyright header. + +### Single-author file (default) + +```text +# SPDX-FileCopyrightText: 2026 PythonWoods +# SPDX-License-Identifier: Apache-2.0 +``` + +### Multi-author file — append, never overwrite + +If you contribute to a file that already has a copyright header, **append** your +own `SPDX-FileCopyrightText` line on a new line immediately below the existing +one. Never replace or remove the original author's line: + +```text +# SPDX-FileCopyrightText: 2026 PythonWoods +# SPDX-FileCopyrightText: 2026 Your Name +# SPDX-License-Identifier: Apache-2.0 +``` + +The `SPDX-License-Identifier` line stays last and appears only once per file. + +:::note[Zenzic Shield and copyright lines] + +Zenzic's normalizer skips `SPDX-FileCopyrightText` comment lines during +word-count checks (Z502) — they are metadata, not prose. The Shield (Z201) +does not trigger on these lines either, because copyright email addresses +are structurally distinct from credential patterns. + +::: + +### MDX / JSX files + +Use MDX comment syntax: + +```mdx +{/* SPDX-FileCopyrightText: 2026 PythonWoods */} +{/* SPDX-FileCopyrightText: 2026 Your Name */} +{/* SPDX-License-Identifier: Apache-2.0 */} +``` + +### Files that cannot carry inline headers + +For binary files, generated assets, or formats with no comment syntax, add an +entry to `REUSE.toml` at the repository root instead of adding inline headers. +The pre-commit hook validates both inline headers and `REUSE.toml` entries. + ## Use of Generative AI AI-assisted coding can be useful, but the unreflected inclusion of AI-generated diff --git a/docs/community/contribute/report-a-bug.mdx b/developers/contribute/report-a-bug.mdx similarity index 99% rename from docs/community/contribute/report-a-bug.mdx rename to developers/contribute/report-a-bug.mdx index d698e4b..64e6e77 100644 --- a/docs/community/contribute/report-a-bug.mdx +++ b/developers/contribute/report-a-bug.mdx @@ -1,7 +1,9 @@ --- icon: lucide/bug tags: + - Community + sidebar_label: "Bug Reports" description: "How to report bugs effectively with reproduction steps." --- @@ -41,9 +43,11 @@ Only bugs that occur in the latest version of Zenzic will be addressed. Before creating a bug report, do some research: 1. [Search our documentation][Search our documentation] and look for sections + related to your problem. 2. [Search our issue tracker][issue tracker], as another user might already + have reported the same problem. __Keep track of all search terms and relevant links; you'll need @@ -95,6 +99,7 @@ Provide a clear, focused, and concise summary of the bug. Adhere to the following principles: - __Explain the what, not the how__ – focus on the problem + and its impact, not how to reproduce it. - __Keep it short and concise__ – one or two sentences is ideal. diff --git a/docs/community/contribute/report-a-docs-issue.mdx b/developers/contribute/report-a-docs-issue.mdx similarity index 99% rename from docs/community/contribute/report-a-docs-issue.mdx rename to developers/contribute/report-a-docs-issue.mdx index fa191d0..7883d31 100644 --- a/docs/community/contribute/report-a-docs-issue.mdx +++ b/developers/contribute/report-a-docs-issue.mdx @@ -1,7 +1,9 @@ --- icon: lucide/file-pen-line tags: + - Community + sidebar_label: "Documentation Issues" description: "How to report documentation errors and suggest improvements." --- @@ -50,6 +52,7 @@ describe the severity of the issue: - __Keep it short__ – one or two sentences is ideal. - __One issue at a time__ – create separate issues for unrelated + inconsistencies. > __Why we need this__: describing the problem clearly is a prerequisite for diff --git a/docs/community/contribute/request-a-change.mdx b/developers/contribute/request-a-change.mdx similarity index 99% rename from docs/community/contribute/request-a-change.mdx rename to developers/contribute/request-a-change.mdx index 1a7db38..371bf12 100644 --- a/docs/community/contribute/request-a-change.mdx +++ b/developers/contribute/request-a-change.mdx @@ -1,7 +1,9 @@ --- icon: lucide/hand-platter tags: + - Community + sidebar_label: "Change Requests" description: "How to propose new features or changes to Zenzic." --- @@ -88,6 +90,7 @@ Provide a detailed and precise description of your idea. Explain why it is relevant to Zenzic specifically. - **Explain the what, not the why** – focus on describing the + proposed change precisely. Benefits belong in [Use cases]. - **Keep it short** – be brief and to the point. diff --git a/docs/community/developers/explanation/_category_.json b/developers/explanation/_category_.json similarity index 100% rename from docs/community/developers/explanation/_category_.json rename to developers/explanation/_category_.json diff --git a/developers/explanation/adr-agnostic-universalism.mdx b/developers/explanation/adr-agnostic-universalism.mdx new file mode 100644 index 0000000..4273298 --- /dev/null +++ b/developers/explanation/adr-agnostic-universalism.mdx @@ -0,0 +1,96 @@ +--- +sidebar_label: "ADR 005: Agnostic Universalism" +sidebar_position: 2 +description: "ADR 005: Z404 CONFIG_ASSET_MISSING extended to all supported engines — not just Docusaurus." +--- + +{/* SPDX-FileCopyrightText: 2026 PythonWoods */} +{/* SPDX-License-Identifier: Apache-2.0 */} + +# ADR 005: Agnostic Universalism — Z404 for All Engines + +**Status:** Active +**Decider:** Architecture Lead +**Date:** 2026-04-20 (v0.7.0 sprint) + +--- + +## Context + +Z404 (`CONFIG_ASSET_MISSING`) was originally implemented exclusively for the +Docusaurus adapter. It detected when a file declared in `docusaurus.config.ts` +(favicon, Open Graph image, custom CSS) did not exist on disk. + +This created a structural contradiction: Zenzic's public claim is that it is a +**Safe Harbor for all documentation engines**. Offering a config-asset integrity +check only to Docusaurus users violated this claim. An MkDocs project declaring +a `theme.favicon` that pointed to a non-existent file would receive no +diagnostic — a silent gap in the safety perimeter. + +--- + +## Decision + +Z404 was extended from a Docusaurus-only check to a **universal check** covering +all supported engines. Each adapter declares which config assets it owns, and the +core engine invokes `check_config_assets()` uniformly across all adapters. + +| Engine | Assets checked | +|--------|---------------| +| Docusaurus | `customCss`, `favicon`, Open Graph `image`, social card paths in `themeConfig` | +| MkDocs | `theme.favicon`, `theme.logo` (resolved relative to `docs_dir/`) | +| Zensical | `[project].favicon`, `[project].logo` | +| Standalone | — (no engine config file; check is a no-op) | + +--- + +## Rationale + +### 1. Safe Harbor Means All Ports + +A harbor that is safe only for Docusaurus ships is not a Safe Harbor — it is a +**branded harbor with a size restriction**. The moment a check is engine-specific +without a technical reason, it signals to contributors that engine parity is +optional. That signal compounds over releases. + +### 2. The Adapter Protocol Already Provides the Hook + +`BaseAdapter` already defined `check_config_assets()` as an optional method. The +universalism decision was not an architectural change — it was the **activation of +an already-present architectural contract**. Every adapter already had the +infrastructure; only the MkDocs and Zensical implementations were missing. + +### 3. Preventing the "Trusted Config" Assumption + +The implicit assumption that engine configuration files contain valid asset paths +is the same category of trust error that Zenzic was designed to eliminate. A +`theme.favicon: assets/icon.png` that doesn't exist is a broken link — it just +happens to live in a YAML file rather than a Markdown document. + +--- + +## Implementation + +Each adapter's `check_config_assets()` method: + +1. Reads the engine config file (one-time I/O, not in a hot loop). +2. Resolves each declared asset path against `docs_root`. +3. Emits a `Finding` with code `Z404` for each path that does not exist on disk. + +The core `check_all()` pipeline calls `adapter.check_config_assets()` after the +per-file scan phase, ensuring Z404 findings appear in the same SARIF report and +exit-code accounting as all other findings. + +--- + +## Consequences + +- MkDocs and Zensical users gain asset integrity validation without any config change. +- Adding a new engine adapter requires implementing `check_config_assets()` — the + + protocol now enforces this explicitly (a `NotImplementedError` is raised for + adapters that skip it). + +- Z404 is now classified as a **universal quality check**, not an engine-specific + + feature, in `reference/finding-codes.mdx`. diff --git a/developers/explanation/adr-bilingual-structural.mdx b/developers/explanation/adr-bilingual-structural.mdx new file mode 100644 index 0000000..9955aa9 --- /dev/null +++ b/developers/explanation/adr-bilingual-structural.mdx @@ -0,0 +1,153 @@ +--- +sidebar_label: "ADR 008: Bilingual Structural Invariant" +sidebar_position: 9 +description: "ADR 008: Atomic filesystem parity between the English source tree and its Italian mirror — the Symmetry Guardrail." +--- + +{/* SPDX-FileCopyrightText: 2026 PythonWoods */} +{/* SPDX-License-Identifier: Apache-2.0 */} + +# ADR 008: Bilingual Structural Invariant — The Symmetry Guardrail + +**Status:** Active +**Decider:** Architecture Lead +**Date:** 2026-04-20 (v0.7.0 sprint, D045 — Diátaxis Migration) + +--- + +## Context + +Zenzic.dev is a bilingual documentation site. English (`docs/`) is the +authoritative source; Italian (`i18n/it/docusaurus-plugin-content-docs/current/`) +is the translation mirror. Docusaurus's language switcher resolves Italian pages +by **mirroring the English filesystem path**: a user on +`/docs/reference/finding-codes` switches to `/it/docs/reference/finding-codes` — +and Docusaurus serves the file at the corresponding path in the `i18n/it/` tree. + +During the v0.7.0 Diátaxis migration (D045), 29 English files were renamed and +moved to align with the four-quadrant structure. Several Italian files were not +moved atomically in the same commit. The result: the language switcher produced +**404 errors** on pages where the English file had been moved but the Italian +mirror had not. + +This class of bug is particularly insidious because: + +1. **No build-time error is produced.** `onBrokenLinks: 'throw'` only detects + + internal `[text](link)` references — it does not validate language switcher + paths. + +2. **The bug is invisible in development mode.** `npm run start` serves a single + + locale. The switcher is inactive. The 404 only appears in `just build` output + when both locales are built simultaneously. + +3. **The time-to-detection window is long.** A missing IT file discovered three + + commits after the EN rename requires a forensic git blame to trace — the + coupling between the two moves is no longer visible in the history. + +--- + +## Decision + +> **Every structural change to `docs/` must be applied atomically to +> `i18n/it/docusaurus-plugin-content-docs/current/` in the same commit.** + +This is not a recommendation — it is a hard invariant. Three specific rules +follow from it: + +### Rule 1 — Atomic Moves + +Any `git mv` applied to a file in `docs/` must be accompanied by a corresponding +`git mv` in the Italian mirror **in the same commit**. A rename in English is a +rename in Italian. + +```bash +# Correct — both moves in one commit +git mv docs/guides/intro.mdx docs/tutorials/intro.mdx +git mv i18n/it/docusaurus-plugin-content-docs/current/guides/intro.mdx \ + i18n/it/docusaurus-plugin-content-docs/current/tutorials/intro.mdx +git commit -m "refactor(docs): move intro to tutorials quadrant (EN + IT)" +``` + +### Rule 2 — Slug Parity + +If a `slug:` value is changed in an English file, it must be changed identically +in the corresponding Italian file. A diverged `slug:` causes the language +switcher to produce a 404, with no build-time warning. + +### Rule 3 — Symmetry Validation Before Every Commit + +Before committing any change that touches the filesystem structure (renames, +additions, deletions), the following command must exit 0: + +```bash +diff \ + <(find docs -name "*.mdx" | sed 's|^docs/||' | sort) \ + <(find i18n/it/docusaurus-plugin-content-docs/current \ + -name "*.mdx" | \ + sed 's|^i18n/it/docusaurus-plugin-content-docs/current/||' | sort) +``` + +Any output from this command represents a structural asymmetry that **will** +produce a 404 on the Italian language switcher. + +--- + +## Rationale + +### 1. Italian is a First-Class Citizen + +The Italian documentation is not a secondary asset or a "nice to have". It is +part of the Safe Harbor contract. A link that works in English but 404s in +Italian is a **structural failure** of the documentation system — equivalent to +a broken internal link in the English tree. + +### 2. The Language Switcher Has No Safety Net + +Docusaurus's `onBrokenLinks: 'throw'` does not cover language switcher paths. +This means the only safeguard is the contributor discipline enforced by this ADR. +There is no build-time backstop. + +### 3. Git History Coherence + +An atomic commit that moves both EN and IT files creates a **coherent history +unit**: the rename is a single, reversible operation. Split commits create +history noise and make bisect unreliable when investigating regressions. + +--- + +## Invariants (Non-Negotiable) + +- The symmetry `diff` command must exit 0 before any commit that modifies the + + filesystem structure of `docs/` or `i18n/it/`. + +- New files added to `docs/` must have a corresponding stub added to `i18n/it/` + + **in the same commit** — even if the Italian content is a copy of the English + until a translation is provided. + +- The pre-commit hook (`pre-commit-config.yaml`) enforces symmetry at the gate. + + Bypassing it with `--no-verify` on a structural commit is a Class 1 violation + (Technical Debt). + +--- + +## Consequences + +- Every contributor who renames or moves a documentation file must be aware of + + the Italian mirror — this is a non-optional part of the contribution workflow + documented in `CONTRIBUTING.md`. + +- The `just preflight` recipe (`uvx pre-commit run --all-files`) enforces this + + check in CI. A PR that breaks structural symmetry will fail at the gate. + +- The symmetry invariant applies to **directory structure** only. Italian + + *content* may lag behind English during active sprints, as long as the file + is present (even as a stub). A 404 is worse than a stale translation. diff --git a/developers/explanation/adr-cross-instance-allowlist.mdx b/developers/explanation/adr-cross-instance-allowlist.mdx new file mode 100644 index 0000000..09e389d --- /dev/null +++ b/developers/explanation/adr-cross-instance-allowlist.mdx @@ -0,0 +1,177 @@ +--- +sidebar_label: "ADR 011: Cross-Instance Allowlist" +sidebar_position: -1 +description: "ADR 011: Why Zenzic v0.7.0 prefers a declarative absolute_path_allowlist over silent suppression for multi-instance Docusaurus deployments." +--- + +{/* SPDX-FileCopyrightText: 2026 PythonWoods */} +{/* SPDX-License-Identifier: Apache-2.0 */} + +# ADR 011: Cross-Instance Absolute Path Allowlist + +**Status:** Accepted (May 2026) +**Decider:** Tech Lead +**Date:** 2026-05-03 (v0.7.0 "Quartz Maturity" / "Quarzo") + +--- + +## Context + +The v0.7.0 documentation restructure introduced a **multi-instance Docusaurus** +architecture: `/docs/*` (User area) and `/developers/*` (Developer area) are +served by two separate `@docusaurus/plugin-content-docs` instances. This split +is a discoverability win — search, sidebar, and breadcrumbs no longer mix +user-level and engineering content — but it creates a structural friction with +the Zenzic core validator. + +Each Zenzic adapter analysis pass operates on **one** Virtual Site Map (VSM) +at a time. Cross-plugin links (e.g. a User how-to that links to a Developer +reference) cannot be relative — Docusaurus refuses to resolve relative paths +across plugin boundaries. They must be absolute (`/developers/how-to/...`). +But absolute paths are exactly what `Z105 ABSOLUTE_PATH` was designed to +forbid: they break portability when a site is hosted in a subdirectory, and +they are environment-dependent. The validator, lacking knowledge of the +sibling plugin's VSM, sees a legitimate cross-plugin link as a broken absolute +reference. + +The options examined were: + +- **Option A** — Implement a manual allowlist in core configuration. +- **Option B** — Force the use of JSX components (e.g. ``), binding the + source to the build engine (violates **Pillar 1: Lint the Source**). +- **Option C** — Auto-detect cross-instance routes via multiple scans + (computationally expensive, risks **Pillar 2: Zero Subprocesses**). + +## Decision + +We adopt **Option A**: a `[link_validation] absolute_path_allowlist` key in +`zenzic.toml`. The validator honors listed prefixes as **Trusted Ghost +Routes** — absolute paths whose targets are project-internal but live outside +the current VSM. + +```toml +# zenzic.toml — declarative cross-plugin contract +[link_validation] +absolute_path_allowlist = ["/developers/", "/api/"] +``` + +The check runs immediately before the Z105 emission in `validator.py`: if the +parsed path starts with any allowlisted prefix, the link is treated as valid +and skipped — no other resolution is attempted, no error is recorded. + +## Rationale + +This decision is governed by the **Transparency Invariant**: + +- **Explicit Declaration.** Instead of silencing errors with inline `noqa` + comments scattered through Markdown, the architect declares — once, in + config — which absolute prefixes the project owns. The configuration *is* + the cross-instance map. +- **Linter Integrity.** Zenzic still does its job: an absolute link that is + neither on disk nor in the allowlist still fails the push. The allowlist + narrows the scope of trust; it does not weaken Z105 itself. +- **Engine Agnosticism.** Markdown source remains agnostic of Docusaurus, + MkDocs, or any future multi-instance engine. No `` import, no JSX + prelude — the same `.mdx` file would work in a single-instance migration. + +Option B was rejected because JSX imports bleed engine-specific syntax into +content (Pillar 1 violation). Option C was rejected because it would require +either subprocess delegation to the build tool (Pillar 2 violation) or +duplicating Docusaurus's plugin-resolution logic in Python (maintenance +burden, parity drift). + +## Invariants + +These constraints are permanent consequences of ADR-0011: + +1. **Allowlist entries must start with `/`.** Relative entries are nonsense + (relative paths never trigger Z105) and would silently broaden the bypass. +2. **Match semantics are `startswith` only.** No globbing, no regex, no + wildcards. The semantics must remain inspectable at a glance. +3. **The check runs before Z105 emission, not after.** Allowlisted links must + never appear in the findings stream — not even as suppressed `info` — + because they represent intentional architectural contracts, not silenced + problems. +4. **Allowlist entries are not validated for existence.** Z108 + `STALE_ALLOWLIST_ENTRY` (config hygiene) is intentionally deferred to + v0.8.0 to preserve **Pillar 3: Pure Functions** (no aggregate cross-worker + state). See [Technical Debt Ledger](../governance/technical-debt.mdx). + +## Consequences + +### Pros + +- **Pillar 1 preserved.** Markdown source stays engine-agnostic. +- **Pillar 2 preserved.** Validation remains deterministic, no extra + processes or network scans. +- **Audit Trail.** The `zenzic.toml` becomes a documented map of inter- + instance dependencies — readable by humans, parseable by tools, versioned + in git. +- **Reversible.** Removing an entry restores Z105 enforcement on that + prefix; the architect can always re-tighten the perimeter. + +### Cons + +- **Manual Maintenance.** If a satellite route changes (e.g. `/developers/` + → `/dev/`), the allowlist in the core repo must be updated by hand. The + validator cannot detect a renamed route through the allowlist alone. +- **Scope Discipline Required.** A reckless allowlist (`["/"]`) would + silently disable Z105 entirely. Code review of `zenzic.toml` changes is + the protection. + +## Transparency Analysis + +The allowlist transforms a potential **blind spot** into a **conscious +choice**. Zenzic's stance is unambiguous: we prefer the developer to write + +> "Zenzic, I know `/developers/` is not in this VSM — trust me." + +over hiding the same fact behind an inline suppression comment that +degrades the global Quality Score without explaining the system topology. +The first form is documentation; the second is technical debt disguised as +silence. + +This ADR establishes the precedent for how Zenzic will handle expansion +toward micro-site architectures: every cross-boundary trust must be +declared, named, and reviewable. + +## Suppression vs Configuration + +Zenzic offers two distinct primitives for telling the linter "this is +intentional." They are **orthogonal** and must not be conflated: + +| Primitive | Scope | Use when | +|---|---|---| +| `[link_validation] absolute_path_allowlist` | Project-wide structural contract | The fact is a **systemic truth** of the architecture (e.g. multi-instance routing, satellite domain prefix). | +| `` / `{/* zenzic:ignore Zxxx */}` | One source line | The rule is correct in general; this **specific occurrence** is a documented, local exception (e.g. a code sample that *looks* like a credential). | + +The allowlist is a **contract**: it changes Z105's domain of validity by +declaring premises about the project's URL space. The validator still +*evaluates* the link — the evaluation simply has different inputs. + +The inline ignore is **surgery**: it suppresses an emitted finding on a +single line, leaves an audit comment in the source, and is reviewed at +the diff level. + +**Anti-pattern (forbidden in v0.7.0+).** Cross-plugin links must never +be handled with ``. Doing so would tacitly +admit that the routing is "broken and accepted"; in fact the routing is +**correct by design**, and that correctness deserves promotion to the +project's structural configuration. Inline suppression of cross-plugin +links also fragments the truth: a future contributor reading +`zenzic.toml` would see no record of the cross-boundary dependency. + +**Decision rule.** If the same suppression would be needed in two or +more files, it is no longer a local exception — it is a systemic truth +and belongs in `zenzic.toml`. Promote it. + +--- + +## Related + +- [ADR 002: Zero Subprocesses Policy](./adr-zero-subprocesses.mdx) — + forbids the auto-detection alternative (Option C). +- [ADR 001: Lint the Source](./adr-lint-source.mdx) — forbids the JSX + alternative (Option B). +- [Technical Debt Ledger](../governance/technical-debt.mdx) — records the + Z108 deferral (Pillar 3 preservation). diff --git a/developers/explanation/adr-decentralized-cli.mdx b/developers/explanation/adr-decentralized-cli.mdx new file mode 100644 index 0000000..33b7d5c --- /dev/null +++ b/developers/explanation/adr-decentralized-cli.mdx @@ -0,0 +1,170 @@ +--- +sidebar_label: "ADR 004: Decentralized CLI" +sidebar_position: 7 +description: "ADR 004: Splitting the monolithic CLI module into a structured package — the Layer Law that keeps Core independent of CLI." +--- + +{/* SPDX-FileCopyrightText: 2026 PythonWoods */} +{/* SPDX-License-Identifier: Apache-2.0 */} + +# ADR 004: Decentralized CLI Package + +**Status:** Active +**Decider:** Architecture Lead +**Date:** 2026-04-15 (v0.7.0 sprint, D062-B / D063 / D064) + +--- + +## Context + +Zenzic's original CLI lived in a single file: `src/zenzic/cli.py`. Over the +course of the v0.6.x release cycle, that file grew to exceed **2,000 lines**, +containing six conceptually distinct responsibilities in a single namespace: + +| Responsibility | Examples | +|---|---| +| Analysis commands | `check links`, `check orphans`, `check all` | +| Engine inspection | `inspect capabilities` | +| Maintenance commands | `clean` | +| Lab showcase | `zenzic lab` — 11 interactive acts | +| Standalone operations | `diff`, `score`, `init` | +| Shared UI/output helpers | banner, console, exclusion manager builder | + +This monolith created compounding problems: + +1. **Circular import risk.** As `core/` modules grew, contributors were tempted + + to import `cli.py` utilities directly from core, inverting the dependency + direction. + +2. **UI state scattering.** The Rich `console` object was instantiated multiple + + times across different function scopes, causing inconsistent output formatting + and race conditions in test environments. + +3. **Test isolation failure.** Every test that touched any CLI command had to + + import the entire `cli.py` — including the lab showcase, the Rich live display, + and all Typer sub-apps. This inflated test startup time and made mocking + unreliable. + +4. **Contributor friction.** A new contributor adding a check command had no + + clear "where does this go?" signal from the file structure alone. + +--- + +## Decision + +`src/zenzic/cli.py` was dissolved into a package `src/zenzic/cli/` with the +following module structure: + +```text +src/zenzic/cli/ + __init__.py — public re-exports + _check.py — check sub-app: links, orphans, snippets, references, assets, all + _inspect.py — inspect sub-app: capabilities + _clean.py — clean sub-app + _lab.py — lab command: 11 Acts (0–10), interactive showcase + _standalone.py — standalone commands: diff, init, score + _shared.py — shared helpers: _build_exclusion_manager, _validate_docs_root, + _ui, console +``` + +`src/zenzic/main.py` became the **Typer entry point** — a thin orchestrator +that imports each sub-app and registers it on the root Typer application. It +contains no analysis logic. + +Three companion decisions were applied in the same sprint: + +- **D062-B:** `src/zenzic/ui.py` → `src/zenzic/core/ui.py`. UI primitives are + + consumed by both CLI and Core; placing them in `core/` ensures Core can use + them without importing from `cli/`, which would violate the Layer Law. + +- **D063:** `src/zenzic/lab.py` → `src/zenzic/cli/_lab.py`. The lab showcase is + + pure CLI orchestration — interactive Rich displays, act sequencing, user + prompts. It belongs with the CLI layer, not adjacent to the core. + +- **D064 (SDK Cleansing):** `run_rule()` was extracted from `cli.py` into + + `core/rules.py`. The public `zenzic.rules` module became a **6-line re-export + façade** — backwards compatible for any third-party code that imported it + directly, while ensuring the implementation lives in `core/`. + +--- + +## The Layer Law (Rule R05) + +This ADR formalises the **dependency direction invariant** as a named rule: + +> **R05 — Core never imports upward.** Modules in `src/zenzic/core/` must never +> import from `src/zenzic/cli/` or `src/zenzic/main.py`. + +The enforced direction is: + +```text +cli/ → core/ → models/ +``` + +`cli/` may import anything from `core/`. `core/` may import from `models/`. The +reverse is permanently forbidden. This ensures that `core/` can be used as a +standalone SDK without dragging in Typer, Rich live displays, or any interactive +I/O dependencies. + +--- + +## Rationale + +### 1. Single Responsibility at the File Level + +A 2,000-line file is not a file — it is an undeclared package. Formalising the +package structure makes the single-responsibility principle visible in the +filesystem: a contributor looking for orphan-detection logic opens `_check.py`, +not a monolith where they must search by function name. + +### 2. Test Isolation + +After the split, `test_cli.py` can import only the specific sub-app under test. +The lab showcase's Rich live displays are no longer loaded when testing `check +links`. Startup time for individual test modules dropped measurably. + +### 3. SDK Contract + +The `zenzic.rules` façade preserves backwards compatibility for any project that +used `from zenzic.rules import run_rule`. No import path changes were required for +existing integrations, despite the internal reorganisation. + +--- + +## Invariants (Non-Negotiable) + +- `src/zenzic/core/` never imports from `src/zenzic/cli/` — any PR that introduces + + such an import is an automatic revert candidate. + +- `_shared.py` is the **only** place in `cli/` where the Rich `console` object is + + instantiated. All other `cli/` modules call `_ui()` from `_shared.py`. + +- `src/zenzic/main.py` contains **no analysis logic** — only Typer app wiring. +- `zenzic.rules` remains a re-export façade. The implementation lives in + + `core/rules.py`. + +--- + +## Consequences + +- New CLI commands are added to the appropriate `cli/_*.py` module, not to a + + catch-all monolith. + +- The `run_rule()` function is importable as both `zenzic.rules.run_rule` (public + + façade) and `zenzic.core.rules.run_rule` (direct). Both paths are stable. + +- The lab showcase (`cli/_lab.py`) can be extended with new acts without + + affecting the analysis pipeline's test surface. diff --git a/docs/community/developers/explanation/adr-discovery.mdx b/developers/explanation/adr-discovery.mdx similarity index 99% rename from docs/community/developers/explanation/adr-discovery.mdx rename to developers/explanation/adr-discovery.mdx index 0a54184..f60d350 100644 --- a/docs/community/developers/explanation/adr-discovery.mdx +++ b/developers/explanation/adr-discovery.mdx @@ -28,8 +28,11 @@ Without a known root, Zenzic cannot: - Resolve absolute-style internal links (`/docs/page.md`) to physical files. - Locate `zenzic.toml` or a fallback engine config (`mkdocs.yml`, `zensical.toml`). - Enforce the Virtual Site Map (VSM) perimeter — the oracle that determines + what is a valid page and what is a Ghost Route. + - Avoid accidentally indexing files that belong to a parent project, + a sibling repository, or the system root. The root discovery mechanism must therefore be **deterministic**, **safe by @@ -97,9 +100,12 @@ Zenzic's behaviour is independent of the build toolchain. ## Consequences - **Positive:** Every code path that calls `find_repo_root()` is guaranteed + to receive a valid, bounded directory or raise before any I/O occurs. + - **Positive:** Ghost Route logic and VSM construction have a stable anchor. - **Negative (pre-amendment):** The `zenzic init` command, whose purpose is + to *create* the `zenzic.toml` root marker, could not be run in a directory that had neither `.git` nor `zenzic.toml`. This was the **Bootstrap Paradox** (ZRT-005). @@ -155,6 +161,8 @@ scratch. - `src/zenzic/core/scanner.py` — `find_repo_root()` implementation - `src/zenzic/cli.py` — `init` command, sole consumer of `fallback_to_cwd=True` - `tests/test_scanner.py` — `test_find_repo_root_genesis_fallback`, + `test_find_repo_root_genesis_fallback_still_raises_without_flag` + - `tests/test_cli.py` — `test_init_in_fresh_directory_no_git` - `CONTRIBUTING.md` — Core Laws → Root Discovery Protocol diff --git a/developers/explanation/adr-lint-source.mdx b/developers/explanation/adr-lint-source.mdx new file mode 100644 index 0000000..ba43832 --- /dev/null +++ b/developers/explanation/adr-lint-source.mdx @@ -0,0 +1,166 @@ +--- +sidebar_label: "ADR 001: Lint the Source" +sidebar_position: -2 +description: "ADR 001: The Genesis Decision — why Zenzic analyzes raw Markdown sources and never the build output." +--- + +{/* SPDX-FileCopyrightText: 2026 PythonWoods */} +{/* SPDX-License-Identifier: Apache-2.0 */} + +# ADR 001: Lint the Source, Not the Build + +**Status:** Active (Genesis Decision) +**Decider:** Architecture Lead +**Date:** 2026-01-01 (founding principle, pre-v0.1.0) + +--- + +## Context + +When Zenzic was conceived, the dominant approach to documentation validation was +**output-based analysis**: tools like `linkchecker` and `htmlproofer` fetch or +parse the HTML generated by the build engine, then traverse the rendered page +structure to verify link targets, image paths, and anchor IDs. + +This approach has a fundamental structural flaw: the validator is downstream of +the build. Validation can only run **after** the build succeeds. If the build +fails — due to a syntax error, a missing plugin, or an engine version mismatch — +no validation occurs at all. The pipeline produces silence where it should produce +a diagnostic. + +Three compounding problems emerge in CI environments: + +1. **Build coupling.** A documentation validator that requires a successful build + + cannot be the first gate in the pipeline. It must be placed after `mkdocs build` + or `npm run build`, adding 2–10 minutes of build overhead before a single link + is checked. + +2. **Engine fragility.** Build engines change how they generate anchor IDs, URL + + slugs, and asset paths between minor versions. A validator calibrated to the + output of MkDocs 1.5 may silently miss broken links under MkDocs 1.6 because + the ID generation scheme changed. The validator is, in effect, testing the + engine's output rather than the author's intent. + +3. **Engine lock-in.** A validator that understands HTML from one engine cannot + + validate HTML from another without engine-specific adaptation. This creates a + validation ecosystem that fragments along engine lines rather than converging + on universal documentation quality standards. + +The "MkDocs Crisis" — a period during Zenzic's early development when the +reference documentation lost all link validity due to an MkDocs upgrade that +changed slug generation — crystallised the cost of output-based validation. The +error was not in the Markdown source; it was in the mismatch between the source +and the engine's new URL convention. An output-based validator would have caught +this only after the broken site was deployed. + +--- + +## Decision + +> **Zenzic analyzes raw Markdown source files and static configuration files +> exclusively. It never inspects, fetches, or depends on HTML build output.** + +The implementation vehicle for this decision is the **Virtual Site Map (VSM)** — +a complete in-memory projection of the final site, constructed from source files +alone, using engine-specific knowledge encoded in **adapters** (see ADR 005, +ADR 007). + +The VSM allows Zenzic to answer questions that previously required a live site: + +- "Does this anchor `#installation` exist in the target page?" — answered by + + parsing the Markdown heading structure, not the rendered HTML. + +- "Is this path `/docs/reference/finding-codes` a valid route?" — answered by + + the VSM's route graph, which models i18n fallbacks and versioned slugs without + executing the build. + +- "Is this asset referenced in `docusaurus.config.ts` present on disk?" — answered + + by static parsing of the TypeScript config file, not by starting a Node.js + process. + +--- + +## Rationale + +### 1. Pre-Build Error Prevention + +A broken link discovered before the build is a developer warning. A broken link +discovered after a 10-minute build is a CI failure that blocks the PR queue. +Zenzic's position in the pipeline is always **before the build** — it is the +gate that certifies the source is structurally sound before any build resource +is consumed. + +### 2. Engine Agnosticism by Design + +By analyzing source files rather than build output, Zenzic is inherently +engine-agnostic. The same `check links` command validates an MkDocs project, +a Docusaurus site, and a Zensical wiki — because all three share the same +raw Markdown format. Engine-specific URL conventions are encoded in the adapter +layer (not in the validator), making the core engine permanently portable. + +### 3. Deterministic Analysis + +Source files are static. A given set of Markdown files produces the same +analysis results regardless of which machine runs Zenzic, which Python version +is installed, or which timezone the CI runner is in. Build-output validators +introduce non-determinism through engine version drift, network-fetched pages, +and CDN caching. Zenzic's source-based analysis is a **pure function of the +repository state** — identical input, identical output, always. + +### 4. The Ghost Route Capability + +The VSM models routes that do not exist as physical files on disk: i18n +fallback routes, versioned documentation slugs, and engine-generated index +pages. An output-based validator can only test routes that the build produces. +Zenzic's VSM models the **intent** of the documentation architecture, catching +structural errors in routes that the author planned but hasn't yet published. + +--- + +## Invariants (Non-Negotiable) + +- Zenzic's validation logic (`core/validator.py`, `core/scanner.py`) must never + + start an HTTP request, load a browser, or parse HTML. All analysis operates + on bytes read from the filesystem. + +- The VSM (`models/vsm.py`) is the canonical source of route truth. No validator + + may compute a route by invoking the build engine — even as a subprocess. + +- Adapters may read static configuration files (`.ts`, `.yml`, `.toml`) using + + pure-Python text parsing. They must not execute those files (see ADR 002). + +--- + +## Consequences + +- Zenzic's analysis performance is **content-dependent**. Measured against + + the real `zenzic-doc` project (59 MDX pages with JSX, frontmatter, and + tables): ~420 ms of pure analysis time on a warm Python process. + Simple Markdown projects with minimal frontmatter and no JSX can scan + 200 files in ~100 ms. End-to-end wall time on a cold `uvx` invocation + adds ~2–8 s of Python interpreter startup on top of analysis time. + Run `python scripts/benchmark.py --repo ` to measure your own project. + +- Zenzic can be placed as the **first step** in any CI pipeline, before + + `npm install`, before `pip install`, before the build engine is even available. + +- Engine-specific quirks (Docusaurus anchor generation, MkDocs nav contracts, + + Zensical slug conventions) are isolated in the adapter layer. The core engine + is permanently engine-neutral. + +- The VSM provides a testable, inspectable data structure for documentation + + architecture — enabling future capabilities like structural diffing, coverage + metrics, and ghost route detection without modifying the analysis core. diff --git a/developers/explanation/adr-parallel-early-termination.mdx b/developers/explanation/adr-parallel-early-termination.mdx new file mode 100644 index 0000000..0aca0e9 --- /dev/null +++ b/developers/explanation/adr-parallel-early-termination.mdx @@ -0,0 +1,165 @@ +--- +sidebar_label: "ADR 020: Parallel Audit Completeness" +sidebar_position: 10 +description: "ADR 020: Why Zenzic uses wait(FIRST_COMPLETED) for parallel result collection and how the fail-fast coordinator works without violating Pillar 3." +--- + +{/* SPDX-FileCopyrightText: 2026 PythonWoods */} +{/* SPDX-License-Identifier: Apache-2.0 */} + +# ADR 020: Parallel Audit Completeness vs. Fail-Fast + +**Status:** Active (v0.7.0 "Quartz Maturity") +**Decider:** Architecture Lead +**Date:** 2026-05-02 + +--- + +## Context + +Zenzic uses a `ProcessPoolExecutor` to scan documentation files in parallel +when a repository contains 50 or more Markdown files (`ADAPTIVE_PARALLEL_THRESHOLD` +in `core/scanner.py`). Each worker executes `_scan_single_file()` independently +and returns an `IntegrityReport` containing any findings, including `SecurityFinding` +objects emitted by the Shield (Z201/Z202/Z203). + +In the implementation prior to v0.7.0, the coordinator collected results by +iterating over `futures_map.items()` **in submission order**, calling +`fut.result(timeout=30)` on each future in turn. This design had two consequences: + +1. **No early termination.** If file 1 of 500 contained a credential (Z201, + Exit Code 2), all 499 remaining workers continued to completion before the + CLI could report the breach. On large repositories, this wasted significant + CI compute time. + +2. **Sequential result collection.** A slow worker at position 2 would block + collection of all subsequent results until it completed or timed out, even + if workers 3–500 had already finished. + +Two abort mechanisms were evaluated before the adopted solution: + +**`multiprocessing.Manager().Event()`** — a shared boolean flag visible to both +coordinator and workers. **Rejected.** Passing a manager event to `_worker()` +makes it stateful: its output would depend on external shared state rather than +solely on its inputs (`md_file`, `config`, `rule_engine`). This violates +**Pillar 3: Pure Functions First** — a founding invariant of the Zenzic +architecture. `_worker()` must remain a pure function. + +**`concurrent.futures.as_completed()`** — an iterator that yields futures in +completion order. **Evaluated and replaced.** `as_completed()` provides no +per-batch timeout guarantee. A deadlocked final worker would block the generator +indefinitely. The ZRT-002 protection (Z009 for deadlocked workers) cannot be +preserved without introducing a separate per-future timeout mechanism that +negates the simplicity advantage of `as_completed()`. + +--- + +## Decision + +> **From v0.7.0, the parallel coordinator uses `concurrent.futures.wait()` with +> `return_when=FIRST_COMPLETED` and a `_abort` local flag. On the first +> `SecurityFinding` in a completed worker result, all still-queued (`PENDING`) +> futures are cancelled immediately. The ZRT-002 deadlock guard is preserved.** + +The implementation replaces the `for fut, md_file in futures_map.items()` loop +with a `while _pending` loop. Each iteration calls: + +```python +done, _pending = concurrent.futures.wait( + _pending, + timeout=_WORKER_TIMEOUT_S, + return_when=concurrent.futures.FIRST_COMPLETED, +) +``` + +When a completed report contains `security_findings`, the coordinator sets +`_abort = True` and calls `pending_fut.cancel()` on every future still in +`_pending`. Subsequent iterations discard results silently. + +**Behavioural changes in v0.7.0:** + +| Scenario | Pre-v0.7.0 | v0.7.0 | +|---|---|---| +| No security breach | All files scanned | All files scanned (unchanged) | +| Security breach in file 1/500 | All 500 files scanned | Breach detected; pending tasks cancelled | +| Deadlocked worker | Z009 after 30 s per-worker | Z009 if no worker completes in 30 s | +| Result order | Submission order → sorted | Completion order → sorted | + +**Cancellation semantics:** `future.cancel()` operates only on tasks that have +not yet been dispatched to a worker process (`PENDING` state). Tasks already +`RUNNING` cannot be interrupted — they complete and their results are silently +discarded (not added to the report). The fail-fast is therefore a +**best-effort CI optimisation**, not a hard execution guarantee. + +**ZRT-002 preservation:** If `concurrent.futures.wait()` returns an empty `done` +set (no worker completed within `_WORKER_TIMEOUT_S` seconds), all remaining +pending futures are cancelled and a Z009 finding is emitted for each stalled +file. This protects against ReDoS patterns in `[[custom_rules]]` that somehow +bypass the startup canary (`_assert_regex_canary()`). + +--- + +## Rationale + +### 1. Pillar 3 Preserved + +The fail-fast is implemented entirely in the coordinator, which is orchestration +logic — not analysis logic. The coordinator is the only scope where multiple +futures are visible simultaneously. No analysis function is aware of the abort +state. + +`_worker()` and `_scan_single_file()` are **unchanged** in v0.7.0. Given the +same inputs, they produce the same output. They have no dependency on shared +state. This functional purity is what makes them deterministic in isolation and +trivially testable. + +### 2. Audit-Complete Semantics for Running Workers + +Workers already executing when a breach is detected are allowed to complete +naturally. Their results are discarded by the coordinator. This prevents the +scenario where a partially-written `IntegrityReport` (from a worker interrupted +mid-execution) corrupts the findings list or leaves file handles open. + +### 3. Deterministic Output + +The final `reports` list is always sorted by `file_path` after collection. +CLI output is reproducible regardless of worker completion order, pool size, +or how many files were scanned before the abort. + +### 4. `wait(FIRST_COMPLETED)` vs `as_completed()` + +`as_completed()` was the initially-proposed mechanism. It was replaced by +`wait(return_when=FIRST_COMPLETED)` for one specific reason: the ZRT-002 +deadlock guard. With `as_completed()`, a deadlocked last worker causes the +generator to block indefinitely with no way to enforce a timeout per pending +batch. With `wait(timeout=_WORKER_TIMEOUT_S)`, an empty `done` set after 30 +seconds unconditionally triggers the Z009 guard — no additional mechanism needed. + +--- + +## Invariants + +- `_worker()` must remain a pure, stateless function. No shared state, queue, + or event may be passed to it. +- The `_abort` flag is a local variable in the coordinator loop. It is not + exported, not shared with workers, and not visible outside the `with executor` + block. +- Results are always sorted by `file_path` before being returned. The + completion order from `wait()` is never the final output order. +- ZRT-002 deadlock guard: if no future completes within `_WORKER_TIMEOUT_S` + seconds, all remaining futures are cancelled and a Z009 finding is emitted + for each stalled file. + +--- + +## Consequences + +- On repositories with a security breach in the first few files, CI runtime + is reduced proportionally to the number of cancelled workers. +- On repositories with no breach, performance is identical to the previous + implementation (all workers complete, all results collected). +- The `ADAPTIVE_PARALLEL_THRESHOLD` constant retains its role: below 50 files, + sequential mode is used and this ADR does not apply. The sequential path + is unchanged. +- The fail-fast applies to parallel mode only. A scan that produces zero + security findings is unaffected by this change. diff --git a/developers/explanation/adr-path-sovereignty.mdx b/developers/explanation/adr-path-sovereignty.mdx new file mode 100644 index 0000000..d420453 --- /dev/null +++ b/developers/explanation/adr-path-sovereignty.mdx @@ -0,0 +1,146 @@ +--- +sidebar_label: "ADR 009: Path Sovereignty" +sidebar_position: 4 +description: "ADR 009: The configuration follows the target, not the caller — preventing Context Hijacking." +--- + +{/* SPDX-FileCopyrightText: 2026 PythonWoods */} +{/* SPDX-License-Identifier: Apache-2.0 */} + +# ADR 009: Path Sovereignty — Configuration Follows the Target + +**Status:** Active +**Decider:** Architecture Lead +**Date:** 2026-04-12 (v0.7.0 sprint, CEO-052) + +--- + +## Context + +`find_repo_root()` originally searched upward from `os.getcwd()` — the invoking +shell's current working directory. This worked correctly for the standard case +where the user runs Zenzic from inside the repository they want to analyse. + +It failed for any scenario where the caller's working directory differed from the +target repository: + +```bash +# CWD = /home/user/my-tools +# Target = /home/user/another-project/docs +zenzic check all /home/user/another-project/docs +``` + +In this case, `find_repo_root()` would walk upward from `/home/user/my-tools`, +find *that* repository's `zenzic.toml`, and load *that* repository's +configuration — including its `engine`, `docs_dir`, `excluded_dirs`, and custom +rules. The analysis target was `another-project`, but the configuration applied +was from `my-tools`. This is **Context Hijacking**. + +--- + +## Decision + +> **"The configuration follows the target, not the caller."** + +When an explicit `PATH` argument is provided to any filesystem-interacting CLI +command, `find_repo_root()` is called with `search_from=target_path` — walking +upward from the **target**, not the CWD: + +```python +# core/scanner.py +def find_repo_root( + search_from: Path | None = None, + fallback_to_cwd: bool = False, +) -> Path: + start = search_from or Path.cwd() + for parent in [start, *start.parents]: + if (parent / ".git").exists() or (parent / "zenzic.toml").exists(): + return parent + if fallback_to_cwd: + return Path.cwd() + raise RuntimeError(...) +``` + +`_apply_target()` in `cli/_check.py` orchestrates the recalibration: after +deriving `docs_root` from the user-provided target, it calls +`find_repo_root(search_from=docs_root)` to load the correct `zenzic.toml`, +then re-derives `docs_dir` from the target repo's configuration. + +--- + +## The `_apply_target()` Invariant + +When `target == repo_root` (the user points directly at a repo root, not a +subdirectory), `docs_dir` is **preserved from the config** rather than overridden +to `"."`. This prevents a subtle regression: a user running +`zenzic check all /path/to/repo` should respect that repo's `docs_dir = "docs"` +setting, not flatten it to the root. + +```python +# _apply_target() — canonical logic +if resolved_target == repo_root: + # Target IS the repo root: honour the config's docs_dir. + docs_root = repo_root / config.docs_dir +else: + # Target is a subdirectory: treat it as the docs root directly. + docs_root = resolved_target +``` + +--- + +## Rationale + +### 1. The Principle of Contextual Integrity + +A configuration file belongs to the project it lives in. Loading a foreign +`zenzic.toml` because of a coincidence of working directory is a **configuration +supply chain vulnerability** — the analysis is secretly governed by rules the +user did not intend to apply. + +### 2. CI/CD Correctness + +In CI pipelines, the working directory is often the runner's home, a workspace +root, or a tool directory — not the documentation repository. Path Sovereignty +ensures that `zenzic check all $DOCS_PATH` in CI always applies the correct +project-specific rules, regardless of the runner's `$PWD`. + +### 3. Symmetry with ADR-007 + +ADR-007 (Sovereign Sandbox) established that the **perimeter** follows the +target. ADR-009 completes the picture: the **configuration** also follows the +target. Together they guarantee that every aspect of an analysis — what is +scanned, what rules apply, and what escapes are forbidden — is determined solely +by the target repository. + +--- + +## Scope + +Path Sovereignty applies to every CLI command that accepts an optional positional +`PATH` argument (Rule R18 — Total CLI Symmetry): + +| Command | PATH semantics | +|---------|---------------| +| `zenzic check all [PATH]` | Sovereign root: `find_repo_root(search_from=PATH)` | +| `zenzic score [PATH]` | Same | +| `zenzic diff [PATH]` | Same; snapshot path derived from resolved `repo_root` | +| `zenzic init [PATH]` | Genesis Nomad: `PATH` is the `repo_root` directly; created if absent | +| `zenzic lab`, `zenzic inspect` | No PATH argument — exempt | + +--- + +## Consequences + +- Running Zenzic from any directory now produces identical results to running it + + from inside the target repository — no surprises for CI operators. + +- Contributors implementing new CLI commands that accept a `PATH` argument + + **must** call `find_repo_root(search_from=resolved_path)` and invoke + `_apply_target()`. This is now a documented invariant in the contribution guide. + +- The `fallback_to_cwd=True` parameter of `find_repo_root()` is reserved + + exclusively for the `init` command (Genesis Fallback — see ADR-003). No other + command may use it. diff --git a/developers/explanation/adr-sovereign-sandbox.mdx b/developers/explanation/adr-sovereign-sandbox.mdx new file mode 100644 index 0000000..e7dd255 --- /dev/null +++ b/developers/explanation/adr-sovereign-sandbox.mdx @@ -0,0 +1,112 @@ +--- +sidebar_label: "ADR 007: Sovereign Sandbox" +sidebar_position: 3 +description: "ADR 007: The Blood Sentinel guards escapes FROM the target, not the location OF the target." +--- + +{/* SPDX-FileCopyrightText: 2026 PythonWoods */} +{/* SPDX-License-Identifier: Apache-2.0 */} + +# ADR 007: Sovereign Sandbox + +**Status:** Active +**Decider:** Architecture Lead +**Date:** 2026-04-08 (v0.7.0 sprint, D043) + +--- + +## Context + +When a user runs `zenzic check all /path/to/target`, Zenzic must establish two +boundaries: the **analysis root** (what to scan) and the **perimeter** (what +links are permitted to resolve to). The Blood Sentinel (`Z202`, `Z203`) enforces +the perimeter — it detects links that escape the documentation sandbox. + +The early implementation had a geometric bug: it anchored the perimeter to the +**invoking shell's repo root** rather than to the **target**. This caused a +false positive when a user ran Zenzic from repo A pointing at repo B: + +```bash +cd /home/user/repo-a +zenzic check all /home/user/repo-b/docs +``` + +Zenzic would compute `repo_root = /home/user/repo-a`, then classify any link in +`repo-b/docs` that resolved outside `/home/user/repo-a` as a path traversal +attempt — even though the user explicitly specified `repo-b/docs` as the analysis +target. The Blood Sentinel was firing on legitimate cross-project invocations. + +--- + +## Decision + +**The explicit `PATH` argument provided by the user is the sovereign sandbox +root.** Blood Sentinel guards escapes **from** the target, not the location +**of** the target. + +Implementation: after computing `docs_root` from the user-provided path, if +`docs_root.relative_to(repo_root)` raises `ValueError` (i.e. `docs_root` is +outside the CWD's repo root), `repo_root` is dynamically reassigned: + +```python +# cli/_check.py — sovereign root guard +try: + docs_root.relative_to(repo_root) +except ValueError: + # docs_root is outside the current repo — honour the explicit target. + repo_root = docs_root +``` + +This change is **additive**: it only fires when the user provides an explicit +`PATH` that falls outside the CWD repo. When `zenzic check all` is invoked +without a `PATH`, the original behaviour is unchanged. + +--- + +## Rationale + +### 1. Intent Sovereignty + +A user who types `zenzic check all /home/user/repo-b/docs` has stated their +intent unambiguously. The tool must respect that intent. Overriding an explicit +argument with an implicit context-derived boundary is a **violation of the +Principle of Least Surprise**. + +### 2. The Blood Sentinel's Correct Domain + +Z202 (`PATH_TRAVERSAL`) was designed to catch Markdown authors who write links +like `[config](../../../etc/passwd)` attempting to escape the documentation +sandbox. Its domain is **relative link escape detection within a known source +tree**. It was never designed to penalise the user's legitimate choice of which +source tree to analyse. + +### 3. Remote CI Use Case + +The Sovereign Sandbox change directly enables a valid DevOps pattern: a +centralised `zenzic-runner` repository that executes `zenzic check all` on +multiple downstream documentation repositories as part of a cross-repo CI gate. +Without this fix, such a setup was impossible. + +--- + +## Invariants (Non-Negotiable) + +- The Sovereign Sandbox override fires **only** when an explicit `PATH` argument + + is provided and that path falls outside the CWD repo root. + +- Blood Sentinel (`Z202`/`Z203`) remains unconditionally active within the + + sovereign sandbox. No path traversal is permitted inside the declared target. + +- The change does not affect `fail-on-error` semantics or exit codes. + +--- + +## Consequences + +- Remote CI patterns (cross-repo scanning) now work correctly. +- `_validate_docs_root` (the F4-1 guard) continues to protect against config-file + + injection attacks (`docs_dir = "../../etc"`). The Sovereign Sandbox override + only fires for **user-supplied** paths, not config-file-derived paths. diff --git a/developers/explanation/adr-unified-perimeter.mdx b/developers/explanation/adr-unified-perimeter.mdx new file mode 100644 index 0000000..5cd62c6 --- /dev/null +++ b/developers/explanation/adr-unified-perimeter.mdx @@ -0,0 +1,174 @@ +--- +sidebar_label: "ADR 006: Unified Perimeter" +sidebar_position: 8 +description: "ADR 006: Fixing theme flip and Blog locale bleed in zenzic.dev — storage namespace unification and locale-sovereign navbar links." +--- + +{/* SPDX-FileCopyrightText: 2026 PythonWoods */} +{/* SPDX-License-Identifier: Apache-2.0 */} + +# ADR 006: Unified Perimeter — Storage Namespace & Blog Locale Sovereignty + +**Status:** Active +**Decider:** Architecture Lead +**Date:** 2026-04-27 (v0.7.0 sprint, CEO 051, commit `3188387`) + +--- + +## Context + +This ADR is specific to the **zenzic.dev documentation site** (this repository), +not to the Zenzic CLI core. It documents two independent locale-bleed bugs that +were introduced when `future.v4: true` was activated in `docusaurus.config.ts`. + +### Bug 1 — The Theme Flip + +With `future.v4: true`, Docusaurus enables `siteStorageNamespacing`: it auto-generates +a per-locale localStorage key by hashing `url + baseUrl + locale`. This produced: + +| Locale | localStorage key | +|--------|-----------------| +| English (`/`) | `theme-926` | +| Italian (`/it/`) | `theme-3d7` | + +When a user switched from the English to the Italian documentation, their browser +loaded a **different** localStorage key. Since the Italian key had no stored +preference, Docusaurus fell back to `defaultMode: 'dark'`. If the user had +previously switched to light mode in English, the switch caused an instant +**dark mode revert** — a visible FOUC (Flash of Unstyled Content) on every +locale switch. + +### Bug 2 — The Blog Locale Bleed + +The Blog link in the navbar pointed to the blog using a standard Docusaurus +navbar item: + +```ts +// docusaurus.config.ts — original, broken +{ to: '/blog', label: 'Journal', position: 'left' } +``` + +Docusaurus's static build pipeline **rewrites** both `to:` and `href:` values in +navbar items for each locale's HTML output. In the Italian static build, this +became: + +```html + +Journal +``` + +When a user navigated from Italian documentation to the Journal via that link, +they landed on `/it/blog` — which loaded the blog with the Italian locale UI: +dates rendered as `"25 aprile 2026"`, labels appeared as `"Etichette"`, the +reading time showed `"9 minuti di lettura"`. The Blog is an English-only +content space and must never be locale-translated. + +Switching from `to:` to `href:` did **not** fix the issue: `href:` values in +standard navbar items are also rewritten by the Docusaurus i18n build pipeline. + +--- + +## Decision + +Two independent fixes were applied to `docusaurus.config.ts`: + +### Fix 1 — Unified Storage Namespace + +```ts +// docusaurus.config.ts +storage: { + namespace: false, +}, +``` + +The top-level `storage.namespace: false` overrides the `future.v4` +namespacing behaviour. Both locales now share the single key `"theme"` in +localStorage. Dark mode preference persists across all locale switches. + +**Verified in build output:** The anti-FOUC inline script in both +`build/index.html` and `build/it/index.html` reads: + +```js +localStorage.getItem("theme") +``` + +### Fix 2 — `type: 'html'` Locale-Sovereign Link + +```ts +// docusaurus.config.ts — Blog navbar item +{ + type: 'html', + value: 'Blog', + position: 'left', +}, +``` + +Docusaurus does **not** process the `innerHTML` of `type: 'html'` navbar items +through the i18n rewrite pipeline. The raw `href="/blog"` is preserved verbatim +in every locale's static HTML output. + +**Verified in build output:** The Italian locale HTML contains: + +```html +href=/blog>Blog +``` + +Not `/it/blog` — locale-sovereign. + +--- + +## Rejected Approaches + +### `themeConfig.siteStorage.themeKey` + +Proposed in the CEO directive as a way to control the storage key. This property +**does not exist** in Docusaurus 3.x. There is no `themeConfig.siteStorage` +namespace. The correct API is the top-level `storage` object. + +### `respectPrefersColorScheme: true` + +Also proposed in the CEO directive. This would instruct Docusaurus to follow the +OS-level color scheme preference on every page load — **overriding the user's +explicit in-app preference**. This directly reverts the CEO 149 invariant +(`respectPrefersColorScheme: false`) which was established as a permanent +protection against OS-preference-driven theme resets. It was not applied. + +--- + +## Invariants (Non-Negotiable) + +- `storage: { namespace: false }` must remain in `docusaurus.config.ts` for as + + long as `future.v4: true` is active and the Italian locale is supported. + Removing it silently re-introduces per-locale storage key fragmentation. + +- `colorMode.respectPrefersColorScheme` must remain `false`. This is an + + immutable invariant (CEO 149). Any PR that sets it to `true` is an automatic + revert candidate. + +- The Blog navbar item must remain `type: 'html'`. Converting it back to a + + standard `to:` or `href:` item will re-introduce locale bleed in the next + build. This is not immediately visible in development mode (`npm run start`) + because `npm run start` serves a single locale without the rewrite pipeline. + **Bugs of this class are only visible in `just build` output.** + +--- + +## Consequences + +- Dark mode preference is now fully locale-independent. A user who sets dark mode + + in English documentation retains dark mode when switching to Italian. + +- The Blog (blog) always loads at `/blog` regardless of which locale the user + + navigated from. + +- The `type: 'html'` navbar item does not participate in Docusaurus's `i18n` + + translation pipeline (i.e., it does not appear in `code.json` translation keys). + The label "Blog" is therefore hardcoded in the HTML value — this is + intentional, as the blog is English-only and the label does not require + translation. diff --git a/developers/explanation/adr-vault.mdx b/developers/explanation/adr-vault.mdx new file mode 100644 index 0000000..e543314 --- /dev/null +++ b/developers/explanation/adr-vault.mdx @@ -0,0 +1,115 @@ +--- +sidebar_label: "ADR Vault" +sidebar_position: -3 +description: "The complete index of Zenzic Architectural Decision Records — every major technical choice, its context, and its permanent consequences." +--- + +{/* SPDX-FileCopyrightText: 2026 PythonWoods */} +{/* SPDX-License-Identifier: Apache-2.0 */} + +# ADR Vault + +> *"A tool that works for mysterious reasons is not a tool — it is a ritual. +> Zenzic works for documented reasons. This vault is the proof."* + +This page is the complete index of **Architectural Decision Records (ADRs)** for +the Zenzic project. Each ADR documents a major technical decision: its context +(why the problem existed), its decision (what was chosen), and its invariants +(what must never change as a consequence). + +ADRs are the **immutable memory** of the project. They explain not only what +Zenzic does, but why — so that future contributors can extend the system without +unknowingly violating the constraints that make it trustworthy. + +--- + +## Genesis Decisions + +These ADRs were established before the first public release. They define the +philosophical and technical foundations on which all subsequent decisions rest. + +| ADR | Title | Sprint | +|-----|-------|--------| +| [ADR 001](./adr-lint-source.mdx) | Lint the Source, Not the Build | Genesis (pre-v0.1.0) | +| [ADR 002](./adr-zero-subprocesses.mdx) | Zero Subprocesses Policy | Genesis (pre-v0.1.0) | + +--- + +## Core Architecture Decisions + +These ADRs document the structural decisions made during the active development +of Zenzic v0.6.x and v0.7.0. + +| ADR | Title | Sprint | +|-----|-------|--------| +| [ADR 003](./adr-discovery.mdx) | Root Discovery Protocol | D036 / ZRT-005 | +| [ADR 004](./adr-decentralized-cli.mdx) | Decentralized CLI Package | D062-B / D064 | +| [ADR 005](./adr-agnostic-universalism.mdx) | Z404 Agnostic Universalism | D087 | +| [ADR 007](./adr-sovereign-sandbox.mdx) | Sovereign Sandbox | D043 | +| [ADR 008](./adr-bilingual-structural.mdx) | Bilingual Structural Invariant | D045 | +| [ADR 009](./adr-path-sovereignty.mdx) | Path Sovereignty | CEO-052 | + +--- + +## Documentation Site Decisions + +These ADRs document architectural decisions specific to this documentation site +(`zenzic.dev`) — choices about how the Docusaurus site is built, localized, and +maintained. + +| ADR | Title | Sprint | +|-----|-------|--------| +| [ADR 006](./adr-unified-perimeter.mdx) | Unified Perimeter (Storage + Blog) | CEO 051 | +| [ADR 011](./adr-cross-instance-allowlist.mdx) | Cross-Instance Absolute Path Allowlist | EPOCH 5 (v0.7.0) | + +--- + +## Reading Guide + +Each ADR follows a consistent structure: + +- **Context** — the problem that existed before the decision was made. Reading + + the Context of an ADR tells you what pain the decision was eliminating. + +- **Decision** — the choice that was made, stated precisely and without + + ambiguity. If you ever wonder "why does Zenzic do X?", the Decision section + of the relevant ADR is the answer. + +- **Rationale** — the engineering reasoning behind the decision. This section + + is the "why not the alternative?" — it records the rejected approaches and + explains why they were insufficient. + +- **Invariants** — the constraints that must never be violated as a consequence + + of the decision. These are permanent. They do not expire with version + increments. A PR that violates an invariant listed in an ADR is an automatic + revert candidate, regardless of its other merits. + +- **Consequences** — the known trade-offs and capabilities that the decision + + enables or forecloses. Reading Consequences helps contributors understand the + boundaries of what Zenzic can and cannot do by design. + +--- + +## Adding a New ADR + +When a significant architectural decision is made — one that constrains future +contributors or resolves a structural tension — it must be recorded here. + +1. Create `developers/explanation/adr-.mdx` with the next + + available ADR number. + +2. Create the Italian mirror at the corresponding path in `i18n/it/`. +3. Add both files to the table above in the appropriate section. +4. Record the decision in the `[ADR]` section of the relevant Zenzic Ledger + + (`.github/copilot-instructions.md`) in the repository where the decision + was implemented. + +The ADR is permanent once published. To amend a decision, add a new ADR that +references the original and documents the amendment — never rewrite history. diff --git a/developers/explanation/adr-zero-subprocesses.mdx b/developers/explanation/adr-zero-subprocesses.mdx new file mode 100644 index 0000000..266cd43 --- /dev/null +++ b/developers/explanation/adr-zero-subprocesses.mdx @@ -0,0 +1,173 @@ +--- +sidebar_label: "ADR 002: Zero Subprocesses" +sidebar_position: -1 +description: "ADR 002: The Security & Portability Decision — why Zenzic is 100% pure Python and never spawns external processes." +--- + +{/* SPDX-FileCopyrightText: 2026 PythonWoods */} +{/* SPDX-License-Identifier: Apache-2.0 */} + +# ADR 002: Zero Subprocesses Policy + +**Status:** Active (Genesis Decision) +**Decider:** Architecture Lead +**Date:** 2026-01-01 (founding principle, pre-v0.1.0) + +--- + +## Context + +Many documentation tools that need to understand multiple build engines solve +the problem by **delegating to those engines**: they call `mkdocs build`, +`npm run build`, or `node scripts/generate-nav.js` as subprocesses, then +parse the output. This approach appears pragmatic — it re-uses the engine's +own logic rather than reimplementing it. + +In practice, subprocess delegation creates a cascade of problems that become +acute in enterprise CI/CD environments: + +1. **Security surface.** A tool that executes arbitrary subprocesses in the + + context of a documentation repository becomes a **code execution vector**. + Any `Makefile`, `justfile`, or `package.json` `scripts` entry near the + documentation root is potentially reachable. In repositories with complex + monorepo structures, the boundary between "running the doc validator" and + "running project build scripts" becomes dangerously blurred. + +2. **Portability collapse.** A subprocess call to `node` requires Node.js to + + be installed at a specific path. A call to `mkdocs` requires the MkDocs + virtual environment to be active. In Docker containers, GitHub Actions + runners, and air-gapped CI systems, the presence of these binaries cannot + be assumed. A tool that requires Node.js to validate a Markdown repository + is not portable — it is fragile. + +3. **Version coupling.** When the subprocess's binary is upgraded independently + + of the validator, output format changes silently break the parser. The + validator is now coupled to the binary's `--format json` contract, which + may not be stable across minor versions. + +4. **Performance overhead.** Starting a Node.js process, loading `docusaurus` + + dependencies, and building a partial site map takes 5–30 seconds. Performing + this for every CI run, for every file change, makes incremental development + loops slow. For a tool that is supposed to be a fast pre-commit gate, this + is unacceptable. + +5. **Zero-Trust violation.** In regulated or security-sensitive environments, + + a CI gate that executes code from the repository being validated is a + trust-boundary violation. The validator must be a **passive reader**, not + an **active executor**, to satisfy Zero-Trust CI requirements. + +--- + +## Decision + +> **The Zenzic core is 100% pure Python. No subprocess call, no `os.system`, +> no external binary execution, and no network access occurs during analysis.** + +Every piece of information that Zenzic needs about a documentation engine's +behavior is extracted through **static parsing** of configuration files: + +| Engine config | Parsing method | +|---|---| +| `docusaurus.config.ts` | Pure-Python regex extraction of `to:`, `href:`, `docId:`, and `themeConfig` fields | +| `mkdocs.yml` | PyYAML — pure Python, no subprocess | +| `zensical.toml` | `tomllib` / `tomli` — pure Python, no subprocess | +| `pyproject.toml` | `tomllib` / `tomli` — pure Python, no subprocess | +| `sidebars.ts` | Pure-Python regex extraction of doc IDs and paths | + +The constraint is enforced at the module level: `core/` contains no `import +subprocess` statement. This is verifiable by static analysis and is covered by +the `test_cli_e2e.py` test suite, which monkey-patches `subprocess` and asserts +it is never reached during any analysis path. + +--- + +## Rationale + +### 1. Zero-Trust Execution + +Zenzic is a validator — its security model requires that it be a **passive +reader** of the repository, not an active participant in its build system. A +tool that executes `package.json` scripts or `Makefile` targets as part of its +analysis cannot be granted Zero-Trust status in a regulated CI environment. +The subprocess prohibition is not a performance optimization — it is a +**security invariant**. + +### 2. Portability is Non-Negotiable + +Zenzic runs via `uvx zenzic` — a single command that requires only Python and +`uv` on the PATH. No Node.js, no npm, no MkDocs, no Jekyll, no Hugo. This +install profile works identically on Ubuntu 22.04, Windows 11, macOS Sequoia, +Alpine Linux Docker containers, and air-gapped CI runners. The moment Zenzic +adds a subprocess call, it inherits the portability matrix of the subprocess +target. + +### 3. Static Analysis is Sufficient + +The concern that static parsing of TypeScript config files is fragile is valid +but manageable. The adapter layer uses conservative regex patterns that target +structural constants in each engine's configuration format — properties like +`to:`, `href:`, and `docId:` that are part of the engine's public API and +change infrequently. When an engine changes its config format, the adapter +is updated — a contained, testable change. This is preferable to subprocess +coupling, where a binary version bump silently breaks the output parser. + +### 4. Speed as a First-Class Requirement + +A pre-commit gate that takes 30 seconds is a gate that developers disable. +Zenzic's source-based, subprocess-free analysis completes in 1–5 seconds for +most documentation repositories. This speed is a consequence of the subprocess +prohibition: there is no process startup overhead, no dependency installation, +no partial build to execute. Pure Python on warm bytecode cache is consistently +fast. + +--- + +## Invariants (Non-Negotiable) + +- No file in `src/zenzic/` may contain `import subprocess`, `import os` used + + for `os.system`/`os.popen`, or any equivalent mechanism for spawning + external processes. + +- No file in `src/zenzic/` may make HTTP requests (no `urllib`, no `requests`, + + no `httpx`) during analysis. External URL validation (Z103) uses only socket- + level connectivity checks, which are isolated in the dedicated external link + checker module and are explicitly opt-in. + +- TypeScript and JavaScript configuration files are parsed as text, not executed. + + Any "execution" of a config file — even via a sandboxed Node.js `eval` — is + permanently forbidden. + +- The `test_cli_e2e.py` test suite must include at least one test that verifies + + `subprocess.run` is never called during a `check all` invocation. + +--- + +## Consequences + +- Zenzic cannot validate documentation that is generated entirely at runtime + + (e.g., API docs generated from source code annotations via `mkdocstrings`). + This is an intentional scope boundary — Zenzic validates the **authored** + documentation, not the generated portions. Generated sections are outside + the Safe Harbor perimeter by definition. + +- Configuration files written in languages that require execution to evaluate + + (e.g., Starlark `BUILD` files, Python-based `mkdocs_macros` plugins) are + parsed conservatively. Zenzic extracts what static analysis can safely + determine and treats the rest as opaque. + +- The subprocess prohibition means Zenzic cannot auto-detect the installed + + version of the documentation engine. Version-specific behavior differences + are handled by adapter configuration (e.g., `engine: "docusaurus"` in + `zenzic.toml`) rather than runtime version negotiation. diff --git a/developers/explanation/architecture-gaps.mdx b/developers/explanation/architecture-gaps.mdx new file mode 100644 index 0000000..d9a8f3f --- /dev/null +++ b/developers/explanation/architecture-gaps.mdx @@ -0,0 +1,273 @@ +--- +sidebar_label: "Architectural Gaps" +sidebar_position: 5 +description: "Architecture gaps identified and closed in v0.7.0, with open items planned for v0.8.0." +--- + +{/* SPDX-FileCopyrightText: 2026 PythonWoods */} +{/* SPDX-License-Identifier: Apache-2.0 */} + +# Zenzic — Architectural Gaps & Roadmap + +> *"What is not documented, does not exist; what is documented poorly, is an ambush."* +> +> This page tracks gaps that were closed during the v0.7.0 cycle and those that +> remain open for the v0.8.0 roadmap. It is a living document — updated each sprint. + +--- + +## Open — Target v0.8.0 + +### GAP-001 — Auto-Fix Engine + +**Component:** `cli/_check.py`, new `core/fixer.py` +**Description:** Zenzic detects but does not repair. A contributor who receives a Z501 +(placeholder) or Z502 (short content) finding must locate and edit the file manually. +An Auto-Fix engine would apply safe, reversible patches directly to source files — +replacing placeholder tokens, stubbing short sections, and reporting what was changed. + +**Planned semantics:** + +```bash +zenzic fix all # dry-run by default: shows diff, writes nothing +zenzic fix all --apply # writes changes; staged via git diff +zenzic fix links # fixes only Z101/Z104 (dead links) — renames or stubs +``` + +**Design constraints:** + +- Auto-fix must never touch files that triggered Z201 (Shield secret) — those require + + human judgment. + +- Exit code semantics are unchanged: `--apply` still exits 1 if unfixed findings remain. +- Pure Python, no subprocess (Pillar 2). + +**Status:** Design phase. No code merged. + +--- + +### GAP-002 — Dynamic Navbar/Footer Plugin Support + +**Component:** `core/adapters/_docusaurus.py`, `_parse_config_navigation()` +**Description:** Docusaurus supports navbar items declared via `@docusaurus/plugin-*` +plugins (e.g. `plugin-content-docs` multi-instance, custom navbar components). When +the navbar is populated dynamically at build time, Zenzic's static regex parser cannot +extract those paths — it falls back to treating all files as `REACHABLE`. + +**Impact:** Low false-positive risk (the fallback is conservative), but some true orphans +may be missed in plugin-heavy configurations. + +**Planned resolution:** A structured warning (`::warning` annotation in CI mode) when +Zenzic detects dynamic navbar plugins, indicating that orphan detection may be +incomplete. The user can suppress it with `dynamic_nav_plugins = true` in `zenzic.toml`. + +**Status:** Tracked. RFC open. + +--- + +## Closed in v0.7.0 — Operation Obsidian Stress {/* zenzic:ignore Z905 */} + +:::info[What was the Operation?] + +Before the v0.7.0 release, four AI agents were instructed to break Zenzic's Shield +(credential scanner) using realistic bypass techniques. They found four real vectors. +All were closed before stable release. + +See the full technical post-mortem: [AI Red Team Attacks Code Linter](https://zenzic.dev/blog/ai-driven-siege-shield-postmortem) + +::: + +### ZRT-001 — Unicode Normalization Bypass (Shield) + +**Identified by:** AI Red Team agent "Alpha" during Operation Obsidian Stress {/* zenzic:ignore Z905 */} +**Component:** `core/shield.py`, `scan_lines_with_lookback()` +**Description:** The Shield's regex patterns matched ASCII credential shapes. An attacker +controlling a Markdown file could insert a Unicode lookalike character (e.g. `ghp_…` +using fullwidth Latin letters) into what appeared to be a token. The Shield would not +fire because the byte sequence did not match the ASCII pattern. + +**Resolution:** Unicode normalization (NFKC) is applied to each line before pattern +matching. `unicodedata.normalize("NFKC", line)` collapses fullwidth, superscript, +enclosed, and other Unicode lookalikes to their ASCII canonical form. The original line +content is preserved for output; only the normalized copy is matched against. + +**Lesson:** Regex-based credential detection must normalize input. The attack surface +is not the pattern — it is the encoding. + +**Closed in:** v0.7.0 sprint D038. + +--- + +### ZRT-002 — Lookback Buffer Escape (Shield) + +**Identified by:** AI Red Team agent "Bravo" during Operation Obsidian Stress {/* zenzic:ignore Z905 */} +**Component:** `core/shield.py`, `scan_lines_with_lookback()` +**Description:** The Shield's lookback buffer was used to detect multi-line credential +constructs (e.g. a `password:` key on one line, the value on the next). Agent Bravo +inserted a sufficiently long "filler" block (> buffer size) between the key and value +lines. The buffer emptied before the value line was scanned, breaking the association +and suppressing the Z201 finding. + +**Resolution:** The lookback buffer size was validated against the maximum known +multi-line credential pattern length in the registry (`codes.py`). The buffer is now +guaranteed to span the maximum pattern window. Additionally, the buffer is flushed on +file boundaries only — never mid-file. + +**Lesson:** Buffer-based detection requires formal sizing against the worst-case +pattern. An informal "large enough" buffer is not a security guarantee. + +**Closed in:** v0.7.0 sprint D039. + +--- + +### ZRT-003 — HTML Entity Obfuscation (Shield) + +**Identified by:** AI Red Team agent "Charlie" during Operation Obsidian Stress {/* zenzic:ignore Z905 */} +**Component:** `core/shield.py` +**Description:** The Shield scanned raw Markdown bytes. Agent Charlie used HTML entity +encoding (`ghp_…` for `ghp_…`) inside fenced code blocks. The Shield's patterns +did not match the entity-encoded form, allowing a fake credential to pass undetected. + +**Resolution:** A lightweight HTML entity decoder is applied to each line before Shield +pattern matching (after NFKC normalization). The decoder handles numeric (`g`) and +named (`&`) entities. XML/HTML character references are normalized to their Unicode +codepoints before the regex runs. + +**Lesson:** Multi-encoding defense requires layered normalization. A single normalization +pass (NFKC only) is insufficient when HTML rendering is part of the content pipeline. + +**Closed in:** v0.7.0 sprint D040. + +--- + +### ZRT-004 — Fenced Block Scope Confusion (Shield) + +**Identified by:** AI Red Team agent "Delta" during Operation Obsidian Stress {/* zenzic:ignore Z905 */} +**Component:** `core/shield.py`, fenced block state machine +**Description:** The Shield originally skipped scanning inside triple-backtick fenced +blocks, reasoning that code examples are not live secrets. Agent Delta embedded a +`ghp_` pattern inside a `bash` fenced block. The Shield did not fire. + +**Resolution after deliberation:** The "skip fenced blocks" heuristic was **reversed**. +The Shield now scans all lines, including fenced code blocks. The rationale: a +documentation file that leaks a real credential inside a `bash` block is still leaking +a real credential. The example nature of the block is irrelevant to the security outcome. + +A `# zenzic: ignore-next-line` comment is the authorized mechanism for authors who need +to include a credential-shaped string in a documented example (e.g. showing the format +of a GitHub token without using a real one). The `examples/matrix/red-team/` fixtures +demonstrate this pattern. + +**Lesson:** Heuristic scope exclusions that reduce false positives often create false +negatives in adversarial conditions. Security-critical passes should default to +**scan everything, authorize exceptions explicitly**. + +**Closed in:** v0.7.0 sprint D041. + +--- + +## Closed Earlier (Pre-v0.7.0) + +### ZRT-005 — Bootstrap Paradox + +**Component:** `core/scanner.py` +**Description:** `zenzic init` crashed with a configuration error when invoked in an +empty directory. The `find_repo_root()` function had no fallback, making it impossible +to initialize a project that did not yet have a `.git` or `zenzic.toml` marker. +**Resolution:** `fallback_to_cwd=True` parameter added to `find_repo_root()`, used +exclusively by `zenzic init`. See [ADR 003](adr-discovery.mdx). +**Closed in:** v0.6.0a4. + +--- + +### ZRT-006 — VSM Bypass: Absolute Slug Links Skipped Silently + +**Component:** `core/validator.py` — Phase 2 link validation loop + +**Description:** When a Docusaurus project declares `routeBasePath`-owned prefixes +(e.g. `/blog/`) via `get_absolute_url_prefixes()`, the validator suppresses Z105 +(ABSOLUTE_PATH) for links starting with those prefixes. The suppression was +implemented as a bare `continue`, which exited the per-link iteration before the +VSM lookup — making Z001 impossible to fire on absolute prefix-owned links. + +A second compounding issue: `DocusaurusAdapter.set_slug_map()` was never called +during `validate_links_async()`, so the slug map was empty at VSM construction time. +Blog posts declaring `slug: my-post` in frontmatter were routed via filename +derivation instead (e.g. `2026-04-29-my-post` → `/blog/my-post/`), producing a VSM +that diverged from the URLs Docusaurus actually served. + +**Combined effect:** A link `/blog/wrong-slug` where the real slug was +`/blog/correct-slug` produced no finding from Zenzic, while `docusaurus build` failed +with a broken-link error. The sentinel was blind to the most common post-rename failure +mode. + +**Resolution:** Two coordinated fixes in `core/validator.py`: + +1. **Lifecycle ordering** — `adapter.set_slug_map(md_contents)` is now called (via + `hasattr` guard for cross-engine safety) immediately before `build_vsm()`. The VSM + is built on the correct virtual identity, not the physical filename. + +2. **Scoped VSM lookup** — After Z105 suppression, the validator checks whether the + matched prefix has at least one route in the VSM (`_scanned_vsm_prefixes`). If so, + it performs a `dict.get()` lookup and reports `FILE_NOT_FOUND` when the route is + absent. Prefixes with no VSM entries (sibling plugins whose markdown is outside the + scan scope) retain the unconditional bypass — Zero-Config invariant preserved. + +**Cross-engine impact:** MkDocs, Zensical, and Standalone adapters do not implement +`set_slug_map()`. The `hasattr` guard makes the call a no-op for those engines — no +behaviour change. + +**Regression lock:** `tests/test_docusaurus_blog_vsm.py` — class +`TestAbsoluteSlugMismatch` — two new tests: +- `test_absolute_broken_blog_link_is_detected` — wrong slug raises `FILE_NOT_FOUND` +- `test_correct_absolute_slug_link_is_clean` — correct slug produces no error + +**Closed in:** v0.7.0. + +### D100 — Privacy Gate Migration: `.zenzic.dev.toml` → `.zenzic.local.toml` + +**Component:** `cli/_standalone.py`, `models/config.py`, `core/shield.py`, `core/codes.py` + +**Description:** The original D002 Environmental Privacy Gate (`_scaffold_dev_toml`) created +`.zenzic.dev.toml` with a `[development_gate]` table that held `forbidden_patterns` for +export redaction. This file was not integrated into the Shield scanning pipeline — +it served only as a local redaction hint for export tooling. The patterns were never +checked against documentation content, so a developer could inadvertently publish a +document containing a forbidden code-name without any Zenzic warning. + +This gap created a false sense of security: users configured `forbidden_patterns` +expecting Zenzic to block those terms from documentation, but the scan never happened. + +**Resolution (Sprint D100 — v0.7.0):** + +1. **New canonical file:** `.zenzic.local.toml` replaces `.zenzic.dev.toml` as the + machine-local, git-ignored privacy configuration. It is a flat TOML file with a + top-level `forbidden_patterns = [...]` key. + +2. **Automatic `.gitignore` management:** `zenzic init` now always scaffolds + `.zenzic.local.toml` and appends the filename to `.gitignore` if the file exists + and the entry is absent. No manual step required. + +3. **Config deep-merge:** `ZenzicConfig.load()` performs an additive merge of + `forbidden_patterns` from `.zenzic.local.toml` after loading the primary config + (`zenzic.toml` or `[tool.zenzic]`). Duplicates are removed; insertion order is preserved. + +4. **Z204 FORBIDDEN_TERM — Exit 2:** `scan_line_for_forbidden_terms()` in `core/shield.py` + performs a case-insensitive verbatim substring scan against the merged `forbidden_patterns` + list. Any match on any line of any documentation file is emitted as a `SecurityFinding` + with `secret_type="FORBIDDEN_TERM"`. The scanner bridges this to Z204 (not Z201), + preserving clear separation between credential leaks and forbidden-term violations. + +5. **Backward compatibility:** `_scaffold_dev_toml()` is retained as a shim that + delegates to `_scaffold_local_toml()`. No external callers need updating. + +**Brand Integrity Shield — Two-Layer Design:** +The Z204 Privacy Gate and the Z905 Brand Obsolescence Guard form a two-layer architecture: +- **Z204** (`forbidden_patterns` in `.zenzic.local.toml`): exit 2, non-suppressible. + Designed for private terms that must never appear in any published doc. +- **Z905** (`obsolete_names` in `zenzic.toml`): exit 1, suppressible with `zenzic:ignore Z905`. + Designed for deprecated brand terms where historical references in CHANGELOG files + are acceptable. + +**Closed in:** v0.7.0 sprint D100. diff --git a/developers/explanation/engineering-ledger.mdx b/developers/explanation/engineering-ledger.mdx new file mode 100644 index 0000000..ce9028f --- /dev/null +++ b/developers/explanation/engineering-ledger.mdx @@ -0,0 +1,136 @@ +--- +sidebar_label: "The Engineering Ledger" +sidebar_position: 6 +description: "The three architectural pillars of Zenzic v0.7.0 — Zero Assumptions, Subprocess-Free, and Deterministic Graph — explained as proofs of strength." +--- + +{/* SPDX-FileCopyrightText: 2026 PythonWoods */} +{/* SPDX-License-Identifier: Apache-2.0 */} + +# The Engineering Ledger + +> *"A tool that works for mysterious reasons is not a tool — it is a ritual. +> Zenzic works for documented reasons. This page is the proof."* + +This page is the technical manifesto behind every decision that makes Zenzic v0.7.0 +reliable enough to be called a **Safe Harbor**. It belongs in the Developer quadrant +because the user does not need to know this to be protected — but the contributor +must understand it to extend the system without breaking the contract. + +--- + +## Pillar 1 — Zero Assumptions: Lint the Source, Not the Build {#pillar-zero-assumptions} + +**The user-facing benefit:** Zenzic catches broken links, leaked credentials, and structural +errors *before* the documentation build starts — eliminating an entire class of CI failures +that only surface after a 3-minute build wait. + +**The engineering principle:** Analysis is performed exclusively on raw Markdown source files +and their static configuration (e.g., `mkdocs.yml`, `docusaurus.config.ts`, `zenzic.toml`). +No HTML output is parsed. No web server is started. No build engine is invoked. + +**Why this matters technically:** Build output is a transformation of the source. Validating +the output means trusting the transformation is correct — which it is not, by definition, +when the source has structural errors. Zenzic validates the invariants of the source so that +the build can be trusted to produce a correct output. + +**Implementation:** The **Virtual Site Map (VSM)** is the proof. `core/vsm.py` builds a +complete in-memory projection of the final site from source files alone, using adapter-specific +knowledge of how each engine maps files to URLs. Ghost Routes (i18n fallbacks, versioned slugs) +are modelled without running the build that would produce them. + +**Invariant:** `core/` never calls `subprocess.run`. No file in `src/zenzic/` spawns an +external process. This is verified by the `test_cli_e2e.py` test suite, which monkey-patches +`subprocess` and asserts it is never reached. + +--- + +## Pillar 2 — Subprocess-Free: 100% Pure Python {#pillar-subprocess-free} + +**The user-facing benefit:** Zenzic runs identically on Ubuntu, Windows, and macOS — in a +Docker container, a GitHub Actions runner, or a developer's laptop — with no hidden system +dependencies. `uvx zenzic check all .` works the same everywhere. + +**The engineering principle:** Every analysis function is a pure Python function. The only +permitted I/O is file reading (source files) and writing (reports, snapshots). No shell +commands, no `os.system`, no `subprocess.run`, no Node.js execution, no external binaries. + +**Why this matters technically:** Subprocesses introduce platform-specific behaviour, PATH +sensitivity, and non-deterministic timing. A CI gate that behaves differently on the +developer's machine and in production is not a gate — it is noise. Zero subprocesses means +`zenzic check all` on your laptop produces the same exit code as `zenzic check all` in CI. + +**Implementation:** The 3×3 CI matrix (OS: `[ubuntu, windows, macos]` × Python: +`[3.11, 3.12, 3.13]`) is the ongoing proof. 1,260+ tests pass on all nine combinations +as of v0.7.0. Property-based tests (Hypothesis, `ci` profile: 500 examples) stress-test +core functions across the input space to surface platform-specific edge cases. + +**Invariant (RULE R08):** Codified in `copilot-instructions.md` as a permanent non-negotiable. +Any PR that introduces a subprocess call fails the review gate immediately. + +--- + +## Pillar 3 — Deterministic Graph: Pure Functions First {#pillar-deterministic-graph} + +**The user-facing benefit:** Running `zenzic check all` twice on the same source produces +the same report, the same exit code, and the same score. There are no race conditions, no +cache invalidation surprises, and no flaky CI failures caused by Zenzic itself. + +**The engineering principle:** Analysis logic is pure and deterministic. I/O is isolated at +the edges (Discovery reads files; Reporting writes results). The hot-path loops — +link validation, credential scanning, orphan detection — contain zero file system calls, +zero random state, and zero shared mutable state. + +**Why this matters technically:** Non-deterministic analysis tools create a particularly +damaging failure mode: they teach engineers to re-run CI instead of fixing the root cause. +A tool that occasionally passes on the same input trains the team to ignore it. +Determinism is the foundation of trust. + +**Implementation:** + +```python +# core/scorer.py — D092 Quartz Penalty Scorer +def compute_score(findings_counts: dict[str, int]) -> ScoreReport: + """ + No I/O. No side effects. Same inputs → same output, on every OS, + in every Python version, at any time of day. + findings_counts keys are Zxxx codes (e.g. "Z101", "Z402"). + """ + ... +``` + +The `AdaptiveRuleEngine` in `core/rules.py` auto-selects between sequential and parallel +execution at the 50-file threshold — but both modes produce identical findings. Parallelism +is a performance optimization; it does not alter the analysis result. + +**Invariant (RULE R01, RULE R03):** No `Path.exists()` or `open()` inside link/file +validation loops. Every `Finding` object carries a `Zxxx` code from `codes.py`. The +`_to_findings()` function in `cli/_check.py` is the single authorised conversion point. + +--- + +## The Exit Code Contract {#exit-code-contract} + +The three pillars converge in the exit code contract — the most visible proof that +Zenzic is a principled tool, not a heuristic scanner: + +| Exit | Meaning | Suppressible? | +|:---:|---|:---:| +| `0` | All checks passed — documentation is clean | — | +| `1` | Quality findings (broken links, orphans, placeholders, etc.) | ✅ `--exit-zero` | +| `2` | **Shield — credential detected (Z201)** | ❌ **Never** | +| `3` | **Blood Sentinel — path traversal / fatal (Z202/Z203)** | ❌ **Never** | + +Exit codes 2 and 3 are enforced at the CLI layer (`cli/_check.py`) before any +`--exit-zero` flag is consulted. The check is not conditional — it is structurally prior. +No configuration, flag, or environment variable can suppress a security exit. + +This is not a policy decision. It is a **proof of correctness**: a CI gate that can be +silenced on a credential leak is not a security gate. It is a checkbox. + +## Further Reading + +- [ADR — Architectural Decision Records](./adr-discovery) — the full decision log +- [Architecture](./architecture-gaps) — open gaps and v0.8.0 roadmap +- [Finding Codes Reference](/docs/reference/finding-codes) — the full `Zxxx` catalogue +- [Safe Harbor](/docs/explanation/safe-harbor) — the philosophy behind the engineering diff --git a/developers/governance/_category_.json b/developers/governance/_category_.json new file mode 100644 index 0000000..d213229 --- /dev/null +++ b/developers/governance/_category_.json @@ -0,0 +1,5 @@ +{ + "label": "Governance & Sovereignty", + "position": 5, + "collapsible": true +} diff --git a/developers/governance/adversarial_ai.mdx b/developers/governance/adversarial_ai.mdx new file mode 100644 index 0000000..a90fd01 --- /dev/null +++ b/developers/governance/adversarial_ai.mdx @@ -0,0 +1,185 @@ +--- +sidebar_label: "Adversarial Stress-Testing Protocol" +--- + +{/* SPDX-FileCopyrightText: 2026 PythonWoods */} +{/* SPDX-License-Identifier: Apache-2.0 */} + +# Adversarial Stress-Testing — AI as Punching Bag + +> **`Adversarial Stress-Testing — AI as Punching Bag`** +> +> *AI does not co-author Zenzic. It is the punching bag we hit until the +> design stops bleeding.* + +--- + +## 1. The Arena + +Zenzic development operates as an **adversarial arena**. The rules are simple: + +- **Humans** decide the architecture. The Three Pillars, the VSM design, the Shield + + pipeline, the Blood Sentinel perimeter — these are strategic human choices. + +- **AI** is deployed as a controlled Red Team. Its mission is to find logical flaws, + + Pillar violations, and security weaknesses in what the human has already decided. + +| Role | Function | +| :--- | :--- | +| **Human Architect** | **Integrity Gatekeeper.** Decides strategy. Sets invariants. Owns liability. Ratifies or rejects every AI finding. | +| **AI Red Team** | **Adversarial Auditor.** Attacks assumptions. Attempts to violate the Three Pillars. Surfaces hidden coupling. Finds contradictions before they ship. | + +**The Core Rule:** + +> *"AI output is treated as an adversarial pull request. It must pass the Constitutional +> Audit before being merged. If the AI can propose a valid violation, it has found a +> real bug — not a style suggestion."* + +--- + +## 2. The Three Pillars as Stress-Test Targets + +Every AI-assisted session in Zenzic is framed as a direct attack on one or more of +the Three Pillars: + +### Pillar 1 — Lint the Source, Not the Build + +**The attack surface:** Can any analysis code be made to depend on HTML output, +compiled assets, or build artifacts? Can a rule be written that only fires after +a build completes? + +**The invariant:** Analysis operates on raw Markdown and configuration files. If +the AI finds a code path that reads `build/` or requires a build step before firing, +it has found a Pillar 1 violation. + +### Pillar 2 — Zero Subprocesses + +**The attack surface:** Can any code path lead to `subprocess.run`, `os.system`, +`os.popen`, `shutil.which` + exec, or any form of external process invocation? +Can a plugin contract be satisfied by a class that wraps a subprocess? + +**The invariant:** 100% pure Python. No exceptions. If the AI can write a +`BaseRule` subclass that calls a subprocess and still passes the `PluginContractError` +validation — that is a real contract vulnerability. + +### Pillar 3 — Pure Functions First + +**The attack surface:** Can analysis logic accumulate state between calls? Can a +rule hold a mutable counter that affects future findings? Can the `check()` method +make an I/O call that influences its output? + +**The invariant:** Analysis logic is deterministic. `check(file_path, text)` always +returns the same findings for the same inputs. If the AI can construct a valid +`BaseRule` implementation with a hidden `self._cache` that changes behavior on the +second call, it has found a Pillar 3 violation. + +--- + +## 3. Adversarial Session Types + +### Type A — Architecture Violation Hunt + +The AI is given the full codebase and tasked with finding any code that violates +an `[INVARIANT]` from the Zenzic Ledger. No guidance is given on where to look. + +**Outcome:** If a real violation is found, it is promoted to a bug and fixed +in the same sprint. If no violations are found, the session confirms architectural +soundness. + +### Type B — Reg Ex Canary Attack (ZRT-002) + +The AI is tasked with constructing a regex pattern that: + +1. Would be accepted by `AdaptiveRuleEngine` construction (passes `_assert_regex_canary`) +2. Exhibits catastrophic backtracking on input sizes > 1 KiB + +This is a direct security stress-test on the ReDoS hardening. + +### Type C — Shield Bypass Hunt + +The AI is given the 8-stage normalization pipeline and tasked with constructing a +Markdown fragment that: + +1. Contains a real credential (from a known family in `_SECRETS`) +2. Passes through all 8 normalization stages undetected + +This is the most adversarial session type. Any successful bypass is a **Z201 +SHIELD_SECRET detection failure** — a Critical security finding requiring an +immediate patch and a new normalization stage. + +### Type D — Blood Sentinel Escape + +The AI is given the `InMemoryPathResolver._build_target()` implementation and tasked +with constructing a path string that: + +1. Is a valid relative Markdown link (parseable by the MDX renderer) +2. After `os.path.normpath()` collapse, resolves to a path outside `docs_root` +3. Does not contain obvious traversal sequences (literal `../`) + +Any successful escape is a **Z202 PATH_TRAVERSAL** false-negative — a Critical +security finding requiring immediate perimeter hardening. + +--- + +## 4. Governance Badge + +```text +AI-Tested / Human-Governed +``` + +This badge, visible in the Zenzic `README`, signals: + +1. **AI was used** — in adversarial sessions, as documented here. +2. **Humans decided** — every strategic choice, every invariant, every merge decision. +3. **Transparency** — you know exactly how AI was deployed in this project. + +Every sprint that involved adversarial sessions records the outcome in the Zenzic +Ledger `[ACTIVE SPRINT]` entry: sessions run, violations found, outcome. + +--- + +## 5. What AI Does Not Decide + +| Decision | Authority | +| :--- | :--- | +| Architectural Principles (Three Pillars) | Human — non-delegable | +| Finding code semantics (Zxxx registry) | Human — ratified in `core/codes.py` | +| Exit code contract (0/1/2/3) | Human — immutable | +| Sprint scope and release schedule | Human | +| Whether an AI finding is a real violation | Human Integrity Gatekeeper | + +AI proposes. AI attacks. AI does not ratify. + +--- + +## 6. FAQ + +**Q: Is Zenzic "written by AI"?** + +No. Zenzic is *stress-tested* by AI. The Three Pillars, the VSM architecture, the +Shield normalization pipeline, and the Blood Sentinel perimeter are human strategic +choices. AI is used to enforce these choices by attempting to break them. + +**Q: Can I attribute a bug to the AI?** + +No. The human Integrity Gatekeeper merged the code. The Integrity Gatekeeper owns +the bug. The AI's role was to catch it before merge — if it failed to, that is an +audit protocol failure, not a liability transfer. + +**Q: Does the AI have access to secrets or production systems during adversarial sessions?** + +No. Adversarial sessions operate on the open-source codebase only. The AI is given +source code and documentation. No credentials, no deployment keys, no production +access. The adversarial model is purely analytical. + +**Q: Why document this?** + +To avoid **Asymmetrical Information**. When you read a Zenzic ADR that says +*"the Shield has 8 normalization stages"*, you should know that those 8 stages +survived a Type C adversarial session where an AI attempted to construct bypass +payloads for each one. The rigor is deliberate. The transparency is part of the +security model. + +> *Saga VI: The Governance of Quartz — [read the chronicle](https://zenzic.dev/blog/governance-of-quartz)* diff --git a/developers/governance/evolution_policy.mdx b/developers/governance/evolution_policy.mdx new file mode 100644 index 0000000..7b2a5f4 --- /dev/null +++ b/developers/governance/evolution_policy.mdx @@ -0,0 +1,124 @@ +--- +sidebar_label: "Evolution Policy" +--- + +{/* SPDX-FileCopyrightText: 2026 PythonWoods */} +{/* SPDX-License-Identifier: Apache-2.0 */} + +# Evolution Policy: The Immutable Pillars + +> *"The Three Pillars do not evolve. They protect the things that do."* + +--- + +The Zenzic Evolution Policy governs how the project changes. Its first principle +is that **not everything can change** — and the Three Pillars are the things +that cannot. + +--- + +## 1. The Immutability Contract + +The Three Pillars are not preferences. They are the structural requirements of the +Safe Harbor. A Zenzic without Pillar II (Zero Subprocesses) is not a faster Zenzic +— it is a different tool that has abandoned its trust model. + +### What "Immutable" Means + +| Pillar | Can Be Relaxed? | Consequence of Relaxation | +| :--- | :---: | :--- | +| **I — Lint the Source, Not the Build** | ❌ No | Breaks the pre-build analysis guarantee | +| **II — Zero Subprocesses** | ❌ No | Breaks the Zero-Trust execution model | +| **III — Pure Functions First** | ❌ No | Breaks reproducibility and auditability | + +A change that violates Pillar II or Pillar III — even temporarily, even for a +well-motivated reason — requires: + +1. **A Major version increment** (e.g., v0.7.0 → v1.0.0) +2. **A 30-day public impact analysis period** +3. **A formal [ADR](https://github.com/PythonWoods/zenzic/blob/main/.github/copilot-instructions.md)** + + added to the Zenzic Ledger + +4. **An [Adversarial AI session](./adversarial_ai) (Type A)** against the proposed + + replacement architecture + +5. **2/3 Core Maintainer consensus** + +This is not a bureaucratic barrier. It is the cost of the trust model. If the change +is truly necessary, the 30-day period protects the users who depend on the Pillar. + +--- + +## 2. What Can Evolve (Lightweight Procedure) + +**Operational Standards** — quality gate thresholds, coverage floors, benchmark +targets, finding code messages (not semantics) — evolve on a lightweight track: + +| Stage | Activity | Timeline | +| :--- | :--- | :--- | +| Proposal | GitHub issue with rationale | Day 0 | +| Debate | 72-hour window for Core Maintainer objections | 72 hours | +| Merge | Any Core Maintainer may merge if no blocking objection | Day 4+ | +| Ledger Update | `[POLICIES]` or `[ARCHITECTURE]` section updated in same commit | — | + +Examples of Operational Standard changes: + +- Raising the coverage floor from 80% to 85% +- Adjusting mutation score targets +- Updating a finding code message (text only, not semantics) +- Adding a new `Zxxx` finding code in an existing range + +--- + +## 3. RFC Template (for Pillar-Level Proposals) + +Any proposal to amend a Three Pillars invariant must include: + +1. **Current Text:** The exact `[INVARIANT]` text being challenged. +2. **Proposed Text:** The replacement wording, if any. +3. **Rationale:** Why the current invariant is architecturally insufficient or harmful. +4. **Cost:** What breaks? Which users must migrate? Which ADRs are invalidated? +5. **Alternative Analysis:** What alternatives were considered before proposing this? + +A proposal without a Cost section and Alternative Analysis will not enter debate. + +--- + +## 4. The "Convenience" Prohibition + +> *"We don't accept shortcuts because of convenience."* + +The following are **not** valid rationales for a Pillar amendment: + +- "It's annoying to write pure functions for this rule." +- "We need to ship this subprocess call now." +- "The AI proposed a simpler architecture that bypasses Pillar II." +- "This is a temporary exception." + +If a proposed change would be rejected as a pull request by a junior engineer who +has read the Zenzic Ledger once — it is not a candidate for the Evolution Policy. +It is a candidate for a code review. + +--- + +## 5. Emergency Security Exception + +In case of a **Critical Security Vulnerability** requiring an emergency deviation +from a Pillar (e.g., a process isolation call during a zero-day response), Core +Maintainers may invoke the Emergency Exception: + +- Suspends **one specific invariant** for **maximum 30 days** +- Requires a logged emergency ADR in the Zenzic Ledger with: invariant suspended, + + security rationale, expiry deadline + +- If restoration is impossible in 30 days → full Pillar Amendment Process begins + + before the deadline expires + +The Emergency Exception **cannot** be invoked for convenience, deadline pressure, +or technical debt. It requires a documented CVE or equivalent security incident. + +> *Saga VI: The Governance of Quartz — [read the chronicle](https://zenzic.dev/blog/governance-of-quartz)* diff --git a/developers/governance/exit_strategy.mdx b/developers/governance/exit_strategy.mdx new file mode 100644 index 0000000..082a5ab --- /dev/null +++ b/developers/governance/exit_strategy.mdx @@ -0,0 +1,151 @@ +--- +sidebar_label: "The Sovereignty Oath" +--- + +{/* SPDX-FileCopyrightText: 2026 PythonWoods */} +{/* SPDX-License-Identifier: Apache-2.0 */} + +# The Sovereignty Oath: Zero Residue + +> *"Zenzic is a sentinel in your pipeline, not a chain. The ability to remove it +> is not a failure mode — it is a design requirement."* + +--- + +## The Oath + +Zenzic makes one unconditional promise: **it will never hold your codebase hostage.** + +To ensure the integrity of the Safe Harbor, Zenzic's audit core is strictly read-only. +We believe that a linter should never be a source of unintended mutations. Any future +remediation features will be implemented as explicit, interactive utilities +(e.g. `zenzic fix`), keeping the analysis phase 100% mutation-free. + +This document is the formal proof of that promise. + +--- + +## 1. Zero Residue Guarantee + +When you remove Zenzic, what remains? + +| Component | Residue After Removal | +| :--- | :--- | +| **Your source files** | Unchanged — Zenzic never writes or modifies content | +| **Your application code** | Unchanged — Zenzic is never imported at runtime | +| **Your Python types** | Unchanged — Zenzic uses `typing.Protocol`, not inheritance | +| **Your config format** | Standard `[tool.zenzic]` PEP convention — remove the section, done | +| **Your CI pipeline** | One workflow step — delete it | +| **Your pre-commit hooks** | One hook entry — remove it | + +**Total removal time: 30 seconds.** + +No migration scripts. No data format to convert. No architecture to unwind. + +--- + +## 2. Why `typing.Protocol` Matters + +Zenzic's adapter system uses [`typing.Protocol`](https://docs.python.org/3/library/typing.html#typing.Protocol) +— the Python standard library's structural subtyping mechanism. + +This is a deliberate architectural choice: + +```python +# Zenzic adapter contract — structural subtyping only +class AdapterProtocol(Protocol): + def get_docs_root(self) -> Path: ... + def get_nav_paths(self) -> frozenset[str]: ... + def get_metadata_files(self) -> frozenset[str]: ... +``` + +**What this means for you:** + +- You do **not** need to subclass a Zenzic base class. +- Your code does **not** carry a Zenzic inheritance chain. +- If you remove Zenzic, your Python classes remain unchanged — no base class to strip + + out, no method overrides to remove, no MRO to audit. + +The adapter is a structural contract. If your object has the right methods, Zenzic +accepts it. If Zenzic is removed, your object still works — it simply has no auditor. + +--- + +## 3. PEP-Compliant Configuration + +Zenzic configuration lives in the `[tool.zenzic]` section of `pyproject.toml` — +the standard [PEP 518](https://peps.python.org/pep-0518/) location for tool config: + +```toml title="pyproject.toml" +[tool.zenzic] +docs_dir = "docs" +engine = "mkdocs" +``` + +Or in a standalone `zenzic.toml` at the repository root. + +**Removal procedure:** + +```toml title="pyproject.toml (after)" +# [tool.zenzic] section deleted — no other changes needed +``` + +Or: + +```bash +rm zenzic.toml +``` + +The `[tool.zenzic]` section is an isolated namespace. Removing it does not affect +any other tool configuration. No cascading effects. No shared state. + +--- + +## 4. The 30-Second Decommission + +### Step 1 — Remove from CI (15 seconds) + +```yaml title=".github/workflows/docs.yml" +# Delete this block + +- uses: PythonWoods/zenzic-action@v1 + + with: + version: "0.7.0" + format: sarif + upload-sarif: "true" +``` + +Or, if running directly: + +```yaml title=".github/workflows/docs.yml" +# Delete this block + +- name: Zenzic Sentinel + + run: uvx zenzic check all +``` + +### Step 2 — Remove Configuration (15 seconds) + +```bash +rm zenzic.toml +# OR edit pyproject.toml: remove the [tool.zenzic] section +``` + +**Done.** Your pipeline runs without the documentation integrity gate. Your codebase +is identical to its state before Zenzic was adopted. + +--- + +## 5. Why We Document the Exit + +Trust is built on the **ability to leave**, not the requirement to stay. + +A tool that makes departure difficult is not confident in its value — it is protecting +its own presence. The Zenzic trust model is Zero-Trust: including toward Zenzic itself. + +The sentinel exists to protect your documentation. Not to protect itself. + +> *Saga VI: The Governance of Quartz — [read the chronicle](https://zenzic.dev/blog/governance-of-quartz)* diff --git a/developers/governance/index.mdx b/developers/governance/index.mdx new file mode 100644 index 0000000..95cb84d --- /dev/null +++ b/developers/governance/index.mdx @@ -0,0 +1,80 @@ +--- +sidebar_label: "Overview" +--- + +{/* SPDX-FileCopyrightText: 2026 PythonWoods */} +{/* SPDX-License-Identifier: Apache-2.0 */} + +# Governance & Sovereignty + +> *"Stability is not the enemy of progress. It is its precondition."* + +This section is not documentation for bureaucrats. It is the **Engineering of +Stability** — a formal contract that protects the Three Pillars of the Safe Harbor +from erosion by convenience, urgency, or well-intentioned shortcuts. + +--- + +## The Supreme Law: The Three Pillars + +Every governance document in this section exists to defend one invariant: +**the Three Pillars are non-negotiable.** + +| Pillar | Invariant | What Breaking It Would Cost | +| :---: | :--- | :--- | +| **I** | Lint the Source, Not the Build | Analysis of HTML output chains Zenzic to the build pipeline — the thing it is designed to precede. | +| **II** | Zero Subprocesses | A subprocess call escapes the trust boundary. It introduces a dependency Zenzic cannot audit, on an execution context it does not control. | +| **III** | Pure Functions First | Impure functions in hot-path loops are invisible failure modes. Determinism is the foundation of the trust model. Every finding must be reproducible. | + +These are not design preferences. They are load-bearing walls. When the Three Pillars +hold, the Safe Harbor holds. + +--- + +## Governance Documents + +| Document | Purpose | +| :--- | :--- | +| [Adversarial AI Model](./adversarial_ai) | How AI is used as Red Team to attack the Three Pillars — not as a co-author. | +| [The Sovereignty Oath](./exit_strategy) | Proof that Zenzic is a tool, not a master. Zero Residue. Reversible in 30 seconds. | +| [Evolution Policy](./evolution_policy) | The formal process for evolving — or protecting — the Three Pillars. | +| [License Compliance](./licensing) | Apache-2.0 + REUSE 3.3. Every file carries the cryptographic signature of its license. | + +--- + +## The Engineering of Stability + +Governance documents are not written for today. They are written for the engineers +who will maintain Zenzic in 2030, under pressures that do not yet exist, facing +architectural temptations that have not yet been named. + +The [Zenzic Ledger](https://github.com/PythonWoods/zenzic/blob/main/.github/copilot-instructions.md) +is the operational memory of the project. This Governance section is its +**constitutional layer** — the principles the Ledger itself cannot override. + +> *Saga VI: The Governance of Quartz — [read the chronicle](https://zenzic.dev/blog/governance-of-quartz)* + +--- + +## Abstract + +Zenzic's governance system is designed around a single guarantee: that the rules of the +Safe Harbor do not change silently mid-voyage. + +The Three Pillars — *Lint the Source*, *Zero Subprocesses*, *Pure Functions First* — +are Constitutional Laws, not architectural preferences. Changing any Pillar requires a +Major version increment, a 30-day public impact period, an adversarial AI session (Type A), +and a 2/3 consensus of Core Maintainers. + +Zenzic's governance is built on three axes: + +| Axis | Document | Guarantee | +| :--- | :--- | :--- | +| **Liberty** | [The Sovereignty Oath](./exit_strategy) | Removed in 30 seconds. Zero residue. Core is read-only. | +| **Pressure** | [Adversarial AI Model](./adversarial_ai) | AI attacks the Pillars; humans ratify. AI does not decide. | +| **Duration** | [Evolution Policy](./evolution_policy) | No Pillar changes without a public constitutional process. | + +This section is the **sentinel's constitution** — the constraints that protect Zenzic's +own structure from erosion by convenience, urgency, and well-intentioned shortcuts. + +*"Do not trust us. Trust the system we built to protect you."* diff --git a/developers/governance/licensing.mdx b/developers/governance/licensing.mdx new file mode 100644 index 0000000..102ba2a --- /dev/null +++ b/developers/governance/licensing.mdx @@ -0,0 +1,161 @@ +--- +sidebar_label: "License Compliance" +--- +{/* SPDX-FileCopyrightText: 2026 PythonWoods */} +{/* SPDX-License-Identifier: Apache-2.0 */} + +# Sentinel Compliance: Apache-2.0 + REUSE 3.3 + +> *"Every file in Zenzic carries the cryptographic signature of its license. +> There are no dark corners."* + +--- + +## 1. The License + +Zenzic is released under the **Apache License 2.0**. This is not a policy choice — +it is an engineering commitment. Apache-2.0 provides: + +| Permission | Details | +| :--- | :--- | +| ✅ Commercial use | No restrictions | +| ✅ Modification | Fork, patch, extend | +| ✅ Distribution | Redistribute under same license | +| ✅ Patent grant | Explicit patent license from all contributors | + +**Conditions:** + +- Preserve the `LICENSE` and `NOTICE` files in distributions. +- State significant changes in modified versions. + +**Full text:** `LICENSE` file at the root of each Zenzic repository. + +--- + +## 2. The License Signature — SPDX + REUSE 3.3 + +Every source file in Zenzic carries an **SPDX header** — a machine-readable +declaration of authorship and license: + +```python +# SPDX-FileCopyrightText: 2026 PythonWoods +# SPDX-License-Identifier: Apache-2.0 +``` + +This is not a comment. It is a **license signature** — machine-parseable by any +[REUSE 3.3](https://reuse.software/spec/)-compliant tool, including `reuse lint`. + +Files without an individual header are covered by `REUSE.toml` bulk declarations: + +```toml title="REUSE.toml" +[[annotations]] +path = ["docs/**", "i18n/**", "*.md"] +SPDX-FileCopyrightText = "2026 PythonWoods " +SPDX-License-Identifier = "Apache-2.0" + +[[annotations]] +path = ["build/**", "node_modules/**", ".docusaurus/**"] +SPDX-FileCopyrightText = "2026 PythonWoods " +SPDX-License-Identifier = "Apache-2.0" +``` + +**Coverage strategy:** + +| Component | Method | +| :--- | :--- | +| Python source files | Per-file SPDX header | +| Shell scripts | Per-file SPDX header | +| Configuration (TOML, YAML) | Per-file header or `REUSE.toml` | +| Documentation (`.mdx`, `.md`) | `REUSE.toml` bulk declaration | +| Auto-generated files | `REUSE.toml` coverage | +| Binary assets (SVG, PNG) | `REUSE.toml` bulk declaration | + +--- + +## 3. The Single Gate of Truth + +```bash +uv run reuse lint +``` + +This is the **only authorised compliance verification command.** It: + +1. Parses every SPDX header in every file. +2. Validates all `REUSE.toml` bulk declarations. +3. Reports any file without coverage as a compliance failure. +4. Returns exit 0 only when 100% of files have a declared license. + +**Expected output:** + +```text +Congratulations! Your project is compliant with version 3.3 of the REUSE Specification. +``` + +This gate runs in: + +- The Sentinel Guard pre-commit hook (hook 8 of 8) +- `just preflight` — the full local CI mirror + +Any PR that fails `uv run reuse lint` does not merge. + +--- + +## 4. Contributor Policy — No CLA, Multi-Author Copyright + +Zenzic uses the **multi-author copyright model**. No Contributor License Agreement +(CLA) is required. + +| Scenario | Action | +| :--- | :--- | +| New file (any contributor) | Add your own SPDX copyright line | +| Small change (< 10 lines) | Keep existing headers unchanged | +| Substantial contribution | Append your copyright line below existing lines | + +Example of multi-author file: + +```python +# SPDX-FileCopyrightText: 2026 PythonWoods +# SPDX-FileCopyrightText: 2026 Contributor Name +# SPDX-License-Identifier: Apache-2.0 +``` + +You retain copyright of your contribution. The Apache-2.0 license — including its +patent grant — applies automatically upon submission. + +--- + +## 5. Third-Party Dependency Policy + +Zenzic may only depend on libraries with Apache-2.0-compatible licenses: + +| License | Compatible | Notes | +| :--- | :---: | :--- | +| MIT | ✅ | Permissive | +| BSD 2/3-Clause | ✅ | Permissive | +| Apache-2.0 | ✅ | Identical | +| LGPL-3.0 | ✅ | Library use only | +| ISC | ✅ | MIT-equivalent | +| GPL-2.0 / GPL-3.0 | ❌ | Copyleft contamination | +| Proprietary | ❌ | Not open-source | + +When adding a dependency: + +1. Verify license compatibility above. +2. Add to the `NOTICE` file: name, URL, copyright holder, license identifier. +3. Run `uv run reuse lint` — no regressions accepted. + +--- + +## 6. Legal Disclaimer + +This document provides operational guidance, not legal advice. For questions +regarding Apache-2.0 compliance, patent grants, or contribution rights in your +jurisdiction, consult qualified legal counsel. + +**References:** + +- [Apache License 2.0](https://www.apache.org/licenses/LICENSE-2.0) +- [REUSE 3.3 Specification](https://reuse.software/spec/) +- [SPDX License List](https://spdx.org/licenses/) + +> *Saga VI: The Governance of Quartz — [read the chronicle](https://zenzic.dev/blog/governance-of-quartz)* diff --git a/developers/governance/technical-debt.mdx b/developers/governance/technical-debt.mdx new file mode 100644 index 0000000..d7ec26d --- /dev/null +++ b/developers/governance/technical-debt.mdx @@ -0,0 +1,110 @@ +--- +sidebar_label: "Technical Debt Ledger" +sidebar_position: 50 +description: "The deliberate, declared list of capabilities Zenzic chose NOT to ship in v0.7.0 — and the engineering reasoning that makes each deferral a feature, not an oversight." +--- + +{/* SPDX-FileCopyrightText: 2026 PythonWoods */} +{/* SPDX-License-Identifier: Apache-2.0 */} + +# Technical Debt Ledger + +> *"Hidden debt corrupts trust. Declared debt is engineering."* + +This page is the **public, deliberate list** of capabilities Zenzic chose +**not** to ship in v0.7.0 "Quartz Maturity" — and the engineering reasoning +that makes each deferral a conscious design choice, not an oversight. + +Zenzic's stance: a project that lints other people's documentation must hold +itself to a higher standard of honesty about its own evolution. Every entry +below names what is missing, why it was deferred, and which sprint owns the +follow-through. + +--- + +## Open Entries (v0.7.0 → v0.8.0) + +### Z108 STALE_ALLOWLIST_ENTRY + +**Category:** Configuration hygiene +**Status:** Deferred to v0.8.0 "Basalt" +**Tracked:** GitHub issue (milestone `v0.8.0`) +**Related:** [ADR 011: Cross-Instance Allowlist](../explanation/adr-cross-instance-allowlist.mdx) + +#### What was deferred + +A check that warns when a prefix declared in +`[link_validation] absolute_path_allowlist` is never actually referenced by +any link in the project — i.e. the allowlist entry has become **stale** and +can be safely removed. + +#### Why we deferred it + +The check is conceptually simple but architecturally expensive: + +1. **Pillar 3 violation.** Z907 and Z105 are pure per-link / per-file + functions — they decide independently in each `pytest-xdist` worker with + no shared state. A "used / unused" determination requires aggregating + results across **every** scanned file in **every** worker, then + reconciling at the end of the run. Introducing aggregate state into the + validator pass would force a Pillar 3 redesign in a release whose stated + goal is *consolidation*, not refactor. +2. **Wrong category.** Linting the *content* of documentation and linting + the *configuration* of the linter itself are different problem spaces. + Mixing them inflates the validator's scope and obscures which findings + are about user-authored content vs. project setup. +3. **YAGNI signal absent.** No real-world reports of stale allowlist + entries exist yet. v0.7.0 is the first release that has the feature at + all. Adding a hygiene check for a problem that has never been observed + would be premature. + +#### What we will do in v0.8.0 + +The natural home for this check is a separate command — proposed name +`zenzic inspect config` — which audits configuration files end-to-end: +unreferenced allowlist entries, contradictory `excluded_dirs` patterns, +deprecated keys, etc. This separates **content lint** (the validator pass) +from **config audit** (the inspector pass) and keeps both passes pure. + +#### Mitigation in v0.7.0 + +`zenzic.toml` is small, version-controlled, and code-reviewed at every PR. +A stale allowlist entry is a code-review concern in v0.7.0, promoted to a +tooling concern in v0.8.0. The risk window is bounded: a stale entry can at +worst silence a legitimate Z105 finding for a prefix that no longer needs +silencing — it cannot create false positives, leak data, or weaken any +security check. + +--- + +## Closed Entries + +This section will accrue entries as deferred items ship. Each closed entry +will name the version that resolved it and link to the merged PR. + +*(none yet — v0.7.0 is the first release with a public Technical Debt +Ledger.)* + +--- + +## Why this page exists + +Zenzic's first invariant is **Transparency**. A linter that hides its own +shortcomings is not trustworthy: every project that adopts Zenzic should be +able to read this ledger and judge for themselves whether the deferred work +matters to their use case. + +Three commitments govern this page: + +1. **Every deferral is named.** No silent backlog. If we chose not to ship + a capability that was meaningfully discussed during a sprint, it lands + here. +2. **Every deferral has a reason.** "We ran out of time" is acceptable + when true; vague hand-waving is not. The reason must be specific enough + that a future contributor can decide whether the constraint still holds. +3. **Every deferral has an owner.** Either a target sprint, a target + release, or an explicit "indefinitely deferred" with the rationale. + Ledger entries without owners decay into folklore. + +When you contribute a deferral here, you are not admitting weakness — you +are protecting the next contributor from rediscovering the same trade-off. diff --git a/docs/community/developers/how-to/_category_.json b/developers/how-to/_category_.json similarity index 100% rename from docs/community/developers/how-to/_category_.json rename to developers/how-to/_category_.json diff --git a/docs/community/developers/how-to/implement-adapter.mdx b/developers/how-to/implement-adapter.mdx similarity index 85% rename from docs/community/developers/how-to/implement-adapter.mdx rename to developers/how-to/implement-adapter.mdx index 67904a6..7b64c78 100644 --- a/docs/community/developers/how-to/implement-adapter.mdx +++ b/developers/how-to/implement-adapter.mdx @@ -15,7 +15,7 @@ i18n conventions — without modifying Zenzic itself. --- -## What Is an Adapter? +## What Is an Adapter An **adapter** is a Python class that satisfies the `BaseAdapter` protocol (`src/zenzic/core/adapters/_base.py`). Zenzic's @@ -58,7 +58,7 @@ to `get_route_info()`. ## Step 1 — Create the Adapter Class -```python +```python title="my_engine_adapter/adapter.py" # my_engine_adapter/adapter.py from __future__ import annotations @@ -69,7 +69,6 @@ from typing import Any from zenzic.core.adapters import RouteMetadata from zenzic.models.vsm import RouteStatus - class MyEngineAdapter: """Adapter for MyEngine documentation projects.""" @@ -124,7 +123,7 @@ class MyEngineAdapter: there is no reference set to compare the file list against. Return True if your adapter successfully loaded a config file. - Return False only if no engine config exists (bare/vanilla mode). + Return False only if no engine config exists (bare/standalone mode). """ return bool(self._config) @@ -194,7 +193,7 @@ class MyEngineAdapter: Zenzic discovers adapters through the `zenzic.adapters` entry-point group. Register your adapter in your package's `pyproject.toml`: -```toml +```toml title="pyproject.toml" [project.entry-points."zenzic.adapters"] myengine = "my_engine_adapter.adapter:MyEngineAdapter" ``` @@ -202,7 +201,7 @@ myengine = "my_engine_adapter.adapter:MyEngineAdapter" The **key** (left of `=`) becomes the engine name users pass to `--engine` or set as `engine` in `zenzic.toml`: -```toml +```toml title="zenzic.toml" # In the user's zenzic.toml [build_context] engine = "myengine" @@ -269,29 +268,77 @@ adapter. --- +## Step 6 — Declare Link-Scheme Bypasses (Optional) {#step-6-bypasses} + +If your engine uses a non-standard URI scheme for internal links, implement +`get_link_scheme_bypasses()` to tell the Core which scheme names to exempt from +the Z105 absolute-path check and the unknown-scheme error (Rule R21 — Protocol +Sovereignty): + +```python +def get_link_scheme_bypasses(self) -> frozenset[str]: + """Return URI scheme names this engine uses legitimately. + + The validator adds ``:`` to its skip list for each returned name, + suppressing both the unknown-scheme warning and the Z105 absolute-path check + for URLs that use that scheme. + + Return ``frozenset()`` if your engine has no special link-scheme bypass. + """ + return frozenset() +``` + +Most engines return `frozenset()`. The built-in `DocusaurusAdapter` returns +`frozenset({"pathname"})` because Docusaurus uses `pathname:///` links for +static-asset references that bypass the React router — the leading `/` in the +path component is a URI convention artifact, not a server-absolute path. + +:::info[Rule R21 — Protocol Sovereignty] +The Core never hardcodes engine names. Engine-specific behaviour is declared in +the adapter and queried by the Core via this method. Adding a new adapter that +needs a link-scheme bypass requires **zero changes to `validator.py`**. +::: + +--- + ## Adapter Contract Guarantees Your adapter must satisfy these invariants, or Zenzic's scanner may produce incorrect results: 1. `get_route_info()` must return a `RouteMetadata` with a `canonical_url` + that starts and ends with `/`. + 2. `get_route_info()` must set `status` to one of `REACHABLE`, + `ORPHAN_BUT_EXISTING`, or `IGNORED`. Never return `CONFLICT` — that status is assigned later by `_detect_collisions()`. + 3. `get_nav_paths()` returns paths **relative to `docs_root`**, using forward + slashes, with no leading `/`. + 4. `get_nav_paths()` returns only `.md` files (other extensions are ignored by + the orphan checker). + 5. `is_locale_dir()` must return `False` for the **default** locale. Only + non-default locale directories should return `True`. + 6. All methods must be **pure**: same inputs always produce the same outputs. + No I/O, no global-state mutation. + 7. `resolve_asset()` must never raise — return `None` on any failure. 8. `resolve_anchor()` must never raise — return `False` on any failure. + The `anchors_cache` argument is read-only; do not mutate it. + 9. `has_engine_config()` must never raise — return `False` on any failure. 10. `provides_index(directory_path)` **is the only method permitted to do I/O**. + It is called once per directory during the discovery phase — never inside per-link or per-file hot loops — so a single `Path.exists()` call is acceptable. Return `True` if your engine will generate a landing page for @@ -299,6 +346,11 @@ incorrect results: like `_category_.json` with `"link": {"type": "generated-index"}`). Never raise — return `False` on any I/O failure. +11. `get_link_scheme_bypasses()` must return a `frozenset[str]` of scheme names + + (without the trailing colon) — never `None`, never raise. Return + `frozenset()` if your engine has no special link-scheme bypass requirement. + --- ## Testing Your Adapter @@ -329,11 +381,17 @@ def test_nav_paths_relative() -> None: Connect adapter code to deployment truth: 1. Register engine identity in project configuration via `[build_context] engine` - (see [Adapters & Engine Configuration](../../../how-to/configure-adapter.mdx)). + + (see [Adapters & Engine Configuration](/docs/how-to/configure-adapter)). + 2. Validate adapter behavior under strict Sentinel policy: + `zenzic check all --engine myengine --strict`. - For run controls, see [CLI Commands: Global flags](../../../reference/cli.mdx#global-flags). + For run controls, see [CLI Commands: Global flags](/docs/reference/cli#global-flags). + 3. If your engine generates synthetic locale routes, explicitly map Ghost Route + expectations against the VSM reference: - [Checks Reference — VSM](../../../reference/checks#vsm-how-it-works). + [Checks Reference — VSM](/docs/reference/checks#vsm-how-it-works). + ::: diff --git a/developers/how-to/sovereign-override-404-shield.mdx b/developers/how-to/sovereign-override-404-shield.mdx new file mode 100644 index 0000000..a02e0fa --- /dev/null +++ b/developers/how-to/sovereign-override-404-shield.mdx @@ -0,0 +1,81 @@ +--- +icon: lucide/shield-alert +sidebar_label: "Sovereign Override (404 Shield)" +description: "Use ZENZIC_EXTRA_ARGS for temporary pre-launch URL exclusions without weakening global external-link validation." +--- + +{/* SPDX-FileCopyrightText: 2026 PythonWoods */} +{/* SPDX-License-Identifier: Apache-2.0 */} + +# Sovereign Override (404 Shield) + +Use this protocol when Sentinel reports `EXTERNAL_LINK` for URLs that are not +public yet (pre-launch pages, release tags not published, staged docs routes). + +The goal is strict integrity with temporary surgical exceptions. + +--- + +## Why This Exists + +`zenzic check all --strict` should keep checking external links. +Using `--no-external` hides real regressions and is not acceptable for +Quartz-grade governance. + +`ZENZIC_EXTRA_ARGS` provides a runtime-only override so CI can remain strict while +excluding specific known pre-launch URLs. + +--- + +## Fast Response (Contributor Runbook) + +If CI fails with a 404 on a known pre-launch URL: + +```bash +ZENZIC_EXTRA_ARGS="--exclude-url https://example.com/prelaunch" just verify +``` + +For multiple URLs: + +```bash +ZENZIC_EXTRA_ARGS="--exclude-url https://a.example --exclude-url https://b.example" just verify +``` + +--- + +## Propagation Chain (No Blind Compartments) + +The override must flow through every execution layer: + +1. `just verify` -> `check *args` in `justfile` +2. `preflight` hook -> `scripts/pre-commit-zenzic.sh` +3. shared script -> `zenzic check all --strict ${ZENZIC_EXTRA_ARGS:-} "$@"` +4. CI step sets `ZENZIC_EXTRA_ARGS` in `.github/workflows/ci.yml` + +If one layer drops the variable, the shield breaks. + +--- + +## Lifecycle Policy (Mandatory) + +1. Introduce exclusions only for URLs that are known pre-launch artifacts. +2. Keep exclusions in CI runtime env, not static project config. +3. Remove each exclusion immediately after the URL returns `200 OK`. +4. Treat stale exclusions as technical debt and remove in the next maintenance PR. + +--- + +## Anti-Patterns (Forbidden) + +- `--no-external` as a permanent workaround. +- Domain-wide exclusions when only a single URL is unstable. +- Committing private overrides into tracked config. + +--- + +## Verification Checklist + +- `just verify` passes locally with the intended exclusions. +- `just preflight` passes (ensures pre-commit path also honors the variable). +- CI env includes only the minimum `--exclude-url` entries required. +- Follow-up issue/PR exists to remove temporary exclusions post-launch. diff --git a/docs/community/developers/how-to/write-plugin.mdx b/developers/how-to/write-plugin.mdx similarity index 92% rename from docs/community/developers/how-to/write-plugin.mdx rename to developers/how-to/write-plugin.mdx index 2327f63..573c2ad 100644 --- a/docs/community/developers/how-to/write-plugin.mdx +++ b/developers/how-to/write-plugin.mdx @@ -65,7 +65,9 @@ class NoDraftRule(BaseRule): - **Never** open files, make network requests, or call subprocesses. - **Always** return the same output for the same input — no randomness, no + dependency on mutable global state. + - **Not** mutate their arguments (`file_path`, `text`, `vsm`, `anchors_cache`). :::warning[Avoid global mutable state] @@ -80,13 +82,12 @@ completion. All state must be returned as `RuleFinding` objects. ## Minimal example -```python +```python title="my_org_rules/rules.py" # my_org_rules/rules.py import re from pathlib import Path from zenzic.rules import BaseRule, RuleFinding - class NoInternalHostnameRule(BaseRule): """Flag occurrences of the internal hostname in public documentation.""" @@ -120,13 +121,13 @@ class NoInternalHostnameRule(BaseRule): Expose the rule through the `zenzic.rules` entry-point group in your package's `pyproject.toml`: -```toml +```toml title="pyproject.toml" [project.entry-points."zenzic.rules"] no-internal-hostname = "my_org_rules.rules:NoInternalHostnameRule" ``` The entry-point name (`no-internal-hostname`) is the **plugin ID** that users -reference in `zenzic.toml` (see [Enabling plugins](#enabling-plugins) below). +reference in `zenzic.toml` (see [Enabling plugins](#enabling-plugins) below). {/* zenzic:ignore Z107 */} Install your package alongside Zenzic: @@ -134,12 +135,14 @@ Install your package alongside Zenzic: uv add my-org-rules # or: pip install my-org-rules ``` -After installing, run `zenzic plugins list` to confirm the rule is discovered: +After installing, run `zenzic inspect capabilities` to confirm the rule is discovered: ```bash -zenzic plugins list -# Installed plugin rules (2 found) -# broken-links Z001 (core) zenzic.core.rules.VSMBrokenLinkRule +zenzic inspect capabilities +# Core Scanners (built-in) +# … +# Extensible Rules (plugin system) +# broken-links Z001 (core) zenzic.core.rules.VSMBrokenLinkRule # no-internal-hostname MYORG-001 (my-org-rules) my_org_rules.rules.NoInternalHostnameRule ``` @@ -179,7 +182,7 @@ Quick verification: ```bash cd plugin-scaffold-demo uv pip install -e . -zenzic plugins list +zenzic inspect capabilities zenzic check all ``` @@ -191,7 +194,7 @@ Core rules (registered under `zenzic.rules` by Zenzic itself) are always active. External plugin rules must be explicitly enabled in `zenzic.toml` under the `plugins` key: -```toml +```toml title="zenzic.toml" # zenzic.toml [build_context] engine = "mkdocs" @@ -217,7 +220,6 @@ from collections.abc import Mapping from zenzic.core.rules import BaseRule, RuleFinding from zenzic.models.vsm import Route - class NoOrphanLinkRule(BaseRule): @property def rule_id(self) -> str: @@ -245,7 +247,6 @@ setup required: from zenzic.rules import run_rule from my_org_rules.rules import NoInternalHostnameRule - def test_internal_hostname_detected(): findings = run_rule( NoInternalHostnameRule(), @@ -255,7 +256,6 @@ def test_internal_hostname_detected(): assert findings[0].rule_id == "MYORG-001" assert findings[0].severity == "error" - def test_clean_content_passes(): findings = run_rule(NoInternalHostnameRule(), "All public content here.") assert findings == [] @@ -293,13 +293,19 @@ refuses to start. Fix the rule before running Zenzic. Bridge your rule from implementation to production Sentinel flow: 1. Register and enable the plugin ID in `zenzic.toml` under `plugins` - (see [Enabling plugins](#enabling-plugins)). + + (see [Enabling plugins](#enabling-plugins)). {/* zenzic:ignore Z107 */} + 2. Validate the rule under strict pipeline semantics: + `zenzic check all --strict`. For run-time policy controls, see - [CLI Commands: Global flags](../../../reference/cli.mdx#global-flags). + [CLI Commands: Global flags](/docs/reference/cli#global-flags). + 3. If your rule is nav-aware, map expected Ghost Route behavior against the VSM model: - [Checks Reference — VSM](../../../reference/checks#vsm-how-it-works). + + [Checks Reference — VSM](/docs/reference/checks#vsm-how-it-works). + ::: [ep]: https://packaging.python.org/en/latest/guides/creating-and-discovering-plugins/#using-package-metadata diff --git a/docs/community/developers/index.mdx b/developers/index.mdx similarity index 99% rename from docs/community/developers/index.mdx rename to developers/index.mdx index a40876e..55eab6c 100644 --- a/docs/community/developers/index.mdx +++ b/developers/index.mdx @@ -20,10 +20,15 @@ This section covers everything you need to extend, adapt, or contribute to Zenzi ## In this section - [Writing Plugin Rules](how-to/write-plugin.mdx) — implement `BaseRule` subclasses, register + them via `entry_points`, and satisfy the pickle / purity contract. + - [Writing an Adapter](how-to/implement-adapter.mdx) — implement the `BaseAdapter` protocol + to teach Zenzic about a new documentation engine. + - [Example Projects](tutorials/adapter-examples.mdx) — four self-contained runnable fixtures that + demonstrate correct and incorrect Zenzic configurations. --- diff --git a/docs/community/developers/reference/_category_.json b/developers/reference/_category_.json similarity index 100% rename from docs/community/developers/reference/_category_.json rename to developers/reference/_category_.json diff --git a/developers/reference/adapter-api.mdx b/developers/reference/adapter-api.mdx new file mode 100644 index 0000000..708f29b --- /dev/null +++ b/developers/reference/adapter-api.mdx @@ -0,0 +1,145 @@ +--- +icon: lucide/library +sidebar_label: "API Reference" +description: "Python API reference for Zenzic's public modules and classes." +--- + +{/* SPDX-FileCopyrightText: 2026 PythonWoods */} +{/* SPDX-License-Identifier: Apache-2.0 */} + +# API Reference + +Auto-generated reference documentation for all public modules in `zenzic`. This section is English-only, as the source docstrings are written in English. + +--- + +## `zenzic.core.scanner` + +Filesystem scanning utilities: repo root discovery, orphan page detection, asset tracking, and placeholder scanning. + +::: zenzic.core.scanner + options: + members: + + - find_repo_root + - find_config_file + - find_orphans + - find_placeholders + - find_unused_assets + - find_missing_directory_indices + - calculate_orphans + - calculate_unused_assets + - check_placeholder_content + - check_asset_references + +--- + +## `zenzic.core.scorer` + +Documentation quality scoring engine: weighted 0–100 score computation, snapshot persistence, and snapshot loading. + +::: zenzic.core.scorer + options: + members: + + - compute_score + - save_snapshot + - load_snapshot + - ScoreReport + - CategoryScore + +--- + +## `zenzic.core.validator` + +Validation logic: broken link detection via MkDocs and Python snippet syntax checking. + +::: zenzic.core.validator + options: + members: + + - validate_links + - validate_snippets + - check_snippet_content + - SnippetError + +--- + +## `zenzic.models.config` + +Configuration model. + +::: zenzic.models.config + options: + members: + + - ZenzicConfig + +--- + +## `zenzic.rules` — Plugin SDK façade + +`zenzic.rules` is the **canonical entry point for plugin authors**. It re-exports the stable +API surface from `zenzic.core.rules` and is the only path that is guaranteed to remain stable +across major versions. + +```python +from zenzic.rules import BaseRule, RuleFinding, Severity, run_rule +``` + +### `BaseRule` + +Abstract base class for all plugin rules. Subclass this and implement `check()`: + +```python +from zenzic.rules import BaseRule, RuleFinding, Severity + +class NoDraftRule(BaseRule): + rule_id = "no-draft" + + def check(self, file_path, line_no, line): + if "DRAFT" in line: + return [RuleFinding( + rule_id=self.rule_id, + file_path=file_path, + line_no=line_no, + message="DRAFT marker found", + severity=Severity.WARNING, + )] + return [] +``` + +### `run_rule` — Test helper + +Runs a single rule against a Markdown string. No engine setup required — designed for unit tests: + +```python +from zenzic.rules import BaseRule, RuleFinding, run_rule + +def test_no_draft_rule(): + findings = run_rule(NoDraftRule(), "# My Page\n\nDRAFT content here.") + assert len(findings) == 1 + assert findings[0].severity == "warning" +``` + +**Canonical location:** `run_rule` is implemented in `zenzic.core.rules` and re-exported from +`zenzic.rules`. Both import paths work; prefer `from zenzic.rules import run_rule` in plugin code. + +### `RuleFinding`, `Severity`, `Violation`, `CustomRule` + +| Name | Purpose | +| :--- | :--- | +| `RuleFinding` | Dataclass returned by `BaseRule.check()` — carries `rule_id`, `file_path`, `line_no`, `message`, `severity` | +| `Severity` | Enum: `Severity.ERROR`, `Severity.WARNING`, `Severity.INFO` | +| `Violation` | Alias of `RuleFinding` — kept for backward compatibility | +| `CustomRule` | TOML-declared rule engine — used internally; not for subclassing | + +### Entry-point registration + +Register your rule so `zenzic inspect capabilities` discovers it: + +```toml +# In your plugin's pyproject.toml +[project.entry-points."zenzic.rules"] +my-rule-name = "my_package.rules:MyRuleClass" +``` diff --git a/docs/community/developers/reference/sentinel-style.mdx b/developers/reference/sentinel-style.mdx similarity index 55% rename from docs/community/developers/reference/sentinel-style.mdx rename to developers/reference/sentinel-style.mdx index eafc99b..185c2a9 100644 --- a/docs/community/developers/reference/sentinel-style.mdx +++ b/developers/reference/sentinel-style.mdx @@ -35,6 +35,7 @@ Every card in a `
` block must have exactly: ### Canonical example ```markdown + -   **User Guide** Everything you need to install, configure, and integrate Zenzic into @@ -110,10 +111,15 @@ convention (lowercase, hyphen-separated). ### Rules - **Semantic consistency:** if an icon represents "Contribute" on one page, it + must be the same icon on every page. + - **Uniform syntax:** every icon in a card grid uses ``. + No mixing of syntaxes or icon sets. + - **Tree-shaking contract:** before using a new icon name, add it to the + explicit `iconsMap` in `src/components/Icon.tsx`. Unregistered names render a red placeholder and emit a `console.warn`. @@ -202,3 +208,137 @@ Before submitting a PR, verify: - [ ] No naked code fences exist (§5). - [ ] SPDX header is present (§6). - [ ] Italian mirror is structurally identical to English. +- [ ] No hex literal (`#rrggbb`) in `src/` outside `SentinelPalette._*` (§9). +- [ ] All colour references use `SentinelPalette.*` — no removed flat constants (§9). + +--- + +## 8. SentinelUI Gateway {#sentinelui-gateway} + +All branded terminal output in Zenzic flows through a single object: `SentinelUI` in +`src/zenzic/ui.py`. Command modules must **never** instantiate `Console` or `SentinelUI` +directly — they must call `get_ui()` and `get_console()` from `zenzic.cli._shared`. + +### Core methods + +| Method | When to use | +| :--- | :--- | +| `print_header(version)` | The top-of-output Forge Frame banner — once per command invocation | +| `make_panel(content, *, title, border_style)` | Styled Rich `Panel` — for structured output blocks | +| `print_exception_alert(message, *, context, title, border_style)` | Error panels for `ZenzicError` and `PluginContractError` | + +### Usage pattern + +```python +# In any _check.py / _clean.py / _standalone.py command +from . import _shared + +# Print the Zenzic banner header +_shared.get_ui().print_header(__version__) + +# Print a styled panel +panel = _shared.get_ui().make_panel( + "Content here", + title="Panel Title", + border_style="bold cyan", +) +_shared.get_console().print(panel) +``` + +### Why the gateway matters + +The `--no-color` and `--force-color` CLI flags call `configure_console()`, which atomically +replaces the module-level `console` and `_ui` singletons. Any locally-created `Console` or +`SentinelUI` instance will be frozen before the flag takes effect, silently ignoring the +user's color preference. + +The `force_terminal` parameter must **always** be `None` (auto-detect) in the module-level +`Console`, never `False`. Explicit `False` disables color system detection entirely — +resulting in no ANSI styling even in truecolor terminals. This is the most common source of +visual regressions in the Zenzic CLI layer. + +### Checklist addition + +Add to your PR checklist: + +- [ ] No `Console(...)` or `SentinelUI(...)` instantiation in command modules. +- [ ] All banner output uses `get_ui().print_header()`, not a locally-created UI instance. +- [ ] `force_terminal` on any new `Console` call is `None` or conditional (`True if ... else None`), never `False`. + +--- + +## 9. SentinelPalette — Zero Hex Law {#sentinel-palette} + +`SentinelPalette` in `src/zenzic/ui.py` is the **sole authorised source of colour values** +in the entire Zenzic codebase. This is the Zero Hex Law. + +### The Law + +:::warning Design Constraint + +No hex colour string (e.g. `#4f46e5`) and no raw Rich colour name (e.g. `"red"`, `"cyan"`) +may appear anywhere in `src/` **except** inside `SentinelPalette._*` private class attributes. +Every other file must address only the semantic public attributes shown below. + +::: + +### Semantic palette + +| Attribute | Hex | Meaning | +| :--- | :---: | :--- | +| `SentinelPalette.BRAND` | `#4f46e5` | Zenzic primary / brand accent (Indigo) | +| `SentinelPalette.SUCCESS` | `#10b981` | OK · clean · pass (Emerald) | +| `SentinelPalette.WARNING` | `#f59e0b` | Caution · advisory (Amber) | +| `SentinelPalette.ERROR` | `#f43f5e` | Failure · broken links (Rose) | +| `SentinelPalette.DIM` | `#64748b` | Muted · secondary text (Slate) | +| `SentinelPalette.FATAL` | `#8b0000` | Security breach · path traversal (Blood) | + +### Pre-composed style strings + +For the most common combinations, use a `STYLE_*` constant instead of constructing +`f"bold {X}"` inline: + +| Constant | Expands to | +| :--- | :--- | +| `SentinelPalette.STYLE_BRAND` | `"bold #4f46e5"` | +| `SentinelPalette.STYLE_OK` | `"bold #10b981"` | +| `SentinelPalette.STYLE_WARN` | `"bold #f59e0b"` | +| `SentinelPalette.STYLE_ERR` | `"bold #f43f5e"` | +| `SentinelPalette.STYLE_DIM` | `"#64748b"` | + +### Usage pattern + +```python +# CORRECT — semantic alias via SentinelPalette +from zenzic.ui import SentinelPalette + +table = Table(border_style=SentinelPalette.DIM, header_style=SentinelPalette.STYLE_BRAND) +text = Text.from_markup(f"[{SentinelPalette.BRAND}]Zenzic[/]") +panel = Panel("...", border_style=SentinelPalette.STYLE_ERR) +``` + +```python +# FORBIDDEN — hex literal outside SentinelPalette +text = Text.from_markup("[#4f46e5]Zenzic[/]") # ✗ + +# FORBIDDEN — flat constant import (removed in v0.7.0) +from zenzic.ui import INDIGO, EMERALD # ✗ + +# FORBIDDEN — inline alias +P = SentinelPalette # ✗ use full qualification +``` + +### Updating the palette + +To change a colour, edit **only** the corresponding `_PRIVATE` hex attribute inside +`SentinelPalette` in `src/zenzic/ui.py`. All semantic aliases and pre-composed style +strings derive from those private attributes — the entire codebase updates automatically. + +### Checklist addition + +Add to your PR checklist: + +- [ ] No hex literal (`#rrggbb`) anywhere in `src/` outside `SentinelPalette._*`. +- [ ] No raw Rich colour names (`"red"`, `"cyan"`) for brand-palette usage — use `SentinelPalette.*`. +- [ ] No local alias `P = SentinelPalette` — always use the full class name. +- [ ] No `from zenzic.ui import INDIGO` (or any removed flat constant). diff --git a/docs/community/developers/tutorials/_category_.json b/developers/tutorials/_category_.json similarity index 100% rename from docs/community/developers/tutorials/_category_.json rename to developers/tutorials/_category_.json diff --git a/developers/tutorials/adapter-examples.mdx b/developers/tutorials/adapter-examples.mdx new file mode 100644 index 0000000..58925b0 --- /dev/null +++ b/developers/tutorials/adapter-examples.mdx @@ -0,0 +1,343 @@ +--- +icon: lucide/folder-open +sidebar_label: "Example Projects" +description: "Self-contained runnable fixtures demonstrating correct and incorrect Zenzic configurations." +--- + +{/* SPDX-FileCopyrightText: 2026 PythonWoods */} +{/* SPDX-License-Identifier: Apache-2.0 */} + +# Example Projects + +The `examples/` directory at the repository root contains five self-contained +projects. Each is a runnable fixture: navigate into the directory and run +`zenzic check all` to see its output. + +```bash +git clone https://github.com/PythonWoods/zenzic +cd zenzic/examples/ +zenzic check all +``` + +--- + +## broken-docs — Intentional Failures Fixture + +**Purpose:** Trigger every Zenzic check at least once. Useful when debugging a +new check or verifying that an error message is correctly formatted. + +**Expected result:** `FAILED` — multiple check failures, exit code 1. + +| Check | What triggers it | +| --- | --- | +| Links | Missing file, dead anchor, path traversal, absolute path, broken i18n | +| Orphans | `api.md` exists on disk but is absent from the `nav` | +| Snippets | Python block with a `SyntaxError` (missing colon) | +| Placeholders | `api.md` has only 18 words and a bare task marker | +| Assets | `assets/unused.png` is on disk but never referenced | +| Custom rules | `ZZ-NOFIXME` pattern in `zenzic.toml` | + +```bash +cd examples/broken-docs +zenzic check all # exit 1 +zenzic check all --exit-zero # exit 0 (soft-gate mode) +``` + +Engine: `mkdocs`. Also ships a `zensical.toml` to demonstrate the same fixture +under the Zensical engine. + +--- + +## i18n-standard — Gold Standard Bilingual Project + +**Purpose:** Demonstrate a perfectly clean bilingual project that scores 100/100. +Use this as the reference template when starting a new multilingual docs project. + +**Expected result:** `SUCCESS` — all checks pass, score 100/100. + +Key patterns this example demonstrates: + +- **Suffix-mode i18n** — translations live as `page.it.md` siblings, never in a + + `docs/it/` subtree + +- **Path symmetry** — `../../assets/brand/brand-kit.zip` resolves identically from + + both `page.md` and `page.it.md` + +- **Build artifact exclusion** — `excluded_build_artifacts` lets Zenzic validate + + links to generated files without requiring them on disk + +- **`fail_under = 100`** — any regression breaks the gate + +```bash +cd examples/i18n-standard +zenzic check all --strict # exit 0, score 100/100 +``` + +Engine: `mkdocs` with `i18n` plugin in `docs_structure: suffix` mode. + +--- + +## security_lab — Zenzic Shield Test Fixture + +**Purpose:** Exercise the Shield subsystem — credential detection and path +traversal classification — before releases. + +**Expected result:** `FAILED` — exit code 2 (Shield event; non-suppressible). + +| File | What it triggers | +| --- | --- | +| `traversal.md` | `PathTraversal`: `../../etc/passwd` escapes `docs/` | +| `attack.md` | `PathTraversal` + seven fake credential patterns (all Shield families) | +| `absolute.md` | Absolute paths (`/assets/logo.png`, `/etc/passwd`) | +| `fenced.md` | Fake credentials inside unlabelled and `bash` fenced blocks | + +```bash +cd examples/security_lab +zenzic check links --strict # exit 1 (path traversal) +zenzic check references # exit 2 (Shield: fake credentials) +zenzic check all # exit 2 (Shield takes priority) +``` + +> The credentials in `attack.md` and `fenced.md` are entirely synthetic — they +> match the regex shape but are not valid tokens for any service. + +Engine: `mkdocs`. + +--- + +## standalone — Engine-Agnostic Quality Gate + +**Purpose:** Show Zenzic running without any build engine. No `mkdocs.yml`, +no `zensical.toml`, no Hugo config. Just `engine = "standalone"` in `zenzic.toml`. + +**Expected result:** `SUCCESS` — all applicable checks pass. + +What works in Standalone mode: + +- Links, snippets, placeholders, and assets are fully checked +- `[[custom_rules]]` fire identically to any other mode +- `fail_under` enforces a minimum quality score +- The **orphan check is skipped** — with no declared nav there is no reference set + +```bash +cd examples/standalone +zenzic check all # exit 0 +``` + +Use Standalone mode for Hugo, Docusaurus, Sphinx, Astro, Jekyll, GitHub wikis, +or any project that does not use MkDocs or Zensical. + +--- + +## plugin-scaffold-demo — Plugin SDK Living Scaffold + +**Purpose:** Provide the exact output generated by +`zenzic init --plugin plugin-scaffold-demo` as a committed integration fixture. + +**Expected result:** `SUCCESS` — the generated scaffold is lint-clean. + +```bash +cd examples/plugin-scaffold-demo +zenzic check all # exit 0 +``` + +Use this fixture to validate scaffold regressions: if this example starts +failing, the SDK template has drifted. + +--- + +## Running the full examples suite + +From the repository root, verify all examples produce their expected exit codes: + +```bash +# Gold standard and standalone: must be clean +(cd examples/i18n-standard && zenzic check all --strict) +(cd examples/standalone && zenzic check all) + +# Broken: must fail with exit 1 +(cd examples/broken-docs && zenzic check all); [ $? -eq 1 ] + +# Security lab: must exit with code 2 (Shield) +(cd examples/security_lab && zenzic check all); [ $? -eq 2 ] + +# Plugin scaffold demo: generated template must be clean +(cd examples/plugin-scaffold-demo && zenzic check all) +``` + +--- + +## Adapter Internals — Pedagogical Comparison + +This section walks through two concrete adapter methods side-by-side. +The contrast between `DocusaurusAdapter` and `StandaloneAdapter` shows how the +adapter protocol enables engine-agnostic Core logic. + +### `provides_index()` — Does this directory have a landing page + +The Core calls `provides_index(directory_path)` once per directory during orphan +detection. It answers: *"Will the engine generate a browsable index for this +directory, so that files inside it are not structurally orphaned?"* + +**`DocusaurusAdapter.provides_index()`** — full engine awareness: + +```python +def provides_index(self, directory_path: Path) -> bool: + # Physical index files — Docusaurus serves these directly. + index_files = ("index.md", "index.mdx", "README.md", "README.mdx") + if any((directory_path / f).exists() for f in index_files): + return True + + # _category_.json with "generated-index" link — Docusaurus auto-generates + # a category landing page even without a physical index file. + category_json = directory_path / "_category_.json" + if category_json.exists(): + try: + import json as _json + data = _json.loads(category_json.read_text(encoding="utf-8")) + link = data.get("link", {}) + return isinstance(link, dict) and link.get("type") == "generated-index" + except Exception: + return True # conservative: assume it provides an index + return False +``` + +**`StandaloneAdapter.provides_index()`** — zero engine assumptions: + +```python +def provides_index(self, directory_path: Path) -> bool: + # No engine config — only a plain index.md signals a landing page. + return (directory_path / "index.md").exists() +``` + +**Key difference:** `DocusaurusAdapter` knows about `_category_.json` and +`README.mdx` because those are Docusaurus conventions. `StandaloneAdapter` +makes no assumptions — it recognises only the universal `index.md` convention. + +--- + +### `get_nav_paths()` — What files are discoverable + +`get_nav_paths()` returns the set of file paths reachable via the site's +navigation UI. A file absent from this set is a candidate for Z402 +(`ORPHAN_BUT_EXISTING`). + +**`DocusaurusAdapter.get_nav_paths()`** — three-source aggregation: + +```python +def get_nav_paths(self) -> frozenset[str]: + if self._sidebar_path is not None: + sidebar_paths = _parse_sidebars(self._sidebar_path, self._docs_root) + if sidebar_paths is not None: + # Explicit sidebar: merge with navbar paths. + # A file is REACHABLE if it appears in the sidebar OR the navbar. + return sidebar_paths | self._navbar_paths + # Autogenerated or no sidebar: all files are already REACHABLE. + return frozenset() +``` + +`self._navbar_paths` is populated by `_parse_config_navigation()` from +`docusaurus.config.*` — it extracts `to:` URL paths and `docId:` attributes +from **navbar** and **footer** items. A file linked only in the footer is still +considered discoverable (UX-Discoverability Law, Rule R21). + +**`StandaloneAdapter.get_nav_paths()`** — intentionally empty: + +```python +def get_nav_paths(self) -> frozenset[str]: + """Empty frozenset — no engine config means no declared nav.""" + return frozenset() +``` + +When `get_nav_paths()` returns an empty frozenset, `classify_route()` treats +**all** files as `REACHABLE`. This is intentional: in Standalone mode there is no +navigation contract, so orphan detection (Z402) is disabled. + +--- + +### `classify_route()` — Is this file reachable + +`classify_route(rel, nav_paths)` maps a source file path to its route status. + +**`DocusaurusAdapter.classify_route()`** — four classification rules: + +```python +def classify_route(self, rel: Path, nav_paths: frozenset[str]) -> RouteStatus: + # Rule 1: Private/meta files (e.g. _category_.json) → IGNORED + non_sentinel_parts = [p for p in rel.parts if p != "_version_"] + if any(part.startswith("_") for part in non_sentinel_parts): + return "IGNORED" + + # Version Ghost Routes: files under _version_/