Summary
Add a W3C Web Annotation-compliant annotator for HTML files, complementing the existing line-based annotator (packages/codev/templates/open.html). Annotations are anchored to text ranges (not line numbers) and stored inline in the HTML file as embedded JSON-LD, so the file remains a single source of truth — no sidecar annotation server required.
Motivation
The current annotator is line-based and stores comments as inline source comments (<!-- REVIEW: ... --> for HTML/MD, // REVIEW: for JS, etc.). This works well for code review but is awkward for prose-style HTML content where annotations need to attach to arbitrary text ranges, not whole lines.
W3C Web Annotation Data Model provides the right primitive for this:
- TextQuoteSelector with prefix/exact/suffix tolerates minor drift
- TextPositionSelector as a fallback for disambiguation
- JSON-LD is the canonical serialization
Storing the annotation set inline (in a <script type=\"application/ld+json\"> block at the end of <body>) keeps the file self-contained and editable in any text editor — same ergonomic win as the existing inline-comment approach, just with W3C-standard data structures.
Goals
- Render HTML files in the existing sandboxed iframe with annotated ranges highlighted (
<mark> overlays applied from JSON-LD on load).
- Capture user selections in the iframe via
window.getSelection() and convert ranges to W3C TextQuoteSelector + TextPositionSelector.
- Persist annotations to the embedded
<script type=\"application/ld+json\" id=\"annotations\"> block in the HTML file, using the same write path as the current annotator.
- Re-anchor annotations on load, tolerating minor edits to the underlying HTML. Use Apache Annotator (
@apache-annotator/dom) for matching — do not roll our own.
- Reuse UI: the annotations panel, comment dialog, triple-enter-to-submit, etc., from
open.html should carry over largely unchanged. Only the anchoring layer (line → text-range) and the storage layer (inline comment → embedded JSON-LD) change.
Non-goals
- An annotation server / sync protocol. This is purely a file-based annotator.
- Annotating non-HTML files with this new model. The existing line-based annotator keeps owning MD/code files.
- Cross-document annotation graphs.
- Author identity / multi-user merge resolution beyond what the W3C model implies for free.
End-to-end usability check
Before approving, a human must be able to:
- Open an HTML file via
afx open path/to/file.html
- Select a text range in the rendered iframe
- Enter a comment
- Save — verify the JSON-LD
<script> block now contains the annotation
- Close and reopen the file — the annotation is re-anchored and highlighted
- Edit the surrounding HTML in any text editor and reopen — the annotation either re-anchors (minor edit) or is reported as orphaned (major edit)
This is the headline path. Unit tests alone are insufficient — verify by hand before tagging.
Open questions for the spec phase
- File-extension trigger: is this the default for
.html in afx open, or opt-in via a flag / file-extension list? (The current annotator already handles .html in line mode.)
- How are orphaned annotations surfaced when re-anchoring fails?
- Should the JSON-LD block include author info (pull from
git config user.name like the current annotator presumably does)?
References
Summary
Add a W3C Web Annotation-compliant annotator for HTML files, complementing the existing line-based annotator (
packages/codev/templates/open.html). Annotations are anchored to text ranges (not line numbers) and stored inline in the HTML file as embedded JSON-LD, so the file remains a single source of truth — no sidecar annotation server required.Motivation
The current annotator is line-based and stores comments as inline source comments (
<!-- REVIEW: ... -->for HTML/MD,// REVIEW:for JS, etc.). This works well for code review but is awkward for prose-style HTML content where annotations need to attach to arbitrary text ranges, not whole lines.W3C Web Annotation Data Model provides the right primitive for this:
Storing the annotation set inline (in a
<script type=\"application/ld+json\">block at the end of<body>) keeps the file self-contained and editable in any text editor — same ergonomic win as the existing inline-comment approach, just with W3C-standard data structures.Goals
<mark>overlays applied from JSON-LD on load).window.getSelection()and convert ranges to W3C TextQuoteSelector + TextPositionSelector.<script type=\"application/ld+json\" id=\"annotations\">block in the HTML file, using the same write path as the current annotator.@apache-annotator/dom) for matching — do not roll our own.open.htmlshould carry over largely unchanged. Only the anchoring layer (line → text-range) and the storage layer (inline comment → embedded JSON-LD) change.Non-goals
End-to-end usability check
Before approving, a human must be able to:
afx open path/to/file.html<script>block now contains the annotationThis is the headline path. Unit tests alone are insufficient — verify by hand before tagging.
Open questions for the spec phase
.htmlinafx open, or opt-in via a flag / file-extension list? (The current annotator already handles.htmlin line mode.)git config user.namelike the current annotator presumably does)?References
packages/codev/templates/open.html