Skip to content

Knowledge Graph: Citation/Authority Graph + Interactive Visualization #61

Description

@Number531

Summary

Add a Knowledge Graph feature that extracts citations, legal authorities, sections, agents, and their relationships from completed pipeline sessions, stores them in PostgreSQL, and renders an interactive force-directed graph in a new center-panel tab alongside Chat.

Spec: docs/pending-updates/knowledge-graph.md

Motivation

The Citation Chat answers "what does the memo say about X?" but cannot answer "how are these citations connected?" or "which authorities support the CFIUS analysis and where did they come from?" A knowledge graph fills this gap by making relationships, provenance, and cross-section dependencies visible and queryable.

Architecture Decisions (validated via Exa research, March 2026)

Decision Choice Rationale
Graph storage PostgreSQL adjacency tables Already running pgvector:pg16. LightRAG (29K stars) validates this pattern. Apache AGE as future upgrade path.
Extraction Hybrid: rule-based 80% + LLM 20% SAP Practical GraphRAG: dependency parsing achieves 94% of LLM quality. Our pipeline already produces structured citation-map.json.
Visualization force-graph v1.51 (Canvas) 140K weekly npm downloads, handles 75K elements, CDN-loadable, MIT.
Retrieval Reciprocal Rank Fusion Proven 62% → 84% precision lift when merging graph + vector results.
Tab placement Center panel (like Chat) Same session-scoped, post-pipeline lifecycle.

Roadmap

Phase 1 — Schema & Extraction (Backend)

  • Add ensureKnowledgeGraphSchema() to postgres.js — 5 tables: kg_nodes, kg_edges, kg_evolution, kg_provenance, kg_messages
  • Add KNOWLEDGE_GRAPH feature flag to featureFlags.js
  • Create knowledgeGraphExtractor.js — post-session extraction:
    • Rule-based: section nodes, agent nodes, source_doc nodes from DB
    • Citation parsing: parse consolidated-footnotes.md → citation nodes + verification tags
    • Citation map: parse citation-map.json → CITES edges (section → citation)
    • LLM classification: single Sonnet call to classify authorities (case/statute/regulation)
    • Similarity edges: cosine > 0.85 from existing report_embeddings
  • Wire hookDBBridge.js SessionEnd → setImmediate(buildSessionKnowledgeGraph) (fire-and-forget)
  • Add schema init + router mount to claude-sdk-server.js

Phase 2 — API Endpoints

  • GET /api/db/sessions/:key/kg/graph — full graph in {nodes, links} format
  • GET /api/db/sessions/:key/kg/neighbors/:nodeId — 1-hop neighbors
  • GET /api/db/sessions/:key/kg/evolution — graph construction timeline
  • GET /api/db/sessions/:key/kg/provenance/:nodeId — extraction audit trail

Phase 3 — Frontend Visualization

  • Add force-graph CDN to index.html
  • Add "Graph" tab button (locked until session available, same pattern as Chat)
  • Add graph tab content: toolbar (layout mode, confidence slider), canvas container, detail panel, composer
  • Add styles.css: toolbar, canvas, detail panel, node type colors (reuse design tokens)
  • Add app.js: state vars, enableGraph()/disableGraph(), renderForceGraph(), node click → neighbors + provenance, layout switching, confidence filtering

Phase 4 — Graph-Aware Q&A (RRF Hybrid Retrieval)

  • Create knowledgeGraphRouter.js — SSE streaming endpoint
    • Keyword match on node labels + embedding similarity on query
    • 2-hop graph traversal via recursive CTE
    • Vector similarity via existing searchSimilar()
    • Reciprocal Rank Fusion merge (k=60)
    • Messages API streaming with Anthropic Citations
  • Wire frontend: openKgStream(), handleKgEvent(), highlight relevant nodes on query

Phase 5 — Tests & Polish

  • Unit tests: schema creation, citation parsing, edge creation, RRF fusion, API endpoints
  • Integration test: live pipeline → verify graph populated → verify force-graph renders
  • Evolution timeline playback (scrubber to replay graph growth across phases)
  • Cross-session authority dedup via canonical_key (same case across memos)

Files

New (3): knowledgeGraphExtractor.js, knowledgeGraphRouter.js, knowledge-graph.test.js

Modified (8): postgres.js, featureFlags.js, claude-sdk-server.js, hookDBBridge.js, dbFrontendRouter.js, index.html, styles.css, app.js

Docker: Zero changes — CDN-loaded lib, auto-created schema, no new npm deps.

Dependencies


🤖 Generated with Claude Code

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions