Skip to content

Discovery readers: JSONL, JSON, and SQLite storage reader helpers #14

@leefaus

Description

@leefaus

Part of #11

Summary

Create shared reader helpers in discovery/reader.rs that abstract the three storage backends used by agent traces: JSONL files, JSON files, and SQLite databases.

Storage Backends

Backend Agents Format
JSONL Claude Code, Codex, Gemini CLI, Copilot One JSON object per line
JSON Cline, Amp, OpenClaw, Pi Single JSON file per session/task
SQLite Cursor (state.vscdb), Hermes (state.db), OpenCode (opencode.db) SQL queries on agent DBs

API

/// Read a JSONL file, yielding parsed JSON values line by line.
pub fn read_jsonl(path: &Path) -> AgentResult<Vec<serde_json::Value>>;

/// Read a JSONL file incrementally from a byte offset (for cursor-based scanning).
pub fn read_jsonl_since(path: &Path, offset: u64) -> AgentResult<(Vec<serde_json::Value>, u64)>;

/// Read and parse a JSON file.
pub fn read_json(path: &Path) -> AgentResult<serde_json::Value>;

/// Open a read-only SQLite connection.
pub fn open_sqlite_readonly(path: &Path) -> AgentResult<rusqlite::Connection>;

New Dependency

rusqlite = { version = "0.31", features = ["bundled"] }

Acceptance Criteria

  • JSONL reader handles large files efficiently (streaming, not full-file read)
  • JSONL reader tolerates malformed lines (skip + warn, dont abort)
  • SQLite reader opens databases read-only (no writes to agent DBs)
  • JSON reader handles common edge cases (BOM, trailing comma)
  • Unit tests with fixture files

Estimated Effort

1-2 days

Metadata

Metadata

Assignees

No one assigned

    Labels

    agent-discoveryAgent trace discovery and import into provenance graphenhancementNew feature or request

    Type

    No type
    No fields configured for issues without a type.

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions