A three-part toolkit for browser session capture, automated crawling, and offline reconstruction of modern web applications.
This repository is not about bypassing authentication or attacking websites. It works with what is already available inside a browser session: DOM, network activity, runtime state, storage, routes, telemetry, and the traces of real user interaction.
The monorepo is built around one pipeline:
Capture -> Crawl -> Reconstruct -> Review -> Export
Browser-side capture engine injected into a live page through DevTools or browser automation.
- Records DOM, CSS, assets, requests, responses, routes, storage, IndexedDB, Cache Storage, and Service Worker signals.
- Supports session recording via
watch(). - Produces JSON capture packages for downstream analysis.
Docs:
Playwright orchestration layer for running one continuous browser session against a target site.
- Uses a persistent Chromium profile.
- Supports
auth none|manual|auto|profile. - Survives redirects and login flows.
- Discovers internal URLs and records one final session capture.
Docs:
Offline analysis and reporting layer that converts capture packages into structured reports and export artifacts.
- Builds architecture, API, scenario, telemetry, storage, entity, and security views.
- Generates HTML reports, Postman collections, OpenAPI drafts, TypeScript SDK drafts, and MockServers.
- Works with manual captures and crawler sessions.
Docs:
- Capture a browser session manually with
SiteSnapshotteror automatically withSiteCrawlerSnapshotter. - Feed the resulting capture folder into
SiteReconstructor. - Review the generated HTML portal and exported artifacts.
- Use the result as engineering intelligence, integration prep, migration support, or technical due diligence input.
- reverse engineers
- integration teams
- frontend and platform engineers
- security-minded analysts
- migration and due-diligence teams
SiteSnapshotter/ browser capture engine
SiteCrawlerSnapshotter/ Playwright automation and session crawling
SiteReconstructor/ offline report and export pipeline
scripts/ auxiliary build and generation scripts
docs/ internal notes and working materials
As an open-source engineering toolkit, this repository is already meaningful. It has real depth in session capture, browser-state analysis, IndexedDB intelligence, and reconstruction of technical artifacts.
As a commercial out-of-the-box platform, it is not there yet. The main gaps are product polish, repeatability of categorization, and the trust level required for generated exports such as SDKs and mock servers.
The practical sweet spot today is:
- strong open-core toolkit
- expert-led paid analysis
- niche B2B/internal platform potential
This repository is distributed under the MIT License.



