Skip to content

ecomxco/setup-data-warehouse

Repository files navigation

Setup Data Warehouse

Pull all your brand data from live APIs into a single queryable SQLite database — in one command.

v1.4.0 — 8 channels · incremental pull · identity resolution · 28 CRO views

Built by eComX — the Context-First AI development methodology for serious ecommerce operators.

What This Is

A workflow + scripts that connect to every channel in your stack (Shopify, Klaviyo, Meta Ads, GA4, Gorgias, Gmail, and more), pull data directly from their APIs, and load it into a local warehouse.db SQLite database you can query instantly.

Philosophy: Your data is already yours. This workflow just puts it all in one place — a single file you can query, feed to AI agents, and use to generate your BIOS intelligence specs.

No platform exports. No third-party data tools. No monthly fees. One pull. One database.


Prerequisites

  • /setup-environment completed — credentials must be in environments/
  • Node.js ≥ 18
  • npm install (installs better-sqlite3 and other deps)

Quick Start

git clone https://github.com/ecomxco/setup-data-warehouse.git
cd setup-data-warehouse

# Bootstrap (installs deps, creates env dirs, validates)
./setup.sh

# Fill in credentials
# → environments/{channel}/.env (see .env.example)

# Pull all live channels (incremental — only new/updated records)
npm run pull

# Build the SQLite warehouse (+ identity graph)
npm run build

# Query it
npm run summary
npm run query -- "SELECT email, total_spent FROM shopify_customers ORDER BY CAST(total_spent AS REAL) DESC LIMIT 10"

What Gets Built

data-warehouse/
├── shopify/           orders, customers, products, line items
├── klaviyo/           profiles, campaigns, flows, messages
├── meta-ads/          campaigns, adsets, ads, creative
├── facebook-organic/  posts, comments, messages
├── instagram-organic/ posts, comments, DMs
├── ga4/               traffic, ecommerce
├── gorgias/           tickets, customers
├── gmail/             inbox
├── loox-reviews/      product reviews
└── warehouse.db       ← 22+ tables, fully queryable

Scripts Included

Script Purpose
data-warehouse/pull-all.js Master pull — 8 channels, incremental, retry, manifest
data-warehouse/build-db.js Loads JSONL → SQLite with 25+ tables, 28 views, and identity resolution
data-warehouse/query.js SQL CLI tool
data-warehouse/convert-csv.js CSV → JSONL for non-API channels
data-warehouse/types.ts Shared enums and channel identifiers
data-warehouse/cursors.json Incremental pull state (gitignored)
data-warehouse/manifest.json Run audit trail (gitignored)

Refresh Schedule

# Incremental refresh (default — only new/updated records)
node data-warehouse/pull-all.js && node data-warehouse/build-db.js

# Full re-pull (ignores cursors, re-downloads everything)
node data-warehouse/pull-all.js --full && node data-warehouse/build-db.js

# Preview what would be pulled (no API calls)
node data-warehouse/pull-all.js --dry-run

# Single channel
node data-warehouse/pull-all.js shopify
node data-warehouse/pull-all.js gorgias
node data-warehouse/pull-all.js gmail

Channels: shopify · klaviyo · meta-ads · comments · messages · ga4 · gorgias · gmail


BIOS Integration

After pulling, check which BIOS intelligence specs are now unlockable:

node data-warehouse/query.js --bios-check

Then generate your specs with /setup-bios.


Support


License

MIT — © 2026 eCom XP LLC

About

Pull all brand data from live APIs into a single queryable SQLite database — in one command.

Topics

Resources

License

Contributing

Security policy

Stars

Watchers

Forks

Packages

 
 
 

Contributors