Skip to content

Latest commit

 

History

History
202 lines (170 loc) · 13.4 KB

File metadata and controls

202 lines (170 loc) · 13.4 KB

pixieengine.com — Static Site Architecture (design, in progress)

Companion to aws-inventory.md. This is the detailed design for the static rebuild. DECISION POINTS are marked ⟐ — those need sign-off before build. Started 2026-06-01.

Build status (2026-06-01)

Phase 1 generator working in static-site/ (generator source, tracked; data workspace + generated public/ are gitignored — see static-site/README.md):

  • build-dataset.mjs — merges DB metadata (db_sprites.ndjson) + CF-log-recovered slugs (../cf-recovery/sprite_slugs.tsv) + S3 id list → build/dataset.ndjson (248,175 records). Titles: 20,874 from the 2017 DB + 91,443 recovered from CloudFront logs = ~112k titled (incl. 86% of the post-2017 gap); 135,858 never seen in logs → "Sprite #id". Log-recovered rows also get a real created date (first-seen) and the keyword slug for the canonical URL.
  • generate.mjs — custom Node generator. Full build = 248,175 sprite pages + 4,137 gallery pages + 1,075 tune pages + 18 tune-gallery pages + 2,890 tag pages + editor landing + 410 + robots + sitemap index (252,140 URLs, 6 shards), in ~25s. Preview: node generate.mjs (newest 5k); full: LIMIT=all node generate.mjs. Tune data exported via build/tunes.ndjson.
  • Matches the original look: emits the site's real markup (<header>, <content>, <card>, <sprites>, <creator>, <tags>, .pagination) and serves the real compiled screen.css (copied from public/assets/screen-*.css) verbatim — purple #673ab7 chrome, amber tags, etc.
  • Verified via local static server: sprite detail (title/meta/canonical/og/JSON-LD + derivation card + tags + actions), gallery (newest-first, 60/page), tag pages, sitemap, robots render & resolve.
  • Tunes (gallery + detail) and per-page contextual tags bar (weblike browsing) done; logarithmic pagination (×10, First/Prev/Last) done; tag links restyled via /site.css.
  • Replays work — verified against the live pixie3 editor. Generator emits postmaster.js + a single /load/ editor page that reads the sprite id (from /sprites/<id>/load path or ?id=), HEAD-checks replay.json, and calls loadReplayFromURL (replays if present, else just loads the image). Sprite pages have a "▶ Watch replay" button. (Editor is framable; replay.json is CORS-open. Coverage is partial by nature: older sprites have no replay [404]; re-uploads load but don't animate; genuine drawings animate.)
  • Tunes play — verified end-to-end. Tune pages have "▶ Play in Composer" → /tunes/load/?sha=, which embeds the new composer (danielx.net/composer/) with #url-<encoded content URL>. Required two small fixes in composer-app/composer/source/persistence.civet (committed there, redeployed to prod): (1) loadFromSlug accepts a url- prefix → reuses loadURL; (2) fromBlob JSON check broadened to application/(.*\+)?json (our content is application/whimsy.composer.v0+json). Iframe needs allow-same-origin (composer uses its real origin). Tune content: s3.amazonaws.com/ images.pixie.strd6.com/data/<content_sha256>, CORS-open.
  • CloudFront Function (infra/functions/rewrite.js) written + unit-tested (id/slug rewrite, /sprites/<id>/load/load/, ?tagged=/tags/ 301, dir→index; npm test in infra/). Deploy runbook in deploy.md; infra-as-code in infra/ (CDK).
  • Pending: CF-log slug enrichment (running) → re-merge to fill ~227k untitled; profile stubs; deploy — the greenfield CDK stack (infra/) provisions a new bucket + new CloudFront dist + function (admin creds, run from CloudShell), then the pixieengine-deploy user does the s3 sync.

Goals & principles

  1. Preserve SEO — same URLs, return 200, pre-rendered crawlable HTML, fast (CDN).
  2. Stupid simple — no servers, no DB, no framework lock-in; a plain generator + S3 + CloudFront.
  3. Progressive enhancement — static core works with JS off; accounts/social hydrate on top.
  4. Consolidation-ready — dynamic layer is a thin client of whimsy.space's existing platform.
  5. Reproducible builds — one dataset in → deterministic static tree out.

Two-layer model

Layer 1  STATIC CONTENT  (the SEO asset; ~all of the site)
         pre-rendered HTML per sprite/tune + gallery/tag indexes
         → S3 (private) → CloudFront (pixieengine.com) → users + Googlebot
         images already served from *.pixiecdn.com (unchanged)

Layer 2  DYNAMIC ENHANCEMENT  (optional, loads after; zero SEO weight)
         small JS → whimsy Cognito (login) + api-whimsy-space (favorites/comments)
                  + per-user Briefcase S3 (writes)
         reads: static JSON from CDN  |  writes: through whimsy Lambda (authed)

Page inventory & URL map

URL Type Source Notes
/ static template editor shell (iframe); save disabled in archive
/sprites/<id>(-slug) static (1 per sprite) archive DB + CF-log slugs + S3 primary SEO asset, ~248k pages
/sprites + /sprites/page/<n> static (paginated) dataset newest-first; ~3,900 pages @64
/tags/<tag> (+ pages) static taggings replaces ?tagged= (301 the old form)
/tunes/<id>(-slug) static archive DB + logs ~1k+ pages; player is client-side
/<username> (profiles) static stub usernames seen in CF logs minimal stub (200, preserves backlinks); no sprite listing in v1
/pixel-editor static shell iframe → danielx.net/pixel-editor/pixie3/ already client-side
/sitemap.xml (+ index) generated dataset split (<50k URLs each)
/robots.txt static allow crawl
/404, /410 static for removed/missing

Data: source of truth → build inputs

Single merged dataset (one row per sprite/tune) assembled from three sources, written as NDJSON (or a read-only SQLite file) — the generator's only input:

  • 2017 metadata (pixie_archive DB): title, tags, description, dims, created_at for ids ≤153,218.
  • CF-log recovery (cloudfront-log-recovery.md output): titles/slugs for post-2017 ids.
  • S3 image list: authoritative existence + dims (the ~248k that actually have original.png).

Record shape (sprites):

{"id":257680,"slug":"ugly-killman-donkey-ass","title":"ugly killman donkey ass",
 "tags":["killman"],"w":64,"h":64,"created":"2024-09-01",
 "img":"https://0.pixiecdn.com/sprites/257680/original.png","source":"s3log"}

source ∈ {db2017, s3log, s3only} — drives badges, sitemap inclusion, and moderation gating.

Build pipeline (the generator)

Tooling (locked): a small custom Node generator — full control over SEO markup, no framework, parallel workers, simple hand-written templates. (No SSG.)

dataset.ndjson ──▶ generate.mjs (worker pool) ──▶ public/
                       │   templates: sprite.html, gallery.html, tag.html, tune.html
                       └─ emits: /sprites/<id>/index.html, /sprites/page/<n>/index.html,
                                  /tags/<tag>/index.html, /tunes/<id>/index.html,
                                  sitemap-*.xml, data/*.json (for client search)
  • Incremental builds: hash each record; only regenerate changed/new pages (so adding sprites later is cheap, not a 248k re-render).
  • Parallel: shard by id range across workers; 248k pages in minutes.
  • No per-page JS required — pages are complete HTML; the enhancement bundle is one shared <script defer>.

Hosting & edge routing

  • S3 (private) pixieengine-static holding the public/ tree, CloudFront OAC (CDK-owned, RETAIN). Populated by aws s3 sync from static-site/public. (The hand-made staging bucket pixieengine-com-site used for the initial upload was deleted 2026-06-02.)
  • New, greenfield CloudFront distribution (infra/, PixieStaticStack) — entirely separate from the legacy E3BKYYG8EH9O0K, which is never touched. Cutover = move the pixieengine.com DNS alias to the new dist (-c withDomain=true adds a new ACM cert + alias + Route 53 A/AAAA); rollback = move the alias back. 503 ends the moment the alias points at the populated new dist. (Earlier draft repointed the legacy dist's origin in place — superseded.)
  • CloudFront Function (viewer-request) for routing (≈20 lines, model: existing CloudFrontSubdomainPath Lambda):
    • /sprites/<id> or /sprites/<id>-<anyslug>/sprites/<id>/index.html (parse leading int)
    • append /index.html to any directory request
    • /sprites?tagged=<t> → 301 → /tags/<t>/ ; strip other legacy query forms
    • canonical host: www.pixieengine.compixieengine.com (existing redirect dist already does www)
    • unknown sprite id → /410/index.html served as 404 (CloudFront can't emit 410 in custom-error responses) so Google drops it cleanly
  • Cache: long TTL on /sprites/* and images; short/none on data/*.json.

SEO specifics (per sprite page)

  • <title>: "<title> – pixel art by Anonymous | Pixie Engine" (fallback Sprite #<id>).
  • <meta name=description> from description/tags; <link rel=canonical>/sprites/<id>-<slug>.
  • Open Graph + Twitter card → og:image = CDN original; schema.org/ImageObject JSON-LD.
  • The image (<img> from *.pixiecdn.com, image-rendering:pixelated), tag links, remix/parent links.
  • Sitemap index + sharded sitemaps (≤50k URLs), prioritized by CF-log traffic.
  • 301 legacy ?tagged=/?page= forms; 404 (via /410/index.html) for moderated-out/missing ids.

Progressive enhancement layer (whimsy contract)

One shared enhance.js (defer-loaded). On a sprite page it:

  1. Checks whimsy Cognito session (User Pool us-east-1_cfvrlBLXG); shows "Sign in with whimsy".
  2. Favorites — read count from a static/CDN JSON; on click, write through api-whimsy-space (authed) to the user's Briefcase S3 scope; optimistic UI.
  3. Comments — render from sprites/<id>/comments/*.json (CDN); post via the Lambda.
  4. Pattern: read from CDN, write through one authed Lambda — reads scale free, writes are rare.

Empty mount points (<div data-favorites>, <div data-comments>) sit in the static HTML; JS hydrates them. With JS off / crawler, the page is still complete and indexable.

DEFERRED: Layer 2 is not built in Phase 1 (decision #6). No enhance.js, Cognito, or mount points ship in the static archive — this section is the design to revisit once it's live.

Deployment & rebuild

  • aws s3 sync public/ s3://pixieengine-static --delete + cloudfront create-invalidation (scoped to changed paths) — via the scoped pixieengine-deploy IAM user.
  • ⟐ Build trigger: one-time for archive launch; later, a rebuild step when new content lands (manual, scheduled, or whimsy-write-driven).

Phasing (recommended)

  1. Phase 1 — Anonymous static archive (no login). 2017 + CF-log-recovered content, read-only, SEO-complete. Gets pixieengine.com returning 200s ASAP — the urgent SEO win. Stupid simple.
  2. Phase 2 — Progressive enhancement. Add whimsy login + favorites (+ comments) as JS layer.
  3. Phase 3 — Living site (optional). New creations via editor → whimsy → incremental rebuild (or client-rendered new pages + periodic static regen). Only if you want write/growth back.

Decisions (locked 2026-06-01)

  1. Archive-only to start. Build Phase 1 (anonymous, read-only, no login). Living layer is later.
  2. Publish all. No moderation gate — all ~248k sprites (incl. unmoderated S3-only) are listed + in the sitemap. Risk accepted; moderation becomes a reactive takedown process post-launch (the flag viewer + a removal/410 list), not a pre-publish gate.
  3. Landing / stays the editor (iframe → danielx.net/pixel-editor/pixie3/). Note: in the archive there is no write backend, so the editor's "save to gallery" is disabled (draw/export only) until the dynamic layer (Phase 2). Gallery lives at /sprites.
  4. Profiles = minimal stub. /<username> returns 200 (preserves backlinks) with a lightweight page: the (already-public) username + link to the gallery. Sub-decision: whether to also list that user's sprites requires the b008 user→sprite map (mild de-anon, display_name only, no email/hash) — default no listing for now to stay fully anonymized.
  5. Custom Node generator; pages kept simple. No SSG/framework. Minimal, fast, hand-controlled markup.
  6. Dynamic layer (Layer 2) deferred — design it after the static site is live. Phase 1 ships no login/favorites/comments. 7/8. Host on pixieengine.com via a new, greenfield CloudFront distribution (CDK stack in infra/) → private static bucket pixieengine-static (CloudFront OAC). The legacy dist E3BKYYG8EH9O0K is never touched; cutover = move the DNS alias to the new dist, rollback = move it back. Not a whimsy Briefcase. (Revised from the original "reuse E3BKYYG8EH9O0K, repoint origin" idea — the greenfield + DNS-cutover approach is safer and fully reversible.)

Implications for Phase 1 build

  • Generator emits: /sprites/<id>/index.html (×~248k), /sprites/page/<n>/, /tags/<tag>/, /tunes/<id>/, profile stubs for usernames seen in CF logs, / (editor shell), sitemaps, robots.
  • No enhance.js, no Cognito, no favorites/comments mount points in v1 (add in Phase 2).
  • CloudFront Function: id/slug rewrite, dir→index.html, ?tagged=/tags/; missing → 404 via the /410/index.html error page.