Skip to content

Latest commit

 

History

History
244 lines (204 loc) · 16.5 KB

File metadata and controls

244 lines (204 loc) · 16.5 KB

pixieengine.com — AWS Inventory & Static-Site Migration Plan

Living reference. Started 2026-06-01. Companion to heroku-recovery.md (which has the blow-by-blow recovery log). This doc = the AWS layout, what we found, and the plan to relaunch pixieengine.com as a static site + consolidate with whimsy.space.

AWS account: 186123361267, region us-east-1 (everything lives here).


TL;DR / current status

  • pixieengine.com has been down (HTTP 503) since ~2025-02 — Heroku detached its Postgres DB (root cause, see Heroku doc). It is NOT a code problem; the app boots fine.
  • Newest DB backup is 2017-03-05 (Heroku PG capture b008). Restored & verified locally. Nothing newer exists on any disk or in S3.
  • Sprite images survived in S3: ~248,175 sprites (ids 1–259,230). ~105k are post-2017 with images but no DB row.
  • CloudFront access logs survived in S3 (logs-dumper), 2017-02 → 2025-02 — these can recover post-2017 titles/slugs, traffic, and backlinks.
  • whimsy.space is already a full serverless platform (Cognito + identity-scoped S3 + Lambda API/WebSocket + Stripe). It is the natural backend for pixieengine's dynamic features.
  • Plan: relaunch pixieengine.com as pre-rendered static HTML (SEO-preserving) on S3+CloudFront, with dynamic features (accounts/comments/favorites) as progressive enhancement against whimsy's existing Cognito + Lambda + Briefcase S3.

1. AWS inventory

1.1 S3 buckets (relevant)

Bucket Purpose
images.pixie.strd6.com Sprite images. sprites/<id>/{original,large,medium,thumb}.png + replay.json. ~248,175 sprites, ids 1–259,230. S3 access logging: off.
dev.pixie.strd6.com / pixie.strd6.com / test.pixie.strd6.com older/parallel pixie image buckets (sync-dev-bucket mirrors images→dev)
addressable content-addressable store; origin for a*.pixiecdn.com (tune/data blobs)
trinket origin for t*.pixiecdn.com
backups.pixieengine.com only projects.zip (2013, 1.9 GB) + replays.zip (2014, 813 MB). No DB backup.
logs-dumper Log aggregation. cloudfront/ (per-domain CF logs incl. pixieengine.com/), s3/whimsy-fs/, whimsy.space/, danielx.net/, etc. whimsy-fs S3 access logs land here.
pixieengine-redirect S3-website redirect bucket; origin for www.pixieengine.com
pixieengine-{s3bucket,incomingbucket,logbucket}-* old CloudFormation stack (~2015). data.pixieengine.com origin (dist disabled). logbucket = S3 access logs of incomingbucket (just console noise now).
projects.pixieengine.com / staging.pixieengine.com old project/staging buckets
whimsy-fs whimsy.space filesystem / static host. Per-user scopes keyed by Cognito identity id (<id-pool>:<uuid>/public/<domain>/... — the "Briefcase"). Serves whimsy.space, danielx.net, pegged.us, etc.
whimsy-fs-incoming / whimsyspace-{databucket,logbucket}-* whimsy support buckets
cloudcontentaddressablestorage-{s3,incoming,log}bucket-* CAS stack (Lambda-backed)
danielx.net / strd6.com / mariopaint.net / 69signals.com / git_info / distri-tactics / RedIce other sites/projects

1.2 CloudFront distributions

Dist ID Aliases Origin Notes
E3BKYYG8EH9O0K pixieengine.com, api.pixieengine.com pixieengine.herokuapp.com DEAD origin (Heroku down). Logs → logs-dumper/cloudfront/pixieengine.com/. Legacy dist — left untouched; cutover moves its pixieengine.com alias to the new static dist (CDK, infra/).
E1YLL4NVXDIDA5 www.pixieengine.com pixieengine-redirect (S3) www→ redirect
E30UBGU2BPKA0U 0-3.pixiecdn.com, images0-3.pixieengine.com images.pixie.strd6.com sprite image CDN
E2711IWAXSUJIS a0-3.pixiecdn.com addressable CAS/data CDN
EWGAEOQIY9FQ8 t0-3.pixiecdn.com trinket
E1KIHFJIZL22TA projects.pixieengine.com projects bucket
E796WPR8GDD60 data.pixieengine.com pixieengine-s3bucket disabled
E37PE6H6KE2965 whimsy.space, www.whimsy.space whimsy-fs whimsy main
E1PWHFQBR2WODL / EMMDKCF6USARK *.whimsy.space / *.staging.whimsy.space whimsy-fs wildcard user sites
E1OSKWNHO8IZLL / ETBD5WPYKJ764 danielx.net,www / data.danielx.net whimsy-fs
E3IMLOG47LB30T fs.whimsy.space whimsy-fs (REST)
EVSNG2814B0F7 pegged.us whimsy-fs a Briefcase-hosted site
E3LWRZEDR2FWB3 / EK4AB0LJXE9I0 contrasaur.us / jadelet.com strd6.github.io
E1X6IVRF377PPD lsystems.danielx.net flicker-swoop.glitch.me
E2V8AB0I1ZIKPO blog.danielx.net wordpress disabled
ESU2T599EW9JX 69signals.com 69signals bucket

CloudFront standard logging reports disabled on all distributions, but logs are delivered to logs-dumper/cloudfront/<domain>/ via a custom mechanism. pixieengine.com logs confirmed present 2017-02 → 2025-02, standard CF log format (E3BKYYG8EH9O0K.<date-hour>.<hash>.gz).

1.3 Lambda (us-east-1) — the whimsy.space backend

api-whimsy-space (REST API) · websocket-whimsy-space (realtime) · CognitoIdentityPoolAuth · CloudFrontSubdomainPath (hostname→S3-path router, model for our id/slug rewrite) · incoming-email-whimsy-space · whimsy-space-s3-incoming · stripe-checkout + stripe-webhook (payments, 2024) · CloudContentAddressableStorage · paint-composer-uploads + paint-composer-s3-data-incoming · PixieEngine (2015, legacy).

1.4 Cognito (whimsy identity)

  • User Pool us-east-1_cfvrlBLXG ("Whimsy.Space") — accounts/login.
  • Identity Pool us-east-1:4fe22da5-bb5e-4a78-a260-74ae0a140bf9 ("WhimsySpace") — federated identity → per-user scoped AWS creds → per-user S3 prefix in whimsy-fs (the Briefcase).

1.5 Route 53 hosted zones

pixieengine.com (ZPF3QICRGSLCF), whimsy.space (Z37HC4BHZ115H3), pixiecdn.com (Z3K4FLWKRKKJIY), danielx.net, jadelet.com, contrasaur.us, strd6.com, mariopaint.net, civet.dev, 69signals.com, pegged.us. All DNS is in this account (we control it).

1.6 IAM users

User Role Status
stay-pegged (local [default]) broad-ish S3 read; the prod app keys rotate (exposed in transcript)
composer (local [composer-deploy]) S3 read keep
pixieengine-audit (local [pixieengine-audit]) discovery: ReadOnly/custom pixieengine-discovery-ro DELETE after discovery
pixieengine-deploy (local [pixieengine-deploy]) scoped: read images+logs, rw static bucket, CF invalidation DELETE — superseded by the pixie-deploy role; image/log reads were migration-only (drop, re-add if needed)
pixie-ops (local [pixie-ops]) sts:AssumeRole on the CDK bootstrap roles + the pixie-deploy role — nothing else created (the one static deploy credential; runs cdk deploy --profile pixie-ops)
pixie-deploy role (CDK-managed; assumed by pixie-ops) rw pixieengine-static, cloudfront:CreateInvalidation, recovery/* on backups created by the stack — keyless; content s3 sync + invalidation assume it

2. Findings so far

2.1 Why it's down

Heroku detached the Postgres DB on 2025-02-03 (release v185, by kpremkumar@heroku.com; plan heroku-postgresql:essential-1, addon postgresql-concentric-34855). No DATABASE_URL → prod eager_load fails at boot → every request is H10 App crashed / 503. App code is fine (boots locally in dev and production mode on Ruby 3.1.3).

2.2 Data recovery

  • DB: newest backup is b008 (2017-03-05), the Heroku PG addon's own capture. Downloaded → ~/pixie-backups/b008-2017-03-05.dump, restored & verified: sprites 142,634 (ids 1–153,218), users 20,409, collections 20,262, taggings 438,825, tunes 1,075, comments 6,870. Everything 2017→2025 is gone from the DB. No newer backup exists anywhere (all local disks + S3 searched).
  • Images: intact in images.pixie.strd6.com. ~248,175 sprites, ids 1–259,230. ~105k post-2017 sprites have images but no DB row (recoverable as synthesized rows).
  • CloudFront logs: logs-dumper/cloudfront/pixieengine.com/, 2017-02 → 2025-02, ~10k files/month. Contain full slugged request paths (/sprites/257680-ugly-killman-donkey-ass), referers, timestamps. ⇒ can recover post-2017 titles/slugs, traffic ranking, backlinks.
  • Activity rate: ~28 sprites/day in the final month (Nov–Dec 2024); last sprite id 259230 on 2024-12-20 (site went quiet ~6 weeks before the formal DB detachment).

2.3 whimsy.space platform (the consolidation target)

A complete serverless stack already in this account: Cognito (auth) + identity-pool-scoped S3 (whimsy-fs Briefcase, /public hosting) + Lambda REST (api-whimsy-space) & WebSocket (websocket-whimsy-space) + Stripe + a CloudFront subdomain→path router. Proven hosting sites: whimsy.space, danielx.net, pegged.us. This is the "stupid-simple backend" for pixieengine's dynamic layer — reuse, don't rebuild.


3. Local working state (this machine)

  • Toolchain: Ruby 3.1.3 via rbenv (~/.rbenv), Bundler 2.3.26, Postgres 16, Redis, ImageMagick.
  • App boots in dev and production mode on Ruby 3.1.3 (no code-level crash).
  • pixie_archive Postgres DB: b008 restored → anonymized (sprite/tune user_id NULL'd, users/comments/emails/follows/invites/activities dropped) → migration 20170305171646 applied → schema matches code. Plus 9 sample S3-recovered post-2017 sprites.
  • Archive mode flag added (uncommitted): config.x.archive via ARCHIVE=1 (config/application.rb + app/controllers/sprites_controller.rb) — drops the where.not(user_id: nil) and 5-year-recency gallery filters so anon 2017 sprites list. Verified: gallery + sprite pages render "Anonymous" with live CDN images.
  • Sprite viewer: tmp/sprite-viewer/viewer.html (+ ids.js, 248,175 ids) — static, paginated 100/page grid for fast visual review; has a "Moderate mode" (click-to-flag → flagged-sprites.json).
  • Static export PoC: generator in static-site/ (tracked source; gitignored build/ data + public/ output). Full build = ~252k pages in ~25s.

4. Static-site migration plan

4.1 Architecture (SEO-preserving)

  • Pre-rendered static HTML for every sprite/tune/profile page (NOT a client-rendered SPA) — real <title>, meta description, canonical, image, tags in the markup.
  • Host: private S3 bucket pixieengine-static (CloudFront OAC) + a new, greenfield CloudFront distribution (CDK stack in infra/, PixieStaticStack) — entirely separate from the legacy E3BKYYG8EH9O0K, which is never touched. Cutover = move the pixieengine.com DNS alias to the new dist (a new ACM cert is provisioned for it); rollback = move the alias back. The 503 stops the moment the alias points at the populated new dist. (Revised from the original "reuse E3BKYYG8EH9O0K, repoint its origin" plan — greenfield + DNS-cutover is safer and fully reversible.)
  • CloudFront Function to rewrite /sprites/<id>(-slug)/sprites/<id>/index.html by parsing the leading integer id (Rails resolved ids that way). Recovers every historical slug URL with one file per id. Same for /tunes/; profiles /<username>/users/<name>/index.html. (CloudFrontSubdomainPath Lambda is a working reference for edge path rewriting.)

4.2 URL / SEO preservation (the google juice)

The value is bound to the domain + its backlinks, decayed after ~16 mo of 503 but recoverable.

  • Restore exact original URLs returning 200 (edge rewrite handles id/slug variants).
  • Use real 2017 metadata (titles/tags/slugs) for ids ≤153,218; recover post-2017 titles/slugs from the CloudFront logs.
  • rel=canonical, single host https://pixieengine.com, 301 http→https + www→apex.
  • Regenerate sitemap.xml; submit in Google Search Console; robots.txt allows crawl.
  • 404 (a /410/index.html "gone" page) for sprites deliberately not restored (moderated/missing). (Originally specced as 410, but CloudFront custom-error responses don't permit a 410 code; both de-index cleanly.)
  • Static + CDN = strong Core Web Vitals (ranking tailwind).
  • Do NOT migrate to a new domain unless brand-critical — link equity is domain-bound; a move costs a second ranking dip and requires keeping pixieengine.com as a 301 redirector forever.

4.3 Progressive enhancement + whimsy consolidation

  • Static HTML core ranks & renders standalone. Dynamic features hydrate after, client-side, so they never block crawlability:
    • Accounts/login → whimsy Cognito (User Pool us-east-1_cfvrlBLXG).
    • Favorites / comments / friendsapi-whimsy-space Lambda + per-user Briefcase S3 (identity-pool-scoped whimsy-fs prefixes); realtime via websocket-whimsy-space if wanted.
    • Reads come straight from S3/CDN JSON; writes go through the (authenticated) Lambda. "Read from CDN, write through one function" — reads scale infinitely, writes are rare.
  • Net: pixieengine becomes a static content site on its own domain (SEO asset) + a thin client of whimsy's existing platform (dynamic features). No new DB, no new auth, no Rails/Postgres.

4.4 Anonymized archive + moderation

  • Recovered sprites are shown as "Anonymous" — the 2017 users table (emails + scrypt hashes) is never published. The app natively renders "Anonymous" for null `user.
  • ~105k post-2017 sprites have no recorded deletion/suppression state (that DB is gone) → bulk re-publishing risks resurfacing removed/abusive content. Mitigations: bounded import, an "unverified" tier (URL-reachable but unlisted), an automated image-safety pass, or manual review via the flag viewer. Decide before any full import.

    Update 2026-06-02: the published archive is the full set, and the reactive moderation toolkit now exists (text + replay/upload + pixel-empty scanners → a local review tool; removed.tsv / adult.tsv honored by the generator). First pass done (~1,740 removals + 12 adult-gates). The "unverified tier" = the adult-gate / hold-out pattern. CSAM hash-matching still unbuilt. See static-site/README.md + docs/replay-format.md.


5. Open decisions / next steps

  1. CloudFront log mining — local parse of recent logs (2023→2025) to build id → title/slug + hit-count + backlink list, OR set up Athena for the full 8-year sweep (needs IAM additions). (Recommended: start local; feeds real titles into the generator.) → Full task spec: cloudfront-log-recovery.md.
  2. Finish the static export prototype + the CloudFront id/slug Function. (Done — generator working; function in infra/functions/rewrite.civet, unit-tested.)
  3. Deploy the greenfield CDK stack (infra/, PixieStaticStack): new bucket pixieengine-static
    • new dist + function (Phase 1, no alias), populate via s3 sync, verify on *.cloudfront.net, then cutover by moving the DNS alias (-c withDomain=true). Keyless pixie-deploy role (assumed by pixie-ops) handles the sync/invalidation. Full runbook: deploy.md / infra/README.md.
  4. Moderation strategy for the ~105k post-2017 sprites before bulk publish.
  5. Provision decision for any live writable features (vs pure archive) → whimsy integration scope.
  6. Security: rotate exposed prod secrets (AWS keys / GitHub token / ADMIN_CODE); delete pixieengine-audit user after discovery.

6. Credentials / access (working notes)

  • Local AWS profiles: [default]=stay-pegged, [composer-deploy]=composer, [pixieengine-audit]=discovery (ReadOnly). Use --profile pixieengine-audit for discovery scans.
  • Deploy identity (current): the pixie-deploy role (CDK-managed, assumed by pixie-ops) — rw pixieengine-static, cloudfront:CreateInvalidation, recovery/* on backups. Keyless. The pixie-ops user (the one static credential) is scoped to sts:AssumeRole on the CDK bootstrap roles + pixie-deploy, nothing else.
  • The legacy pixieengine-deploy user is superseded and slated for deletion. Its hand-made inline policy also granted read on images.pixie.strd6.com + the log bucket — those were migration-only and are intentionally not carried into the role (re-add scoped reads if a build step ever needs them). Cutover: cdk deploy --profile pixie-ops → add the [pixie-deploy] profile → verify a sync → in CloudShell delete the old user's access key, inline policy, then the user.
  • Do not commit AWS keys; keep them only in ~/.aws/credentials.