Skip to content

feat(scripts): add Stripe admin backfills#2002

Merged
riderx merged 1 commit into
mainfrom
codex/stripe-admin-backfill-scripts
May 1, 2026
Merged

feat(scripts): add Stripe admin backfills#2002
riderx merged 1 commit into
mainfrom
codex/stripe-admin-backfill-scripts

Conversation

@riderx
Copy link
Copy Markdown
Member

@riderx riderx commented May 1, 2026

Summary (AI generated)

  • Added a dry-run-first backfill for the admin Org Conversion Rate Trend metric.
  • Added a dry-run-first Stripe customer country sync for the Top Billing Countries dashboard.
  • Added shared script helpers and focused unit coverage for the backfill logic.

Motivation (AI generated)

The admin dashboard has historical Stripe-backed analytics fields that can be empty or stale for older rows. These scripts let us safely backfill the derived conversion trend and billing country data without changing runtime API behavior.

Business Impact (AI generated)

More complete admin analytics improve visibility into conversion health and customer geography, which helps reporting, growth analysis, and billing operations.

Test Plan (AI generated)

  • bunx eslint --no-ignore scripts/admin_stripe_backfill_utils.ts scripts/backfill_org_conversion_rate_trend.ts scripts/backfill_stripe_customer_countries.ts tests/admin-stripe-backfill-scripts.unit.test.ts
  • bunx tsc --ignoreConfig --noEmit --allowImportingTsExtensions --moduleResolution bundler --module ESNext --target ES2020 --lib DOM,ESNext --types bun,vitest --strict --esModuleInterop --skipLibCheck scripts/admin_stripe_backfill_utils.ts scripts/backfill_org_conversion_rate_trend.ts scripts/backfill_stripe_customer_countries.ts tests/admin-stripe-backfill-scripts.unit.test.ts
  • bunx vitest run tests/admin-stripe-backfill-scripts.unit.test.ts
  • Commit hook ran bun run cli:build && vue-tsc --noEmit; it exited successfully while printing existing Vue macro plugin resolution warnings.

Generated with AI

Summary by CodeRabbit

Chores

  • Added new admin backfill scripts for Stripe data synchronization—one to reconstruct organization conversion rate trends and another to populate customer country information from Stripe records.
  • Added accompanying utility module with helpers for backfill operations, environment configuration, and data processing.

Tests

  • Added unit tests for backfill script calculations and data normalization.

@coderabbitai
Copy link
Copy Markdown
Contributor

coderabbitai Bot commented May 1, 2026

📝 Walkthrough

Walkthrough

Introduces Stripe backfill automation infrastructure comprising two new CLI scripts that reconstruct database records: one backfills org conversion rates from org creation timestamps and global stats, the other backfills Stripe customer countries from the Stripe API. Includes shared utility functions and comprehensive unit tests.

Changes

Cohort / File(s) Summary
NPM Scripts
package.json
Added two new Bun-based TypeScript script entries: stripe:backfill-org-conversion-rate and stripe:backfill-customer-countries.
Shared Utilities
scripts/admin_stripe_backfill_utils.ts
New utility module with 10+ helper functions for CLI argument parsing, environment loading, Supabase/Stripe client initialization, async task pooling with concurrency limits, and customer ID validation.
Backfill Scripts
scripts/backfill_org_conversion_rate_trend.ts, scripts/backfill_stripe_customer_countries.ts
Two new backfill automation scripts: one reconstructs org conversion rate trends by deriving org counts from timestamps within configurable date ranges; the other fetches and normalizes customer countries from Stripe API, supporting dry-run and apply modes, concurrent operations, and failure logging.
Unit Tests
tests/admin-stripe-backfill-scripts.unit.test.ts
New Vitest suite verifying calculation logic, data normalization, country code handling, customer ID validation, and update decision rules for both backfill scripts.

Sequence Diagram(s)

sequenceDiagram
    actor User
    participant CLI as Backfill Script
    participant Supabase as Supabase<br/>(Database)
    participant Stripe as Stripe API
    
    User->>CLI: Run with --from/--to dates (--apply flag)
    CLI->>Supabase: Load global_stats rows in date range
    CLI->>Supabase: Load orgs.created_at timestamps
    CLI->>CLI: Paginate & transform:<br/>For each global_stats row,<br/>calculate next_rate & orgs count
    CLI->>CLI: Identify changed rows<br/>(delta > 0.0001)
    alt Dry-run mode
        CLI-->>User: Report sample changes,<br/>row counts
    else Apply mode
        CLI->>Supabase: Update changed rows<br/>(concurrent batches)
        Supabase-->>CLI: Confirm updates
        CLI-->>User: Report final totals
    end
Loading
sequenceDiagram
    actor User
    participant CLI as Backfill Script
    participant Supabase as Supabase<br/>(Database)
    participant Stripe as Stripe API
    participant FileSystem as File System
    
    User->>CLI: Run with optional --customer-id, --limit (--apply flag)
    CLI->>Supabase: Load stripe_info rows<br/>(scope by customer if provided)
    loop Concurrent batches (configurable)
        CLI->>Stripe: Fetch customer address for each ID
        Stripe-->>CLI: Return customer data or error
        CLI->>CLI: Extract & normalize<br/>country code to ISO-2
        CLI->>CLI: Decide: update if missing<br/>or refresh mode enabled
    end
    CLI->>FileSystem: Write failures to<br/>./tmp/backfill_failures.json
    alt Dry-run mode
        CLI-->>User: Report sample from→to changes
        alt Any failures
            CLI-->>CLI: Throw error
        end
    else Apply mode
        CLI->>Supabase: Write updates (concurrent)
        Supabase-->>CLI: Confirm writes
        CLI-->>User: Report update totals
        alt Any failures
            CLI-->>CLI: Throw error
        end
    end
Loading

Estimated code review effort

🎯 4 (Complex) | ⏱️ ~50 minutes

Poem

🐰 Hopping through databases with Stripe in tow,
Converting rates and countries, watching data flow,
Backfill scripts dance in concurrent delight,
Normalizing, paginating—making all things right! ✨

🚥 Pre-merge checks | ✅ 4 | ❌ 1

❌ Failed checks (1 warning)

Check name Status Explanation Resolution
Docstring Coverage ⚠️ Warning Docstring coverage is 0.00% which is insufficient. The required threshold is 80.00%. Write docstrings for the functions missing them to satisfy the coverage threshold.
✅ Passed checks (4 passed)
Check name Status Explanation
Title check ✅ Passed The title clearly summarizes the main change: adding Stripe admin backfill scripts. It is concise, specific, and directly reflects the primary objective of the changeset.
Description check ✅ Passed The PR description includes all required template sections: Summary, Test Plan, and Checklist (partially completed). It provides sufficient context about the changes and demonstrates testing was performed.
Linked Issues check ✅ Passed Check skipped because no linked issues were found for this pull request.
Out of Scope Changes check ✅ Passed Check skipped because no linked issues were found for this pull request.

✏️ Tip: You can configure your own custom pre-merge checks in the settings.

✨ Finishing Touches
📝 Generate docstrings
  • Create stacked PR
  • Commit on current branch
🧪 Generate unit tests (beta)
  • Create PR with unit tests
  • Commit unit tests in branch codex/stripe-admin-backfill-scripts

Review rate limit: 0/5 reviews remaining, refill in 55 minutes and 30 seconds.

Comment @coderabbitai help to get the list of available commands and usage tips.

@codspeed-hq
Copy link
Copy Markdown
Contributor

codspeed-hq Bot commented May 1, 2026

Merging this PR will not alter performance

✅ 28 untouched benchmarks


Comparing codex/stripe-admin-backfill-scripts (7c53aee) with main (8284e3b)

Open in CodSpeed

@sonarqubecloud
Copy link
Copy Markdown

sonarqubecloud Bot commented May 1, 2026

Copy link
Copy Markdown

@chatgpt-codex-connector chatgpt-codex-connector Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

💡 Codex Review

Here are some automated review suggestions for this pull request.

Reviewed commit: 7c53aeef28

ℹ️ About Codex in GitHub

Your team has set up Codex to review pull requests in this repo. Reviews are triggered when you

  • Open a pull request for review
  • Mark a draft as ready
  • Comment "@codex review".

If Codex has suggestions, it will comment; otherwise it will react with 👍.

Codex can also answer questions or update the PR. Try commenting "@codex address that feedback".

Comment on lines +153 to +154
.order('created_at', { ascending: true })
.range(offset, offset + DEFAULT_PAGE_SIZE - 1)
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

P2 Badge Make org pagination deterministic for backfill reads

Ordering orgs only by created_at and paginating with range(offset, ...) can skip or duplicate rows when multiple orgs share the same timestamp, because tie ordering is not stable across pages. In that case the reconstructed denominator becomes incorrect and org_conversion_rate updates are wrong for affected dates. This is especially likely on historical bulk inserts where many rows have identical created_at values; include a unique secondary sort key (or keyset pagination) to guarantee deterministic paging.

Useful? React with 👍 / 👎.

Copy link
Copy Markdown
Contributor

@coderabbitai coderabbitai Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 3

🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.

Inline comments:
In `@scripts/admin_stripe_backfill_utils.ts`:
- Around line 111-113: The isActionableStripeCustomerId helper currently treats
any trimmed, non-`pending_` value as valid; change it to only accept real Stripe
customer IDs by ensuring the trimmed value both startsWith('cus_') and is not
prefixed with 'pending_'. Update the function isActionableStripeCustomerId to
return true only when customerId is non-null/defined, trimmed,
startsWith('cus_'), and does not startWith('pending_') so malformed IDs like
'sub_' or 'acct_' are rejected before calling stripe.customers.retrieve.

In `@scripts/backfill_org_conversion_rate_trend.ts`:
- Around line 237-245: The asyncPool worker currently lets a single rejected
updateConversionRate() reject the whole pool and leaves in-flight tasks
untracked; instead, wrap the per-row call to updateConversionRate(supabase, row)
in a try/catch inside the asyncPool callback, push failed row identifiers (e.g.,
row.date_id or the full row) into a local failures array, increment updated only
on success, and continue so other in-flight tasks can finish; after asyncPool
resolves, log or throw a summary using failures and updated to surface which
date_id(s) failed and ensure the script exits non-zero if you need a failing CI
signal.
- Around line 46-49: assertDateId currently only checks DATE_ID_REGEX; update it
to also reject impossible calendar dates by parsing the YYYY-MM-DD parts and
validating them: split value into year, month, day, construct a Date (or use a
reliable date library) and confirm the Date is valid and that its year/month/day
match the parsed components (to catch things like 2026-02-31), and throw the
same Error(`${label} must use YYYY-MM-DD and be a valid date`) when invalid;
keep the original function name assertDateId and leave DATE_ID_REGEX in place
for the shape check so fetchGlobalStatsRows will never receive nonsensical date
strings.
🪄 Autofix (Beta)

Fix all unresolved CodeRabbit comments on this PR:

  • Push a commit to this branch (recommended)
  • Create a new PR with the fixes

ℹ️ Review info
⚙️ Run configuration

Configuration used: defaults

Review profile: CHILL

Plan: Pro

Run ID: 7f80493b-b9fa-481a-bd0b-3df460e3e580

📥 Commits

Reviewing files that changed from the base of the PR and between 8284e3b and 7c53aee.

📒 Files selected for processing (5)
  • package.json
  • scripts/admin_stripe_backfill_utils.ts
  • scripts/backfill_org_conversion_rate_trend.ts
  • scripts/backfill_stripe_customer_countries.ts
  • tests/admin-stripe-backfill-scripts.unit.test.ts

Comment on lines +111 to +113
export function isActionableStripeCustomerId(customerId: string | null | undefined) {
const trimmedCustomerId = customerId?.trim()
return !!trimmedCustomerId && !trimmedCustomerId.startsWith('pending_')
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🟠 Major | ⚡ Quick win

Only accept real Stripe customer IDs here.

This helper currently treats any non-empty, non-pending_ value as actionable. In scripts/backfill_stripe_customer_countries.ts, that is the last gate before stripe.customers.retrieve(...), so malformed IDs like sub_... or acct_... become avoidable failures and can abort the run. Filter explicitly to cus_ IDs.

Suggested fix
 export function isActionableStripeCustomerId(customerId: string | null | undefined) {
   const trimmedCustomerId = customerId?.trim()
-  return !!trimmedCustomerId && !trimmedCustomerId.startsWith('pending_')
+  return !!trimmedCustomerId
+    && trimmedCustomerId.startsWith('cus_')
+    && !trimmedCustomerId.startsWith('pending_')
 }
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@scripts/admin_stripe_backfill_utils.ts` around lines 111 - 113, The
isActionableStripeCustomerId helper currently treats any trimmed, non-`pending_`
value as valid; change it to only accept real Stripe customer IDs by ensuring
the trimmed value both startsWith('cus_') and is not prefixed with 'pending_'.
Update the function isActionableStripeCustomerId to return true only when
customerId is non-null/defined, trimmed, startsWith('cus_'), and does not
startWith('pending_') so malformed IDs like 'sub_' or 'acct_' are rejected
before calling stripe.customers.retrieve.

Comment on lines +46 to +49
function assertDateId(value: string, label: string) {
if (!DATE_ID_REGEX.test(value))
throw new Error(`${label} must use YYYY-MM-DD`)
return value
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🟠 Major | ⚡ Quick win

Reject impossible calendar dates, not just YYYY-MM-DD strings.

assertDateId() only checks the shape. Inputs like 2026-02-31 still pass, and --from is later used as a plain string filter in fetchGlobalStatsRows(), so the script can target the wrong range instead of failing fast.

Suggested fix
 function assertDateId(value: string, label: string) {
   if (!DATE_ID_REGEX.test(value))
     throw new Error(`${label} must use YYYY-MM-DD`)
+
+  const parsed = new Date(`${value}T00:00:00.000Z`)
+  if (Number.isNaN(parsed.getTime()) || getDateId(parsed) !== value)
+    throw new Error(`${label} must be a real UTC calendar date`)
+
   return value
 }
📝 Committable suggestion

‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.

Suggested change
function assertDateId(value: string, label: string) {
if (!DATE_ID_REGEX.test(value))
throw new Error(`${label} must use YYYY-MM-DD`)
return value
function assertDateId(value: string, label: string) {
if (!DATE_ID_REGEX.test(value))
throw new Error(`${label} must use YYYY-MM-DD`)
const parsed = new Date(`${value}T00:00:00.000Z`)
if (Number.isNaN(parsed.getTime()) || getDateId(parsed) !== value)
throw new Error(`${label} must be a real UTC calendar date`)
return value
}
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@scripts/backfill_org_conversion_rate_trend.ts` around lines 46 - 49,
assertDateId currently only checks DATE_ID_REGEX; update it to also reject
impossible calendar dates by parsing the YYYY-MM-DD parts and validating them:
split value into year, month, day, construct a Date (or use a reliable date
library) and confirm the Date is valid and that its year/month/day match the
parsed components (to catch things like 2026-02-31), and throw the same
Error(`${label} must use YYYY-MM-DD and be a valid date`) when invalid; keep the
original function name assertDateId and leave DATE_ID_REGEX in place for the
shape check so fetchGlobalStatsRows will never receive nonsensical date strings.

Comment on lines +237 to +245
let updated = 0
await asyncPool(concurrency, changedRows, async (row) => {
await updateConversionRate(supabase, row)
updated++
if (updated % 100 === 0 || updated === changedRows.length)
console.log(`Updated ${updated}/${changedRows.length}`)
})

console.log(`Done. Updated ${updated}/${changedRows.length} org conversion rate rows.`)
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🟠 Major | ⚡ Quick win

Handle per-row update failures before continuing.

A single rejected updateConversionRate() makes asyncPool() reject immediately, but the other in-flight updates have already started. That can leave a partially applied backfill with no record of which date_ids failed. Catch and collect row-level failures here, then report them at the end.

Suggested fix
   let updated = 0
+  const failures: Array<{ date_id: string, error: string }> = []
   await asyncPool(concurrency, changedRows, async (row) => {
-    await updateConversionRate(supabase, row)
-    updated++
-    if (updated % 100 === 0 || updated === changedRows.length)
-      console.log(`Updated ${updated}/${changedRows.length}`)
+    try {
+      await updateConversionRate(supabase, row)
+      updated++
+      if (updated % 100 === 0 || updated === changedRows.length)
+        console.log(`Updated ${updated}/${changedRows.length}`)
+    }
+    catch (error) {
+      failures.push({
+        date_id: row.date_id,
+        error: error instanceof Error ? error.message : String(error),
+      })
+    }
   })
 
+  if (failures.length > 0)
+    throw new Error(`Org conversion rate backfill completed with ${failures.length} failures`)
+
   console.log(`Done. Updated ${updated}/${changedRows.length} org conversion rate rows.`)
📝 Committable suggestion

‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.

Suggested change
let updated = 0
await asyncPool(concurrency, changedRows, async (row) => {
await updateConversionRate(supabase, row)
updated++
if (updated % 100 === 0 || updated === changedRows.length)
console.log(`Updated ${updated}/${changedRows.length}`)
})
console.log(`Done. Updated ${updated}/${changedRows.length} org conversion rate rows.`)
let updated = 0
const failures: Array<{ date_id: string, error: string }> = []
await asyncPool(concurrency, changedRows, async (row) => {
try {
await updateConversionRate(supabase, row)
updated++
if (updated % 100 === 0 || updated === changedRows.length)
console.log(`Updated ${updated}/${changedRows.length}`)
}
catch (error) {
failures.push({
date_id: row.date_id,
error: error instanceof Error ? error.message : String(error),
})
}
})
if (failures.length > 0)
throw new Error(`Org conversion rate backfill completed with ${failures.length} failures`)
console.log(`Done. Updated ${updated}/${changedRows.length} org conversion rate rows.`)
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@scripts/backfill_org_conversion_rate_trend.ts` around lines 237 - 245, The
asyncPool worker currently lets a single rejected updateConversionRate() reject
the whole pool and leaves in-flight tasks untracked; instead, wrap the per-row
call to updateConversionRate(supabase, row) in a try/catch inside the asyncPool
callback, push failed row identifiers (e.g., row.date_id or the full row) into a
local failures array, increment updated only on success, and continue so other
in-flight tasks can finish; after asyncPool resolves, log or throw a summary
using failures and updated to surface which date_id(s) failed and ensure the
script exits non-zero if you need a failing CI signal.

@riderx riderx merged commit ffbfe25 into main May 1, 2026
50 checks passed
@riderx riderx deleted the codex/stripe-admin-backfill-scripts branch May 1, 2026 11:47
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants