[codex] add stripe emulator vitest coverage and playwright ci#1940
Conversation
|
Note Reviews pausedIt looks like this branch is under active development. To avoid overwhelming you with review comments due to an influx of new commits, CodeRabbit has automatically paused this review. You can configure this behavior by changing the Use the following commands to manage reviews:
Use the checkboxes below for quick actions:
📝 WalkthroughWalkthroughAdds a dedicated Playwright CI job and Bun-based test runner, orchestrates a Supabase-backed backend with readiness gating and retries, tightens Playwright CI config and concurrency, adds test IDs and E2E helpers, hardens multiple backend tests, and introduces a Stripe-emulator integration test suite. Changes
Sequence DiagramsequenceDiagram
participant GH as "GitHub Actions"
participant Runner as "CI runner"
participant BunScript as "bun scripts/run-playwright-tests.ts"
participant Backend as "serve-backend-playwright.ts"
participant SupabaseCLI as "supabase/setup-cli"
participant Functions as "Functions server"
participant Playwright as "Playwright runner"
GH->>Runner: start test_playwright job
Runner->>BunScript: execute runner
BunScript->>BunScript: remove old readiness file, kill stale backends
BunScript->>Backend: spawn backend (env: PLAYWRIGHT_READY_FILE)
Backend->>SupabaseCLI: query worktree/status & start supabase (retries/backoff)
SupabaseCLI-->>Backend: return config (API_URL, ports, keys)
Backend->>Functions: start functions serve
Functions-->>Backend: respond /functions/v1/ok
Backend->>BunScript: write readiness file
BunScript->>BunScript: poll readiness file (<=360s)
BunScript->>Playwright: spawn Playwright tests (SKIP_BACKEND_START=true)
Playwright-->>BunScript: exit status
BunScript->>Backend: send SIGTERM (cleanup)
Backend->>SupabaseCLI: stop supabase
BunScript->>Runner: propagate exit code/result
Estimated code review effort🎯 4 (Complex) | ⏱️ ~60 minutes Possibly related PRs
Poem
🚥 Pre-merge checks | ✅ 4 | ❌ 1❌ Failed checks (1 warning)
✅ Passed checks (4 passed)
✏️ Tip: You can configure your own custom pre-merge checks in the settings. ✨ Finishing Touches🧪 Generate unit tests (beta)
Comment |
|
Review the following changes in direct dependencies. Learn more about Socket for GitHub. |
There was a problem hiding this comment.
Actionable comments posted: 5
Caution
Some comments are outside the diff and can’t be posted inline due to platform limitations.
⚠️ Outside diff range comments (1)
tests/cron_stat_refresh_completion.test.ts (1)
63-105:⚠️ Potential issue | 🟡 MinorUse
getEndpointUrl('/triggers/cron_stat_app')instead ofBASE_URLfor consistency with coding guidelines.Lines 64 and 99 hit
/triggers/cron_stat_appviaBASE_URL. Per the coding guidelines ("UsegetEndpointUrl(path)test helper to route to correct worker based on endpoint"), these should usegetEndpointUrl('/triggers/cron_stat_app')instead. This is already the pattern in tests liketests/version-name-stats.test.ts(line 158) andtests/build_time_tracking.test.ts(line 234).🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@tests/cron_stat_refresh_completion.test.ts` around lines 63 - 105, Replace direct BASE_URL usage with the test helper getEndpointUrl('/triggers/cron_stat_app') for both fetch calls that create firstResponse and secondResponse; update the two fetch invocations in the test "updates app freshness immediately and only marks the org fresh after the last pending app completes" to call fetch(getEndpointUrl('/triggers/cron_stat_app'), { ... }) so the requests follow the project's routing helper convention used elsewhere (refer to the fetch calls that set firstResponse and secondResponse).
🧹 Nitpick comments (3)
src/pages/ApiKeys.vue (1)
335-338: Make the row delete selector unique.Every delete button now renders the same
data-test="delete-key", which becomes ambiguous as soon as the table has multiple keys.TableAction.testIdalready supports a function, so deriving it from the row keeps the selector deterministic.💡 Suggested fix
{ icon: IconTrash, onClick: (key: Database['public']['Tables']['apikeys']['Row']) => deleteKey(key), - testId: 'delete-key', + testId: (key: Database['public']['Tables']['apikeys']['Row']) => `delete-key-${key.id}`, },🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@src/pages/ApiKeys.vue` around lines 335 - 338, The delete button's data-test attribute is not unique because TableAction.testId is set to the same string for every row; change the testId to a function that derives a deterministic unique selector from the row (e.g., use the row's id) so each delete button is distinct. Locate the TableAction entry that currently uses IconTrash and onClick: (key) => deleteKey(key) and replace the static testId with a function like (key) => `delete-key-${key.id}` (or another unique row field) so test selectors are unambiguous.scripts/run-playwright-tests.ts (1)
29-34: AddcwdtospawnSyncfor consistency with the backend script.The matching function in
scripts/serve-backend-playwright.ts(line 86-92) includescwd: repoRoot. Whilepkilldoesn't depend on the working directory, adding it maintains consistency and makes the behavior explicit.🔧 Minor consistency improvement
function stopExistingPlaywrightBackend() { spawnSync('pkill', ['-f', 'supabase-functions.playwright.env'], { + cwd: process.cwd(), stdio: 'ignore', env: process.env, }) }🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@scripts/run-playwright-tests.ts` around lines 29 - 34, In stopExistingPlaywrightBackend(), add cwd: repoRoot to the options object passed to spawnSync (alongside stdio and env) so the pkill invocation matches the serve-backend-playwright.ts behavior; locate the spawnSync call inside function stopExistingPlaywrightBackend and include cwd: repoRoot in that options literal for consistency with the backend script..github/workflows/tests.yml (1)
202-233: Consider uploading Playwright artifacts on failure for easier CI debugging.The job configuration looks solid with proper Supabase CLI pinning and browser installation. However, when Playwright tests fail in CI, having the HTML report and traces available would significantly ease debugging.
📦 Proposed addition: Upload Playwright artifacts
Add these steps after line 233:
- name: Upload Playwright report uses: actions/upload-artifact@v4 if: failure() with: name: playwright-report path: playwright-report/ retention-days: 7🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In @.github/workflows/tests.yml around lines 202 - 233, The test_playwright job doesn't upload Playwright artifacts on failure which hinders CI debugging; add post-test steps after the "Run Playwright tests" step to upload the Playwright HTML report and traces using actions/upload-artifact@v4 with if: failure(), e.g. upload the playwright-report/ directory (and optionally playwright-report/traces or traces/) with a descriptive artifact name like "playwright-report" and a short retention-days value so failed-run reports and traces are available for inspection.
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.
Inline comments:
In `@scripts/serve-backend-playwright.ts`:
- Around line 16-20: The Supabase health-check can misreport valid outputs
because the SupabaseStatus interface only lists API_URL, ANON_KEY, and
SERVICE_ROLE_KEY and the health-check logic only checks those exact keys; update
the interface SupabaseStatus to also include PUBLISHABLE_KEY and SECRET_KEY
(aliases for ANON_KEY and SERVICE_ROLE_KEY) and modify the health-check code
that validates ANON_KEY and SERVICE_ROLE_KEY (the check around lines ~74-75) to
accept either the primary keys or their aliases (use ANON_KEY || PUBLISHABLE_KEY
and SERVICE_ROLE_KEY || SECRET_KEY when determining health) so payloads using
alternate names are treated as healthy.
In `@tests/queue_load.test.ts`:
- Around line 133-141: The stress test's Promise.all burst can fail fast on
thrown fetch() exceptions because fetchQueueSync currently only retries non-202
responses; modify fetchQueueSync (the function invoked in
tests/queue_load.test.ts) to wrap its fetch calls in a try/catch and retry on
thrown errors using the same retry loop/logic used in
webhook-queue-processing.test.ts (the try/catch + backoff/retry pattern present
around lines 27-50), honoring the existing maxRetries argument (e.g., maxRetries
= 6) so transient network exceptions are retried instead of aborting the whole
test.
In `@tests/stripe-emulator.test.ts`:
- Line 78: The test's global variable emulator should be nullable so teardown
doesn't throw if setup failed: change the declaration of emulator
(Awaited<ReturnType<typeof createEmulator>>) to allow undefined/null, ensure
beforeAll assigns it as before, and modify the afterAll teardown to check
emulator before calling its close/stop method (e.g., if (emulator) await
emulator.close() or similar). Update any usages in tests to handle the nullable
type where needed.
- Around line 54-75: getFreePort currently closes the probe server and returns
the port, creating a TOCTOU where another process can take the port before the
emulator binds; change this by either (A) returning a still-bound server (keep
the server returned by createServer/listen open and expose its port so the
emulator reuses that listener) or (B) implement a retry-on-EADDRINUSE loop when
binding the emulator: preserve the probe server until the emulator binds or
catch EADDRINUSE and retry getFreePort/bind several times before failing. Update
the code paths referencing getFreePort (the createServer/listen logic) so the
probe socket is not closed before the emulator's listen call (or ensure robust
retries on port collision).
- Around line 4-5: ESLint import ordering fails because the 'emulate' import
must come before the 'vitest' named imports; reorder the top imports so the line
"import { createEmulator } from 'emulate'" appears before "import { afterAll,
afterEach, beforeAll, describe, expect, it, vi } from 'vitest'" to satisfy
perfectionist/sort-imports.
---
Outside diff comments:
In `@tests/cron_stat_refresh_completion.test.ts`:
- Around line 63-105: Replace direct BASE_URL usage with the test helper
getEndpointUrl('/triggers/cron_stat_app') for both fetch calls that create
firstResponse and secondResponse; update the two fetch invocations in the test
"updates app freshness immediately and only marks the org fresh after the last
pending app completes" to call fetch(getEndpointUrl('/triggers/cron_stat_app'),
{ ... }) so the requests follow the project's routing helper convention used
elsewhere (refer to the fetch calls that set firstResponse and secondResponse).
---
Nitpick comments:
In @.github/workflows/tests.yml:
- Around line 202-233: The test_playwright job doesn't upload Playwright
artifacts on failure which hinders CI debugging; add post-test steps after the
"Run Playwright tests" step to upload the Playwright HTML report and traces
using actions/upload-artifact@v4 with if: failure(), e.g. upload the
playwright-report/ directory (and optionally playwright-report/traces or
traces/) with a descriptive artifact name like "playwright-report" and a short
retention-days value so failed-run reports and traces are available for
inspection.
In `@scripts/run-playwright-tests.ts`:
- Around line 29-34: In stopExistingPlaywrightBackend(), add cwd: repoRoot to
the options object passed to spawnSync (alongside stdio and env) so the pkill
invocation matches the serve-backend-playwright.ts behavior; locate the
spawnSync call inside function stopExistingPlaywrightBackend and include cwd:
repoRoot in that options literal for consistency with the backend script.
In `@src/pages/ApiKeys.vue`:
- Around line 335-338: The delete button's data-test attribute is not unique
because TableAction.testId is set to the same string for every row; change the
testId to a function that derives a deterministic unique selector from the row
(e.g., use the row's id) so each delete button is distinct. Locate the
TableAction entry that currently uses IconTrash and onClick: (key) =>
deleteKey(key) and replace the static testId with a function like (key) =>
`delete-key-${key.id}` (or another unique row field) so test selectors are
unambiguous.
🪄 Autofix (Beta)
Fix all unresolved CodeRabbit comments on this PR:
- Push a commit to this branch (recommended)
- Create a new PR with the fixes
ℹ️ Review info
⚙️ Run configuration
Configuration used: defaults
Review profile: CHILL
Plan: Pro
Run ID: 7516dff2-16d3-469e-82d2-902e519e2c00
📒 Files selected for processing (15)
.github/workflows/tests.ymlpackage.jsonplaywright.config.tsplaywright/e2e/apikeys.spec.tsplaywright/e2e/register.spec.tsscripts/run-playwright-tests.tsscripts/serve-backend-playwright.tssrc/components.d.tssrc/components/DataTable.vuesrc/components/comp_def.tssrc/pages/ApiKeys.vuetests/audit-logs.test.tstests/cron_stat_refresh_completion.test.tstests/queue_load.test.tstests/stripe-emulator.test.ts
There was a problem hiding this comment.
🧹 Nitpick comments (5)
playwright/e2e/apikeys.spec.ts (2)
39-41: Prefer resilient deletion verification over exact toast wording.The row-removal assertion already validates behavior; exact success-copy matching is brittle for i18n/content updates.
Suggested change
- const toast = page.locator('[data-test="toast"]') - await expect(toast).toContainText('API key has been successfully deleted') + await expect(page.locator('[data-test="toast"]')).toBeVisible() await expect(page.locator('tr', { hasText: keyName })).toHaveCount(0)🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@playwright/e2e/apikeys.spec.ts` around lines 39 - 41, The test currently asserts an exact toast message which is brittle; replace the precise text check on the toast (the locator stored in toast via page.locator('[data-test="toast"]')) with a resilient presence/visibility assertion (e.g., await expect(toast).toBeVisible() or toHaveCount(1)) and keep the row-removal assertion (await expect(page.locator('tr', { hasText: keyName })).toHaveCount(0)); update the assertion referencing toast and remove the toContainText('API key has been successfully deleted') check.
9-9: Avoid hard-coded localized toast copy in this helper.This text is translation-driven (
toast.success(t('add-api-key'))insrc/pages/ApiKeys.vue), so exact matching can cause avoidable CI flakes on locale/copy updates.Suggested change
- await expect(page.locator('[data-test="toast"]')).toContainText('Added new API key successfully') + await expect(page.locator('[data-test="toast"]')).toBeVisible()🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@playwright/e2e/apikeys.spec.ts` at line 9, The test currently asserts an exact localized toast message which is brittle; instead, update the assertion on page.locator('[data-test="toast"]') to avoid hard-coded translation text (see ApiKeys.vue where toast.success(t('add-api-key')) is used). Replace the exact toContainText('Added new API key successfully') check with a locale-agnostic assertion such as verifying the toast is visible and contains non-empty text or matches a stable pattern (e.g., success state/class plus presence of any text) so the test no longer depends on the translated copy.tests/queue_load.test.ts (2)
66-73: Considerit.concurrent()for tests that don't touch shared queue state.The health check (line 66) and invalid request tests (line 79) don't interact with the shared
queueNamestate and could run concurrently, per coding guidelines.♻️ Example for health check
- it('should handle queue consumer health check', async () => { + it.concurrent('should handle queue consumer health check', async () => {As per coding guidelines: "Use
it.concurrent()instead ofit()when possible to run tests in parallel within the same file, maximizing parallelism for faster CI/CD"Also applies to: 79-112
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@tests/queue_load.test.ts` around lines 66 - 73, The health-check test "should handle queue consumer health check" (and the other tests that don't touch shared queueName state between lines 79-112, e.g., the invalid request tests) should be converted from serial tests using it(...) to parallel tests using it.concurrent(...); locate the test declarations by their titles in tests/queue_load.test.ts and replace the it(...) calls with it.concurrent(...) so they can run in parallel without altering test logic or shared state handling.
15-19: Redundant DELETE statements on freshly created queue.Since the queue is newly created with a unique name per run, the tables
pgmq.q_${queueName}andpgmq.a_${queueName}are guaranteed empty. These DELETE statements can be removed.♻️ Suggested simplification
beforeAll(async () => { await pool.query('SELECT pgmq.create($1)', [queueName]) - await pool.query(`DELETE FROM pgmq.q_${queueName}`) - await pool.query(`DELETE FROM pgmq.a_${queueName}`) })🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@tests/queue_load.test.ts` around lines 15 - 19, The beforeAll setup in tests/queue_load.test.ts redundantly issues DELETEs after creating a fresh queue; remove the two pool.query calls that delete from `pgmq.q_${queueName}` and `pgmq.a_${queueName}` so the block only calls `pool.query('SELECT pgmq.create($1)', [queueName])` (leave `beforeAll`, `pool.query`, and `queueName` unchanged)..github/workflows/tests.yml (1)
202-244: Optional: extract duplicated CI bootstrap steps into a reusable unit.
test_allandtest_playwrightnow duplicate cache/checkout/bun install/dependency setup and Supabase template linking. Consider a reusable workflow or composite action to keep these in sync.🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In @.github/workflows/tests.yml around lines 202 - 244, The test_playwright job duplicates CI bootstrap steps present in test_all (cache/checkout/setup bun/install dependencies/install Supabase CLI/link templates); extract those shared steps (e.g., the steps named "Cache Deno dependencies", "Checkout capgo", "Setup bun", "Install dependencies", "Install Supabase CLI", "Link Supabase templates") into a reusable workflow or composite action (invocable via workflow_call or a composite action in .github/actions) and replace the duplicated sequences in both jobs with a single call to that reusable unit, leaving job-specific steps like "Install Playwright browser" and "Run Playwright tests" in test_playwright so each job can still add or override steps as needed.
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.
Nitpick comments:
In @.github/workflows/tests.yml:
- Around line 202-244: The test_playwright job duplicates CI bootstrap steps
present in test_all (cache/checkout/setup bun/install dependencies/install
Supabase CLI/link templates); extract those shared steps (e.g., the steps named
"Cache Deno dependencies", "Checkout capgo", "Setup bun", "Install
dependencies", "Install Supabase CLI", "Link Supabase templates") into a
reusable workflow or composite action (invocable via workflow_call or a
composite action in .github/actions) and replace the duplicated sequences in
both jobs with a single call to that reusable unit, leaving job-specific steps
like "Install Playwright browser" and "Run Playwright tests" in test_playwright
so each job can still add or override steps as needed.
In `@playwright/e2e/apikeys.spec.ts`:
- Around line 39-41: The test currently asserts an exact toast message which is
brittle; replace the precise text check on the toast (the locator stored in
toast via page.locator('[data-test="toast"]')) with a resilient
presence/visibility assertion (e.g., await expect(toast).toBeVisible() or
toHaveCount(1)) and keep the row-removal assertion (await
expect(page.locator('tr', { hasText: keyName })).toHaveCount(0)); update the
assertion referencing toast and remove the toContainText('API key has been
successfully deleted') check.
- Line 9: The test currently asserts an exact localized toast message which is
brittle; instead, update the assertion on page.locator('[data-test="toast"]') to
avoid hard-coded translation text (see ApiKeys.vue where
toast.success(t('add-api-key')) is used). Replace the exact toContainText('Added
new API key successfully') check with a locale-agnostic assertion such as
verifying the toast is visible and contains non-empty text or matches a stable
pattern (e.g., success state/class plus presence of any text) so the test no
longer depends on the translated copy.
In `@tests/queue_load.test.ts`:
- Around line 66-73: The health-check test "should handle queue consumer health
check" (and the other tests that don't touch shared queueName state between
lines 79-112, e.g., the invalid request tests) should be converted from serial
tests using it(...) to parallel tests using it.concurrent(...); locate the test
declarations by their titles in tests/queue_load.test.ts and replace the it(...)
calls with it.concurrent(...) so they can run in parallel without altering test
logic or shared state handling.
- Around line 15-19: The beforeAll setup in tests/queue_load.test.ts redundantly
issues DELETEs after creating a fresh queue; remove the two pool.query calls
that delete from `pgmq.q_${queueName}` and `pgmq.a_${queueName}` so the block
only calls `pool.query('SELECT pgmq.create($1)', [queueName])` (leave
`beforeAll`, `pool.query`, and `queueName` unchanged).
ℹ️ Review info
⚙️ Run configuration
Configuration used: defaults
Review profile: CHILL
Plan: Pro
Run ID: d42ed338-b8fc-43c8-ba0e-65dc02192e5f
📒 Files selected for processing (8)
.github/workflows/tests.ymlplaywright/e2e/apikeys.spec.tsscripts/run-playwright-tests.tsscripts/serve-backend-playwright.tssrc/pages/ApiKeys.vuetests/cron_stat_refresh_completion.test.tstests/queue_load.test.tstests/stripe-emulator.test.ts
🚧 Files skipped from review as they are similar to previous changes (4)
- src/pages/ApiKeys.vue
- tests/stripe-emulator.test.ts
- scripts/run-playwright-tests.ts
- scripts/serve-backend-playwright.ts
There was a problem hiding this comment.
💡 Codex Review
Here are some automated review suggestions for this pull request.
Reviewed commit: dee1421f81
ℹ️ About Codex in GitHub
Your team has set up Codex to review pull requests in this repo. Reviews are triggered when you
- Open a pull request for review
- Mark a draft as ready
- Comment "@codex review".
If Codex has suggestions, it will comment; otherwise it will react with 👍.
Codex can also answer questions or update the PR. Try commenting "@codex address that feedback".
There was a problem hiding this comment.
💡 Codex Review
Here are some automated review suggestions for this pull request.
Reviewed commit: ffc11c2633
ℹ️ About Codex in GitHub
Your team has set up Codex to review pull requests in this repo. Reviews are triggered when you
- Open a pull request for review
- Mark a draft as ready
- Comment "@codex review".
If Codex has suggestions, it will comment; otherwise it will react with 👍.
Codex can also answer questions or update the PR. Try commenting "@codex address that feedback".
|
There was a problem hiding this comment.
💡 Codex Review
Here are some automated review suggestions for this pull request.
Reviewed commit: 5f93d3200b
ℹ️ About Codex in GitHub
Your team has set up Codex to review pull requests in this repo. Reviews are triggered when you
- Open a pull request for review
- Mark a draft as ready
- Comment "@codex review".
If Codex has suggestions, it will comment; otherwise it will react with 👍.
Codex can also answer questions or update the PR. Try commenting "@codex address that feedback".
| while (Date.now() - startedAt < timeoutMs) { | ||
| try { | ||
| const response = await fetch(targetUrl) | ||
| if (response.ok) | ||
| return |
There was a problem hiding this comment.
Fail fast when functions serve exits before readiness
This readiness loop only polls targetUrl and never checks whether the spawned functions serve process has already exited. If that child dies early (for example from a bad env file or port conflict), the script waits the full PLAYWRIGHT_BACKEND_TIMEOUT_MS (default 360000 ms) before failing, which adds long false hangs to CI/local failures. Check the child exit/signal state inside the loop and throw immediately when it is no longer running.
Useful? React with 👍 / 👎.
| try { | ||
| await waitForFunctionsReady(functionsReadyTimeoutMs) | ||
| childStarted = true |
There was a problem hiding this comment.
Register signal forwarding before waiting for readiness
The script waits for backend readiness before installing SIGINT/SIGTERM forwarding, so cancellation during startup can terminate this wrapper without propagating cleanup to the spawned functions serve child. That leaves stale edge-runtime processes bound to ports and can break subsequent runs in the same environment. Move signal handler registration to immediately after spawn (or wrap startup in guaranteed child cleanup on signal).
Useful? React with 👍 / 👎.



Summary (AI generated)
bun run test:frontstart a fresh seeded Supabase backend before PlaywrightMotivation (AI generated)
Stripe emulator support had only been exercised through browser specs, while CI still skipped Playwright entirely. That left the emulator-specific checkout fallbacks under-tested and let the end-to-end billing flows drift outside the required PR validation path.
Business Impact (AI generated)
This reduces release risk on subscription and credit-purchase flows, which are directly tied to conversion and revenue. It also makes CI enforce the browser billing path instead of relying on manual verification.
Test Plan (AI generated)
bun lintbun lint:backendbun typecheckbunx vitest run tests/stripe-emulator.test.ts tests/stripe-redirects.unit.test.tsbun run test:front playwright/e2e/subscription-checkout.spec.ts playwright/e2e/credits-top-up.spec.tsGenerated with AI
Summary by CodeRabbit
Tests
Chores