Skip to content

fix(encrypt-upload-client): ensure playwright tests exit cleanly in CI#634

Closed
gitsofaryan wants to merge 30 commits intostoracha:mainfrom
gitsofaryan:fix/playwright-ci-hang
Closed

fix(encrypt-upload-client): ensure playwright tests exit cleanly in CI#634
gitsofaryan wants to merge 30 commits intostoracha:mainfrom
gitsofaryan:fix/playwright-ci-hang

Conversation

@gitsofaryan
Copy link
Copy Markdown

Context

Browser tests for encrypt-upload-client were hanging in CI even after all tests passed, causing jobs to be cancelled after several hours (see #629). CI browser tests were temporarily disabled in #630.

What this does

Adds an explicit CI-only teardown to ensure Playwright tests exit cleanly and do not keep the Node event loop alive.

Why

This addresses the underlying hang while keeping the fix minimal and safe. It should allow browser tests to be safely re-enabled in CI in a follow-up PR.

Scope

  • CI-only behavior
  • No runtime or encryption logic changes

@gitsofaryan gitsofaryan requested a review from travis as a code owner January 13, 2026 05:32
@gitsofaryan
Copy link
Copy Markdown
Author

Hi @travis 👋
When you have a moment, could you please take a look at this?

This adds a small CI-only Playwright teardown to address the hanging
browser tests described in #629, without changing runtime behavior.
Happy to adjust if you’d prefer a different approach.

travis
travis previously approved these changes Jan 14, 2026
Copy link
Copy Markdown
Contributor

@travis travis left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

this looks fine, but fwiw it does not seem to solve the core issue here - I tried re-enabling the browser tests in CI (see #635 and https://github.com/storacha/upload-service/actions/runs/20974056083 which demonstrated the same "hanging forever" behavior

that said, it seems reasonable to exit the process here, so I've approved

@gitsofaryan
Copy link
Copy Markdown
Author

Thanks for catching that, @travis . You're rightI realized test.afterAll only runs in the worker process, so process.exit(0) there wasn't killing the main process where the secure server lives.

should we move with the explicit exit to global-teardown.js instead. Since that runs in the main orchestration process, it should properly force the entire node process to terminate in CI. Initialing any updates?

@gitsofaryan
Copy link
Copy Markdown
Author

Update: Moved the process.exit(0) call from the test worker (afterAll hook) to globalTeardown.

Why: The previous attempt only exited the worker process, leaving the main Playwright orchestration process (and the secure server) alive, which caused the CI job to hang. globalTeardown runs in the main process, so calling process.exit(0) there ensures the entire Node environment is terminated.

Verification:

Verified locally with CI=true npx playwright test -> Observed explicit [Global Teardown] CI detected, forcing process exit log.
Verified standard local run -> Tests pass without forced exit.

@travis
Copy link
Copy Markdown
Contributor

travis commented Jan 14, 2026

this is still hanging after re-enabling the encrypto-upload-client job: #636

could you please pull 07a99c0 into this PR so I don't have to keep creating extra PRs?

@gitsofaryan
Copy link
Copy Markdown
Author

Acknowledged, @travis.
I’m investigating this in my fork with workflows enabled so I can fully reproduce and resolve the CI hang.
I’ll update this PR and tag you once the fix is verified.

@travis travis dismissed their stale review January 15, 2026 17:40

more changes have been added

@travis
Copy link
Copy Markdown
Contributor

travis commented Jan 15, 2026

I’m investigating this in my fork with workflows enabled

hey thanks! really appreciate you looking into this! let me know when it's ready and I will take another look.

@gitsofaryan
Copy link
Copy Markdown
Author

Hey @travis ,

I've successfully fixed the CI hang and verified it in my forked repo. (Note: I'm not 100% sure about the 'billing plan' tests passing since the credentials aren't set up in my fork, but the hangs are definitely gone.)

check my run -> https://github.com/gitsofaryan/upload-service/actions/runs/21056512559

I also fixed some lint errors and validated the lockfiles. Please take a look at this run, and if it looks good, we can run it on the main repo.

Thanks a lot!

Copy link
Copy Markdown
Contributor

@travis travis left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

this looks fine, but fwiw it does not seem to solve the core issue here - I tried re-enabling the browser tests in CI (see #635 and https://github.com/storacha/upload-service/actions/runs/20974056083 which demonstrated the same "hanging forever" behavior

that said, it seems reasonable to exit the process here, so I've approved

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

this test is still disabled in CI - this should fix that and prove this is working

Suggested change
"test:browser": "npm run test:setup-certs && npx playwright test"

- Simplified all test wrapper scripts to minimal timeout-based pattern
- Removed unnecessary output parsing and complex exit logic
- Added global timeout (5min) to Playwright config for CI
- Improved browser launch options for CI stability
- Enhanced server cleanup in global teardown
- Removed fix/** branch trigger from CI workflow

This approach relies on:
1. Global teardown for proper server cleanup
2. Simple timeout protection in wrappers
3. Playwright's globalTimeout as safety net
4. Clean process.exit() to prevent hanging

Signed-off-by: Aryan Jain <aryan.jain.csbs22@ggits.net>
…ent CI hangs

- Add 5s timeout per fetch in revocation check to prevent hanging on slow endpoints
- Use AbortSignal.any() to compose global abort and per-fetch timeouts
- Add queue.onIdle() in finally block to ensure all queue tasks complete
- Simplify test commands to direct Playwright/test runner invocation
- Remove redundant test wrapper scripts from CLI, encrypt-upload-client, and UI packages
- Keep Playwright global teardown with process.exit(0) in CI to force clean exit
- Revert unnecessary lint ignore patterns and focus on actual hang fix
- Add test:browser to CI workflow
- Remove playwright from tsconfig exclude
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants