fix(android): drain JNI stdio pumps before returning from node::Start#60
Conversation
Closes #19. The pump threads were detached and stop_redirecting_stdout_stderr closed only pipe_stdout[1]/pipe_stderr[1], leaving STDOUT_FILENO and STDERR_FILENO (dup2 aliases) open. read() never saw EOF and the JNI return raced the pump flush, so a Node uncaught-exception traceback could be lost from logcat (CI saw `__NODE_EXIT__:1` with no error). Stop detaching the pumps, close all four write-end refs, and pthread_join before closing the read ends so the buffer drains before JNI returns.
There was a problem hiding this comment.
Pull request overview
This PR fixes an Android teardown race where stdout/stderr pump threads could be terminated before draining the pipe buffer, causing Node.js crash tracebacks (notably uncaught exceptions) to be lost from logcat.
Changes:
- Stops detaching the stdout/stderr pump threads so they can be joined deterministically.
- Closes the write-end file descriptors (including
STDOUT_FILENO/STDERR_FILENOdup2 aliases) to force EOF and allow pump threads to drain before JNI returns. - Joins pump threads before closing read ends.
💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.
| // Close all write-end refs (STDOUT_FILENO is a dup2 alias of pipe_stdout[1]) so | ||
| // the pumps see EOF, then join so the buffer drains before JNI returns. | ||
| close(STDOUT_FILENO); | ||
| close(pipe_stdout[1]); | ||
| close(pipe_stderr[0]); | ||
| close(STDERR_FILENO); | ||
| close(pipe_stderr[1]); |
There was a problem hiding this comment.
Doesn't apply to this codebase: startNodeWithArguments is invoked exactly once per OS process, so the fd-1/2 reuse path can't be reached.
NodeJSService.start()rejects any state other thanSTOPPED, andERRORis per-instance terminal — recovery isdestroy()+ new instance, not a secondstart()on the same instance (NodeJSService.kt:96-102, start guard at :353-359).- The Node thread runs in the
:ComapeoCoreforeground-service process. Stop kills that process, and the Android service framework launches a fresh process for the next start (the instrumented test does an explicitkillon the FGS pid for the same reason). - Independently, nodejs-mobile's
node::Start()is one-shot — it can't be called twice in the same process.
So fds 1/2 are only ever closed at JNI return, immediately before the process exits. There's no second pipe()/dup2() to race with.
| pthread_join(thread_stdout, nullptr); | ||
| pthread_join(thread_stderr, nullptr); |
There was a problem hiding this comment.
Validated and fixed in b3311e2 — track creation success per pump and only pthread_join the threads that were actually started, so a pthread_create failure no longer leads to UB on the uninitialized pthread_t.
Note on the partial-init flag (first pthread_create succeeds, second fails, dup2'd stderr has no pump): kept as-is. It's pre-existing behavior — the previous detached-thread version had the same dup2-without-pump exposure on pthread_create failure — and given how rare pthread_create failures are on this device class, "stdout/stderr writes may block until OOM-kill" felt like a niche enough corner to leave alone rather than refactor start_redirecting_stdout_stderr for full atomic rollback.
Address PR #60 review: with the previous diff, an exceedingly rare pthread_create failure inside start_redirecting_stdout_stderr would leave the corresponding pthread_t uninitialized; the unconditional pthread_join in stop_redirecting_stdout_stderr would then be UB. Track creation success per pump and only join the threads that were actually started.
## Optic Release Automation This **draft** PR is opened by Github action [optic-release-automation-action](https://github.com/nearform-actions/optic-release-automation-action). A new **draft** GitHub release [v1.0.0-pre.2](https://github.com/digidem/comapeo-core-react-native/releases/tag/untagged-c499977757c9745e56b2) has been created. Release author: @gmaclennan #### If you want to go ahead with the release, please merge this PR. When you merge: - The GitHub release will be published - The npm package with tag pre will be published according to the publishing rules you have configured - No major or minor tags will be updated as configured #### If you close the PR - The new draft release will be deleted and nothing will change ## What's Changed * Android Testing Infrastructure & Bug Fixes by @gmaclennan in #3 * chore: prebuild example/android; harden instrumented tests by @gmaclennan in #10 * Integrate @comapeo/core via IPC over Unix sockets by @gmaclennan in #5 * chore: adjust repo setup by @achou11 in #12 * chore: minor fixes based on expo-doctor by @achou11 in #13 * Add iOS support & test infrastructure by @gmaclennan in #6 * chore: add architecture docs & plans by @gmaclennan in #11 * update some native deps used in backend by @achou11 in #14 * iOS Phase 1: unified JS bundle + smoke test (simulator-only) by @gmaclennan in #15 * iOS Phase 2: xcframework Embed & Sign for native addons by @gmaclennan in #16 * Phase 2 Android: jniLibs packaging + unified rollup loader plugin by @gmaclennan in #17 * chore: post-Phase-2 cleanup — comments, plan docs, agents.md by @gmaclennan in #33 * android: read abiFilters from reactNativeArchitectures (#30) by @gmaclennan in #35 * refactor: simplify build-backend.ts; rollup writes directly to native asset trees by @gmaclennan in #34 * chore: fix eslint configuration by @achou11 in #41 * android: audit 16 KB page alignment on every shipped .so by @gmaclennan in #43 * Add rootkey persistence and lifecycle state management by @gmaclennan in #36 * chore: move example app into apps directory by @achou11 in #18 * refactor: per-component lifecycle state with derived ComapeoState by @gmaclennan in #47 * android: fold waitForFile into connect retry loop by @gmaclennan in #52 * chore: add e2e testing app by @achou11 in #49 * fix(android): drop setUnlockedDeviceRequired from rootkey wrapper key by @gmaclennan in #57 * fix(backend): cache stopping/error frames for late joiners by @gmaclennan in #58 * fix(ios-tests): wait for STOPPING before signalling node exit by @gmaclennan in #59 * fix(android): drain JNI stdio pumps before returning from node::Start by @gmaclennan in #60 * Sentry integration: Phase 1 + Phase 2a + Phase 2b by @gmaclennan in #54 * feat(backend): polywasm-backed undici on iOS, re-enable maps plugin by @gmaclennan in #62 * ci: drop unreliable Android emulator snapshot caching by @gmaclennan in #64 * feat(sentry): land Phase 3 — backend loader + RPC tracing by @gmaclennan in #63 * fix(ios-tests): serialise STOPPING/STOPPED observers in testFullLifecycleStateTransitions by @gmaclennan in #71 * use npm list instead of custom traversal to get native module versions by @achou11 in #70 * feat(sentry): land Phases 6 + 7a — Android exit reasons & iOS MetricKit app-exit telemetry by @gmaclennan in #72 * fix(sentry): make exit telemetry lossless and stop cross-process clobbering by @gmaclennan in #84 * chore(e2e): add e2e tests on browserstack via Maestro by @achou11 in #56 * feat(sentry): migrate to @sentry/react-native v8; exit telemetry as Application Metrics by @gmaclennan in #73 * Map server integration by @gmaclennan in #86 * chore(deps): upgrade to Expo SDK 56 (React Native 0.85) by @gmaclennan in #87 * chore(ci): add release workflow by @gmaclennan in #90 * chore: fix npm script and release build script by @gmaclennan in #91 * chore(pack): don't try to package build files by @gmaclennan in #92 * fix: start fastify listening by @gmaclennan in #93 * perf(backend): switch bundler from rollup to rolldown by @gmaclennan in #94 * fix(ci): ignore-scripts in ios npm installs by @gmaclennan in #96 * fix(ci): replace --ignore-scripts with npm strict-allow-scripts allowlist by @gmaclennan in #106 * feat(config): let the consuming app supply the default project config by @gmaclennan in #95 * chore(release): merge prerelease branch. by @gmaclennan in #110 ## New Contributors * @achou11 made their first contribution in #12 **Full Changelog**: https://github.com/digidem/comapeo-core-react-native/commits/v1.0.0-pre.2 <!-- <release-meta>{"id":342868678,"version":"v1.0.0-pre.2","npmTag":"pre","opticUrl":"https://optic-zf3votdk5a-ew.a.run.app/api/generate/"}</release-meta> -->
## Optic Release Automation This **draft** PR is opened by Github action [optic-release-automation-action](https://github.com/nearform-actions/optic-release-automation-action). A new **draft** GitHub release [v1.0.0-pre.2](https://github.com/digidem/comapeo-core-react-native/releases/tag/untagged-352a6c41c12fd02dec37) has been created. Release author: @gmaclennan #### If you want to go ahead with the release, please merge this PR. When you merge: - The GitHub release will be published - The npm package with tag pre will be published according to the publishing rules you have configured - No major or minor tags will be updated as configured #### If you close the PR - The new draft release will be deleted and nothing will change <!-- Release notes generated using configuration in .github/release.yml at 7fe80b4 --> ## What's Changed ### 🚀 Features * Integrate @comapeo/core via IPC over Unix sockets by @gmaclennan in #5 * Add iOS support & test infrastructure by @gmaclennan in #6 * iOS Phase 1: unified JS bundle + smoke test (simulator-only) by @gmaclennan in #15 * iOS Phase 2: xcframework Embed & Sign for native addons by @gmaclennan in #16 * Phase 2 Android: jniLibs packaging + unified rollup loader plugin by @gmaclennan in #17 * android: read abiFilters from reactNativeArchitectures (#30) by @gmaclennan in #35 * Add rootkey persistence and lifecycle state management by @gmaclennan in #36 * Sentry integration: Phase 1 + Phase 2a + Phase 2b by @gmaclennan in #54 * feat(backend): polywasm-backed undici on iOS, re-enable maps plugin by @gmaclennan in #62 * feat(sentry): land Phase 3 — backend loader + RPC tracing by @gmaclennan in #63 * feat(sentry): land Phases 6 + 7a — Android exit reasons & iOS MetricKit app-exit telemetry by @gmaclennan in #72 * feat(sentry): migrate to @sentry/react-native v8; exit telemetry as Application Metrics by @gmaclennan in #73 * Map server integration by @gmaclennan in #86 * feat(config): let the consuming app supply the default project config by @gmaclennan in #95 ### 🐛 Bug Fixes * fix(android): drop setUnlockedDeviceRequired from rootkey wrapper key by @gmaclennan in #57 * fix(backend): cache stopping/error frames for late joiners by @gmaclennan in #58 * fix(ios-tests): wait for STOPPING before signalling node exit by @gmaclennan in #59 * fix(android): drain JNI stdio pumps before returning from node::Start by @gmaclennan in #60 * fix(ios-tests): serialise STOPPING/STOPPED observers in testFullLifecycleStateTransitions by @gmaclennan in #71 * fix(sentry): make exit telemetry lossless and stop cross-process clobbering by @gmaclennan in #84 * fix: start fastify listening by @gmaclennan in #93 * fix(ci): ignore-scripts in ios npm installs by @gmaclennan in #96 * fix(ci): replace --ignore-scripts with npm strict-allow-scripts allowlist by @gmaclennan in #106 * fix(release): stop `npm pack --dry-run` leaking dry-run into backend install by @gmaclennan in #129 ### ⚡ Performance * perf(backend): switch bundler from rollup to rolldown by @gmaclennan in #94 ### ⬆️ Dependencies * update some native deps used in backend by @achou11 in #14 * chore(deps): upgrade to Expo SDK 56 (React Native 0.85) by @gmaclennan in #87 ### 🏗️ Maintenance * Android Testing Infrastructure & Bug Fixes by @gmaclennan in #3 * chore: prebuild example/android; harden instrumented tests by @gmaclennan in #10 * chore: adjust repo setup by @achou11 in #12 * chore: minor fixes based on expo-doctor by @achou11 in #13 * chore: add architecture docs & plans by @gmaclennan in #11 * chore: post-Phase-2 cleanup — comments, plan docs, agents.md by @gmaclennan in #33 * refactor: simplify build-backend.ts; rollup writes directly to native asset trees by @gmaclennan in #34 * chore: fix eslint configuration by @achou11 in #41 * android: audit 16 KB page alignment on every shipped .so by @gmaclennan in #43 * chore: move example app into apps directory by @achou11 in #18 * refactor: per-component lifecycle state with derived ComapeoState by @gmaclennan in #47 * android: fold waitForFile into connect retry loop by @gmaclennan in #52 * chore: add e2e testing app by @achou11 in #49 * ci: drop unreliable Android emulator snapshot caching by @gmaclennan in #64 * use npm list instead of custom traversal to get native module versions by @achou11 in #70 * chore(e2e): add e2e tests on browserstack via Maestro by @achou11 in #56 * chore(ci): add release workflow by @gmaclennan in #90 * chore: fix npm script and release build script by @gmaclennan in #91 * chore(pack): don't try to package build files by @gmaclennan in #92 * chore(release): merge prerelease branch. by @gmaclennan in #110 * ci(e2e): retry BrowserStack builds on infra-class flakes by @gmaclennan in #113 ### Other Changes * ci: derive changelog labels from PR titles + add Dependabot by @gmaclennan in #114 ## New Contributors * @achou11 made their first contribution in #12 * @optic-release-automation[bot] made their first contribution in #112 **Full Changelog**: https://github.com/digidem/comapeo-core-react-native/commits/v1.0.0-pre.2 <!-- <release-meta>{"id":342970724,"version":"v1.0.0-pre.2","npmTag":"pre","opticUrl":"https://optic-zf3votdk5a-ew.a.run.app/api/generate/"}</release-meta> -->
Summary
jni-bridge.cpp; close all four write-end refs (includingSTDOUT_FILENO/STDERR_FILENOdup2 aliases) soread()sees EOF, thenpthread_joinbefore closing the read ends. The pump buffer now drains before JNI returns fromnode::Start, so an uncaught-exception traceback can no longer be lost on teardown.Test plan
__NODE_EXIT__:1with no preceding errorpthread_join)🤖 Generated with Claude Code