Skip to content

refactor(android): stream multipart bundle bodies straight to disk #57043

Open
AndreiCalazans wants to merge 3 commits into
facebook:mainfrom
AndreiCalazans:andrei/android-bundle-download-optimizations
Open

refactor(android): stream multipart bundle bodies straight to disk #57043
AndreiCalazans wants to merge 3 commits into
facebook:mainfrom
AndreiCalazans:andrei/android-bundle-download-optimizations

Conversation

@AndreiCalazans
Copy link
Copy Markdown
Contributor

@AndreiCalazans AndreiCalazans commented Jun 2, 2026

Summary:

We have a very large developer bundle which on a stock Android emulator often causes OOMs while loading the bundle (#52818). You will see below that there is a spike in memory when we are downloading the bundle that seems to go up 1.4x the bundle size and exacerbates the problem. This PR streams the multipart bundle bodies straight to disk to avoid this peak memory.

Before:

Note in the video the spike to 433.8 MB

image
before-stream-to-disk-react-native.mov

After:

With this implementation memory is contained and during the download phase does not peak past 83.4 MB only taking 316.6 MB after the download phase is finished.

after-stream-to-disk-react-native.mov

Note that we have been using this prior version that I made PR for in January because we are on older RN version.

Changelog:

[ANDROID] [CHANGED] Refactored stream multipart bundle bodies to write straight to disk

Test Plan:

Create the bundle padding mechanism to create a bundle with artificial 100 MB in size.

diff --git a/packages/rn-tester/metro.config.js b/packages/rn-tester/metro.config.js
@@ -14,6 +14,36 @@ const {getDefaultConfig} = require('@react-native/metro-config');
 const {mergeConfig} = require('metro-config');
 const path = require('path');

+const padMB = parseInt(process.env.RN_PAD_BUNDLE_MB || '0', 10);
+// Metro 0.81+ enforces strict package `exports`; `./private/*` re-exports `./src/*.js`.
+const baseJSBundle = require('metro/private/DeltaBundler/Serializers/baseJSBundle');
+const bundleToString = require('metro/private/lib/bundleToString');
+const baseJSBundleFn = baseJSBundle.default || baseJSBundle;
+const bundleToStringFn = bundleToString.default || bundleToString;
+
+function buildPaddedSerializer(padBytes) {
+  const overhead = '\n/* RN_PAD_BUNDLE  */\n'.length;
+  const filler = 'A'.repeat(Math.max(0, padBytes - overhead));
+  const suffix = '\n/* RN_PAD_BUNDLE ' + filler + ' */\n';
+  return (entryPoint, preModules, graph, options) => {
+    const bundle = baseJSBundleFn(entryPoint, preModules, graph, options);
+    const result = bundleToStringFn(bundle);
+    return {code: result.code + suffix, map: result.map};
+  };
+}
+
 const config = {
   watchFolders: [ /* ... */ ],
   resolver: { /* ... */ },
+  ...(padMB > 0
+    ? {serializer: {customSerializer: buildPaddedSerializer(padMB * 1024 * 1024)}}
+    : null),
 };
  1. yarn android
  2. Stop the Metro bundle and start it yourself with the padding
    RN_PAD_BUNDLE_MB=100 yarn start --reset-cache
  3. Run app

Run tests as well

./gradlew :packages:react-native:ReactAndroid:testDebugUnitTest \     
     -PrunPerfTests=true -Preact.internal.useHermesNightly=true \
     --tests "*PerfTest"

…partStreamReader

Adds opt-in JUnit performance tests that exercise BundleDownloader and
MultipartStreamReader against a 100 MB synthetic JS bundle and capture
today's per-thread allocation and peak heap usage. These serve as a
baseline so a follow-up streaming refactor of the multipart/download
path can be validated and locked in via tightened budgets.

New files (packages/react-native/ReactAndroid/src/test/java/com/facebook/react/devsupport/):
- PerformanceTest.kt        JUnit @category marker for the new tests
- AllocationProbe.kt        Reflection wrapper for HotSpot
                            ThreadMXBean.getThreadAllocatedBytes and JMX
                            heap-pool peak usage (reflection because
                            android.jar strips java.lang.management.*)
- LargeMultipartSource.kt   Lazy okio.Source that synthesizes a valid
                            multipart/mixed stream of arbitrary payload
                            size without buffering it
- MultipartStreamReaderPerfTest.kt  Reader-only: 100 MB payload, asserts
                                    correctness + allocation/peak-heap
                                    upper bounds
- BundleDownloaderPerfTest.kt       End-to-end via an OkHttp Interceptor
                                    (no socket); asserts file-on-disk
                                    size + per-thread/all-threads
                                    allocation + peak heap

Build wiring (ReactAndroid/build.gradle.kts):
  tasks.withType<Test> now reads -PrunPerfTests=true and switches
  useJUnit { include/excludeCategories } accordingly. When opted in,
  bumps maxHeapSize to 2g and adds -XX:+AlwaysPreTouch. Default test
  runs are unchanged: the perf tests are excluded.

Run:
  ./gradlew :packages:react-native:ReactAndroid:testDebugUnitTest \
    -PrunPerfTests=true -Preact.internal.useHermesNightly=true \
    --tests "*PerfTest" --info

Baseline on this machine (100 MB payload):
  MultipartStreamReader: ~135 ms, ~101 MB thread-allocated, ~131 MB peak heap
  BundleDownloader:      ~147 ms, ~108 MB all-threads-allocated, ~115 MB peak heap

Assertion budgets are intentionally loose (3-4x payload) so they capture
the current behaviour without flaking; tighten after the streaming
refactor lands.
Refactors MultipartStreamReader and BundleDownloader so the JS bundle
chunk of a Metro multipart response is written directly into the tmp
file via a BufferedSink, instead of being buffered in heap and then
drained. For a 100 MB bundle on a dev machine, peak heap drops:

  MultipartStreamReader:  131 MB -> 36 MB  (3.6x)
  BundleDownloader:       115 MB -> 33 MB  (3.5x)

Wall-clock time is unchanged within noise. Cumulative bytes allocated
on the worker thread are also unchanged (~100 MB); that's dominated by
okio's per-thread SegmentPool capping at 64 KB and is independent of
whether the reader retains the body. The streaming property is proved
by peak heap + the body being delivered as null in onChunkComplete.

MultipartStreamReader (packages/react-native/ReactAndroid/src/main/.../MultipartStreamReader.kt)
- New ChunkListener API:
    onChunkHeader(headers): BufferedSink?
      Return a sink to stream the body to it; return null to receive
      the body buffered as a Buffer via onChunkComplete.
    onChunkComplete(headers, body: Buffer?, isLastChunk)
      `body` is non-null iff onChunkHeader returned null.
    onChunkProgress is unchanged.
- New algorithm: working buffer bounded by READ_CHUNK_SIZE + maxDelim
  (~16 KB + 21 B); reads upstream incrementally, scans for the next
  delimiter, transfers safe-to-flush bytes to the sink (or accumulator)
  using okio segment-move semantics. Lookahead window equals
  `maxDelimLen - 1` so delimiters spanning two reads are still detected.
- The reader cannot know which chunk is the last until it sees the next
  delimiter. Listeners must route on headers (Content-Type,
  X-Http-Status), not on isLastChunk position. Routing-from-headers
  matches what BundleDownloader needs and is more robust anyway.
- Chunks with no `\r\n\r\n` header separator are still supported
  (mirrors prior behaviour; required by testMultipleParts).

BundleDownloader (packages/react-native/ReactAndroid/src/main/.../BundleDownloader.kt)
- processMultipartResponse uses a new inner StreamingBundleChunkListener
  that returns Okio.buffer(Okio.sink(tmpFile)) when the chunk is
  application/javascript AND has effective status 200 (X-Http-Status
  header overrides response.code()). Progress JSON and error bodies are
  buffered in memory as before, so JSONObject parsing and
  DebugServerException diagnostics still work.
- On stream success, the listener closes the sink, populates BundleInfo
  from X-Metro-Files-Changed-Count, atomically renames tmp -> output,
  and fires callback.onSuccess(). On stream failure (premature upstream
  EOF, IOException), the outer code deletes the half-written tmp file
  before surfacing the error.
- closeOpenSinkQuietly ensures the file handle never leaks if
  readAllParts throws mid-stream.

Tests
- MultipartStreamReaderTest: adapted to the new ChunkListener API.
  All 4 existing correctness cases (simple, multiple parts, no
  delimiter, no close delimiter) still pass; behaviour around
  headerless chunks is preserved.
- MultipartStreamReaderPerfTest: now exercises the streaming path
  (returns a CountingDiscardingSink) and asserts the body is delivered
  via the sink (not buffered) plus peak heap < 64 MB. Drops the
  thread-allocated assertion; it's an okio SegmentPool artifact, not a
  reader property.
- BundleDownloaderPerfTest: budgets retargeted to peak heap < 80 MB.

Test commands unchanged:
  ./gradlew :packages:react-native:ReactAndroid:testDebugUnitTest \
    -Preact.internal.useHermesNightly=true \
    --tests "com.facebook.react.devsupport.*"

  ./gradlew :packages:react-native:ReactAndroid:testDebugUnitTest \
    -PrunPerfTests=true -Preact.internal.useHermesNightly=true \
    --tests "*PerfTest" --info
…lication/javascript; charset=UTF-8"

Regression introduced in the streaming refactor (e9e200a).

Metro actually serves the JS bundle chunk with
  Content-Type: application/javascript; charset=UTF-8
not the bare `application/javascript` my refactor was matching against.
With strict equality, `onChunkHeader` returned null (so the streaming
sink was never opened), the body was buffered into the in-memory
accumulator, and `onChunkComplete` hit the `else -> Unit` branch and
silently dropped it. Symptoms in dev on the RNTester emulator: heap
spiked to ~200 MB during the download, then dropped back, but the dev
loading view sat at "Bundling 99%" forever because callback.onSuccess()
was never fired.

Fix: parse Content-Type properly. New private helper `mediaType(...)`
strips the parameter portion (`; charset=...`) and lower-cases the
result, and the two routing predicates (`isJsBundleChunk` and
`isProgressChunk`) compare against the parsed media type. Also: the
else branch now logs a warning instead of silently dropping, so a
future Metro change that introduces yet another Content-Type variant
will be visible in logcat rather than stranding the user at 99%.

Updated `LargeMultipartSource` to emit the real Metro Content-Type
(`application/javascript; charset=UTF-8`) so the perf tests guard
against this exact regression class going forward. The perf tests pass
unchanged \u2014 the streaming path is still taken, output-size still equals
payloadBytes, and peak heap is still bounded.

Verified against a live Metro dev server: with the synthetic source
now matching production, `BundleDownloaderPerfTest` would have caught
this bug. `MultipartStreamReaderTest` (the 4 existing correctness
cases) is unaffected and still green.
@meta-cla meta-cla Bot added the CLA Signed This label is managed by the Facebook bot. Authors need to sign the CLA before a PR can be reviewed. label Jun 2, 2026
@facebook-github-tools facebook-github-tools Bot added the Shared with Meta Applied via automation to indicate that an Issue or Pull Request has been shared with the team. label Jun 2, 2026
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

CLA Signed This label is managed by the Facebook bot. Authors need to sign the CLA before a PR can be reviewed. Shared with Meta Applied via automation to indicate that an Issue or Pull Request has been shared with the team.

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant