Polyfill TC39 Uint8Array base64/hex methods in workflow VM context#1547
Conversation
🦋 Changeset detectedLatest commit: 6268d0a The changes in this PR will be included in the next version bump. This PR includes changesets to release 16 packages
Not sure what this means? Click here to learn what changesets are. Click here if you're a maintainer who wants to add another changeset to this PR |
📊 Benchmark Results
workflow with no steps💻 Local Development
▲ Production (Vercel)
🔍 Observability: Next.js (Turbopack) | Express workflow with 1 step💻 Local Development
▲ Production (Vercel)
🔍 Observability: Next.js (Turbopack) | Express workflow with 10 sequential steps💻 Local Development
▲ Production (Vercel)
🔍 Observability: Next.js (Turbopack) | Express workflow with 25 sequential steps💻 Local Development
▲ Production (Vercel)
🔍 Observability: Express | Next.js (Turbopack) workflow with 50 sequential steps💻 Local Development
▲ Production (Vercel)
🔍 Observability: Express | Next.js (Turbopack) Promise.all with 10 concurrent steps💻 Local Development
▲ Production (Vercel)
🔍 Observability: Next.js (Turbopack) | Express Promise.all with 25 concurrent steps💻 Local Development
▲ Production (Vercel)
🔍 Observability: Next.js (Turbopack) | Express Promise.all with 50 concurrent steps💻 Local Development
▲ Production (Vercel)
🔍 Observability: Express | Next.js (Turbopack) Promise.race with 10 concurrent steps💻 Local Development
▲ Production (Vercel)
🔍 Observability: Express | Next.js (Turbopack) Promise.race with 25 concurrent steps💻 Local Development
▲ Production (Vercel)
🔍 Observability: Express | Next.js (Turbopack) Promise.race with 50 concurrent steps💻 Local Development
▲ Production (Vercel)
🔍 Observability: Express | Next.js (Turbopack) workflow with 10 sequential data payload steps (10KB)💻 Local Development
▲ Production (Vercel)
🔍 Observability: Express | Next.js (Turbopack) workflow with 25 sequential data payload steps (10KB)💻 Local Development
▲ Production (Vercel)
🔍 Observability: Express | Next.js (Turbopack) workflow with 50 sequential data payload steps (10KB)💻 Local Development
▲ Production (Vercel)
🔍 Observability: Express | Next.js (Turbopack) workflow with 10 concurrent data payload steps (10KB)💻 Local Development
▲ Production (Vercel)
🔍 Observability: Next.js (Turbopack) | Express workflow with 25 concurrent data payload steps (10KB)💻 Local Development
▲ Production (Vercel)
🔍 Observability: Express | Next.js (Turbopack) workflow with 50 concurrent data payload steps (10KB)💻 Local Development
▲ Production (Vercel)
🔍 Observability: Next.js (Turbopack) | Express Stream Benchmarks (includes TTFB metrics)workflow with stream💻 Local Development
▲ Production (Vercel)
🔍 Observability: Express | Next.js (Turbopack) stream pipeline with 5 transform steps (1MB)💻 Local Development
▲ Production (Vercel)
🔍 Observability: Express | Next.js (Turbopack) 10 parallel streams (1MB each)💻 Local Development
▲ Production (Vercel)
🔍 Observability: Express | Next.js (Turbopack) fan-out fan-in 10 streams (1MB each)💻 Local Development
▲ Production (Vercel)
🔍 Observability: Express | Next.js (Turbopack) SummaryFastest Framework by WorldWinner determined by most benchmark wins
Fastest World by FrameworkWinner determined by most benchmark wins
Column Definitions
Worlds:
❌ Some benchmark jobs failed:
Check the workflow run for details. |
🧪 E2E Test Results❌ Some tests failed Summary
❌ Failed Tests🌍 Community Worlds (59 failed)mongodb (3 failed):
redis (2 failed):
turso (54 failed):
Details by Category✅ ▲ Vercel Production
✅ 💻 Local Development
✅ 📦 Local Production
✅ 🐘 Local Postgres
✅ 🪟 Windows
❌ 🌍 Community Worlds
✅ 📋 Other
|
a14c2d3 to
9a53f66
Compare
There was a problem hiding this comment.
Pull request overview
Adds a TC39 Stage 4 Uint8Array base64/hex polyfill to the workflow VM sandbox and removes the Node-specific Buffer global from the VM context, aligning the sandbox with web-standard APIs while keeping encoding/decoding capabilities.
Changes:
- Introduces
installUint8ArrayBase64()to polyfillUint8Arraybase64/hex encode/decode methods inside the VM context. - Updates VM context initialization to install the polyfill and stop injecting
Buffer. - Adds a comprehensive unit test suite for the new methods/options and removes VM
Buffer-specific tests.
Reviewed changes
Copilot reviewed 5 out of 5 changed files in this pull request and generated 3 comments.
Show a summary per file
| File | Description |
|---|---|
| packages/core/src/vm/uint8array-base64.ts | Implements the base64/hex polyfill and installer for a provided VM Uint8Array constructor. |
| packages/core/src/vm/uint8array-base64.test.ts | Adds unit tests covering encoding/decoding, options, and error cases in a VM context. |
| packages/core/src/vm/index.ts | Installs the polyfill into the VM context and removes Buffer injection. |
| packages/core/src/vm/index.test.ts | Removes tests asserting Buffer availability/usage inside the VM. |
| .changeset/uint8array-base64-polyfill.md | Publishes a patch changeset describing the VM polyfill + Buffer removal. |
💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.
| const result = fromBase64( | ||
| str, | ||
| alphabet, | ||
| lastChunkHandling, | ||
| this.length | ||
| ); | ||
| const bytes = result.bytes; | ||
| const written = bytes.length; | ||
|
|
||
| for (let i = 0; i < written; i++) { | ||
| this[i] = bytes[i]; | ||
| } |
There was a problem hiding this comment.
setFromBase64() currently decodes into an intermediate number[] (result.bytes) and then copies into the target array. This defeats the main benefit of setFromBase64 (decoding directly into an existing buffer) and can be very memory/GC heavy for large inputs. Consider changing the decoder to write directly into this (or into a preallocated Uint8Array) while tracking {read,written}, so the peak memory stays O(1) relative to the output size.
There was a problem hiding this comment.
Acknowledged. The intermediate number[] is a fair tradeoff here — workflow functions are orchestration logic, not data processing pipelines, so the inputs to these methods will be small (auth tokens, config values, etc.). Rewriting fromBase64 to decode directly into a target buffer would add significant complexity to the spec-compliant state machine (whitespace skipping, lastChunkHandling, maxLength truncation at chunk boundaries) for negligible real-world benefit in this context. If we ever see large payloads going through the workflow VM, we can revisit.
| const result = fromHex(str, this.length); | ||
| const bytes = result.bytes; | ||
| const written = bytes.length; | ||
|
|
||
| for (let i = 0; i < written; i++) { | ||
| this[i] = bytes[i]; | ||
| } | ||
|
|
||
| if (result.error) { | ||
| throw result.error; | ||
| } | ||
|
|
||
| return { read: result.read, written }; |
There was a problem hiding this comment.
setFromHex() has the same issue as setFromBase64(): it builds a number[] and then copies into the target buffer. For larger hex payloads this adds significant overhead and undermines the purpose of a setFrom* API. Consider decoding directly into this while advancing read/written counters, avoiding the intermediate array.
| const result = fromHex(str, this.length); | |
| const bytes = result.bytes; | |
| const written = bytes.length; | |
| for (let i = 0; i < written; i++) { | |
| this[i] = bytes[i]; | |
| } | |
| if (result.error) { | |
| throw result.error; | |
| } | |
| return { read: result.read, written }; | |
| const maxBytes = this.length; | |
| const strLen = str.length; | |
| let read = 0; | |
| let written = 0; | |
| // Decode two hex characters (one byte) at a time directly into `this`. | |
| while (written < maxBytes && read + 1 < strLen) { | |
| const hiCode = str.charCodeAt(read); | |
| const loCode = str.charCodeAt(read + 1); | |
| const hi = | |
| hiCode >= 48 && hiCode <= 57 | |
| ? hiCode - 48 | |
| : hiCode >= 65 && hiCode <= 70 | |
| ? hiCode - 55 | |
| : hiCode >= 97 && hiCode <= 102 | |
| ? hiCode - 87 | |
| : -1; | |
| const lo = | |
| loCode >= 48 && loCode <= 57 | |
| ? loCode - 48 | |
| : loCode >= 65 && loCode <= 70 | |
| ? loCode - 55 | |
| : loCode >= 97 && loCode <= 102 | |
| ? loCode - 87 | |
| : -1; | |
| if (hi === -1 || lo === -1) { | |
| throw new TypeError('Invalid hex string'); | |
| } | |
| this[written] = (hi << 4) | lo; | |
| written++; | |
| read += 2; | |
| } | |
| // If there's a trailing single hex digit, treat it as invalid. | |
| if (read < strLen && (strLen - read) === 1) { | |
| throw new TypeError('Invalid hex string'); | |
| } | |
| return { read, written }; |
There was a problem hiding this comment.
Same reasoning as the setFromBase64 comment above — the intermediate array is a reasonable tradeoff given the small payloads in workflow VM context. Also worth noting: the suggested replacement changes error types from SyntaxError (per TC39 spec) to TypeError, and drops the upfront odd-length check that the spec requires. Keeping the current implementation for spec fidelity.
f5e438c to
bef31f2
Compare
Summary
Uint8Arraybase64/hex proposal to the workflow VM contextUint8Array.prototype.toBase64(),Uint8Array.prototype.toHex(),Uint8Array.fromBase64(),Uint8Array.fromHex(),Uint8Array.prototype.setFromBase64(), andUint8Array.prototype.setFromHex()per the specalphabet("base64"/"base64url"),omitPadding, andlastChunkHandling("loose"/"strict"/"stop-before-partial")The polyfill is a no-op when the native methods already exist, so it will gracefully step aside once Node.js ships native support.