CBOR Transport support#25
Conversation
Signed-off-by: marcopiraccini <marco.piraccini@gmail.com>
Signed-off-by: marcopiraccini <marco.piraccini@gmail.com>
Adds the server/client pieces needed for Vercel's e2e suite to pass against us when they un-skip the Platformatic community world: - Server: run_started now bootstraps the run from eventData when no prior run_created exists (resilient start path). The SDK v5 contract expects this so a failed run_created can be recovered via the queue payload. - Client: runs.get now names 404 errors WorkflowRunNotFoundError so the SDK's resilient-start retry loop recognizes them as retryable. streams.get copies chunks into standalone ArrayBuffers — undici's pooled buffers aren't detachable, which broke the SDK's byte-stream transfer. streams.getChunks does the same for paginated chunks. - E2E: bump next build with WORKFLOW_PUBLIC_MANIFEST=1 so the manifest is served from public/, matching Vercel's CI setup. Add four adapted tests to vercel-e2e.test.ts: resilient start (#2255), getTailIndex after stream completes, getTailIndex on nonexistent namespace, getChunks paginates. wellKnownAgentWorkflow stays skipped (upstream @workflow/next plugin does not discover dot-prefixed app dirs). Verified: 61/59/0/2 (tests/pass/fail/skipped) in 220s against a clean local run. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Signed-off-by: marcopiraccini <marco.piraccini@gmail.com>
transport.ts is a direct adaptation of packages/world-vercel/src/queue.ts — same wire contract, same 3 classes, same CBOR-first/JSON-fallback logic. Add an inline comment at the top of the file pointing to NOTICE, and list the file in NOTICE alongside the e2e suite ports. Also note the two new vercel-e2e.test.ts ports (resilient start + outputStream getTailIndex/getChunks) in NOTICE. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Signed-off-by: marcopiraccini <marco.piraccini@gmail.com>
mcollina
left a comment
There was a problem hiding this comment.
The semversiness of this unclear to me
| - Queue messages between client and server use CBOR framing. CBOR preserves `Uint8Array` natively (JSON does not), so binary workflow input survives the queue round-trip without base64 wrapping. | ||
| - `createQueueHandler` accepts both CBOR and JSON inbound via a dual transport. A v3 client can be deployed against a v2-only server during rollout; a v2 client can be deployed against a v3 server. | ||
|
|
||
| Peer dependency: `@workflow/world` ≥ 5.0.0-beta.1 (the first release exporting `SPEC_VERSION_SUPPORTS_CBOR_QUEUE_TRANSPORT`). |
There was a problem hiding this comment.
This makes this a semver-major for us too. Can we do this in a way that we support also old clients?
I would prefer if we had integration tests for both v4 and v5 too.
There was a problem hiding this comment.
We actually are still in 0.x but OK.
There was a problem hiding this comment.
To be clear, a user would need to use the v5 beta of workflow?
There was a problem hiding this comment.
Updated, now we support both v4 and v5, PTAL
Signed-off-by: marcopiraccini <marco.piraccini@gmail.com>
Signed-off-by: marcopiraccini <marco.piraccini@gmail.com>
Signed-off-by: marcopiraccini <marco.piraccini@gmail.com>
| } | ||
|
|
||
| async postRaw (path: string, body: Buffer, contentType: string, query?: Record<string, string | undefined>): Promise<any> { | ||
| let fullPath = `${this.#baseUrl}${path}` |
There was a problem hiding this comment.
are you making sure that delimiters (like /) are set? I think we should validate for // checks.
| async postRaw (path: string, body: Buffer, contentType: string, query?: Record<string, string | undefined>): Promise<any> { | ||
| let fullPath = `${this.#baseUrl}${path}` | ||
| if (query) { | ||
| const url = new URL(`http://localhost${fullPath}`) |
There was a problem hiding this comment.
shouldn't we use baseUrl here?
|
|
||
| const cborTransport = new CborTransport() | ||
| const jsonTransport = new JsonTransport() | ||
| const dualTransport = new DualTransport() |
| }) | ||
| const result = useCbor | ||
| ? await client.postRaw('/queue', transport.serialize(envelope), transport.contentType) | ||
| : await client.post('/queue', envelope) |
There was a problem hiding this comment.
Are you sure it's not better to handle this in the client itself?
| transform (chunk, controller) { | ||
| const copy = new Uint8Array(chunk.byteLength) | ||
| copy.set(chunk) | ||
| controller.enqueue(copy) |
| const src = Buffer.from(chunk.data, 'base64') | ||
| const copy = new Uint8Array(src.byteLength) | ||
| copy.set(src) | ||
| return { index: chunk.index, data: copy } |
There was a problem hiding this comment.
This is allocating memory twice for no particular reason, can we avoid it?
| } | ||
| } | ||
|
|
||
| export async function drainStream (stream: ReadableStream<Uint8Array>): Promise<Buffer> { |
| total += value.byteLength | ||
| } | ||
| } | ||
| const buf = Buffer.allocUnsafe(total) |
Signed-off-by: marcopiraccini <marco.piraccini@gmail.com>
Summary
Adds spec version 3 support to
@platformatic/worldand@platformatic/workflow: CBOR queue transport + resilient start +streams.*interface compat. Brings full parity with Vercel's upstream e2e suite so Platformatic can be un-skipped in Vercel's community-worlds CI.Follow-up to vercel/workflow#1450, which landed Platformatic as
if: falsepending CBOR. The upstream change that made CBOR a requirement is vercel/workflow#1627 — "Gate CBOR queue transport on specVersion" — which switched queue messages from JSON to CBOR forspecVersion >= 3sorunInput.input(aUint8Array) survives the wire. Worlds that want to handle v3 runs must speak CBOR on the queue. Community worlds opt in via vercel/workflow#1658.Workflow SDK compatibility
Our World API is typed against
@workflow/world@4.1.1stable (which already exportsSPEC_VERSION_SUPPORTS_CBOR_QUEUE_TRANSPORT,RunInput,QueueOptions.specVersion,World.specVersion,getStreamChunks/getStreamInfo— CBOR support was backported into the stable 4.1.x line). The same world instance also works at runtime against the@workflow/core@5.0.0-betaline.workflowSDKworkflow@4.2.x(stable)writeToStream,getStreamChunks, ...).workflow@5.0.0-beta.xworld.streams.*(nested namespace, different arg order). Exposed at runtime alongside v4 methods.Both are exercised in CI:
e2e-v5/runs againstworkflow@5.0.0-beta.2(mirrors Vercel's main-branch CI),e2e-v4/runs againstworkflow@4.2.4stable.Scope
Client (
@platformatic/world)HttpClient.post(path, body, query?, encoding?)handles both JSON and CBOR bodies via anencoding: 'json' | 'cbor'parameter.encodefromcbor-xis called inline.queue()picks CBOR whenopts.specVersion >= 3, JSON otherwise. Falls back to CBOR whenopts.specVersionis missing.createQueueHandlerdoes CBOR-first decode with a JSON fallback inline (no shared Transport abstraction).HttpClientvalidates paths at the boundary: must start with/, must not contain//(catches empty-interpolation bugs before they hit the server).streams.*are thin adapters that delegate to it. Satisfies v4.1.1'sStreamertype;streamsis a runtime-only addition for v5 SDKs.streams.getcopies each chunk into a standaloneArrayBuffervianew Uint8Array(chunk)— undici's pooled buffers aren't detachable and break the SDK's byte-stream transfer.storage.runs.getrenames 404 errors toWorkflowRunNotFoundErrorso the SDK's resilient-start retry loop recognizes them as retryable.specVersion: SPEC_VERSION_SUPPORTS_CBOR_QUEUE_TRANSPORT.Server (
@platformatic/workflow)002.do.sql:workflow_queue_messagesgainspayload_bytes BYTEA+payload_encoding TEXT ('json' | 'cbor')with an XOR constraint. Undo migration is forward-only.application/cborcontent-type parser.plugins/queue.tsbranches onContent-Type, stores in arrival encoding.queue/dispatcher.tsforwards with matchingContent-Type; re-enqueues preserve encoding.GET /runs/:runId/streams/:name/chunks(paginated) and/info(tailIndex + done).run_startedevent handler: idempotent (otherwise v5 replay fails with "Unconsumed event in event log") AND bootstraps the run fromeventDatawhen no priorrun_createdexists (resilient-start recovery).correlation_idso hooks created in the same millisecond return in workflow order.E2E structure
e2e-v5/— workbench onworkflow@5.0.0-beta.2(matches Vercel's main-branch CI). Build usesWORKFLOW_PUBLIC_MANIFEST=1so the manifest is served at/.well-known/workflow/v1/manifest.json.e2e-v4/— workbench onworkflow@4.2.4stable. Covers the v4 runtime path beyond the basic happy path: sleep, hook resume, step retry-until-success, FatalError bubbling, output stream writes.packages/core/e2e/e2e.test.tsintoe2e-v5/test/vercel-e2e.test.ts:resilient start: addTenWorkflow completes when run_created returns 500(line 2255)outputStreamWorkflow: getTailIndex returns correct index after stream completes(line 711)outputStreamWorkflow: getTailIndex returns -1 before any chunks are written(line 728)outputStreamWorkflow: getChunks returns same content as reading the stream(line 745)getWorld()was called withoutawaitin the queue-based health check.webhookWorkflow: HTTP-triggered resume with 3 webhook types— v5 fixes therespondWith: 'manual'cross-process issue.wellKnownAgentWorkflow: step discovery in dot-prefixed directory— side-effect import fromapp/api/trigger-e2e/route.tsmakes the file reachable from the module graph, mirroring Vercel's own workbench.cbor-e2e.test.ts(3 tests): DB-level assertion thatpayload_encoding = 'cbor'after a real run — ground-truth signal CBOR is engaged.Root scripts
pnpm test:e2e:v5— v5 SDK smoke suitepnpm test:e2e:v4— v4 SDK smoke suitepnpm test:e2e:vercel— full Vercel-ported suiteRollback is forward-only: drain the queue before running
002.undo.sql.Test plan
Known upstream issue tracked: vercel/workflow#1735. We drain streams fully before decode, same as upstream.