From 3a951ad1ba57ecd3c7ff8f01aab684bd28086e1e Mon Sep 17 00:00:00 2001 From: Adam Horodyski Date: Thu, 11 Jun 2026 13:58:22 +0200 Subject: [PATCH 01/12] Add SequentialQueue current-state reference doc --- contributingGuides/SEQUENTIAL_QUEUE.md | 521 +++++++++++++++++++++++++ 1 file changed, 521 insertions(+) create mode 100644 contributingGuides/SEQUENTIAL_QUEUE.md diff --git a/contributingGuides/SEQUENTIAL_QUEUE.md b/contributingGuides/SEQUENTIAL_QUEUE.md new file mode 100644 index 000000000000..e3674af34d46 --- /dev/null +++ b/contributingGuides/SEQUENTIAL_QUEUE.md @@ -0,0 +1,521 @@ +# SequentialQueue + +## Overview + +The SequentialQueue is how the app keeps its **offline-first promise**: a user can "do their thing" regardless of connectivity, and every change they make is delivered to the server **once, in order, at the right time** — even across going offline, refreshing the tab, or restarting the app. + +Concretely, `SequentialQueue` is a **single-flight, FIFO, leader-only engine** that drains every `API.write()` request one at a time, with exponential backoff on failure, surviving offline periods and (for most requests) app restarts. It is a coordinator over four collaborators: + +| Collaborator | Role | +|---|---| +| **PersistedRequests** | The request store — an in-memory mirror of the on-disk queue (`persistedRequests` array + a single `ongoingRequest`). | +| **RequestThrottle** | Per-instance exponential backoff with jitter; produces the "give up" signal. | +| **QueuedOnyxUpdates** | A separate in-memory buffer holding WRITE responses' Onyx updates until the whole queue drains (anti-flicker). | +| **The middleware chain** (`Request.processWithMiddleware`) | Where a single request actually runs end-to-end (XHR + auth + response persistence). | + +> **The one inversion to internalize before reading further:** within this subsystem, **in-memory state is authoritative and disk is a lagging backup** — the opposite of the usual Onyx contract. `PersistedRequests` deliberately ignores most of its own Onyx subscription echoes. Roughly half of that module is defensive code protecting this invariant. A reader who misses this will misread much of it as redundant. See [Why in-memory is authoritative](#why-in-memory-is-authoritative). + +This document covers **how the queue works today**. For sibling concerns: +- **How the app decides it is offline** (the hard-stop model, failure tracking, reachability) → [Network State Detection](NETWORK_STATE_DETECTION.md). +- **How features should behave when offline** (optimistic UX patterns A/B/C/D) → [Offline UX Patterns](philosophies/OFFLINE.md). + +This is an observational reference. Where current behavior diverges from apparent intent, that is noted neutrally in a **Sharp edges** subsection on the relevant block, and genuinely intent-ambiguous items are collected in [Open Questions](#open-questions-needs-maintainer-confirmation). All references use module and function **names**, not line numbers, so the doc survives refactors — read it in terms of blocks and their relationships. + +## Contents + +- [Overview](#overview) +- [Architecture Diagram](#architecture-diagram) +- [Lifecycle of One Request](#lifecycle-of-one-request) +- [Restart Recovery](#restart-recovery) +- Building Blocks: [SequentialQueue](#sequentialqueue-the-coordinator) · [PersistedRequests](#persistedrequests-the-store) · [Where the request hits disk](#where-the-request-actually-hits-disk-and-where-it-doesnt) · [RequestThrottle](#requestthrottle-backoff) · [QueuedOnyxUpdates](#queuedonyxupdates-and-queueflusheddata) · [Public Contract](#public-contract-the-api-layer) · [Inbound Consumers](#inbound-consumers-who-calls-the-queue) +- [The State Machine](#the-state-machine) +- [Error Handling](#error-handling) +- Runtime Dimensions: [Multi-Tab & Leader Election](#multi-tab--leader-election) · [Offline Behavior](#offline-behavior) · [Pausing & Data-Gap Sync](#pausing--data-gap-sync) +- [The Middleware Chain](#the-middleware-chain-boundary) +- [Conflict Resolution](#conflict-resolution) +- [Test Coverage](#test-coverage) +- [Configuration Constants](#configuration-constants) +- [Key Modules Reference](#key-modules-reference) +- [Open Questions](#open-questions-needs-maintainer-confirmation) +- [Relationship to Other Docs](#relationship-to-other-docs) + +## Architecture Diagram + +``` + feature code + │ API.write(command, params, onyxData, conflictResolver) + ▼ +┌────────────────────────────────────────────────────────────┐ +│ API.index — prepareRequest │ +│ • apply optimisticData to Onyx (UI updates NOW) │ +│ • evaluate conflictResolver #1 (read-only: suppress optim.?)│ +│ • stamp requestIndex++ (per-tab counter; requestID legacy) │ +│ • strip optimisticData from the persisted request │ +└───────────────┬──────────────────────────────────────────────┘ + write │ read → waitForWrites() → SequentialQueue.waitForIdle() + ▼ (blocks on isReadyPromise) +┌────────────────────────────────────────────────────────────┐ +│ SequentialQueue.push() │ +│ • conflictResolver #2 (authoritative: mutate the queue) │ +│ • PersistedRequests.save → in-memory SYNC, disk ASYNC │ +│ • dispatch: │ +│ offline → persist only, return (reconnect flushes) │ +│ running → defer: isReadyPromise.then(flush) │ +│ idle → flush(true) NOW │ +└───────────────┬──────────────────────────────────────────────┘ + ▼ +┌────────────────────────────────────────────────────────────┐ +│ flush(shouldResetPromise) │ +│ guards (in order): paused? · offline? · already running? · │ +│ all empty? (queue · ongoing · buffered │ +│ onyx updates) · NOT the leader? │ +│ → isSequentialQueueRunning = true │ +│ → (maybe) reset isReadyPromise │ +│ → load PERSISTED_REQUESTS, then process() │ +└───────────────┬──────────────────────────────────────────────┘ + ▼ +┌────────────────────────────────────────────────────────────┐ +│ process() (recurses until queue empty) │ +│ • processNextRequest(): head → ongoingRequest │ +│ (atomic multiSet of both keys; returns LOCAL ref) │ +│ • processWithMiddleware(request, isFromSequentialQueue=true) │ +│ makeXHR → Logging → … → SaveResponseInOnyx → │ +│ FraudMonitoring │ +│ • .then → (pause if shouldPauseQueue) → remove request → │ +│ save queueFlushedData → throttle.clear → RECURSE │ +│ • .catch → error ladder (drop / treat-as-success / │ +│ retry+backoff / give-up) │ +└───────────────┬──────────────────────────────────────────────┘ + ▼ +┌────────────────────────────────────────────────────────────┐ +│ process().finally (the drain-end block, inside flush) │ +│ • isSequentialQueueRunning = false │ +│ • resolve isReadyPromise IFF offline OR queue empty │ +│ • if queue empty: flush QueuedOnyxUpdates, │ +│ then apply + clear queueFlushedData │ +└────────────────────────────────────────────────────────────┘ + + boundary collaborators (contract only — linked, not owned here): + • ActiveClientManager → isClientTheLeader() (web multi-tab) + • NetworkState → getIsOffline() (see NETWORK_STATE_DETECTION.md) + • OnyxUpdateManager → resolves shouldPauseQueue data gaps, then unpause() + + disk writes — PERSISTED_* keys; cache + subscribers update SYNC first, + the returned Onyx promise resolves on STORAGE COMMIT (IDB tx / SQLite WAL): + • #1 save() → PERSISTED_REQUESTS (in push; NOT awaited before flush) + • #2 processNextRequest → PERSISTED_REQUESTS + ONGOING (promotion; fire-and-forget) + • end / rollback / updateOngoing (fire-and-forget) + • conflict update / delete (awaited by handleConflictActions) + ✗ makeXHR (network) gates on auth readiness only — never on a queue disk write +``` + +## Lifecycle of One Request + +This is the connective narrative — how the blocks interact at runtime for a single `API.write()`. + +1. **Prepare (synchronous, in `API.index.prepareRequest`).** `optimisticData` is applied to Onyx immediately, so the UI reflects the change at once. The conflict resolver is evaluated **read-only** here, solely to decide whether to *suppress* the optimistic data (`conflictAction.type === 'noAction'`); the rest of the conflict action is ignored at this site. A `requestID` is stamped from a per-tab counter (the field is literally `requestIndex`; `requestID` is read only as a legacy fallback in `getClientRequestIndex` — this doc keeps the conventional name), and `optimisticData` is stripped from the version that will be persisted (it has already been applied). + +2. **Push (`SequentialQueue.push`).** The conflict resolver runs a **second** time, now **authoritatively**, against the live persisted requests — this is the evaluation that may mutate the queue (push / replace / delete). The resolver function is then deleted off the request (it cannot be serialized), and `PersistedRequests.save` mutates the in-memory array **synchronously** while issuing the disk write (captured as `persistencePromise`, resolved asynchronously on disk commit and **not** awaited before firing). `push` then dispatches on three conditions: **offline** → persist only and return (reconnect will flush later); **already running** → defer via `isReadyPromise.then(flush)`; **idle** → `flush(true)` now. + +3. **Flush (`flush`).** Five guards run in order — paused, offline, already-running, all-empty (three-legged: queue, `ongoingRequest`, **and** the `QueuedOnyxUpdates` buffer — see [the coordinator](#sequentialqueue-the-coordinator)), **not-leader**. If all pass, `isSequentialQueueRunning` is set, `isReadyPromise` is optionally reset, and a throwaway `PERSISTED_REQUESTS` connection guarantees disk is loaded before `process()` runs (it opts out of connection reuse, because `PersistedRequests` already holds a live `PERSISTED_REQUESTS` connection and reusing it would fire extra callbacks). + +4. **Process (`process`).** Guards again (paused / offline / empty), then `processNextRequest` promotes the head to `ongoingRequest` and runs the middleware chain. On success it removes the request and **recurses**; on error it walks the [error ladder](#error-handling). + +5. **Finally (the drain-end `process().finally`, chained inside `flush`).** The running flag clears, `isReadyPromise` resolves **iff offline or the queue is empty**, and — only when the queue is fully drained — `QueuedOnyxUpdates` is flushed and `queueFlushedData` is applied and cleared. + +## Restart Recovery + +The persisted queue exists so an interrupted WRITE survives an app kill. The pieces are described across the blocks below; here is the one cold-start path that strings them together. + +1. **Disk load.** On startup, the `PERSISTED_REQUESTS` and `PERSISTED_ONGOING_REQUESTS` connect callbacks hydrate the in-memory mirror — `persistedRequests` from the array, `ongoingRequest` from the single key. +2. **Head dedup.** If the app was killed _after_ a request was promoted to `ongoingRequest` but _before_ it was removed, the same request can sit both as `ongoingRequest` and as the head of the array. The init path strips the array head when it `deepEqual`s the rehydrated `ongoingRequest` — this is the reason that structural dedup exists. +3. **Flush.** `onPersistedRequestsInitialization` fires `flush()` once there is recovered work (and, separately, `Network/index` fires a startup flush gated on `ActiveClientManager.isReady()`, so the leader gate is meaningful on the first drain). The registered callback is not startup-only: the `PERSISTED_ONGOING_REQUESTS` connect callback re-fires it post-init whenever a *new* ongoing request appears (e.g. observed from another tab) — see [Inbound Consumers](#inbound-consumers-who-calls-the-queue). +4. **Re-drive.** `processNextRequest` sees a non-null `ongoingRequest` and returns it directly (rather than promoting a new head), so the **same** interrupted request is re-sent. + +**What does _not_ recover:** + +- A **File/Blob ongoing request** — `PERSISTED_ONGOING_REQUESTS` holds `null` on disk (the live object was memory-only), so there is nothing to re-drive; the upload is lost. +- Anything still inside the **persist-before-fire crash window** — fired (or about to fire) but not yet committed to disk; it is not reloaded and not re-driven, even though the server may already have acted. See [Where the request actually hits disk](#where-the-request-actually-hits-disk-and-where-it-doesnt). + +**On sign-out (a different reset):** `Session.cleanupSession()` aborts the in-flight cancellable XHR via `HttpUtils.cancelPendingRequests()` — surfacing as `REQUEST_CANCELLED` to `process()`'s catch (dropped, no data applied) — and clears the persisted queue via `PersistedRequests.clear()`. This is the only inbound path that _aborts_ an in-flight request rather than letting it complete. A bare `Onyx.clear()` (without the abort) resets the mirror to `[]`/`null` through the carve-out path, but because `isInitialized` is already true the init callback does **not** re-fire, so clearing on its own schedules no flush; an in-flight `process()` chain holds local references and is unaffected by the memory wipe. + +--- + +# Building Blocks + +Each block is described as **the problem it solves → how it works today → sharp edges** (neutral observations where current behavior is surprising or diverges from apparent intent). For crux/connective blocks (e.g. the disk-persistence and runtime-dimension sections), the sharp edges are woven into the prose rather than a labeled subsection. + +## SequentialQueue (the coordinator) + +**Problem it solves.** Serialize every WRITE so requests reach the server one at a time, in submission order, exactly once — with automatic retry on transient failure and correct ordering of dependent READs. + +**How it works today.** +- **Single-flight.** `isSequentialQueueRunning` is the re-entrancy guard: set at the top of `flush()` after the guards pass, cleared in `process().finally`. While set, new pushes defer rather than start a parallel drain. +- **`push()` dispatch.** As above: offline → persist only; running → defer behind `isReadyPromise.then(flush)`; idle → `flush(true)`. `push` returns the disk-persistence promise but **does not await it** — the queue fires off the synchronous in-memory state. +- **`flush()` guards, in order.** `isQueuePaused` → `isOfflineNetwork()` → `isSequentialQueueRunning` → all-empty → `!isClientTheLeader()`. The all-empty guard is three-legged — no queued requests, no `ongoingRequest`, **and** an empty `QueuedOnyxUpdates` buffer — so `flush()` doubles as the drain path for buffered updates on a request-empty queue. Leadership is checked **exactly once**, before the running flag flips; the recursive `process()` chain re-checks only `isQueuePaused` and `isOfflineNetwork()`, never leadership. +- **`process()` recursion.** Pull the next request, run middleware, on success recurse; the recursion terminates when the queue and ongoing request are both empty. +- **READ-after-WRITE gate.** `waitForIdle()` returns `isReadyPromise`; READs block on it so a READ response cannot clobber optimistic data from in-flight WRITEs on the same keys. See [the state machine](#the-state-machine) for the resolution rule. + +**Sharp edges.** +- `push()` triggers `flush()` from the **synchronous in-memory** save and never awaits `persistencePromise`; the network dispatch (`makeXHR`) gates only on auth readiness, never on the queue's disk write. So a request can reach the server while still absent from disk — a crash in that window loses the queue record even though the server may already have acted. This is the named target of the persist-before-fire work; see [Where the request actually hits disk](#where-the-request-actually-hits-disk-and-where-it-doesnt). +- Leadership being checked once per flush means a leadership change *during* an in-flight drain is not re-evaluated by the running cycle (see [Multi-Tab & Leader Election](#multi-tab--leader-election)). + +## PersistedRequests (the store) + +**Problem it solves.** Hold the queue durably enough to survive offline periods and app restarts, while remaining correct under (a) Onyx's synchronous, sometimes out-of-order subscription callbacks and (b) multiple browser tabs sharing one queue. + +**How it works today.** +- **Two Onyx keys, one mirror.** `PERSISTED_REQUESTS` (the array) and `PERSISTED_ONGOING_REQUESTS` (the single in-flight request) back an in-memory mirror (`persistedRequests`, `ongoingRequest`). After initialization the in-memory copies are authoritative. +- **The ongoing-request model.** `processNextRequest` captures the head as a **local** reference, sets `ongoingRequest`, trims the queue (`slice(1)`), and writes **both keys in a single atomic `Onyx.multiSet`** — so a crash cannot leave the request both "in the queue" and "ongoing." `rollbackOngoingRequest` and `endRequestAndRemoveFromQueue` likewise update both keys together. +- **Why `processNextRequest` returns the local reference.** Onyx's post-`multiSet` callback fires synchronously and would overwrite the module-level `ongoingRequest` with a JSON-serialized copy — which strips `File`/`Blob` payloads. Returning the captured local reference preserves the live objects for the request about to run. (The `knownOngoingRequestIDs` own-write guard ignores that echo for requests carrying a `requestIndex` — the local-ref return is load-bearing for those without one.) +- **File/Blob _ongoing_ requests are not durably persisted.** `shouldPersistOngoingRequest` returns false when any value in `request.data` is a `File`/`Blob`, so `null` is written to `PERSISTED_ONGOING_REQUESTS` and the live object is kept only in the in-memory `ongoingRequest` mirror (a `File`/`Blob` cannot be structured-cloned to survive a restart anyway). Note this null-out guard applies **only to the ongoing key** — the `PERSISTED_REQUESTS` array writes the full request, File/Blob included; IndexedDB can store it on web, while SQLite reduces it to `{}` on native. See [Where the request actually hits disk](#where-the-request-actually-hits-disk-and-where-it-doesnt). +- **Structural removal.** `endRequestAndRemoveFromQueue` and the init-time dedup match by `deepEqual` (structural), **not** by `requestID`. `requestID` matching is used only for cross-tab merge and leader-deletion reconciliation (via `knownRequestIDs`). +- **`getLength()` counts the ongoing request.** It returns the array length plus one when `ongoingRequest` is non-null, so `getLength() === 0` means the queue is truly idle — `API.index` reads this to decide whether a READ must wait for writes. + +### Why in-memory is authoritative + +Onyx (since 3.0.46) fires `set`/`multiSet` subscription callbacks **synchronously and sometimes out of order**. A naive subscriber that adopted every echoed value could let a stale, older echo overwrite correct in-memory state — losing or duplicating queued requests (the failure mode behind that era's lost/duplicated-request bug). So `PersistedRequests` ignores its own write echoes: + +- The `PERSISTED_REQUESTS` callback, once initialized, does **not** blindly adopt disk. It absorbs only requests whose `requestID` is **not** in `knownRequestIDs` (genuine cross-tab writes — which it re-persists and reports via the cross-tab callback), and it reconciles leader-tab deletions only while `pendingOnyxWrites === 0`. +- `pendingOnyxWrites` is a hand-maintained counter: `trackOnyxWrite` wraps the `Onyx.set`/`multiSet` calls the module issues (across both keys) so the callback can tell "this echo is my own in-flight write" from "this is a real external change." One write escapes the wrapper: `clear()` issues its `Onyx.set(PERSISTED_ONGOING_REQUESTS, null)` bare (only the companion array write is tracked). +- **Two carve-outs** still adopt disk: a `null` value (e.g. from `Onyx.clear()`) falls through to the disk-load/re-init path (the guard is gated on `val != null`); and genuinely-new cross-tab requests identified by an unknown `requestID`. + +**Sharp edges.** +- The `PERSISTED_ONGOING_REQUESTS` callback **is guarded against own-write echoes, but differently** from the array: it early-returns when the value's `requestIndex` is in `knownOngoingRequestIDs` (a set fed by `processNextRequest` and `updateOngoingRequest`). It never consults `pendingOnyxWrites`, and two paths are unguarded — a `null` echo and a value with an unknown (or missing) `requestIndex` are adopted unconditionally (see [Open Questions](#open-questions-needs-maintainer-confirmation)). The same callback also fires `triggerInitializationCallback()` (= `flush`) post-init when a new ongoing request appears — a mid-session flush trigger (see [Inbound Consumers](#inbound-consumers-who-calls-the-queue)). +- The `pendingOnyxWrites` counter is a hand-rolled ordering mechanism reimplementing a guarantee the data layer doesn't provide. It is correct only as long as every own write is wrapped and increments/decrements stay balanced (`clear()`'s bare ongoing-key write is the one exception — see above). +- Structural `deepEqual` removal is keyed on object shape, not identity. If a request object is mutated mid-flight by middleware (e.g. optimistic-ID rewriting, a `shouldRetry` flag, `File` stripping on serialization, an added rollback marker), a later `deepEqual` can fail to match the original. + +## Where the request actually hits disk (and where it doesn't) + +This is the crux for any crash-safety or persist-before-fire reasoning, so it is worth pinning exactly. + +**A WRITE persists through one function.** `PersistedRequests.save()` — invoked by `SequentialQueue.push()` directly, or via `handleConflictActions` for the conflict paths. Every mutator in `PersistedRequests` shares one shape: the in-memory module state (`persistedRequests`, `ongoingRequest`, `knownRequestIDs`) is updated **synchronously first**, then the Onyx write is issued with the already-updated value, wrapped in `trackOnyxWrite`. + +**The Onyx promise is a real disk-commit handle, not a cache handle.** In `OnyxUtils` (`setWithRetry`/`multiSetWithRetry`), the cache is updated and subscribers are notified **synchronously** via `broadcastUpdate`/`keyChanged` — there is a source comment, _"This approach prioritizes fast UI changes without waiting for data to be stored in device storage"_ — and the function then returns the `storage.setItem`/`storage.multiSet` chain. That returned promise resolves only after the storage provider commits: an IndexedDB transaction `complete` event on web, or a SQLite WAL commit on native (`journal_mode=WAL`, `synchronous=NORMAL` — durable across a JS/app crash, weaker than an fsync against OS-level power loss). So `push()`'s `persistencePromise` **does** represent durable persistence. Three caveats bound that statement: a **`null` write** returns `Promise.resolve()` immediately (the storage delete runs fire-and-forget — see below), so for null/delete payloads the promise is *not* a commit handle, and a `multiSet` containing a `null` value covers only its non-null pairs (e.g. `endRequestAndRemoveFromQueue`'s "atomic" promise covers only the array write, not the ongoing-key delete); Onyx's internal `retryOperation` **swallows storage failures** after `MAX_STORAGE_OPERATION_RETRY_ATTEMPTS` (5), resolving anyway; and on web Onyx can silently **degrade to `MemoryOnlyProvider`** when the IndexedDB store cannot be created, after which every "commit" is memory-only. + +**But the disk write does not gate the network.** `push()` assigns `persistencePromise`, then on the idle-online path calls `flush(true)` **synchronously on the same tick** before `return persistencePromise` — `flush()` is never chained off `persistencePromise.then(...)`. Because `save()` already updated the in-memory array synchronously, `flush()` → `process()` → `processWithMiddleware()` → `makeXHR()` sees the request immediately and dispatches it. `makeXHR` gates the actual `HttpUtils.xhr` call only on `hasReadRequiredDataFromStorage()` (auth/NetworkStore readiness) — never on the `PERSISTED_REQUESTS` disk write. The request can therefore leave the device while `persistencePromise` is still pending. + +``` +push() (online · idle · no conflict) + │ + ├─ save(req) + │ ├─ persistedRequests.push(req) ── in-memory mirror: SYNC, authoritative now + │ └─ persistencePromise = Onyx.set(PERSISTED_REQUESTS, arr) + │ ├─ cache + subscriber broadcast ── SYNC (same tick) + │ └─ storage.setItem(...) ── async → resolves on DISK COMMIT ▒▒ #1 + │ + ├─ flush(true) ── SYNC, same tick · NOT chained off persistencePromise.then + │ └─ process() + │ ├─ processNextRequest → Onyx.multiSet(PERSISTED_REQUESTS, ONGOING) ▒▒ #2 (not awaited) + │ └─ processWithMiddleware → makeXHR + │ └─ gate: hasReadRequiredDataFromStorage() (auth only — no disk gate) + │ └─ HttpUtils.xhr(...) ───────────────────────► NETWORK FIRES + │ + └─ return persistencePromise (write() awaits only this; resolves to void) + + ╔══════════════════════════════════════════════════════════════════╗ + ║ CRASH WINDOW: NETWORK FIRES ──► ... ──► #1 disk commit ║ + ║ the request can reach (and be acted on by) the server while still ║ + ║ absent from disk → a crash/reload here loses the queue record and ║ + ║ it is NOT recovered on restart. ║ + ╚══════════════════════════════════════════════════════════════════╝ +``` + +**Persist points** (all update the in-memory mirror synchronously first; the Onyx promise resolves on disk commit): + +| When | Function | Key(s) written | Awaited by the control flow? | +|---|---|---|---| +| Enqueue | `save()` | `PERSISTED_REQUESTS` | No — returned as `persistencePromise`, not awaited before `flush()` | +| Promote head → ongoing | `processNextRequest()` | both (atomic `multiSet`) | No — fire-and-forget | +| Success / non-retriable drop | `endRequestAndRemoveFromQueue()` | both (atomic `multiSet`) | No — fire-and-forget | +| Retryable failure | `rollbackOngoingRequest()` | both (atomic `multiSet`) | No — fire-and-forget | +| Conflict REPLACE | `update()` | `PERSISTED_REQUESTS` | **Yes** — `handleConflictActions` awaits it | +| Conflict DELETE | `deleteRequestsByIndices()` | `PERSISTED_REQUESTS` | **Yes** — `handleConflictActions` awaits it | +| In-flight update (e.g. reauth) | `updateOngoingRequest()` | `PERSISTED_ONGOING_REQUESTS` | No | + +> `save()`'s disk-write `.catch` is log-only and does not re-throw, so `persistencePromise` resolves to `void` whether the storage commit **succeeded or failed** — even a consumer that awaits it (e.g. `API.write`) cannot detect a failed persist. (Mirrors the missing `.catch` on the `queueFlushedData` chain.) Onyx itself can also swallow a failed commit — the retry caveat above — so the silence has two layers. + +**Where a request does _not_ hit disk:** + +- **READ commands and `makeRequestWithSideEffects`** never route through `save()` — nothing is persisted or recovered for them. +- **`save()` before initialization** parks the request in `pendingSaveOperations` and returns `Promise.resolve()`; the real write is deferred to the init connect callback. +- **A File/Blob _ongoing_ request** writes `null` to `PERSISTED_ONGOING_REQUESTS` (kept in memory only) and is unrecoverable on restart — but see the asymmetry below: the queue _array_ does write it. +- **`Onyx.set(key, null)`** (e.g. clearing the ongoing key) is routed to a storage **delete** (`remove()`), not a value write — and its returned promise is `Promise.resolve()`, resolved before the delete commits. +- **Subscriber broadcasts** are the synchronous cache/callback loop; they run before — and independently of — the storage write. +- **Asymmetry:** the File/Blob null-out guard applies **only to the ongoing key**. `PERSISTED_REQUESTS` array writes carry no such guard and pass the File/Blob through to the storage provider intact — IndexedDB structured-clone stores a `File` on web, while SQLite's `JSON.stringify` reduces it to `{}` (data lost) on native. File/Blob is therefore **not** uniformly kept off disk. + +## RequestThrottle (backoff) + +**Problem it solves.** Keep retrying a transiently-failing request without hammering a degraded backend — spread load with jittered exponential backoff — and provide a definite "give up" signal when retries are exhausted. + +**How it works today.** +- **`getRequestWaitTime`** seeds the first wait with a random jitter in `[MIN_RETRY_WAIT_TIME_MS, MAX_RANDOM_RETRY_WAIT_TIME_MS]` = `[10, 100]` ms, then **doubles** the prior wait on each retry, capped at `MAX_RETRY_WAIT_TIME_MS` (10 s). +- **`sleep`** increments the retry count and picks the cap — `MAX_OPEN_APP_REQUEST_RETRIES` (2) for the `OPEN_APP` command, else `MAX_REQUEST_RETRIES` (10). Within the cap it resolves after the wait; once exceeded it **rejects with no argument** — this argument-less rejection is the give-up signal that `process()`'s catch consumes. +- **`clear`** resets the wait, the retry count, and any pending timeout — called on success and on any non-retriable outcome. +- The queue uses a single shared instance, `sequentialQueueRequestThrottle`. + +**Sharp edges.** +- After the retry cap, a request is **permanently dropped** (see the [give-up row](#error-handling)) — for non-`OPEN_APP` commands with no user-facing modal. +- `clear()` fires on every success, so a burst that interleaves successes with failures keeps resetting backoff to the floor, weakening the intended exponential spacing against a degraded backend. + +## QueuedOnyxUpdates and queueFlushedData + +These are **two distinct deferral mechanisms** that are easy to confuse. + +**Problem `QueuedOnyxUpdates` solves.** Prevent UI flicker: if each WRITE's response Onyx updates were applied immediately as the queue drained, the UI would replay intermediate states. Instead, WRITE responses are buffered and applied **after the whole queue drains**. + +**How `QueuedOnyxUpdates` works.** It is a module-level **in-memory** buffer. `queueOnyxUpdates` appends and resolves immediately; `flushQueue` snapshots-and-clears (to avoid races) then applies via `Onyx.update`. The WRITE-vs-READ routing decision lives **upstream** in `OnyxUpdates.applyHTTPSOnyxUpdates`: WRITE-type responses (`onyxData`, then `successData`/`failureData`/`finallyData`) route through `queueOnyxUpdates`; READ-type responses apply via `Onyx.update` immediately. When not signed in, `flushQueue` filters the snapshot to a 14-key preserved allowlist (`SESSION`, `NETWORK`, `IS_LOADING_APP`, `HAS_LOADED_APP`, `CREDENTIALS`, `ACCOUNT`, `MODAL`, `PRESERVED_USER_SESSION`, theme/locale/try-new-dot NVPs, and RAM-only loading flags); the filter is skipped entirely under `CONFIG.IS_TEST_ENV`. The pause early-return lives in `SequentialQueue.flushOnyxUpdatesQueue` (returns early when `isQueuePaused`). It is invoked at the end of a full drain — and from two other entry points: `flush()`'s all-empty guard counts a non-empty buffer as pending work (a flush with zero requests proceeds purely to drain the buffer), and `unpause()` calls it directly when the queue is already empty. + +**Problem `queueFlushedData` solves.** Apply a small piece of data **only after a full drain** — specifically, mark the app as loaded only once the queue has actually emptied (not mid-drain). + +**How `queueFlushedData` works.** It is a **distinct, Onyx-persisted** buffer (`QUEUE_FLUSHED_DATA`), separate from the in-memory `QueuedOnyxUpdates`. `SequentialQueue.saveQueueFlushedData` appends a successfully-processed request's `queueFlushedData` field; the queue applies it via `Onyx.update` and clears it only when fully drained (after `flushOnyxUpdatesQueue`). Its sole producer is `App.getOnyxDataForOpenOrReconnect` (`OPEN_APP` / `ReconnectApp`), currently carrying exactly one entry: a merge of `HAS_LOADED_APP = true`. + +**Sharp edges.** +- Both apply **only** when the queue reaches fully-empty. Under sustained WRITE pressure neither applies, so `HAS_LOADED_APP` stays unflipped and the buffers accumulate. +- The application chain in the drain-end `process().finally` has no `.catch` — a failed `Onyx.update` silently skips the subsequent clear. + +## Public Contract (the API layer) + +**Problem it solves.** Give feature code one offline-first surface so it never has to think about connectivity, ordering, or retries. + +**How it works today.** + +| Method | Persisted? | Retried? | Ordering | Notes | +|---|---|---|---|---| +| `API.write` | Yes (via `push`) | Yes (throttle) | FIFO, leader-only | Applies `optimisticData` synchronously, strips it from the persisted request, then `push`es. Resolves to `void` — does **not** await the network round-trip. Maps to [OFFLINE.md](philosophies/OFFLINE.md) patterns A/B. | +| `API.read` | No | No | Blocks on `waitForIdle()` (`isReadyPromise`) before sending | Applies `optimisticData` but never persists. Fire-and-forget through the middleware. Two short-lived-auth sign-in commands bypass `waitForWrites`; `API.paginate`'s READ flavor sits behind the same gate. | +| `API.makeRequestWithSideEffects` | No | No | None — bypasses the queue entirely | For commands whose caller needs the response. Online-only in effect. | + +- **The conflict resolver runs twice.** Read-only in `prepareRequest` (only to set "apply optimistic data?"), then authoritatively in `push` (mutates the queue). See [Conflict Resolution](#conflict-resolution). +- **`API.read` blocks on writes** because a READ that returned before in-flight WRITEs landed could overwrite optimistic data on the same keys. + +**Sharp edges.** +- `makeRequestWithSideEffects` is **not** persisted or retried. Offline, the underlying `fetch` fails and the promise the caller received **rejects** (logged and re-thrown by the `Logging` middleware). It is not silently swallowed — but nothing retries it, so the caller must handle the rejection. + +## Inbound Consumers (who calls the queue) + +The blocks above describe what the queue does; this is the inbound surface — who drives it and the ordering guarantees they lean on. + +**Who triggers `flush()`:** + +- `push()` on the idle-online path. +- `onPersistedRequestsInitialization` — recovered work at startup (see [Restart Recovery](#restart-recovery)). +- `onCrossTabRequestsMerged` — another tab enqueued a genuinely-new request. +- The `PERSISTED_ONGOING_REQUESTS` connect callback — post-init, when a **new** ongoing request appears (e.g. observed from another tab), it re-fires the registered initialization callback, which is `flush`. +- `Reconnect` — on **two** edges: the offline→online `NetworkState` transition and the app-became-active listener. Both are deliberately decoupled from `reconnect()`'s data-sync (`openApp`/`reconnectApp`), which never flushes. +- `Network/index` startup — `ActiveClientManager.isReady().then(flush)`, the primary "drain whatever survived the last session, once we know who the leader is" trigger. +- The **native background-fetch wakeup** (`backgroundTask`), which flushes on a native scheduler tick. + +**Two ordering gates (don't conflate them):** + +- **`waitForIdle()`** resolves when the _whole_ queue drains (returns `isReadyPromise`). Beyond `API.read`, it is consumed by auth-token-swap and post-sign-in flows — `Delegate` connect/disconnect, `Session` merge-account, and `SignInModal` before `openApp` — to fully drain pending WRITEs under the **current** authToken before swapping tokens; sending a queued WRITE under a new token trips `Reauthentication` and forces a logout. +- **`getCurrentRequest()`** resolves when only the _single in-flight_ request's `process()` chain settles (or immediately if none is in flight). Its production consumer is `User.ts`'s Pusher multi-event handler, which sequences server-pushed Onyx updates **after** the in-flight WRITE so a push cannot apply ahead of the WRITE response that may carry related updates. + +**Direct `PersistedRequests` callers (bypassing `push`/`flush`):** + +- `App` clear-and-restore and `ImportOnyxState` — `rollbackOngoingRequest()` to snapshot/restore the queue around an `Onyx.clear()` or a state import. The `App` path also takes a `getAll()` snapshot and restores via direct `save()` calls — a raw enqueue that bypasses `push()` (no conflict resolution, no flush trigger). +- `Report.getGuidedSetupDataForOpenReport` — `getAll()` (read-only) to scan the queue for an already-enqueued `OPEN_REPORT` carrying `guidedSetupData`, preventing duplication. +- `Policy` and `HandleUnusedOptimisticID` — `getAll()` + `update()`/`updateOngoingRequest()` to rewrite a queued (or in-flight) request in place, e.g. optimistic-ID repair. +- `API.index` — `getLength()` (read-only) to decide read-blocking. +- `isPaused()` / `isRunning()` are read-only accessors — `RequestsQueuesState` diagnostics reads these plus `getAll()`/`getOngoingRequest()` directly; `MainQueue` consumes `isRunning()` **behaviorally**, holding its own processing while the sequential queue drains. + +--- + +# The State Machine + +`SequentialQueue` is driven by six module-level variables plus the shared throttle instance. This table is the densest reference for reasoning about invariants during a refactor. + +| Variable | Meaning | Set where | Cleared where | Invariant | +|---|---|---|---|---| +| `isSequentialQueueRunning` | Single-flight / re-entrancy guard | top of `flush()` after guards pass | `process().finally` | At most one drain in progress per tab | +| `currentRequestPromise` | The in-flight `process()` chain (for `getCurrentRequest`) | start of the process chain | the drain-end `process().finally` (set to `null`) | Non-null only while a request is in flight | +| `isQueuePaused` | **Overloaded:** offline pause **or** `shouldPauseQueue` data-gap sync | `pause()` / a `shouldPauseQueue` response | `unpause()` / `resetQueue()` | While true, nothing is processed | +| `isReadyPromise` / `resolveIsReadyPromise` | One-shot gate READs block on (READ-after-WRITE ordering) | resolved at module load; **reset** in `flush(true)` | resolved in `finally` (offline or empty) | A READ may proceed only when no WRITE it must follow is pending | +| `shouldFailAllRequests` | Sticky `NETWORK`-key flag → erroring requests are failed and dropped | `NETWORK` Onyx callback | `NETWORK` Onyx callback | Test/debug only | +| `queueFlushedDataToStore` | In-memory mirror of `QUEUE_FLUSHED_DATA` | the `QUEUE_FLUSHED_DATA` connect-callback echo of `saveQueueFlushedData`'s `Onyx.set` | `clearQueueFlushedData` | Applied only on full drain | +| `sequentialQueueRequestThrottle` | Shared backoff state (wait time, retry count, pending timeout) | `sleep()` on each generic-error retry | `clear()` on success and every non-retriable outcome | Backoff state never survives a settled request | + +### Why `isReadyPromise` resolves on offline, not on paused + +In the drain-end `process().finally`, `resolveIsReadyPromise` runs only when `isOfflineNetwork() || !hasRemainingRequests` — **deliberately not** when `isQueuePaused`. The reason is that `isQueuePaused` is overloaded: + +- **Offline pause** → resolve. The queue cannot process anyway, so READs should be allowed through (they'll show optimistic/stale data, which is acceptable offline). +- **`shouldPauseQueue` data-gap sync** → do **not** resolve. WRITEs are still pending behind the gap, so READs must keep waiting or they'd clobber in-flight optimistic data. + +`unpause()` calls `flush(false)` precisely to **preserve** the existing `isReadyPromise` — swapping in a fresh one would orphan READs already awaiting the old promise. + +**Sharp edges.** `isReadyPromise` is a single shared one-shot resolver. If a flush round resets it but never reaches the `finally` that resolves it (e.g. a persistent, unresolvable `shouldPauseQueue` data gap), subsequent READs blocked on `waitForIdle()` have no timeout. + +--- + +# Error Handling + +When a request errors, `process()`'s `.catch` walks an ordered ladder. **Which Onyx data gets applied diverges sharply by error class** — this table is the contract. + +| Error class | Matched by | Retries? | Onyx data applied | User-facing modal | Source / meaning | +|---|---|---|---|---|---| +| `REQUEST_CANCELLED` | `error.name` | No — drop | **None** | No | Sign-out cancels in-flight requests; optimistic state assumed correct | +| `DUPLICATE_RECORD` | `error.message` | No — drop | **None** | No | Record already exists server-side; optimistic state assumed correct | +| `shouldFailAllRequests` | `NETWORK` flag (same branch as above) | No — drop | `failureData` **+** `finallyData` | No | Test/debug fail-all | +| `ALREADY_CREATED` | `error.message` | No — drop | `successData` **+** `finallyData` | No | Resource already created server-side → treat as success | +| `THROTTLED` on `RESEND_VALIDATE_CODE` | `error.message` + command | No — drop | `failureData` | No | Rate-limit (429); do not retry to avoid spam | +| Generic (everything else) | — | **Yes** — `rollbackOngoingPersistedRequest()` then `throttle.sleep().then(process)` | (none until resolved/given-up) | No | Transient network/server errors | +| Give-up (generic, retries exhausted) | `throttle.sleep` rejects with no arg | No — drop | **`failureData` only** (not `finallyData`) | **`OPEN_APP` only** (`setIsOpenAppFailureModalOpen`) | Retry cap (10, or 2 for `OPEN_APP`) reached | + +Notes: +- **Success path** does not live here. `SaveResponseInOnyx` applies the server's `onyxData` + `successData` + `finallyData`; for WRITE-type responses that is routed through [`QueuedOnyxUpdates`](#queuedonyxupdates-and-queueflusheddata) (deferred until drain). +- **`shouldFailAllRequests` vs. give-up** differ: the former applies failure **and** finally data; the give-up branch applies **only** failure data. + +**Sharp edges.** For all non-`OPEN_APP` commands, a genuinely transient-but-long backend outage that exceeds the retry cap **silently drops** the user's optimistic write (applying `failureData`, no modal). Whether this is intended product behavior is an [open question](#open-questions-needs-maintainer-confirmation). + +--- + +# Runtime Dimensions + +Three concerns cut across every block. + +## Multi-Tab & Leader Election + +> **Web-only.** On native, `ActiveClientManager.isClientTheLeader()` always returns `true` (single client), so this entire dimension is a no-op there. + +**Problem it solves.** On web, `PERSISTED_REQUESTS` is a single Onyx key replicated to **every tab** of one browser. Without coordination, N tabs would each drain the shared queue and send every request N times. + +**How it works today.** `flush()` hard-gates on `isClientTheLeader()` (the last GUID in the shared active-clients list wins — with an `isPromotingNewLeader` carve-out: during the `beforeunload` handoff the departing leader keeps answering `true` even when its GUID is no longer last). Only the leader drains; followers may still **enqueue** (their genuinely-new requests merge cross-tab via `requestID` not in `knownRequestIDs`) but never send. Leadership is checked **once** per flush, before the running flag flips. + +**Sharp edges.** +- **Mid-flush leadership change.** Because leadership is checked once and the async `process()` recursion never re-checks, a leader tab closed/refreshed mid-request can hand leadership to a follower that flushes the same shared queue while the old request is still in flight. There is no cross-tab lock on the ongoing request — duplicate-send avoidance relies on server-side `DUPLICATE_RECORD` / `ALREADY_CREATED` reconciliation, not a lock. +- **`requestID` is per-tab, not globally unique.** It comes from a module-scoped counter initialized to a wall-clock timestamp at module load and incremented per request (stamped as `requestIndex`; `requestID` is the legacy fallback name); each tab has its own counter. Timestamp seeding *reduces* but does not *eliminate* cross-tab `requestID` collisions, which the `knownRequestIDs` cross-tab diff relies on. + +## Offline Behavior + +**Problem it solves.** Let the user keep working with no connection; deliver their writes when connectivity returns. + +**How it works today.** `getIsOffline()` (from `NetworkState`, read synchronously) gates `push`, `flush`, and `process`. Offline, `push` persists and returns without flushing. On reconnect, `Reconnect` calls `SequentialQueue.flush()` from two edges — the offline→online `NetworkState` transition and the app-became-active listener — deliberately separate from its `reconnect()` data-sync, which never flushes (see [Inbound Consumers](#inbound-consumers-who-calls-the-queue)). The queue does **not** own offline *detection* — see [Network State Detection](NETWORK_STATE_DETECTION.md) for the hard-stop model and recovery probes. + +## Pausing & Data-Gap Sync + +**Problem it solves.** The server tags WRITE responses with update IDs. If the client has missed intermediate updates (a gap between `lastUpdateIDAppliedToClient` and the response's `previousUpdateID`), applying further buffered WRITE responses would put Onyx data out of causal order. The queue must pause until the gap is filled. + +**How it works today.** `SaveResponseInOnyx` is the **sole producer** of `shouldPauseQueue`: it sets the flag when the command is not in `requestsToIgnoreLastUpdateID` and `OnyxUpdates.doesClientNeedToBeUpdated` reports a `previousUpdateID` gap, after stashing the update info. `process()` then `pause()`s — the flag flips first, but the just-processed request is still removed and its `queueFlushedData` saved before the recursion early-returns on the pause. The actual gap fetch runs in a **decoupled** `OnyxUpdateManager` flow that fetches the missing range, applies the deferred updates in order, and then calls `unpause()`. Within this concern `pause()` has two inbound callers — `applyOnyxUpdatesReliably` (on detecting a missing-updates fetch) and `DeferredOnyxUpdates` — with `unpause()` called once the gap is filled. + +**Sharp edges.** +- `isQueuePaused` conflates this data-gap pause with the offline pause (see [the state machine](#why-isreadypromise-resolves-on-offline-not-on-paused)). +- `shouldPauseQueue` rides on the response object, so any middleware registered after `SaveResponseInOnyx` that returned a *new* response object would strip the flag. Today the only later middleware is `FraudMonitoring`, which returns the response unchanged — so the flag survives, but the in-code comment claiming `SaveResponseInOnyx` "must be last" is stale (see [Middleware](#the-middleware-chain-boundary)). + +--- + +# The Middleware Chain (boundary) + +**Problem it solves (for the queue).** `process()` hands a request to `processWithMiddleware` and gets back a single promise that either resolves (success, possibly with `shouldPauseQueue`) or rejects (an error the ladder can classify). The chain is where the XHR, auth, response persistence, and optimistic-ID repair happen. + +**How it works today.** +- **Registration order:** `Logging`, `LoadTest`, `FailureTracking`, `Reauthentication`, `handleDeletedAccount`, `SupportalPermission`, `HandleUnusedOptimisticID`, `Pagination`, `SentryServerTiming`, `SaveResponseInOnyx`, `FraudMonitoring`. +- **Execution model.** `processWithMiddleware` folds the chain with `reduce`, seeding from `makeXHR(request)`. Each middleware appends its own `.then` to the promise it receives. So the **first-registered** middleware (`Logging`) wraps the raw XHR and its response handler fires **first**; the **last-registered** (`FraudMonitoring`) fires **last**. **Response handlers fire in registration order** (not the reverse). +- **`isFromSequentialQueue = true`** changes exactly one middleware behavior: `Reauthentication` **re-throws** on failure so the queue's catch can retry (rather than resolving a callback). The `QueuedOnyxUpdates` routing in `SaveResponseInOnyx` is **not** keyed on this flag — it is keyed on `request.data.apiRequestType === WRITE` in `applyHTTPSOnyxUpdates` (see [QueuedOnyxUpdates](#queuedonyxupdates-and-queueflusheddata)), which for queue traffic amounts to the same thing. +- **`SaveResponseInOnyx`** is the penultimate middleware and the sole producer of `shouldPauseQueue`. +- **`Reauthentication`** reacts to a 407 (`NOT_AUTHENTICATED`), which arrives as a *resolved* response (data), by reauthenticating. On the sequential-queue path it re-runs the chain; if reauth ultimately fails it throws `'Failed to reauthenticate'`. On reauth **success** the ongoing request stays promoted and is re-driven in place (no rollback, no dequeue); only reauth **failure** escapes to the generic ladder, which rolls the ongoing request back to the queue head and retries with backoff. `HandleUnusedOptimisticID` is the path that mutates the ongoing request in place (via `updateOngoingRequest`) during such a re-run. + +**Sharp edges.** +- The chain order is implicit import-time module state; there is no runtime assertion that `SaveResponseInOnyx` is penultimate or that `Reauthentication` precedes it. +- `SaveResponseInOnyx`'s early-return guard is effectively **dead code**: `onyxUpdates` defaults to `response.onyxData ?? []`, and an empty array is truthy, so `!onyxUpdates` is never true. It does not short-circuit empty responses. +- The in-code "must be last" comment on `SaveResponseInOnyx` is stale (`FraudMonitoring` follows it). + +--- + +# Conflict Resolution + +**Problem it solves.** When a new request would conflict with one already queued (e.g. two edits to the same field, or an add-then-delete), collapse or rewrite the queue so only the correct net effect is sent. Wrappers like `writeWithNoDuplicates*` build on this. + +**How it works today.** A request may carry `checkAndFixConflictingRequest`. It is evaluated **twice**: +1. **In `prepareRequest` (read-only)** against `getAll()`, solely to decide whether to suppress optimistic data (`conflictAction.type === 'noAction'`). The rest of the action is ignored here. +2. **In `push` (authoritative)** against the live persisted requests. The resolver function is deleted off the request (to keep it serializable), and the `conflictAction` is applied via `handleConflictActions`, which performs the queue mutations. The four action shapes: + - **push** — append the new request. + - **replace** — overwrite the request at a computed index. + - **delete** — remove requests at computed indices, optionally push a new request and/or recurse into a `nextAction` (`pushNewRequest` is a required boolean on the type; the optionality is behavioral). + - **noAction** — ignore the new request. + +**Sharp edges.** +- **Index fragility.** `replace`/`delete` carry **numeric indices** computed by the resolver at `push` time, but `handleConflictActions` applies them **asynchronously** (awaiting between positional operations). Meanwhile `processNextRequest` (`slice(1)`), `endRequestAndRemoveFromQueue` (splice), and cross-tab merges can shift indices. An index computed against one snapshot may point at a different request by the time it is applied. +- **Conflict checking sees only the queue, never the in-flight `ongoingRequest`** — a new write cannot dedup against a request that is already being sent. +- The resolver is expected to be pure, but production resolvers are not: `resolveCommentDeletionConflicts` performs an `Onyx.update`, and because `prepareRequest` invokes the **same** closure, the side effect fires on **both** evaluations (twice per write; an idempotent merge today). `resolveEditCommentWithNewAddCommentRequest` goes further and mutates the queued request's `data` in place during evaluation. + +--- + +# Test Coverage + +Tests are useful here as evidence of the **intended contract** — what the team asserts as a guarantee. (Stated as observation; gaps below are facts, not a to-do list.) + +| Suite | What it pins as contract | +|---|---| +| `tests/unit/APITest.ts` | Offline-persist, online-flush, persisted-until-backend-response, retry/backoff for 5xx + Auth-down (asserts ~3 fetches with throttle waits), reauthentication + ordering, WRITE-blocks-READ, pause/unpause, no-duplicates + enable-feature conflict collapsing, supportal-411 no-retry + `failureData` | +| `tests/unit/SequentialQueueTest.ts` | Push/persist counts, conflict resolution (replace/push/noAction, incl. replace-while-ongoing), move-to-ongoing persistence, startup hydration of the ongoing request, `queueFlushedData` lifecycle, `ALREADY_CREATED` treated as success without retry | +| `tests/unit/PersistedRequests.ts` | Save/remove/processNext/update/updateOngoing, File/Blob handling, cross-tab follower reconciliation; the `Issue 3a`/`3c` cases assert the **durable** ongoing-request behavior (`processNextRequest` always persists via `multiSet`) — a legacy test title references the vestigial `persistWhenOngoing` flag | +| `tests/unit/RequestConflictUtilsTest.ts` | Pure push/replace/delete/noAction resolver logic | +| `tests/actions/QueuedOnyxUpdatesTest.ts` | Account-scoped update filtering on flush | +| `tests/unit/MiddlewareTest.ts` | `SaveResponseInOnyx` and `HandleUnusedOptimisticID` | +| `tests/unit/ReconnectTest.ts` | The offline→online transition flushes the queue exactly once | +| `tests/actions/OnyxUpdateManagerTest.ts` | The queue is paused while a missing-updates gap is fetched, and resumed after | + +The suites rely on `resetQueue()` (resets the four core coordinator vars — running flag, `currentRequestPromise`, pause flag, `isReadyPromise` — but not `shouldFailAllRequests`, `queueFlushedDataToStore`, or the throttle) and `resetPendingWritesForTest()` (resets only the `pendingOnyxWrites` counter) — test-only reset primitives, the pendants to `clear()`. + +**Untested on this branch (factual):** +- The **give-up-after-max-retries** branch (apply `failureData`, dequeue, open the `OPEN_APP` failure modal) — all retry tests recover within 1–3 attempts. +- A **dedicated `RequestThrottle` unit** — its backoff math and reject threshold are exercised only indirectly via the shared instance's wait-time getter. +- The **non-leader `flush` guard** — only follower disk reconciliation is covered, not the early return. +- **True mid-flight-crash-then-restart recovery** — startup hydration is simulated by manually seeding `PERSISTED_ONGOING_REQUESTS`. +- **`waitForIdle` is used only as a drain helper** in tests (pagination, attachments, workspace-upgrade, network, middleware), never asserted as a READ-blocks-on-WRITE ordering guarantee. + +**Behavior pinned to today's ordering.** `APITest.ts` › `API.write() persistence guarantees` › `'Issue 1: should persist the request before applying optimistic data'` currently asserts `optimisticDataApplied === true` and `requestPersistedBeforeOptimistic === false` — i.e. it pins the **current** optimistic-before-persist ordering, with a comment to flip the final assertion once persist-before-fire lands. (Sibling `Issue 2`/`Issue 5` cases in the same block are written to the post-fix behavior, so the block is internally mixed.) + +--- + +# Configuration Constants + +Defined in `src/CONST/index.ts` under `CONST.NETWORK` (retry/backoff subset relevant to the queue; the sustained-failure constants are documented in [Network State Detection](NETWORK_STATE_DETECTION.md)): + +| Constant | Value | Description | +|---|---|---| +| `MAX_REQUEST_RETRIES` | `10` | Retry cap for normal commands before give-up | +| `MAX_OPEN_APP_REQUEST_RETRIES` | `2` | Retry cap for the `OPEN_APP` command | +| `MIN_RETRY_WAIT_TIME_MS` | `10` | Lower bound of the first-retry jitter | +| `MAX_RANDOM_RETRY_WAIT_TIME_MS` | `100` | Upper bound of the first-retry jitter | +| `MAX_RETRY_WAIT_TIME_MS` | `10000` (10 s) | Cap for the doubling backoff | + +> There is **no** constant named `MIN_RANDOM_RETRY_WAIT_TIME_MS`; the jitter lower bound is `MIN_RETRY_WAIT_TIME_MS`. `CONST.JSON_CODE.NOT_AUTHENTICATED = 407` (used by `Reauthentication`) is defined elsewhere in `CONST`, not under `NETWORK`. + +--- + +# Key Modules Reference + +| Module | Responsibility | +|---|---| +| `src/libs/Network/SequentialQueue.ts` | Leader-tab serial executor for persisted WRITE requests: persist-then-flush one at a time through the middleware with throttle/backoff and the error ladder; gates dependent READs via `isReadyPromise`. | +| `src/libs/actions/PersistedRequests.ts` | In-memory mirror (`persistedRequests` + `ongoingRequest`) of the offline write queue, reconciled with the `PERSISTED_REQUESTS` / `PERSISTED_ONGOING_REQUESTS` keys; in-memory authoritative after init. | +| `src/libs/RequestThrottle.ts` | Per-instance exponential-backoff timing: jittered first wait, doubling capped at the max, retry-count gate, argument-less reject as the give-up signal. | +| `src/libs/actions/QueuedOnyxUpdates.ts` | In-memory buffer of WRITE-response Onyx updates, held until the queue drains then flushed (with a not-signed-in preserved-keys filter) — anti-flicker. | +| `src/libs/actions/OnyxUpdates.ts` (`applyHTTPSOnyxUpdates`) | Routes WRITE-type responses through `queueOnyxUpdates` and READ-type through `Onyx.update` immediately; home of `doesClientNeedToBeUpdated`. | +| `src/libs/API/index.ts` | Public facade: `prepareRequest` applies optimistic data + stamps `requestID`; routes `write`→queue, `read`→`waitForWrites` then fire-and-forget, `makeRequestWithSideEffects`→immediate. | +| `src/libs/Request.ts` (`processWithMiddleware`) | Folds the middleware chain over `makeXHR(request)`; first-registered is innermost, response handlers fire in registration order. | +| `src/libs/Middleware/SaveResponseInOnyx.ts` | Penultimate middleware; sole producer of `shouldPauseQueue` (on a `previousUpdateID` gap); persists server Onyx updates. | +| `src/libs/Middleware/Reauthentication.ts` | Re-throws on `isFromSequentialQueue` so the queue retries; reacts to 407 by reauthenticating; may throw `'Failed to reauthenticate'`. | +| `src/libs/ActiveClientManager/` | Leader election across web tabs (`isClientTheLeader`); no-op (always leader) on native. | +| `QUEUE_FLUSHED_DATA` (Onyx key) + `SequentialQueue.saveQueueFlushedData` | Distinct Onyx-persisted buffer applied only on full drain; sole producer is `App.getOnyxDataForOpenOrReconnect` carrying `HAS_LOADED_APP = true`. | + +--- + +# Open Questions (needs maintainer confirmation) + +These are intent-ambiguous from the code alone — flagged for a maintainer to ratify or correct, not asserted as fact. + +1. **`persistWhenOngoing` appears vestigial.** It is read/logged (in `processNextRequest`, `SequentialQueue`, `RequestsQueuesState`) but **never assigned** by any production write-path function (only in test fixtures); production persistence branches on `shouldPersistOngoingRequest`. Is it safe to remove, or is it a reserved hook? +2. **`SaveResponseInOnyx`'s early-return guard is dead code** (`!onyxUpdates` is never true because `onyxUpdates` defaults to `[]`). Was empty-response short-circuiting intended (i.e. should it test array length), or is the guard genuinely obsolete? +3. **Silent give-up data loss for non-`OPEN_APP` commands.** After the retry cap, the request is dropped with `failureData` applied and **no modal**. Is this the intended product behavior, or should more commands surface a failure to the user? +4. **Unguarded paths in the `PERSISTED_ONGOING_REQUESTS` callback.** The callback has an own-write guard (`knownOngoingRequestIDs`) but never consults `pendingOnyxWrites`, and `null` echoes and values with an unknown/missing `requestIndex` are adopted unconditionally — a weaker shield than the array callback's. Are these knowingly-accepted gaps, or should the array callback's full protection be mirrored here? +5. **The stale "`SaveResponseInOnyx` must be last" comment** vs. `FraudMonitoring` being registered after it. Is `FraudMonitoring`-after-`SaveResponseInOnyx` intentional (relying on it returning the response unchanged), and should the comment + ordering be made enforceable? + +--- + +# Relationship to Other Docs + +- [Network State Detection](NETWORK_STATE_DETECTION.md) — **how** the app detects connectivity and decides it is offline (the hard-stop model the queue's `getIsOffline()` gate reads). +- [Offline UX Patterns](philosophies/OFFLINE.md) — **how features should respond** to being offline (optimistic patterns A/B/C/D), which the `API.write`/`read` contract implements. From 98675633f878c15e4f2e3dee0dc08a259c9b5c0f Mon Sep 17 00:00:00 2001 From: Adam Horodyski Date: Thu, 11 Jun 2026 15:51:10 +0200 Subject: [PATCH 02/12] Fix spellcheck failures in SEQUENTIAL_QUEUE.md Co-Authored-By: Claude Opus 4.8 (1M context) --- contributingGuides/SEQUENTIAL_QUEUE.md | 22 +++++++++++----------- cspell.json | 2 ++ 2 files changed, 13 insertions(+), 11 deletions(-) diff --git a/contributingGuides/SEQUENTIAL_QUEUE.md b/contributingGuides/SEQUENTIAL_QUEUE.md index e3674af34d46..67da15e7b756 100644 --- a/contributingGuides/SEQUENTIAL_QUEUE.md +++ b/contributingGuides/SEQUENTIAL_QUEUE.md @@ -48,7 +48,7 @@ This is an observational reference. Where current behavior diverges from apparen ┌────────────────────────────────────────────────────────────┐ │ API.index — prepareRequest │ │ • apply optimisticData to Onyx (UI updates NOW) │ -│ • evaluate conflictResolver #1 (read-only: suppress optim.?)│ +│ • evaluate conflictResolver #1 (read-only: skip optimistic?)│ │ • stamp requestIndex++ (per-tab counter; requestID legacy) │ │ • strip optimisticData from the persisted request │ └───────────────┬──────────────────────────────────────────────┘ @@ -128,7 +128,7 @@ This is the connective narrative — how the blocks interact at runtime for a si The persisted queue exists so an interrupted WRITE survives an app kill. The pieces are described across the blocks below; here is the one cold-start path that strings them together. 1. **Disk load.** On startup, the `PERSISTED_REQUESTS` and `PERSISTED_ONGOING_REQUESTS` connect callbacks hydrate the in-memory mirror — `persistedRequests` from the array, `ongoingRequest` from the single key. -2. **Head dedup.** If the app was killed _after_ a request was promoted to `ongoingRequest` but _before_ it was removed, the same request can sit both as `ongoingRequest` and as the head of the array. The init path strips the array head when it `deepEqual`s the rehydrated `ongoingRequest` — this is the reason that structural dedup exists. +2. **Head dedupe.** If the app was killed _after_ a request was promoted to `ongoingRequest` but _before_ it was removed, the same request can sit both as `ongoingRequest` and as the head of the array. The init path strips the array head when it `deepEqual`s the rehydrated `ongoingRequest` — this is the reason that structural dedupe exists. 3. **Flush.** `onPersistedRequestsInitialization` fires `flush()` once there is recovered work (and, separately, `Network/index` fires a startup flush gated on `ActiveClientManager.isReady()`, so the leader gate is meaningful on the first drain). The registered callback is not startup-only: the `PERSISTED_ONGOING_REQUESTS` connect callback re-fires it post-init whenever a *new* ongoing request appears (e.g. observed from another tab) — see [Inbound Consumers](#inbound-consumers-who-calls-the-queue). 4. **Re-drive.** `processNextRequest` sees a non-null `ongoingRequest` and returns it directly (rather than promoting a new head), so the **same** interrupted request is re-sent. @@ -150,7 +150,7 @@ Each block is described as **the problem it solves → how it works today → sh **Problem it solves.** Serialize every WRITE so requests reach the server one at a time, in submission order, exactly once — with automatic retry on transient failure and correct ordering of dependent READs. **How it works today.** -- **Single-flight.** `isSequentialQueueRunning` is the re-entrancy guard: set at the top of `flush()` after the guards pass, cleared in `process().finally`. While set, new pushes defer rather than start a parallel drain. +- **Single-flight.** `isSequentialQueueRunning` is the re-entry guard: set at the top of `flush()` after the guards pass, cleared in `process().finally`. While set, new pushes defer rather than start a parallel drain. - **`push()` dispatch.** As above: offline → persist only; running → defer behind `isReadyPromise.then(flush)`; idle → `flush(true)`. `push` returns the disk-persistence promise but **does not await it** — the queue fires off the synchronous in-memory state. - **`flush()` guards, in order.** `isQueuePaused` → `isOfflineNetwork()` → `isSequentialQueueRunning` → all-empty → `!isClientTheLeader()`. The all-empty guard is three-legged — no queued requests, no `ongoingRequest`, **and** an empty `QueuedOnyxUpdates` buffer — so `flush()` doubles as the drain path for buffered updates on a request-empty queue. Leadership is checked **exactly once**, before the running flag flips; the recursive `process()` chain re-checks only `isQueuePaused` and `isOfflineNetwork()`, never leadership. - **`process()` recursion.** Pull the next request, run middleware, on success recurse; the recursion terminates when the queue and ongoing request are both empty. @@ -169,7 +169,7 @@ Each block is described as **the problem it solves → how it works today → sh - **The ongoing-request model.** `processNextRequest` captures the head as a **local** reference, sets `ongoingRequest`, trims the queue (`slice(1)`), and writes **both keys in a single atomic `Onyx.multiSet`** — so a crash cannot leave the request both "in the queue" and "ongoing." `rollbackOngoingRequest` and `endRequestAndRemoveFromQueue` likewise update both keys together. - **Why `processNextRequest` returns the local reference.** Onyx's post-`multiSet` callback fires synchronously and would overwrite the module-level `ongoingRequest` with a JSON-serialized copy — which strips `File`/`Blob` payloads. Returning the captured local reference preserves the live objects for the request about to run. (The `knownOngoingRequestIDs` own-write guard ignores that echo for requests carrying a `requestIndex` — the local-ref return is load-bearing for those without one.) - **File/Blob _ongoing_ requests are not durably persisted.** `shouldPersistOngoingRequest` returns false when any value in `request.data` is a `File`/`Blob`, so `null` is written to `PERSISTED_ONGOING_REQUESTS` and the live object is kept only in the in-memory `ongoingRequest` mirror (a `File`/`Blob` cannot be structured-cloned to survive a restart anyway). Note this null-out guard applies **only to the ongoing key** — the `PERSISTED_REQUESTS` array writes the full request, File/Blob included; IndexedDB can store it on web, while SQLite reduces it to `{}` on native. See [Where the request actually hits disk](#where-the-request-actually-hits-disk-and-where-it-doesnt). -- **Structural removal.** `endRequestAndRemoveFromQueue` and the init-time dedup match by `deepEqual` (structural), **not** by `requestID`. `requestID` matching is used only for cross-tab merge and leader-deletion reconciliation (via `knownRequestIDs`). +- **Structural removal.** `endRequestAndRemoveFromQueue` and the init-time dedupe match by `deepEqual` (structural), **not** by `requestID`. `requestID` matching is used only for cross-tab merge and leader-deletion reconciliation (via `knownRequestIDs`). - **`getLength()` counts the ongoing request.** It returns the array length plus one when `ongoingRequest` is non-null, so `getLength() === 0` means the queue is truly idle — `API.index` reads this to decide whether a READ must wait for writes. ### Why in-memory is authoritative @@ -227,7 +227,7 @@ push() (online · idle · no conflict) |---|---|---|---| | Enqueue | `save()` | `PERSISTED_REQUESTS` | No — returned as `persistencePromise`, not awaited before `flush()` | | Promote head → ongoing | `processNextRequest()` | both (atomic `multiSet`) | No — fire-and-forget | -| Success / non-retriable drop | `endRequestAndRemoveFromQueue()` | both (atomic `multiSet`) | No — fire-and-forget | +| Success / non-retryable drop | `endRequestAndRemoveFromQueue()` | both (atomic `multiSet`) | No — fire-and-forget | | Retryable failure | `rollbackOngoingRequest()` | both (atomic `multiSet`) | No — fire-and-forget | | Conflict REPLACE | `update()` | `PERSISTED_REQUESTS` | **Yes** — `handleConflictActions` awaits it | | Conflict DELETE | `deleteRequestsByIndices()` | `PERSISTED_REQUESTS` | **Yes** — `handleConflictActions` awaits it | @@ -251,7 +251,7 @@ push() (online · idle · no conflict) **How it works today.** - **`getRequestWaitTime`** seeds the first wait with a random jitter in `[MIN_RETRY_WAIT_TIME_MS, MAX_RANDOM_RETRY_WAIT_TIME_MS]` = `[10, 100]` ms, then **doubles** the prior wait on each retry, capped at `MAX_RETRY_WAIT_TIME_MS` (10 s). - **`sleep`** increments the retry count and picks the cap — `MAX_OPEN_APP_REQUEST_RETRIES` (2) for the `OPEN_APP` command, else `MAX_REQUEST_RETRIES` (10). Within the cap it resolves after the wait; once exceeded it **rejects with no argument** — this argument-less rejection is the give-up signal that `process()`'s catch consumes. -- **`clear`** resets the wait, the retry count, and any pending timeout — called on success and on any non-retriable outcome. +- **`clear`** resets the wait, the retry count, and any pending timeout — called on success and on any non-retryable outcome. - The queue uses a single shared instance, `sequentialQueueRequestThrottle`. **Sharp edges.** @@ -271,7 +271,7 @@ These are **two distinct deferral mechanisms** that are easy to confuse. **How `queueFlushedData` works.** It is a **distinct, Onyx-persisted** buffer (`QUEUE_FLUSHED_DATA`), separate from the in-memory `QueuedOnyxUpdates`. `SequentialQueue.saveQueueFlushedData` appends a successfully-processed request's `queueFlushedData` field; the queue applies it via `Onyx.update` and clears it only when fully drained (after `flushOnyxUpdatesQueue`). Its sole producer is `App.getOnyxDataForOpenOrReconnect` (`OPEN_APP` / `ReconnectApp`), currently carrying exactly one entry: a merge of `HAS_LOADED_APP = true`. **Sharp edges.** -- Both apply **only** when the queue reaches fully-empty. Under sustained WRITE pressure neither applies, so `HAS_LOADED_APP` stays unflipped and the buffers accumulate. +- Both apply **only** when the queue reaches fully-empty. Under sustained WRITE pressure neither applies, so `HAS_LOADED_APP` never flips and the buffers accumulate. - The application chain in the drain-end `process().finally` has no `.catch` — a failed `Onyx.update` silently skips the subsequent clear. ## Public Contract (the API layer) @@ -304,7 +304,7 @@ The blocks above describe what the queue does; this is the inbound surface — w - The `PERSISTED_ONGOING_REQUESTS` connect callback — post-init, when a **new** ongoing request appears (e.g. observed from another tab), it re-fires the registered initialization callback, which is `flush`. - `Reconnect` — on **two** edges: the offline→online `NetworkState` transition and the app-became-active listener. Both are deliberately decoupled from `reconnect()`'s data-sync (`openApp`/`reconnectApp`), which never flushes. - `Network/index` startup — `ActiveClientManager.isReady().then(flush)`, the primary "drain whatever survived the last session, once we know who the leader is" trigger. -- The **native background-fetch wakeup** (`backgroundTask`), which flushes on a native scheduler tick. +- The **native background-fetch wake-up** (`backgroundTask`), which flushes on a native scheduler tick. **Two ordering gates (don't conflate them):** @@ -327,13 +327,13 @@ The blocks above describe what the queue does; this is the inbound surface — w | Variable | Meaning | Set where | Cleared where | Invariant | |---|---|---|---|---| -| `isSequentialQueueRunning` | Single-flight / re-entrancy guard | top of `flush()` after guards pass | `process().finally` | At most one drain in progress per tab | +| `isSequentialQueueRunning` | Single-flight / re-entry guard | top of `flush()` after guards pass | `process().finally` | At most one drain in progress per tab | | `currentRequestPromise` | The in-flight `process()` chain (for `getCurrentRequest`) | start of the process chain | the drain-end `process().finally` (set to `null`) | Non-null only while a request is in flight | | `isQueuePaused` | **Overloaded:** offline pause **or** `shouldPauseQueue` data-gap sync | `pause()` / a `shouldPauseQueue` response | `unpause()` / `resetQueue()` | While true, nothing is processed | | `isReadyPromise` / `resolveIsReadyPromise` | One-shot gate READs block on (READ-after-WRITE ordering) | resolved at module load; **reset** in `flush(true)` | resolved in `finally` (offline or empty) | A READ may proceed only when no WRITE it must follow is pending | | `shouldFailAllRequests` | Sticky `NETWORK`-key flag → erroring requests are failed and dropped | `NETWORK` Onyx callback | `NETWORK` Onyx callback | Test/debug only | | `queueFlushedDataToStore` | In-memory mirror of `QUEUE_FLUSHED_DATA` | the `QUEUE_FLUSHED_DATA` connect-callback echo of `saveQueueFlushedData`'s `Onyx.set` | `clearQueueFlushedData` | Applied only on full drain | -| `sequentialQueueRequestThrottle` | Shared backoff state (wait time, retry count, pending timeout) | `sleep()` on each generic-error retry | `clear()` on success and every non-retriable outcome | Backoff state never survives a settled request | +| `sequentialQueueRequestThrottle` | Shared backoff state (wait time, retry count, pending timeout) | `sleep()` on each generic-error retry | `clear()` on success and every non-retryable outcome | Backoff state never survives a settled request | ### Why `isReadyPromise` resolves on offline, not on paused @@ -436,7 +436,7 @@ Three concerns cut across every block. **Sharp edges.** - **Index fragility.** `replace`/`delete` carry **numeric indices** computed by the resolver at `push` time, but `handleConflictActions` applies them **asynchronously** (awaiting between positional operations). Meanwhile `processNextRequest` (`slice(1)`), `endRequestAndRemoveFromQueue` (splice), and cross-tab merges can shift indices. An index computed against one snapshot may point at a different request by the time it is applied. -- **Conflict checking sees only the queue, never the in-flight `ongoingRequest`** — a new write cannot dedup against a request that is already being sent. +- **Conflict checking sees only the queue, never the in-flight `ongoingRequest`** — a new write cannot dedupe against a request that is already being sent. - The resolver is expected to be pure, but production resolvers are not: `resolveCommentDeletionConflicts` performs an `Onyx.update`, and because `prepareRequest` invokes the **same** closure, the side effect fires on **both** evaluations (twice per write; an idempotent merge today). `resolveEditCommentWithNewAddCommentRequest` goes further and mutates the queued request's `data` in place during evaluation. --- diff --git a/cspell.json b/cspell.json index 8f866b8f6afb..e39c7212d269 100644 --- a/cspell.json +++ b/cspell.json @@ -468,6 +468,7 @@ "barwidth", "basehead", "baselined", + "beforeunload", "behaviour", "bigdecimal", "billpay", @@ -810,6 +811,7 @@ "reactnativehybridapp", "reactnativekeycommand", "reannounce", + "reauthenticating", "reauthentication", "rebooking", "recategorize", From bd9728f624349a3a16c8fa0e4af216b8f5f4599b Mon Sep 17 00:00:00 2001 From: Adam Horodyski Date: Mon, 15 Jun 2026 14:28:29 +0200 Subject: [PATCH 03/12] Fix preserved-keys allowlist count in SEQUENTIAL_QUEUE.md The QueuedOnyxUpdates not-signed-in filter preserves 15 keys, not 14. Also name the focus-mode NVP and the three RAM-only flags explicitly. Co-Authored-By: Claude Opus 4.8 (1M context) --- contributingGuides/SEQUENTIAL_QUEUE.md | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/contributingGuides/SEQUENTIAL_QUEUE.md b/contributingGuides/SEQUENTIAL_QUEUE.md index 67da15e7b756..6ba3e2a34a13 100644 --- a/contributingGuides/SEQUENTIAL_QUEUE.md +++ b/contributingGuides/SEQUENTIAL_QUEUE.md @@ -264,7 +264,7 @@ These are **two distinct deferral mechanisms** that are easy to confuse. **Problem `QueuedOnyxUpdates` solves.** Prevent UI flicker: if each WRITE's response Onyx updates were applied immediately as the queue drained, the UI would replay intermediate states. Instead, WRITE responses are buffered and applied **after the whole queue drains**. -**How `QueuedOnyxUpdates` works.** It is a module-level **in-memory** buffer. `queueOnyxUpdates` appends and resolves immediately; `flushQueue` snapshots-and-clears (to avoid races) then applies via `Onyx.update`. The WRITE-vs-READ routing decision lives **upstream** in `OnyxUpdates.applyHTTPSOnyxUpdates`: WRITE-type responses (`onyxData`, then `successData`/`failureData`/`finallyData`) route through `queueOnyxUpdates`; READ-type responses apply via `Onyx.update` immediately. When not signed in, `flushQueue` filters the snapshot to a 14-key preserved allowlist (`SESSION`, `NETWORK`, `IS_LOADING_APP`, `HAS_LOADED_APP`, `CREDENTIALS`, `ACCOUNT`, `MODAL`, `PRESERVED_USER_SESSION`, theme/locale/try-new-dot NVPs, and RAM-only loading flags); the filter is skipped entirely under `CONFIG.IS_TEST_ENV`. The pause early-return lives in `SequentialQueue.flushOnyxUpdatesQueue` (returns early when `isQueuePaused`). It is invoked at the end of a full drain — and from two other entry points: `flush()`'s all-empty guard counts a non-empty buffer as pending work (a flush with zero requests proceeds purely to drain the buffer), and `unpause()` calls it directly when the queue is already empty. +**How `QueuedOnyxUpdates` works.** It is a module-level **in-memory** buffer. `queueOnyxUpdates` appends and resolves immediately; `flushQueue` snapshots-and-clears (to avoid races) then applies via `Onyx.update`. The WRITE-vs-READ routing decision lives **upstream** in `OnyxUpdates.applyHTTPSOnyxUpdates`: WRITE-type responses (`onyxData`, then `successData`/`failureData`/`finallyData`) route through `queueOnyxUpdates`; READ-type responses apply via `Onyx.update` immediately. When not signed in, `flushQueue` filters the snapshot to a 15-key preserved allowlist (`SESSION`, `NETWORK`, `IS_LOADING_APP`, `HAS_LOADED_APP`, `CREDENTIALS`, `ACCOUNT`, `MODAL`, `PRESERVED_USER_SESSION`, the theme/locale/try-new-dot/focus-mode NVPs, and the three RAM-only loading flags); the filter is skipped entirely under `CONFIG.IS_TEST_ENV`. The pause early-return lives in `SequentialQueue.flushOnyxUpdatesQueue` (returns early when `isQueuePaused`). It is invoked at the end of a full drain — and from two other entry points: `flush()`'s all-empty guard counts a non-empty buffer as pending work (a flush with zero requests proceeds purely to drain the buffer), and `unpause()` calls it directly when the queue is already empty. **Problem `queueFlushedData` solves.** Apply a small piece of data **only after a full drain** — specifically, mark the app as loaded only once the queue has actually emptied (not mid-drain). From 99ac5490e671f25d8a67ebdc7950848dbd5a4d3a Mon Sep 17 00:00:00 2001 From: Adam Horodyski Date: Mon, 15 Jun 2026 14:40:19 +0200 Subject: [PATCH 04/12] Update SEQUENTIAL_QUEUE.md: persist-before-fire is now in effect PR #91734 made push() async and await the Onyx disk commit before flushing, closing the enqueue crash window the doc centred on. Rewrite the affected sections to current reality: - push() awaits persistencePromise before flush(false); enqueue write is now gated ahead of the network (was: fired off in-memory state) - redraw the disk/network ordering diagram (commit precedes network) - isReadyPromise mechanism: starts resolved, armed via setIsReadyPromisePending(); add isReadyPromisePending (7 state vars) - four resolveIsReadyPromise sites (finally + all-empty/not-leader guards + offline-during-persist in push), not just the finally - persist .catch handlers now Log.alert (storage emergency) - note new SequentialQueueTest coverage; clarify persist-before-fire vs persist-before-optimistic (the latter still open) Co-Authored-By: Claude Opus 4.8 (1M context) --- contributingGuides/SEQUENTIAL_QUEUE.md | 80 +++++++++++++++----------- 1 file changed, 47 insertions(+), 33 deletions(-) diff --git a/contributingGuides/SEQUENTIAL_QUEUE.md b/contributingGuides/SEQUENTIAL_QUEUE.md index 6ba3e2a34a13..bfd4bcfcc122 100644 --- a/contributingGuides/SEQUENTIAL_QUEUE.md +++ b/contributingGuides/SEQUENTIAL_QUEUE.md @@ -55,13 +55,14 @@ This is an observational reference. Where current behavior diverges from apparen write │ read → waitForWrites() → SequentialQueue.waitForIdle() ▼ (blocks on isReadyPromise) ┌────────────────────────────────────────────────────────────┐ -│ SequentialQueue.push() │ +│ SequentialQueue.push() (async) │ │ • conflictResolver #2 (authoritative: mutate the queue) │ -│ • PersistedRequests.save → in-memory SYNC, disk ASYNC │ -│ • dispatch: │ -│ offline → persist only, return (reconnect flushes) │ -│ running → defer: isReadyPromise.then(flush) │ -│ idle → flush(true) NOW │ +│ • PersistedRequests.save → in-memory SYNC; disk AWAITED │ +│ • mark isReadyPromise pending (sync, before first await) │ +│ • await persistencePromise (disk commit) THEN dispatch: │ +│ offline → await persist, return (reconnect flushes) │ +│ running → defer: isReadyPromise.then(flush(false)) │ +│ idle → flush(false) │ └───────────────┬──────────────────────────────────────────────┘ ▼ ┌────────────────────────────────────────────────────────────┐ @@ -102,11 +103,12 @@ This is an observational reference. Where current behavior diverges from apparen disk writes — PERSISTED_* keys; cache + subscribers update SYNC first, the returned Onyx promise resolves on STORAGE COMMIT (IDB tx / SQLite WAL): - • #1 save() → PERSISTED_REQUESTS (in push; NOT awaited before flush) + • #1 save() → PERSISTED_REQUESTS (in push; AWAITED before flush) • #2 processNextRequest → PERSISTED_REQUESTS + ONGOING (promotion; fire-and-forget) • end / rollback / updateOngoing (fire-and-forget) • conflict update / delete (awaited by handleConflictActions) - ✗ makeXHR (network) gates on auth readiness only — never on a queue disk write + ✓ push awaits the #1 disk commit before flush → makeXHR; makeXHR itself still gates + only on auth readiness, so the network can no longer fire ahead of the enqueue write ``` ## Lifecycle of One Request @@ -115,7 +117,7 @@ This is the connective narrative — how the blocks interact at runtime for a si 1. **Prepare (synchronous, in `API.index.prepareRequest`).** `optimisticData` is applied to Onyx immediately, so the UI reflects the change at once. The conflict resolver is evaluated **read-only** here, solely to decide whether to *suppress* the optimistic data (`conflictAction.type === 'noAction'`); the rest of the conflict action is ignored at this site. A `requestID` is stamped from a per-tab counter (the field is literally `requestIndex`; `requestID` is read only as a legacy fallback in `getClientRequestIndex` — this doc keeps the conventional name), and `optimisticData` is stripped from the version that will be persisted (it has already been applied). -2. **Push (`SequentialQueue.push`).** The conflict resolver runs a **second** time, now **authoritatively**, against the live persisted requests — this is the evaluation that may mutate the queue (push / replace / delete). The resolver function is then deleted off the request (it cannot be serialized), and `PersistedRequests.save` mutates the in-memory array **synchronously** while issuing the disk write (captured as `persistencePromise`, resolved asynchronously on disk commit and **not** awaited before firing). `push` then dispatches on three conditions: **offline** → persist only and return (reconnect will flush later); **already running** → defer via `isReadyPromise.then(flush)`; **idle** → `flush(true)` now. +2. **Push (`async SequentialQueue.push`).** The conflict resolver runs a **second** time, now **authoritatively**, against the live persisted requests — this is the evaluation that may mutate the queue (push / replace / delete). The resolver function is then deleted off the request (it cannot be serialized), and `PersistedRequests.save` mutates the in-memory array **synchronously** while issuing the disk write (captured as `persistencePromise`). `push` then marks `isReadyPromise` pending synchronously (before its first `await`, so a READ on the next tick parks behind this write) and **`await`s `persistencePromise` — the disk commit — before flushing**. After the commit it dispatches on three conditions: **offline** → `await` persist and return (reconnect will flush later); **already running** → defer via `isReadyPromise.then(() => flush(false))`; **idle** → `flush(false)`. A network flip to offline *during* the awaited write is handled explicitly: `push` resolves `isReadyPromise` and returns without flushing. 3. **Flush (`flush`).** Five guards run in order — paused, offline, already-running, all-empty (three-legged: queue, `ongoingRequest`, **and** the `QueuedOnyxUpdates` buffer — see [the coordinator](#sequentialqueue-the-coordinator)), **not-leader**. If all pass, `isSequentialQueueRunning` is set, `isReadyPromise` is optionally reset, and a throwaway `PERSISTED_REQUESTS` connection guarantees disk is loaded before `process()` runs (it opts out of connection reuse, because `PersistedRequests` already holds a live `PERSISTED_REQUESTS` connection and reusing it would fire extra callbacks). @@ -135,7 +137,7 @@ The persisted queue exists so an interrupted WRITE survives an app kill. The pie **What does _not_ recover:** - A **File/Blob ongoing request** — `PERSISTED_ONGOING_REQUESTS` holds `null` on disk (the live object was memory-only), so there is nothing to re-drive; the upload is lost. -- Anything still inside the **persist-before-fire crash window** — fired (or about to fire) but not yet committed to disk; it is not reloaded and not re-driven, even though the server may already have acted. See [Where the request actually hits disk](#where-the-request-actually-hits-disk-and-where-it-doesnt). +- A request lost in one of the remaining **fire-and-forget persist windows** — the promotion (`processNextRequest`), removal (`endRequestAndRemoveFromQueue`), and rollback writes are not awaited, so a crash between an in-flight request acting on the server and its `multiSet` committing can leave the disk record momentarily stale. The *enqueue* window is no longer one of these: `push` now awaits the `save()` disk commit before flushing, so a freshly-pushed request cannot reach the server while still absent from disk. See [Where the request actually hits disk](#where-the-request-actually-hits-disk-and-where-it-doesnt). **On sign-out (a different reset):** `Session.cleanupSession()` aborts the in-flight cancellable XHR via `HttpUtils.cancelPendingRequests()` — surfacing as `REQUEST_CANCELLED` to `process()`'s catch (dropped, no data applied) — and clears the persisted queue via `PersistedRequests.clear()`. This is the only inbound path that _aborts_ an in-flight request rather than letting it complete. A bare `Onyx.clear()` (without the abort) resets the mirror to `[]`/`null` through the carve-out path, but because `isInitialized` is already true the init callback does **not** re-fire, so clearing on its own schedules no flush; an in-flight `process()` chain holds local references and is unaffected by the memory wipe. @@ -151,13 +153,13 @@ Each block is described as **the problem it solves → how it works today → sh **How it works today.** - **Single-flight.** `isSequentialQueueRunning` is the re-entry guard: set at the top of `flush()` after the guards pass, cleared in `process().finally`. While set, new pushes defer rather than start a parallel drain. -- **`push()` dispatch.** As above: offline → persist only; running → defer behind `isReadyPromise.then(flush)`; idle → `flush(true)`. `push` returns the disk-persistence promise but **does not await it** — the queue fires off the synchronous in-memory state. +- **`push()` dispatch.** `push` is `async`. It marks `isReadyPromise` pending synchronously, then **`await`s the disk-persistence promise before dispatching**: offline → `await` persist, return; running → defer behind `isReadyPromise.then(() => flush(false))`; idle → `flush(false)`. The queue no longer fires off the synchronous in-memory state ahead of the commit. - **`flush()` guards, in order.** `isQueuePaused` → `isOfflineNetwork()` → `isSequentialQueueRunning` → all-empty → `!isClientTheLeader()`. The all-empty guard is three-legged — no queued requests, no `ongoingRequest`, **and** an empty `QueuedOnyxUpdates` buffer — so `flush()` doubles as the drain path for buffered updates on a request-empty queue. Leadership is checked **exactly once**, before the running flag flips; the recursive `process()` chain re-checks only `isQueuePaused` and `isOfflineNetwork()`, never leadership. - **`process()` recursion.** Pull the next request, run middleware, on success recurse; the recursion terminates when the queue and ongoing request are both empty. - **READ-after-WRITE gate.** `waitForIdle()` returns `isReadyPromise`; READs block on it so a READ response cannot clobber optimistic data from in-flight WRITEs on the same keys. See [the state machine](#the-state-machine) for the resolution rule. **Sharp edges.** -- `push()` triggers `flush()` from the **synchronous in-memory** save and never awaits `persistencePromise`; the network dispatch (`makeXHR`) gates only on auth readiness, never on the queue's disk write. So a request can reach the server while still absent from disk — a crash in that window loses the queue record even though the server may already have acted. This is the named target of the persist-before-fire work; see [Where the request actually hits disk](#where-the-request-actually-hits-disk-and-where-it-doesnt). +- `push()` `await`s `persistencePromise` (the enqueue disk commit) before it ever calls `flush()`, so a freshly-pushed request cannot reach the server while still absent from disk — the enqueue crash window is closed. `makeXHR` still gates only on auth readiness (never on a disk write); the ordering guarantee now comes from `push` awaiting the commit, not from `makeXHR`. The *promotion/removal/rollback* writes inside `process()` remain fire-and-forget (see [Where the request actually hits disk](#where-the-request-actually-hits-disk-and-where-it-doesnt)). - Leadership being checked once per flush means a leadership change *during* an in-flight drain is not re-evaluated by the running cycle (see [Multi-Tab & Leader Election](#multi-tab--leader-election)). ## PersistedRequests (the store) @@ -193,10 +195,10 @@ This is the crux for any crash-safety or persist-before-fire reasoning, so it is **The Onyx promise is a real disk-commit handle, not a cache handle.** In `OnyxUtils` (`setWithRetry`/`multiSetWithRetry`), the cache is updated and subscribers are notified **synchronously** via `broadcastUpdate`/`keyChanged` — there is a source comment, _"This approach prioritizes fast UI changes without waiting for data to be stored in device storage"_ — and the function then returns the `storage.setItem`/`storage.multiSet` chain. That returned promise resolves only after the storage provider commits: an IndexedDB transaction `complete` event on web, or a SQLite WAL commit on native (`journal_mode=WAL`, `synchronous=NORMAL` — durable across a JS/app crash, weaker than an fsync against OS-level power loss). So `push()`'s `persistencePromise` **does** represent durable persistence. Three caveats bound that statement: a **`null` write** returns `Promise.resolve()` immediately (the storage delete runs fire-and-forget — see below), so for null/delete payloads the promise is *not* a commit handle, and a `multiSet` containing a `null` value covers only its non-null pairs (e.g. `endRequestAndRemoveFromQueue`'s "atomic" promise covers only the array write, not the ongoing-key delete); Onyx's internal `retryOperation` **swallows storage failures** after `MAX_STORAGE_OPERATION_RETRY_ATTEMPTS` (5), resolving anyway; and on web Onyx can silently **degrade to `MemoryOnlyProvider`** when the IndexedDB store cannot be created, after which every "commit" is memory-only. -**But the disk write does not gate the network.** `push()` assigns `persistencePromise`, then on the idle-online path calls `flush(true)` **synchronously on the same tick** before `return persistencePromise` — `flush()` is never chained off `persistencePromise.then(...)`. Because `save()` already updated the in-memory array synchronously, `flush()` → `process()` → `processWithMiddleware()` → `makeXHR()` sees the request immediately and dispatches it. `makeXHR` gates the actual `HttpUtils.xhr` call only on `hasReadRequiredDataFromStorage()` (auth/NetworkStore readiness) — never on the `PERSISTED_REQUESTS` disk write. The request can therefore leave the device while `persistencePromise` is still pending. +**And on the enqueue path, the disk write now gates the network.** `push` is `async`: it issues `save()` (synchronous in-memory update + the disk write captured as `persistencePromise`), marks `isReadyPromise` pending, then **`await`s `persistencePromise`** and only afterward calls `flush(false)`. So the enqueue commit lands before `flush()` → `process()` → `processWithMiddleware()` → `makeXHR()` ever runs. `makeXHR` itself still gates the `HttpUtils.xhr` call only on `hasReadRequiredDataFromStorage()` (auth/NetworkStore readiness) — it has no disk gate — but the request can no longer leave the device ahead of its enqueue commit, because `push` blocked on that commit first. (The in-flight `#2` promotion `multiSet` is still fire-and-forget; the gate closed is specifically the enqueue window.) ``` -push() (online · idle · no conflict) +async push() (online · idle · no conflict) │ ├─ save(req) │ ├─ persistedRequests.push(req) ── in-memory mirror: SYNC, authoritative now @@ -204,20 +206,24 @@ push() (online · idle · no conflict) │ ├─ cache + subscriber broadcast ── SYNC (same tick) │ └─ storage.setItem(...) ── async → resolves on DISK COMMIT ▒▒ #1 │ - ├─ flush(true) ── SYNC, same tick · NOT chained off persistencePromise.then - │ └─ process() - │ ├─ processNextRequest → Onyx.multiSet(PERSISTED_REQUESTS, ONGOING) ▒▒ #2 (not awaited) - │ └─ processWithMiddleware → makeXHR - │ └─ gate: hasReadRequiredDataFromStorage() (auth only — no disk gate) - │ └─ HttpUtils.xhr(...) ───────────────────────► NETWORK FIRES + ├─ setIsReadyPromisePending() ── SYNC, before first await (READs park behind us) │ - └─ return persistencePromise (write() awaits only this; resolves to void) + ├─ await persistencePromise ── ⏸ BLOCKS until #1 disk commit lands + │ (if network flipped offline meanwhile → resolveIsReadyPromise() + return, no flush) + │ + └─ flush(false) ── runs only AFTER #1 committed + └─ process() + ├─ processNextRequest → Onyx.multiSet(PERSISTED_REQUESTS, ONGOING) ▒▒ #2 (not awaited) + └─ processWithMiddleware → makeXHR + └─ gate: hasReadRequiredDataFromStorage() (auth only — no disk gate) + └─ HttpUtils.xhr(...) ───────────────────────► NETWORK FIRES ╔══════════════════════════════════════════════════════════════════╗ - ║ CRASH WINDOW: NETWORK FIRES ──► ... ──► #1 disk commit ║ - ║ the request can reach (and be acted on by) the server while still ║ - ║ absent from disk → a crash/reload here loses the queue record and ║ - ║ it is NOT recovered on restart. ║ + ║ ORDERING: #1 disk commit ──► NETWORK FIRES ║ + ║ the enqueue commit is awaited before the request can reach the ║ + ║ server, so a crash before flush cannot lose a not-yet-persisted ║ + ║ request. (Remaining fire-and-forget windows: #2 promotion, ║ + ║ removal, rollback.) ║ ╚══════════════════════════════════════════════════════════════════╝ ``` @@ -225,7 +231,7 @@ push() (online · idle · no conflict) | When | Function | Key(s) written | Awaited by the control flow? | |---|---|---|---| -| Enqueue | `save()` | `PERSISTED_REQUESTS` | No — returned as `persistencePromise`, not awaited before `flush()` | +| Enqueue | `save()` | `PERSISTED_REQUESTS` | **Yes** — `push` awaits `persistencePromise` before `flush()` | | Promote head → ongoing | `processNextRequest()` | both (atomic `multiSet`) | No — fire-and-forget | | Success / non-retryable drop | `endRequestAndRemoveFromQueue()` | both (atomic `multiSet`) | No — fire-and-forget | | Retryable failure | `rollbackOngoingRequest()` | both (atomic `multiSet`) | No — fire-and-forget | @@ -233,7 +239,7 @@ push() (online · idle · no conflict) | Conflict DELETE | `deleteRequestsByIndices()` | `PERSISTED_REQUESTS` | **Yes** — `handleConflictActions` awaits it | | In-flight update (e.g. reauth) | `updateOngoingRequest()` | `PERSISTED_ONGOING_REQUESTS` | No | -> `save()`'s disk-write `.catch` is log-only and does not re-throw, so `persistencePromise` resolves to `void` whether the storage commit **succeeded or failed** — even a consumer that awaits it (e.g. `API.write`) cannot detect a failed persist. (Mirrors the missing `.catch` on the `queueFlushedData` chain.) Onyx itself can also swallow a failed commit — the retry caveat above — so the silence has two layers. +> `save()`'s disk-write `.catch` (and those on `update()` / `deleteRequestsByIndices()`) `Log.alert`s a storage emergency but does **not** re-throw, so `persistencePromise` resolves to `void` whether the storage commit **succeeded or failed** — even a consumer that awaits it (e.g. `push` itself, or `API.write`) cannot detect a failed persist. `push` leans on this deliberately: its `await persistencePromise` is wrapped in a `try/catch` that flushes anyway on rejection, since the request is already in the in-memory queue. Onyx itself can also swallow a failed commit — the retry caveat above — so the silence has two layers. **Where a request does _not_ hit disk:** @@ -323,14 +329,15 @@ The blocks above describe what the queue does; this is the inbound surface — w # The State Machine -`SequentialQueue` is driven by six module-level variables plus the shared throttle instance. This table is the densest reference for reasoning about invariants during a refactor. +`SequentialQueue` is driven by seven module-level variables plus the shared throttle instance. This table is the densest reference for reasoning about invariants during a refactor. | Variable | Meaning | Set where | Cleared where | Invariant | |---|---|---|---|---| | `isSequentialQueueRunning` | Single-flight / re-entry guard | top of `flush()` after guards pass | `process().finally` | At most one drain in progress per tab | | `currentRequestPromise` | The in-flight `process()` chain (for `getCurrentRequest`) | start of the process chain | the drain-end `process().finally` (set to `null`) | Non-null only while a request is in flight | | `isQueuePaused` | **Overloaded:** offline pause **or** `shouldPauseQueue` data-gap sync | `pause()` / a `shouldPauseQueue` response | `unpause()` / `resetQueue()` | While true, nothing is processed | -| `isReadyPromise` / `resolveIsReadyPromise` | One-shot gate READs block on (READ-after-WRITE ordering) | resolved at module load; **reset** in `flush(true)` | resolved in `finally` (offline or empty) | A READ may proceed only when no WRITE it must follow is pending | +| `isReadyPromise` / `resolveIsReadyPromise` | One-shot gate READs block on (READ-after-WRITE ordering) | starts **already-resolved** (`Promise.resolve()`) at module load; armed pending via `setIsReadyPromisePending()` — in `push()`'s sync prelude and in `flush(true)` | `resolveIsReadyPromise()` — see the four sites below | A READ may proceed only when no WRITE it must follow is pending | +| `isReadyPromisePending` | Idempotency guard for `setIsReadyPromisePending()` (prevents orphaning READs parked on a prior pending promise) | `true` when a pending promise is armed | `false` inside `resolveIsReadyPromise` | At most one pending `isReadyPromise` is armed at a time | | `shouldFailAllRequests` | Sticky `NETWORK`-key flag → erroring requests are failed and dropped | `NETWORK` Onyx callback | `NETWORK` Onyx callback | Test/debug only | | `queueFlushedDataToStore` | In-memory mirror of `QUEUE_FLUSHED_DATA` | the `QUEUE_FLUSHED_DATA` connect-callback echo of `saveQueueFlushedData`'s `Onyx.set` | `clearQueueFlushedData` | Applied only on full drain | | `sequentialQueueRequestThrottle` | Shared backoff state (wait time, retry count, pending timeout) | `sleep()` on each generic-error retry | `clear()` on success and every non-retryable outcome | Backoff state never survives a settled request | @@ -342,9 +349,16 @@ In the drain-end `process().finally`, `resolveIsReadyPromise` runs only when `is - **Offline pause** → resolve. The queue cannot process anyway, so READs should be allowed through (they'll show optimistic/stale data, which is acceptable offline). - **`shouldPauseQueue` data-gap sync** → do **not** resolve. WRITEs are still pending behind the gap, so READs must keep waiting or they'd clobber in-flight optimistic data. -`unpause()` calls `flush(false)` precisely to **preserve** the existing `isReadyPromise` — swapping in a fresh one would orphan READs already awaiting the old promise. +**`resolveIsReadyPromise` is now called from four sites**, not just the `finally`. Because `push()` arms the pending promise *before* awaiting the disk write, every early-out path that would otherwise skip the drain has to release parked READs itself, or they would hang: + +1. The **drain-end `process().finally`** — when offline or the queue is empty (above). +2. The **all-empty guard in `flush()`** — e.g. a conflict resolver deleted the only request without pushing a replacement, so there is nothing to drain. +3. The **not-leader guard in `flush()`** — followers never process the queue, so they resolve immediately (otherwise a follower tab's READs would hang forever after any write). +4. **`push()` itself**, when the network flips offline *during* the awaited disk write — `flush()`'s offline early-return wouldn't resolve, so `push` resolves and returns without flushing. + +`unpause()` calls `flush(false)` precisely to **preserve** the existing `isReadyPromise` — swapping in a fresh one would orphan READs already awaiting the old promise. (`setIsReadyPromisePending()` is idempotent for the same reason: it no-ops when a pending promise is already armed.) -**Sharp edges.** `isReadyPromise` is a single shared one-shot resolver. If a flush round resets it but never reaches the `finally` that resolves it (e.g. a persistent, unresolvable `shouldPauseQueue` data gap), subsequent READs blocked on `waitForIdle()` have no timeout. +**Sharp edges.** `isReadyPromise` is a single shared one-shot resolver. If a flush round arms it but never reaches any of the four resolve sites (e.g. a persistent, unresolvable `shouldPauseQueue` data gap), subsequent READs blocked on `waitForIdle()` have no timeout. --- @@ -448,7 +462,7 @@ Tests are useful here as evidence of the **intended contract** — what the team | Suite | What it pins as contract | |---|---| | `tests/unit/APITest.ts` | Offline-persist, online-flush, persisted-until-backend-response, retry/backoff for 5xx + Auth-down (asserts ~3 fetches with throttle waits), reauthentication + ordering, WRITE-blocks-READ, pause/unpause, no-duplicates + enable-feature conflict collapsing, supportal-411 no-retry + `failureData` | -| `tests/unit/SequentialQueueTest.ts` | Push/persist counts, conflict resolution (replace/push/noAction, incl. replace-while-ongoing), move-to-ongoing persistence, startup hydration of the ongoing request, `queueFlushedData` lifecycle, `ALREADY_CREATED` treated as success without retry | +| `tests/unit/SequentialQueueTest.ts` | Push/persist counts, conflict resolution (replace/push/noAction, incl. replace-while-ongoing — now `async`/awaited since `push` is async), move-to-ongoing persistence, startup hydration of the ongoing request, `queueFlushedData` lifecycle, `ALREADY_CREATED` treated as success without retry, **and the persist-before-fire edges: `waitForIdle` resolves without flushing when the network goes offline during persist, and a disk-write failure during persist does not strand `isReadyPromise` or block the queue** | | `tests/unit/PersistedRequests.ts` | Save/remove/processNext/update/updateOngoing, File/Blob handling, cross-tab follower reconciliation; the `Issue 3a`/`3c` cases assert the **durable** ongoing-request behavior (`processNextRequest` always persists via `multiSet`) — a legacy test title references the vestigial `persistWhenOngoing` flag | | `tests/unit/RequestConflictUtilsTest.ts` | Pure push/replace/delete/noAction resolver logic | | `tests/actions/QueuedOnyxUpdatesTest.ts` | Account-scoped update filtering on flush | @@ -465,7 +479,7 @@ The suites rely on `resetQueue()` (resets the four core coordinator vars — run - **True mid-flight-crash-then-restart recovery** — startup hydration is simulated by manually seeding `PERSISTED_ONGOING_REQUESTS`. - **`waitForIdle` is used only as a drain helper** in tests (pagination, attachments, workspace-upgrade, network, middleware), never asserted as a READ-blocks-on-WRITE ordering guarantee. -**Behavior pinned to today's ordering.** `APITest.ts` › `API.write() persistence guarantees` › `'Issue 1: should persist the request before applying optimistic data'` currently asserts `optimisticDataApplied === true` and `requestPersistedBeforeOptimistic === false` — i.e. it pins the **current** optimistic-before-persist ordering, with a comment to flip the final assertion once persist-before-fire lands. (Sibling `Issue 2`/`Issue 5` cases in the same block are written to the post-fix behavior, so the block is internally mixed.) +**Two distinct orderings — don't conflate them.** Persist-before-**fire** (await the disk commit before the network XHR) is in effect today and is the behavior described throughout this doc. Persist-before-**optimistic** (persist the request *before* applying `optimisticData` to Onyx in `prepareRequest`) is a *separate*, still-open ordering: `optimisticData` is applied in `prepareRequest`, before `push` is ever called, so the persist-before-fire await does not affect it. `APITest.ts` › `API.write() persistence guarantees` › `'Issue 1: should persist the request before applying optimistic data'` asserts `optimisticDataApplied === true` and `requestPersistedBeforeOptimistic === false` — it pins that optimistic-before-persist ordering, and its "flip the final assertion" comment refers to the persist-before-optimistic fix (not yet shipped). The sibling `Issue 2` case stalls the write promise on an unresolved `PERSISTED_REQUESTS` `Onyx.set`, asserting `push` awaits persistence (the current persist-before-fire behavior), though its describe-block header comment still reads "BUG" — an upstream test inconsistency, not a behavior claim of this doc. --- From 31d1f510856603b1477bd4b3e00a72a98a49c856 Mon Sep 17 00:00:00 2001 From: Adam Horodyski Date: Tue, 16 Jun 2026 10:11:18 +0200 Subject: [PATCH 05/12] Address review feedback on SEQUENTIAL_QUEUE.md - getLength() does not gate read-blocking: drop the claim; waitForWrites always awaits waitForSequentialQueueIdle() (isReadyPromise) regardless of count (MelvinBot) - Reauthentication throws new Error('Failed to reauthenticate') (MelvinBot) - Qualify persist-before-fire to post-init only; pre-init save() returns Promise.resolve() and the enqueue write is deferred + un-awaited (Codex P2) - isQueuePaused is a data-gap/deferred-update pause only, not offline; offline is a separate isOfflineNetwork() check. Add Open Question #6 on the misleading source comment (Codex P3) Co-Authored-By: Claude Opus 4.8 (1M context) --- contributingGuides/SEQUENTIAL_QUEUE.md | 21 +++++++++++---------- 1 file changed, 11 insertions(+), 10 deletions(-) diff --git a/contributingGuides/SEQUENTIAL_QUEUE.md b/contributingGuides/SEQUENTIAL_QUEUE.md index bfd4bcfcc122..db92489c5959 100644 --- a/contributingGuides/SEQUENTIAL_QUEUE.md +++ b/contributingGuides/SEQUENTIAL_QUEUE.md @@ -159,7 +159,7 @@ Each block is described as **the problem it solves → how it works today → sh - **READ-after-WRITE gate.** `waitForIdle()` returns `isReadyPromise`; READs block on it so a READ response cannot clobber optimistic data from in-flight WRITEs on the same keys. See [the state machine](#the-state-machine) for the resolution rule. **Sharp edges.** -- `push()` `await`s `persistencePromise` (the enqueue disk commit) before it ever calls `flush()`, so a freshly-pushed request cannot reach the server while still absent from disk — the enqueue crash window is closed. `makeXHR` still gates only on auth readiness (never on a disk write); the ordering guarantee now comes from `push` awaiting the commit, not from `makeXHR`. The *promotion/removal/rollback* writes inside `process()` remain fire-and-forget (see [Where the request actually hits disk](#where-the-request-actually-hits-disk-and-where-it-doesnt)). +- `push()` `await`s `persistencePromise` (the enqueue disk commit) before it ever calls `flush()`, so a freshly-pushed request cannot reach the server while still absent from disk — the enqueue crash window is closed — post-initialization only (see [Where the request actually hits disk](#where-the-request-actually-hits-disk-and-where-it-doesnt)). `makeXHR` still gates only on auth readiness (never on a disk write); the ordering guarantee now comes from `push` awaiting the commit, not from `makeXHR`. The *promotion/removal/rollback* writes inside `process()` remain fire-and-forget (see [Where the request actually hits disk](#where-the-request-actually-hits-disk-and-where-it-doesnt)). - Leadership being checked once per flush means a leadership change *during* an in-flight drain is not re-evaluated by the running cycle (see [Multi-Tab & Leader Election](#multi-tab--leader-election)). ## PersistedRequests (the store) @@ -172,7 +172,7 @@ Each block is described as **the problem it solves → how it works today → sh - **Why `processNextRequest` returns the local reference.** Onyx's post-`multiSet` callback fires synchronously and would overwrite the module-level `ongoingRequest` with a JSON-serialized copy — which strips `File`/`Blob` payloads. Returning the captured local reference preserves the live objects for the request about to run. (The `knownOngoingRequestIDs` own-write guard ignores that echo for requests carrying a `requestIndex` — the local-ref return is load-bearing for those without one.) - **File/Blob _ongoing_ requests are not durably persisted.** `shouldPersistOngoingRequest` returns false when any value in `request.data` is a `File`/`Blob`, so `null` is written to `PERSISTED_ONGOING_REQUESTS` and the live object is kept only in the in-memory `ongoingRequest` mirror (a `File`/`Blob` cannot be structured-cloned to survive a restart anyway). Note this null-out guard applies **only to the ongoing key** — the `PERSISTED_REQUESTS` array writes the full request, File/Blob included; IndexedDB can store it on web, while SQLite reduces it to `{}` on native. See [Where the request actually hits disk](#where-the-request-actually-hits-disk-and-where-it-doesnt). - **Structural removal.** `endRequestAndRemoveFromQueue` and the init-time dedupe match by `deepEqual` (structural), **not** by `requestID`. `requestID` matching is used only for cross-tab merge and leader-deletion reconciliation (via `knownRequestIDs`). -- **`getLength()` counts the ongoing request.** It returns the array length plus one when `ongoingRequest` is non-null, so `getLength() === 0` means the queue is truly idle — `API.index` reads this to decide whether a READ must wait for writes. +- **`getLength()` counts the ongoing request.** It returns the array length plus one when `ongoingRequest` is non-null, so `getLength() === 0` means the queue is truly idle. ### Why in-memory is authoritative @@ -195,7 +195,7 @@ This is the crux for any crash-safety or persist-before-fire reasoning, so it is **The Onyx promise is a real disk-commit handle, not a cache handle.** In `OnyxUtils` (`setWithRetry`/`multiSetWithRetry`), the cache is updated and subscribers are notified **synchronously** via `broadcastUpdate`/`keyChanged` — there is a source comment, _"This approach prioritizes fast UI changes without waiting for data to be stored in device storage"_ — and the function then returns the `storage.setItem`/`storage.multiSet` chain. That returned promise resolves only after the storage provider commits: an IndexedDB transaction `complete` event on web, or a SQLite WAL commit on native (`journal_mode=WAL`, `synchronous=NORMAL` — durable across a JS/app crash, weaker than an fsync against OS-level power loss). So `push()`'s `persistencePromise` **does** represent durable persistence. Three caveats bound that statement: a **`null` write** returns `Promise.resolve()` immediately (the storage delete runs fire-and-forget — see below), so for null/delete payloads the promise is *not* a commit handle, and a `multiSet` containing a `null` value covers only its non-null pairs (e.g. `endRequestAndRemoveFromQueue`'s "atomic" promise covers only the array write, not the ongoing-key delete); Onyx's internal `retryOperation` **swallows storage failures** after `MAX_STORAGE_OPERATION_RETRY_ATTEMPTS` (5), resolving anyway; and on web Onyx can silently **degrade to `MemoryOnlyProvider`** when the IndexedDB store cannot be created, after which every "commit" is memory-only. -**And on the enqueue path, the disk write now gates the network.** `push` is `async`: it issues `save()` (synchronous in-memory update + the disk write captured as `persistencePromise`), marks `isReadyPromise` pending, then **`await`s `persistencePromise`** and only afterward calls `flush(false)`. So the enqueue commit lands before `flush()` → `process()` → `processWithMiddleware()` → `makeXHR()` ever runs. `makeXHR` itself still gates the `HttpUtils.xhr` call only on `hasReadRequiredDataFromStorage()` (auth/NetworkStore readiness) — it has no disk gate — but the request can no longer leave the device ahead of its enqueue commit, because `push` blocked on that commit first. (The in-flight `#2` promotion `multiSet` is still fire-and-forget; the gate closed is specifically the enqueue window.) +**And on the enqueue path, the disk write now gates the network.** `push` is `async`: it issues `save()` (synchronous in-memory update + the disk write captured as `persistencePromise`), marks `isReadyPromise` pending, then **`await`s `persistencePromise`** and only afterward calls `flush(false)`. So the enqueue commit lands before `flush()` → `process()` → `processWithMiddleware()` → `makeXHR()` ever runs. `makeXHR` itself still gates the `HttpUtils.xhr` call only on `hasReadRequiredDataFromStorage()` (auth/NetworkStore readiness) — it has no disk gate — but the request can no longer leave the device ahead of its enqueue commit, because `push` blocked on that commit first. (The in-flight `#2` promotion `multiSet` is still fire-and-forget; the gate closed is specifically the enqueue window.) **This holds only once `PersistedRequests` has initialized.** Before initialization, `save()` parks the request in `pendingSaveOperations` and returns `Promise.resolve()` — *not* a commit handle — so `push`'s `await` is a no-op. The real enqueue write is issued later by the init connect-callback as an **un-awaited** `Onyx.set(PERSISTED_REQUESTS, …)`, which then calls `triggerInitializationCallback()` → `flush()`. So during that startup window the XHR can still fire ahead of the enqueue commit; the crash window is closed only after initialization. ``` async push() (online · idle · no conflict) @@ -322,7 +322,7 @@ The blocks above describe what the queue does; this is the inbound surface — w - `App` clear-and-restore and `ImportOnyxState` — `rollbackOngoingRequest()` to snapshot/restore the queue around an `Onyx.clear()` or a state import. The `App` path also takes a `getAll()` snapshot and restores via direct `save()` calls — a raw enqueue that bypasses `push()` (no conflict resolution, no flush trigger). - `Report.getGuidedSetupDataForOpenReport` — `getAll()` (read-only) to scan the queue for an already-enqueued `OPEN_REPORT` carrying `guidedSetupData`, preventing duplication. - `Policy` and `HandleUnusedOptimisticID` — `getAll()` + `update()`/`updateOngoingRequest()` to rewrite a queued (or in-flight) request in place, e.g. optimistic-ID repair. -- `API.index` — `getLength()` (read-only) to decide read-blocking. +- `API.index` — `getLength()` (read-only). This does **not** gate reads; `waitForWrites` always awaits `waitForSequentialQueueIdle()` (= `isReadyPromise`) regardless of the count. - `isPaused()` / `isRunning()` are read-only accessors — `RequestsQueuesState` diagnostics reads these plus `getAll()`/`getOngoingRequest()` directly; `MainQueue` consumes `isRunning()` **behaviorally**, holding its own processing while the sequential queue drains. --- @@ -335,7 +335,7 @@ The blocks above describe what the queue does; this is the inbound surface — w |---|---|---|---|---| | `isSequentialQueueRunning` | Single-flight / re-entry guard | top of `flush()` after guards pass | `process().finally` | At most one drain in progress per tab | | `currentRequestPromise` | The in-flight `process()` chain (for `getCurrentRequest`) | start of the process chain | the drain-end `process().finally` (set to `null`) | Non-null only while a request is in flight | -| `isQueuePaused` | **Overloaded:** offline pause **or** `shouldPauseQueue` data-gap sync | `pause()` / a `shouldPauseQueue` response | `unpause()` / `resetQueue()` | While true, nothing is processed | +| `isQueuePaused` | Data-gap / deferred-update pause **only** (never offline) | `pause()` — the `shouldPauseQueue` response, `DeferredOnyxUpdates`, `applyOnyxUpdatesReliably` | `unpause()` / `resetQueue()` | While true, nothing is processed; offline is a **separate** `isOfflineNetwork()` check in `push`/`flush`/`process` | | `isReadyPromise` / `resolveIsReadyPromise` | One-shot gate READs block on (READ-after-WRITE ordering) | starts **already-resolved** (`Promise.resolve()`) at module load; armed pending via `setIsReadyPromisePending()` — in `push()`'s sync prelude and in `flush(true)` | `resolveIsReadyPromise()` — see the four sites below | A READ may proceed only when no WRITE it must follow is pending | | `isReadyPromisePending` | Idempotency guard for `setIsReadyPromisePending()` (prevents orphaning READs parked on a prior pending promise) | `true` when a pending promise is armed | `false` inside `resolveIsReadyPromise` | At most one pending `isReadyPromise` is armed at a time | | `shouldFailAllRequests` | Sticky `NETWORK`-key flag → erroring requests are failed and dropped | `NETWORK` Onyx callback | `NETWORK` Onyx callback | Test/debug only | @@ -344,10 +344,10 @@ The blocks above describe what the queue does; this is the inbound surface — w ### Why `isReadyPromise` resolves on offline, not on paused -In the drain-end `process().finally`, `resolveIsReadyPromise` runs only when `isOfflineNetwork() || !hasRemainingRequests` — **deliberately not** when `isQueuePaused`. The reason is that `isQueuePaused` is overloaded: +In the drain-end `process().finally`, `resolveIsReadyPromise` runs only when `isOfflineNetwork() || !hasRemainingRequests` — keyed on the **direct offline check**, **deliberately not** on `isQueuePaused`. The two are distinct conditions, not one overloaded flag: -- **Offline pause** → resolve. The queue cannot process anyway, so READs should be allowed through (they'll show optimistic/stale data, which is acceptable offline). -- **`shouldPauseQueue` data-gap sync** → do **not** resolve. WRITEs are still pending behind the gap, so READs must keep waiting or they'd clobber in-flight optimistic data. +- **Offline** (`isOfflineNetwork()`) → resolve. The queue cannot process anyway, so READs should be allowed through (they'll show optimistic/stale data, which is acceptable offline). Going offline never sets `isQueuePaused`; it is checked directly. +- **`isQueuePaused`** (a `shouldPauseQueue` data-gap sync, or a `DeferredOnyxUpdates`/`applyOnyxUpdatesReliably` pause) → do **not** resolve. WRITEs are still pending behind the gap, so READs must keep waiting or they'd clobber in-flight optimistic data. This is exactly why the gate keys on `isOfflineNetwork()` rather than `!isQueuePaused`. **`resolveIsReadyPromise` is now called from four sites**, not just the `finally`. Because `push()` arms the pending promise *before* awaiting the disk write, every early-out path that would otherwise skip the drain has to release parked READs itself, or they would hang: @@ -427,7 +427,7 @@ Three concerns cut across every block. - **Execution model.** `processWithMiddleware` folds the chain with `reduce`, seeding from `makeXHR(request)`. Each middleware appends its own `.then` to the promise it receives. So the **first-registered** middleware (`Logging`) wraps the raw XHR and its response handler fires **first**; the **last-registered** (`FraudMonitoring`) fires **last**. **Response handlers fire in registration order** (not the reverse). - **`isFromSequentialQueue = true`** changes exactly one middleware behavior: `Reauthentication` **re-throws** on failure so the queue's catch can retry (rather than resolving a callback). The `QueuedOnyxUpdates` routing in `SaveResponseInOnyx` is **not** keyed on this flag — it is keyed on `request.data.apiRequestType === WRITE` in `applyHTTPSOnyxUpdates` (see [QueuedOnyxUpdates](#queuedonyxupdates-and-queueflusheddata)), which for queue traffic amounts to the same thing. - **`SaveResponseInOnyx`** is the penultimate middleware and the sole producer of `shouldPauseQueue`. -- **`Reauthentication`** reacts to a 407 (`NOT_AUTHENTICATED`), which arrives as a *resolved* response (data), by reauthenticating. On the sequential-queue path it re-runs the chain; if reauth ultimately fails it throws `'Failed to reauthenticate'`. On reauth **success** the ongoing request stays promoted and is re-driven in place (no rollback, no dequeue); only reauth **failure** escapes to the generic ladder, which rolls the ongoing request back to the queue head and retries with backoff. `HandleUnusedOptimisticID` is the path that mutates the ongoing request in place (via `updateOngoingRequest`) during such a re-run. +- **`Reauthentication`** reacts to a 407 (`NOT_AUTHENTICATED`), which arrives as a *resolved* response (data), by reauthenticating. On the sequential-queue path it re-runs the chain; if reauth ultimately fails it throws `new Error('Failed to reauthenticate')`. On reauth **success** the ongoing request stays promoted and is re-driven in place (no rollback, no dequeue); only reauth **failure** escapes to the generic ladder, which rolls the ongoing request back to the queue head and retries with backoff. `HandleUnusedOptimisticID` is the path that mutates the ongoing request in place (via `updateOngoingRequest`) during such a re-run. **Sharp edges.** - The chain order is implicit import-time module state; there is no runtime assertion that `SaveResponseInOnyx` is penultimate or that `Reauthentication` precedes it. @@ -511,7 +511,7 @@ Defined in `src/CONST/index.ts` under `CONST.NETWORK` (retry/backoff subset rele | `src/libs/API/index.ts` | Public facade: `prepareRequest` applies optimistic data + stamps `requestID`; routes `write`→queue, `read`→`waitForWrites` then fire-and-forget, `makeRequestWithSideEffects`→immediate. | | `src/libs/Request.ts` (`processWithMiddleware`) | Folds the middleware chain over `makeXHR(request)`; first-registered is innermost, response handlers fire in registration order. | | `src/libs/Middleware/SaveResponseInOnyx.ts` | Penultimate middleware; sole producer of `shouldPauseQueue` (on a `previousUpdateID` gap); persists server Onyx updates. | -| `src/libs/Middleware/Reauthentication.ts` | Re-throws on `isFromSequentialQueue` so the queue retries; reacts to 407 by reauthenticating; may throw `'Failed to reauthenticate'`. | +| `src/libs/Middleware/Reauthentication.ts` | Re-throws on `isFromSequentialQueue` so the queue retries; reacts to 407 by reauthenticating; may throw `new Error('Failed to reauthenticate')`. | | `src/libs/ActiveClientManager/` | Leader election across web tabs (`isClientTheLeader`); no-op (always leader) on native. | | `QUEUE_FLUSHED_DATA` (Onyx key) + `SequentialQueue.saveQueueFlushedData` | Distinct Onyx-persisted buffer applied only on full drain; sole producer is `App.getOnyxDataForOpenOrReconnect` carrying `HAS_LOADED_APP = true`. | @@ -526,6 +526,7 @@ These are intent-ambiguous from the code alone — flagged for a maintainer to r 3. **Silent give-up data loss for non-`OPEN_APP` commands.** After the retry cap, the request is dropped with `failureData` applied and **no modal**. Is this the intended product behavior, or should more commands surface a failure to the user? 4. **Unguarded paths in the `PERSISTED_ONGOING_REQUESTS` callback.** The callback has an own-write guard (`knownOngoingRequestIDs`) but never consults `pendingOnyxWrites`, and `null` echoes and values with an unknown/missing `requestIndex` are adopted unconditionally — a weaker shield than the array callback's. Are these knowingly-accepted gaps, or should the array callback's full protection be mirrored here? 5. **The stale "`SaveResponseInOnyx` must be last" comment** vs. `FraudMonitoring` being registered after it. Is `FraudMonitoring`-after-`SaveResponseInOnyx` intentional (relying on it returning the response unchanged), and should the comment + ordering be made enforceable? +6. **The `isQueuePaused` offline source comment is inaccurate.** The comment at the `resolveIsReadyPromise` decision in `SequentialQueue` states *"isQueuePaused is true for both offline pauses AND shouldPauseQueue (data gap sync)"*, but no offline path ever calls `pause()` — the only callers are the `shouldPauseQueue` response, `DeferredOnyxUpdates`, and `applyOnyxUpdatesReliably`. Offline is handled by the separate `isOfflineNetwork()` gate. Should the comment be corrected to drop the offline claim? --- From 2b71d0c849270f5ed96f82e85f7cd96cfb1eb78d Mon Sep 17 00:00:00 2001 From: Adam Horodyski Date: Tue, 16 Jun 2026 13:35:25 +0200 Subject: [PATCH 06/12] Resolve four open questions in SEQUENTIAL_QUEUE.md into documented facts Fold the persistWhenOngoing, dead !onyxUpdates guard term, stale "must be last" comment, and inaccurate isQueuePaused offline comment into their relevant blocks as present-tense facts. Open Questions now holds only the two items that genuinely need a maintainer to ratify: silent give-up data loss, and knownOngoingRequestIDs sufficiency. Co-Authored-By: Claude Opus 4.8 (1M context) --- contributingGuides/SEQUENTIAL_QUEUE.md | 19 ++++++++----------- 1 file changed, 8 insertions(+), 11 deletions(-) diff --git a/contributingGuides/SEQUENTIAL_QUEUE.md b/contributingGuides/SEQUENTIAL_QUEUE.md index db92489c5959..5f77d1a09c13 100644 --- a/contributingGuides/SEQUENTIAL_QUEUE.md +++ b/contributingGuides/SEQUENTIAL_QUEUE.md @@ -183,9 +183,10 @@ Onyx (since 3.0.46) fires `set`/`multiSet` subscription callbacks **synchronousl - **Two carve-outs** still adopt disk: a `null` value (e.g. from `Onyx.clear()`) falls through to the disk-load/re-init path (the guard is gated on `val != null`); and genuinely-new cross-tab requests identified by an unknown `requestID`. **Sharp edges.** -- The `PERSISTED_ONGOING_REQUESTS` callback **is guarded against own-write echoes, but differently** from the array: it early-returns when the value's `requestIndex` is in `knownOngoingRequestIDs` (a set fed by `processNextRequest` and `updateOngoingRequest`). It never consults `pendingOnyxWrites`, and two paths are unguarded — a `null` echo and a value with an unknown (or missing) `requestIndex` are adopted unconditionally (see [Open Questions](#open-questions-needs-maintainer-confirmation)). The same callback also fires `triggerInitializationCallback()` (= `flush`) post-init when a new ongoing request appears — a mid-session flush trigger (see [Inbound Consumers](#inbound-consumers-who-calls-the-queue)). +- The `PERSISTED_ONGOING_REQUESTS` callback **is guarded against own-write echoes, but differently** from the array: it early-returns when the value's `requestIndex` is in `knownOngoingRequestIDs` (a set fed by `processNextRequest` and `updateOngoingRequest`). It never consults `pendingOnyxWrites`; `knownOngoingRequestIDs` is its only own-write guard (the set also catches stale serialized copies a write-counter cannot). Two paths remain adopted unconditionally — a `null` echo and a value with an unknown (or missing) `requestIndex`; whether the set alone is sufficient for those is an [open question](#open-questions-needs-maintainer-confirmation). The same callback also fires `triggerInitializationCallback()` (= `flush`) post-init when a new ongoing request appears — a mid-session flush trigger (see [Inbound Consumers](#inbound-consumers-who-calls-the-queue)). - The `pendingOnyxWrites` counter is a hand-rolled ordering mechanism reimplementing a guarantee the data layer doesn't provide. It is correct only as long as every own write is wrapped and increments/decrements stay balanced (`clear()`'s bare ongoing-key write is the one exception — see above). - Structural `deepEqual` removal is keyed on object shape, not identity. If a request object is mutated mid-flight by middleware (e.g. optimistic-ID rewriting, a `shouldRetry` flag, `File` stripping on serialization, an added rollback marker), a later `deepEqual` can fail to match the original. +- `persistWhenOngoing` is a **vestigial field**. It is read only for logging (in `processNextRequest`, `SequentialQueue`, and `RequestsQueuesState`) and is never assigned by any production write path — ongoing-request persistence is unconditional, gated only by `shouldPersistOngoingRequest` (serializability). Every production read resolves to `undefined`; it is safe to remove. ## Where the request actually hits disk (and where it doesn't) @@ -413,7 +414,7 @@ Three concerns cut across every block. **How it works today.** `SaveResponseInOnyx` is the **sole producer** of `shouldPauseQueue`: it sets the flag when the command is not in `requestsToIgnoreLastUpdateID` and `OnyxUpdates.doesClientNeedToBeUpdated` reports a `previousUpdateID` gap, after stashing the update info. `process()` then `pause()`s — the flag flips first, but the just-processed request is still removed and its `queueFlushedData` saved before the recursion early-returns on the pause. The actual gap fetch runs in a **decoupled** `OnyxUpdateManager` flow that fetches the missing range, applies the deferred updates in order, and then calls `unpause()`. Within this concern `pause()` has two inbound callers — `applyOnyxUpdatesReliably` (on detecting a missing-updates fetch) and `DeferredOnyxUpdates` — with `unpause()` called once the gap is filled. **Sharp edges.** -- `isQueuePaused` conflates this data-gap pause with the offline pause (see [the state machine](#why-isreadypromise-resolves-on-offline-not-on-paused)). +- The in-code source comment at the `resolveIsReadyPromise` decision claims `isQueuePaused` is set for offline pauses as well as `shouldPauseQueue` data-gap syncs. That is **inaccurate**: no offline path ever calls `pause()` (its only callers are the `shouldPauseQueue` response, `DeferredOnyxUpdates`, and `applyOnyxUpdatesReliably`); offline is handled entirely by the separate `isOfflineNetwork()` gate. The decision logic itself is correct — it keys on `isOfflineNetwork()` — only the explanatory comment is wrong (see [the state machine](#why-isreadypromise-resolves-on-offline-not-on-paused)). - `shouldPauseQueue` rides on the response object, so any middleware registered after `SaveResponseInOnyx` that returned a *new* response object would strip the flag. Today the only later middleware is `FraudMonitoring`, which returns the response unchanged — so the flag survives, but the in-code comment claiming `SaveResponseInOnyx` "must be last" is stale (see [Middleware](#the-middleware-chain-boundary)). --- @@ -431,8 +432,8 @@ Three concerns cut across every block. **Sharp edges.** - The chain order is implicit import-time module state; there is no runtime assertion that `SaveResponseInOnyx` is penultimate or that `Reauthentication` precedes it. -- `SaveResponseInOnyx`'s early-return guard is effectively **dead code**: `onyxUpdates` defaults to `response.onyxData ?? []`, and an empty array is truthy, so `!onyxUpdates` is never true. It does not short-circuit empty responses. -- The in-code "must be last" comment on `SaveResponseInOnyx` is stale (`FraudMonitoring` follows it). +- `SaveResponseInOnyx`'s early-return guard has a **dead term**: `onyxUpdates` defaults to `response.onyxData ?? []`, and an empty array is truthy, so the `!onyxUpdates` leg of the guard is never true — a response's empty `onyxData` cannot short-circuit the middleware. The sibling `successData`/`failureData`/`finallyData` legs still fire, so the `if` as a whole is not dead — only its empty-`onyxData` short-circuit is. +- The in-code "must be last" comment on `SaveResponseInOnyx` is **inaccurate**: `FraudMonitoring` is registered after it. The current order is safe only **incidentally** — `FraudMonitoring` returns the response object unchanged, so the `shouldPauseQueue` flag riding on it survives. The load-bearing invariant is "no middleware after `SaveResponseInOnyx` may mutate or replace the response," not literal lastness; nothing enforces either form. --- @@ -519,14 +520,10 @@ Defined in `src/CONST/index.ts` under `CONST.NETWORK` (retry/backoff subset rele # Open Questions (needs maintainer confirmation) -These are intent-ambiguous from the code alone — flagged for a maintainer to ratify or correct, not asserted as fact. +Two items remain genuinely intent-ambiguous and need a maintainer to ratify, not assert as fact. -1. **`persistWhenOngoing` appears vestigial.** It is read/logged (in `processNextRequest`, `SequentialQueue`, `RequestsQueuesState`) but **never assigned** by any production write-path function (only in test fixtures); production persistence branches on `shouldPersistOngoingRequest`. Is it safe to remove, or is it a reserved hook? -2. **`SaveResponseInOnyx`'s early-return guard is dead code** (`!onyxUpdates` is never true because `onyxUpdates` defaults to `[]`). Was empty-response short-circuiting intended (i.e. should it test array length), or is the guard genuinely obsolete? -3. **Silent give-up data loss for non-`OPEN_APP` commands.** After the retry cap, the request is dropped with `failureData` applied and **no modal**. Is this the intended product behavior, or should more commands surface a failure to the user? -4. **Unguarded paths in the `PERSISTED_ONGOING_REQUESTS` callback.** The callback has an own-write guard (`knownOngoingRequestIDs`) but never consults `pendingOnyxWrites`, and `null` echoes and values with an unknown/missing `requestIndex` are adopted unconditionally — a weaker shield than the array callback's. Are these knowingly-accepted gaps, or should the array callback's full protection be mirrored here? -5. **The stale "`SaveResponseInOnyx` must be last" comment** vs. `FraudMonitoring` being registered after it. Is `FraudMonitoring`-after-`SaveResponseInOnyx` intentional (relying on it returning the response unchanged), and should the comment + ordering be made enforceable? -6. **The `isQueuePaused` offline source comment is inaccurate.** The comment at the `resolveIsReadyPromise` decision in `SequentialQueue` states *"isQueuePaused is true for both offline pauses AND shouldPauseQueue (data gap sync)"*, but no offline path ever calls `pause()` — the only callers are the `shouldPauseQueue` response, `DeferredOnyxUpdates`, and `applyOnyxUpdatesReliably`. Offline is handled by the separate `isOfflineNetwork()` gate. Should the comment be corrected to drop the offline claim? +1. **Silent give-up data loss for non-`OPEN_APP` commands.** After the retry cap, the request is dropped with `failureData` applied and **no modal**. Is this the intended product behavior, or should more commands surface a failure to the user? +2. **Sufficiency of `knownOngoingRequestIDs` as the sole guard on the `PERSISTED_ONGOING_REQUESTS` callback.** Unlike the array callback, it does not consult `pendingOnyxWrites` — `knownOngoingRequestIDs` is its only own-write guard (see [PersistedRequests](#persistedrequests-the-store)). Two paths are still adopted unconditionally: a `null` echo and a value with an unknown/missing `requestIndex`. Is the set genuinely sufficient for those two paths, or should the array callback's fuller protection be mirrored here? --- From 9b6e1e6701f2a1f7468da016c7ec912a48622167 Mon Sep 17 00:00:00 2001 From: Adam Horodyski Date: Tue, 16 Jun 2026 14:31:32 +0200 Subject: [PATCH 07/12] Remove Open Questions section from SEQUENTIAL_QUEUE.md Both remaining items were "the code clearly shows what happens; only whether it's intended was unclear." Under document-AS-IS, that intent speculation comes out: the behaviors now stand as plain observations in the Error Handling and PersistedRequests sharp edges. Drop the section and its inbound references (intro, Contents, two inline links). Co-Authored-By: Claude Opus 4.8 (1M context) --- contributingGuides/SEQUENTIAL_QUEUE.md | 16 +++------------- 1 file changed, 3 insertions(+), 13 deletions(-) diff --git a/contributingGuides/SEQUENTIAL_QUEUE.md b/contributingGuides/SEQUENTIAL_QUEUE.md index 5f77d1a09c13..8cb92b6339e9 100644 --- a/contributingGuides/SEQUENTIAL_QUEUE.md +++ b/contributingGuides/SEQUENTIAL_QUEUE.md @@ -19,7 +19,7 @@ This document covers **how the queue works today**. For sibling concerns: - **How the app decides it is offline** (the hard-stop model, failure tracking, reachability) → [Network State Detection](NETWORK_STATE_DETECTION.md). - **How features should behave when offline** (optimistic UX patterns A/B/C/D) → [Offline UX Patterns](philosophies/OFFLINE.md). -This is an observational reference. Where current behavior diverges from apparent intent, that is noted neutrally in a **Sharp edges** subsection on the relevant block, and genuinely intent-ambiguous items are collected in [Open Questions](#open-questions-needs-maintainer-confirmation). All references use module and function **names**, not line numbers, so the doc survives refactors — read it in terms of blocks and their relationships. +This is an observational reference. Where current behavior diverges from apparent intent, that is noted neutrally in a **Sharp edges** subsection on the relevant block. All references use module and function **names**, not line numbers, so the doc survives refactors — read it in terms of blocks and their relationships. ## Contents @@ -36,7 +36,6 @@ This is an observational reference. Where current behavior diverges from apparen - [Test Coverage](#test-coverage) - [Configuration Constants](#configuration-constants) - [Key Modules Reference](#key-modules-reference) -- [Open Questions](#open-questions-needs-maintainer-confirmation) - [Relationship to Other Docs](#relationship-to-other-docs) ## Architecture Diagram @@ -183,7 +182,7 @@ Onyx (since 3.0.46) fires `set`/`multiSet` subscription callbacks **synchronousl - **Two carve-outs** still adopt disk: a `null` value (e.g. from `Onyx.clear()`) falls through to the disk-load/re-init path (the guard is gated on `val != null`); and genuinely-new cross-tab requests identified by an unknown `requestID`. **Sharp edges.** -- The `PERSISTED_ONGOING_REQUESTS` callback **is guarded against own-write echoes, but differently** from the array: it early-returns when the value's `requestIndex` is in `knownOngoingRequestIDs` (a set fed by `processNextRequest` and `updateOngoingRequest`). It never consults `pendingOnyxWrites`; `knownOngoingRequestIDs` is its only own-write guard (the set also catches stale serialized copies a write-counter cannot). Two paths remain adopted unconditionally — a `null` echo and a value with an unknown (or missing) `requestIndex`; whether the set alone is sufficient for those is an [open question](#open-questions-needs-maintainer-confirmation). The same callback also fires `triggerInitializationCallback()` (= `flush`) post-init when a new ongoing request appears — a mid-session flush trigger (see [Inbound Consumers](#inbound-consumers-who-calls-the-queue)). +- The `PERSISTED_ONGOING_REQUESTS` callback **is guarded against own-write echoes, but differently** from the array: it early-returns when the value's `requestIndex` is in `knownOngoingRequestIDs` (a set fed by `processNextRequest` and `updateOngoingRequest`). It never consults `pendingOnyxWrites`; `knownOngoingRequestIDs` is its only own-write guard (the set also catches stale serialized copies a write-counter cannot). Two paths are adopted unconditionally — a `null` echo and a value with an unknown (or missing) `requestIndex`. The same callback also fires `triggerInitializationCallback()` (= `flush`) post-init when a new ongoing request appears — a mid-session flush trigger (see [Inbound Consumers](#inbound-consumers-who-calls-the-queue)). - The `pendingOnyxWrites` counter is a hand-rolled ordering mechanism reimplementing a guarantee the data layer doesn't provide. It is correct only as long as every own write is wrapped and increments/decrements stay balanced (`clear()`'s bare ongoing-key write is the one exception — see above). - Structural `deepEqual` removal is keyed on object shape, not identity. If a request object is mutated mid-flight by middleware (e.g. optimistic-ID rewriting, a `shouldRetry` flag, `File` stripping on serialization, an added rollback marker), a later `deepEqual` can fail to match the original. - `persistWhenOngoing` is a **vestigial field**. It is read only for logging (in `processNextRequest`, `SequentialQueue`, and `RequestsQueuesState`) and is never assigned by any production write path — ongoing-request persistence is unconditional, gated only by `shouldPersistOngoingRequest` (serializability). Every production read resolves to `undefined`; it is safe to remove. @@ -381,7 +380,7 @@ Notes: - **Success path** does not live here. `SaveResponseInOnyx` applies the server's `onyxData` + `successData` + `finallyData`; for WRITE-type responses that is routed through [`QueuedOnyxUpdates`](#queuedonyxupdates-and-queueflusheddata) (deferred until drain). - **`shouldFailAllRequests` vs. give-up** differ: the former applies failure **and** finally data; the give-up branch applies **only** failure data. -**Sharp edges.** For all non-`OPEN_APP` commands, a genuinely transient-but-long backend outage that exceeds the retry cap **silently drops** the user's optimistic write (applying `failureData`, no modal). Whether this is intended product behavior is an [open question](#open-questions-needs-maintainer-confirmation). +**Sharp edges.** For all non-`OPEN_APP` commands, a transient-but-long backend outage that exceeds the retry cap **drops** the user's optimistic write: `failureData` is applied (not `finallyData`), the request is removed from the queue, and **no modal** is shown. The failure modal (`setIsOpenAppFailureModalOpen`) fires for `OPEN_APP` only. --- @@ -518,15 +517,6 @@ Defined in `src/CONST/index.ts` under `CONST.NETWORK` (retry/backoff subset rele --- -# Open Questions (needs maintainer confirmation) - -Two items remain genuinely intent-ambiguous and need a maintainer to ratify, not assert as fact. - -1. **Silent give-up data loss for non-`OPEN_APP` commands.** After the retry cap, the request is dropped with `failureData` applied and **no modal**. Is this the intended product behavior, or should more commands surface a failure to the user? -2. **Sufficiency of `knownOngoingRequestIDs` as the sole guard on the `PERSISTED_ONGOING_REQUESTS` callback.** Unlike the array callback, it does not consult `pendingOnyxWrites` — `knownOngoingRequestIDs` is its only own-write guard (see [PersistedRequests](#persistedrequests-the-store)). Two paths are still adopted unconditionally: a `null` echo and a value with an unknown/missing `requestIndex`. Is the set genuinely sufficient for those two paths, or should the array callback's fuller protection be mirrored here? - ---- - # Relationship to Other Docs - [Network State Detection](NETWORK_STATE_DETECTION.md) — **how** the app detects connectivity and decides it is offline (the hard-stop model the queue's `getIsOffline()` gate reads). From d8f97fd81a64fc105574de3146d664c187c64653 Mon Sep 17 00:00:00 2001 From: Adam Horodyski Date: Tue, 16 Jun 2026 14:55:44 +0200 Subject: [PATCH 08/12] Restate fire-and-forget persist windows as current-state facts Co-Authored-By: Claude Opus 4.8 (1M context) --- contributingGuides/SEQUENTIAL_QUEUE.md | 6 +++++- 1 file changed, 5 insertions(+), 1 deletion(-) diff --git a/contributingGuides/SEQUENTIAL_QUEUE.md b/contributingGuides/SEQUENTIAL_QUEUE.md index 8cb92b6339e9..cf3c3b5eda36 100644 --- a/contributingGuides/SEQUENTIAL_QUEUE.md +++ b/contributingGuides/SEQUENTIAL_QUEUE.md @@ -136,7 +136,11 @@ The persisted queue exists so an interrupted WRITE survives an app kill. The pie **What does _not_ recover:** - A **File/Blob ongoing request** — `PERSISTED_ONGOING_REQUESTS` holds `null` on disk (the live object was memory-only), so there is nothing to re-drive; the upload is lost. -- A request lost in one of the remaining **fire-and-forget persist windows** — the promotion (`processNextRequest`), removal (`endRequestAndRemoveFromQueue`), and rollback writes are not awaited, so a crash between an in-flight request acting on the server and its `multiSet` committing can leave the disk record momentarily stale. The *enqueue* window is no longer one of these: `push` now awaits the `save()` disk commit before flushing, so a freshly-pushed request cannot reach the server while still absent from disk. See [Where the request actually hits disk](#where-the-request-actually-hits-disk-and-where-it-doesnt). +- A request caught in a **fire-and-forget persist window** — a write that touches the server but whose `multiSet` to disk is not awaited, so the disk record is momentarily stale and a crash inside the window loses the request. Three such windows remain; the enqueue window does not. See [Where the request actually hits disk](#where-the-request-actually-hits-disk-and-where-it-doesnt). + - **Promotion** (`processNextRequest`) — not awaited. + - **Removal** (`endRequestAndRemoveFromQueue`) — not awaited. + - **Rollback** writes — not awaited. + - **Enqueue** (`push`) — _closed_: `push` awaits the `save()` disk commit before flushing, so a freshly-pushed request cannot reach the server while still absent from disk. **On sign-out (a different reset):** `Session.cleanupSession()` aborts the in-flight cancellable XHR via `HttpUtils.cancelPendingRequests()` — surfacing as `REQUEST_CANCELLED` to `process()`'s catch (dropped, no data applied) — and clears the persisted queue via `PersistedRequests.clear()`. This is the only inbound path that _aborts_ an in-flight request rather than letting it complete. A bare `Onyx.clear()` (without the abort) resets the mirror to `[]`/`null` through the carve-out path, but because `isInitialized` is already true the init callback does **not** re-fire, so clearing on its own schedules no flush; an in-flight `process()` chain holds local references and is unaffected by the memory wipe. From 5b354f46ff72e8e2aeaf5b4cba42f76ae2b4249a Mon Sep 17 00:00:00 2001 From: Adam Horodyski Date: Tue, 16 Jun 2026 14:57:27 +0200 Subject: [PATCH 09/12] Replace uncommon words with plainer language in SEQUENTIAL_QUEUE.md Co-Authored-By: Claude Opus 4.8 (1M context) --- contributingGuides/SEQUENTIAL_QUEUE.md | 4 ++-- 1 file changed, 2 insertions(+), 2 deletions(-) diff --git a/contributingGuides/SEQUENTIAL_QUEUE.md b/contributingGuides/SEQUENTIAL_QUEUE.md index cf3c3b5eda36..7ce1020a10c4 100644 --- a/contributingGuides/SEQUENTIAL_QUEUE.md +++ b/contributingGuides/SEQUENTIAL_QUEUE.md @@ -189,7 +189,7 @@ Onyx (since 3.0.46) fires `set`/`multiSet` subscription callbacks **synchronousl - The `PERSISTED_ONGOING_REQUESTS` callback **is guarded against own-write echoes, but differently** from the array: it early-returns when the value's `requestIndex` is in `knownOngoingRequestIDs` (a set fed by `processNextRequest` and `updateOngoingRequest`). It never consults `pendingOnyxWrites`; `knownOngoingRequestIDs` is its only own-write guard (the set also catches stale serialized copies a write-counter cannot). Two paths are adopted unconditionally — a `null` echo and a value with an unknown (or missing) `requestIndex`. The same callback also fires `triggerInitializationCallback()` (= `flush`) post-init when a new ongoing request appears — a mid-session flush trigger (see [Inbound Consumers](#inbound-consumers-who-calls-the-queue)). - The `pendingOnyxWrites` counter is a hand-rolled ordering mechanism reimplementing a guarantee the data layer doesn't provide. It is correct only as long as every own write is wrapped and increments/decrements stay balanced (`clear()`'s bare ongoing-key write is the one exception — see above). - Structural `deepEqual` removal is keyed on object shape, not identity. If a request object is mutated mid-flight by middleware (e.g. optimistic-ID rewriting, a `shouldRetry` flag, `File` stripping on serialization, an added rollback marker), a later `deepEqual` can fail to match the original. -- `persistWhenOngoing` is a **vestigial field**. It is read only for logging (in `processNextRequest`, `SequentialQueue`, and `RequestsQueuesState`) and is never assigned by any production write path — ongoing-request persistence is unconditional, gated only by `shouldPersistOngoingRequest` (serializability). Every production read resolves to `undefined`; it is safe to remove. +- `persistWhenOngoing` is a **vestigial field**. It is read only for logging (in `processNextRequest`, `SequentialQueue`, and `RequestsQueuesState`) and is never assigned by any production write path — ongoing-request persistence is unconditional, gated only by `shouldPersistOngoingRequest` (whether the request can be serialized). Every production read resolves to `undefined`; it is safe to remove. ## Where the request actually hits disk (and where it doesn't) @@ -436,7 +436,7 @@ Three concerns cut across every block. **Sharp edges.** - The chain order is implicit import-time module state; there is no runtime assertion that `SaveResponseInOnyx` is penultimate or that `Reauthentication` precedes it. - `SaveResponseInOnyx`'s early-return guard has a **dead term**: `onyxUpdates` defaults to `response.onyxData ?? []`, and an empty array is truthy, so the `!onyxUpdates` leg of the guard is never true — a response's empty `onyxData` cannot short-circuit the middleware. The sibling `successData`/`failureData`/`finallyData` legs still fire, so the `if` as a whole is not dead — only its empty-`onyxData` short-circuit is. -- The in-code "must be last" comment on `SaveResponseInOnyx` is **inaccurate**: `FraudMonitoring` is registered after it. The current order is safe only **incidentally** — `FraudMonitoring` returns the response object unchanged, so the `shouldPauseQueue` flag riding on it survives. The load-bearing invariant is "no middleware after `SaveResponseInOnyx` may mutate or replace the response," not literal lastness; nothing enforces either form. +- The in-code "must be last" comment on `SaveResponseInOnyx` is **inaccurate**: `FraudMonitoring` is registered after it. The current order is safe only **incidentally** — `FraudMonitoring` returns the response object unchanged, so the `shouldPauseQueue` flag riding on it survives. The load-bearing invariant is "no middleware after `SaveResponseInOnyx` may mutate or replace the response," not literally being registered last; nothing enforces either form. --- From aa722e823219b3fbdc3c410a9ace8a1fd626a953 Mon Sep 17 00:00:00 2001 From: Adam Horodyski Date: Tue, 16 Jun 2026 15:14:33 +0200 Subject: [PATCH 10/12] Trim redundancy in SEQUENTIAL_QUEUE.md Remove the "Relationship to Other Docs" section (the Overview already links the two sibling docs), slim "Key Modules Reference" to a terse code-location index with section links, and simplify the architecture diagram to a high-level map (drop middleware names, guard-ordering lists, and the disk-write legend that the disk section already covers). Co-Authored-By: Claude Opus 4.8 (1M context) --- contributingGuides/SEQUENTIAL_QUEUE.md | 119 +++++++++---------------- 1 file changed, 43 insertions(+), 76 deletions(-) diff --git a/contributingGuides/SEQUENTIAL_QUEUE.md b/contributingGuides/SEQUENTIAL_QUEUE.md index 7ce1020a10c4..20eff5b8031c 100644 --- a/contributingGuides/SEQUENTIAL_QUEUE.md +++ b/contributingGuides/SEQUENTIAL_QUEUE.md @@ -36,7 +36,6 @@ This is an observational reference. Where current behavior diverges from apparen - [Test Coverage](#test-coverage) - [Configuration Constants](#configuration-constants) - [Key Modules Reference](#key-modules-reference) -- [Relationship to Other Docs](#relationship-to-other-docs) ## Architecture Diagram @@ -44,70 +43,44 @@ This is an observational reference. Where current behavior diverges from apparen feature code │ API.write(command, params, onyxData, conflictResolver) ▼ -┌────────────────────────────────────────────────────────────┐ -│ API.index — prepareRequest │ -│ • apply optimisticData to Onyx (UI updates NOW) │ -│ • evaluate conflictResolver #1 (read-only: skip optimistic?)│ -│ • stamp requestIndex++ (per-tab counter; requestID legacy) │ -│ • strip optimisticData from the persisted request │ -└───────────────┬──────────────────────────────────────────────┘ - write │ read → waitForWrites() → SequentialQueue.waitForIdle() - ▼ (blocks on isReadyPromise) -┌────────────────────────────────────────────────────────────┐ -│ SequentialQueue.push() (async) │ -│ • conflictResolver #2 (authoritative: mutate the queue) │ -│ • PersistedRequests.save → in-memory SYNC; disk AWAITED │ -│ • mark isReadyPromise pending (sync, before first await) │ -│ • await persistencePromise (disk commit) THEN dispatch: │ -│ offline → await persist, return (reconnect flushes) │ -│ running → defer: isReadyPromise.then(flush(false)) │ -│ idle → flush(false) │ -└───────────────┬──────────────────────────────────────────────┘ +┌──────────────────────────────────────────────┐ +│ API.index — prepareRequest │ +│ apply optimisticData · conflictResolver #1 │ +│ (read-only) · stamp requestID · strip │ +│ optimisticData from the persisted request │ +└───────────────┬──────────────────────────────┘ + write │ read → waitForIdle() (blocks on isReadyPromise) ▼ -┌────────────────────────────────────────────────────────────┐ -│ flush(shouldResetPromise) │ -│ guards (in order): paused? · offline? · already running? · │ -│ all empty? (queue · ongoing · buffered │ -│ onyx updates) · NOT the leader? │ -│ → isSequentialQueueRunning = true │ -│ → (maybe) reset isReadyPromise │ -│ → load PERSISTED_REQUESTS, then process() │ -└───────────────┬──────────────────────────────────────────────┘ +┌──────────────────────────────────────────────┐ +│ SequentialQueue.push() (async) │ +│ conflictResolver #2 (authoritative) · persist │ +│ (PersistedRequests.save) · await disk commit, │ +│ THEN dispatch: offline → return (reconnect │ +│ flushes) · running → defer · idle → flush() │ +└───────────────┬──────────────────────────────┘ ▼ -┌────────────────────────────────────────────────────────────┐ -│ process() (recurses until queue empty) │ -│ • processNextRequest(): head → ongoingRequest │ -│ (atomic multiSet of both keys; returns LOCAL ref) │ -│ • processWithMiddleware(request, isFromSequentialQueue=true) │ -│ makeXHR → Logging → … → SaveResponseInOnyx → │ -│ FraudMonitoring │ -│ • .then → (pause if shouldPauseQueue) → remove request → │ -│ save queueFlushedData → throttle.clear → RECURSE │ -│ • .catch → error ladder (drop / treat-as-success / │ -│ retry+backoff / give-up) │ -└───────────────┬──────────────────────────────────────────────┘ +┌──────────────────────────────────────────────┐ +│ flush() — leader-only, single-flight │ +│ guards pass → load queue → process() │ +└───────────────┬──────────────────────────────┘ ▼ -┌────────────────────────────────────────────────────────────┐ -│ process().finally (the drain-end block, inside flush) │ -│ • isSequentialQueueRunning = false │ -│ • resolve isReadyPromise IFF offline OR queue empty │ -│ • if queue empty: flush QueuedOnyxUpdates, │ -│ then apply + clear queueFlushedData │ -└────────────────────────────────────────────────────────────┘ +┌──────────────────────────────────────────────┐ +│ process() (drains FIFO, recurses) │ +│ promote head → ongoing → middleware chain │ +│ success → remove + recurse │ +│ error → ladder (drop / retry / give up) │ +└───────────────┬──────────────────────────────┘ + ▼ +┌──────────────────────────────────────────────┐ +│ drain-end (process().finally) │ +│ clear running flag · resolve isReadyPromise · │ +│ on empty queue → flush buffered Onyx updates │ +└──────────────────────────────────────────────┘ boundary collaborators (contract only — linked, not owned here): • ActiveClientManager → isClientTheLeader() (web multi-tab) • NetworkState → getIsOffline() (see NETWORK_STATE_DETECTION.md) • OnyxUpdateManager → resolves shouldPauseQueue data gaps, then unpause() - - disk writes — PERSISTED_* keys; cache + subscribers update SYNC first, - the returned Onyx promise resolves on STORAGE COMMIT (IDB tx / SQLite WAL): - • #1 save() → PERSISTED_REQUESTS (in push; AWAITED before flush) - • #2 processNextRequest → PERSISTED_REQUESTS + ONGOING (promotion; fire-and-forget) - • end / rollback / updateOngoing (fire-and-forget) - • conflict update / delete (awaited by handleConflictActions) - ✓ push awaits the #1 disk commit before flush → makeXHR; makeXHR itself still gates - only on auth readiness, so the network can no longer fire ahead of the enqueue write ``` ## Lifecycle of One Request @@ -505,23 +478,17 @@ Defined in `src/CONST/index.ts` under `CONST.NETWORK` (retry/backoff subset rele # Key Modules Reference -| Module | Responsibility | -|---|---| -| `src/libs/Network/SequentialQueue.ts` | Leader-tab serial executor for persisted WRITE requests: persist-then-flush one at a time through the middleware with throttle/backoff and the error ladder; gates dependent READs via `isReadyPromise`. | -| `src/libs/actions/PersistedRequests.ts` | In-memory mirror (`persistedRequests` + `ongoingRequest`) of the offline write queue, reconciled with the `PERSISTED_REQUESTS` / `PERSISTED_ONGOING_REQUESTS` keys; in-memory authoritative after init. | -| `src/libs/RequestThrottle.ts` | Per-instance exponential-backoff timing: jittered first wait, doubling capped at the max, retry-count gate, argument-less reject as the give-up signal. | -| `src/libs/actions/QueuedOnyxUpdates.ts` | In-memory buffer of WRITE-response Onyx updates, held until the queue drains then flushed (with a not-signed-in preserved-keys filter) — anti-flicker. | -| `src/libs/actions/OnyxUpdates.ts` (`applyHTTPSOnyxUpdates`) | Routes WRITE-type responses through `queueOnyxUpdates` and READ-type through `Onyx.update` immediately; home of `doesClientNeedToBeUpdated`. | -| `src/libs/API/index.ts` | Public facade: `prepareRequest` applies optimistic data + stamps `requestID`; routes `write`→queue, `read`→`waitForWrites` then fire-and-forget, `makeRequestWithSideEffects`→immediate. | -| `src/libs/Request.ts` (`processWithMiddleware`) | Folds the middleware chain over `makeXHR(request)`; first-registered is innermost, response handlers fire in registration order. | -| `src/libs/Middleware/SaveResponseInOnyx.ts` | Penultimate middleware; sole producer of `shouldPauseQueue` (on a `previousUpdateID` gap); persists server Onyx updates. | -| `src/libs/Middleware/Reauthentication.ts` | Re-throws on `isFromSequentialQueue` so the queue retries; reacts to 407 by reauthenticating; may throw `new Error('Failed to reauthenticate')`. | -| `src/libs/ActiveClientManager/` | Leader election across web tabs (`isClientTheLeader`); no-op (always leader) on native. | -| `QUEUE_FLUSHED_DATA` (Onyx key) + `SequentialQueue.saveQueueFlushedData` | Distinct Onyx-persisted buffer applied only on full drain; sole producer is `App.getOnyxDataForOpenOrReconnect` carrying `HAS_LOADED_APP = true`. | +A code-location index — concept → file. The behavior lives in the sections above; this is just where to find it. ---- - -# Relationship to Other Docs - -- [Network State Detection](NETWORK_STATE_DETECTION.md) — **how** the app detects connectivity and decides it is offline (the hard-stop model the queue's `getIsOffline()` gate reads). -- [Offline UX Patterns](philosophies/OFFLINE.md) — **how features should respond** to being offline (optimistic patterns A/B/C/D), which the `API.write`/`read` contract implements. +| Module | Responsibility | Section | +|---|---|---| +| `src/libs/Network/SequentialQueue.ts` | The coordinator: persist-then-flush, drain loop, error ladder, `isReadyPromise` READ gate. | [SequentialQueue](#sequentialqueue-the-coordinator) | +| `src/libs/actions/PersistedRequests.ts` | In-memory-authoritative mirror of the `PERSISTED_REQUESTS` / `PERSISTED_ONGOING_REQUESTS` keys. | [PersistedRequests](#persistedrequests-the-store) | +| `src/libs/RequestThrottle.ts` | Jittered exponential backoff + the give-up signal. | [RequestThrottle](#requestthrottle-backoff) | +| `src/libs/actions/QueuedOnyxUpdates.ts` | In-memory buffer of WRITE-response Onyx updates, flushed on drain (anti-flicker). | [QueuedOnyxUpdates](#queuedonyxupdates-and-queueflusheddata) | +| `src/libs/actions/OnyxUpdates.ts` | `applyHTTPSOnyxUpdates` WRITE/READ routing; `doesClientNeedToBeUpdated`. | [QueuedOnyxUpdates](#queuedonyxupdates-and-queueflusheddata) | +| `src/libs/API/index.ts` | Public facade: `prepareRequest`; `write`/`read`/`makeRequestWithSideEffects` routing. | [Public Contract](#public-contract-the-api-layer) | +| `src/libs/Request.ts` | `processWithMiddleware` — folds the middleware chain over `makeXHR`. | [Middleware](#the-middleware-chain-boundary) | +| `src/libs/Middleware/SaveResponseInOnyx.ts` | Penultimate middleware; sole producer of `shouldPauseQueue`. | [Middleware](#the-middleware-chain-boundary) | +| `src/libs/Middleware/Reauthentication.ts` | Re-throws on `isFromSequentialQueue`; handles 407 reauth. | [Middleware](#the-middleware-chain-boundary) | +| `src/libs/ActiveClientManager/` | Web-tab leader election; always-leader on native. | [Multi-Tab](#multi-tab--leader-election) | From 72dfc76766d03b813bb95ac3c8701dee6b51d99c Mon Sep 17 00:00:00 2001 From: Adam Horodyski Date: Wed, 17 Jun 2026 12:49:24 +0200 Subject: [PATCH 11/12] Address Codex review on SEQUENTIAL_QUEUE.md - Drop "exactly once" overstatement: writes are sent serially and deduplicated server-side on the rare resend, not via a client-side lock. - Distinguish push()'s offline-at-entry path (returns without arming the READ gate) from going offline during the awaited persist. - Reframe restart head-dedupe as a defensive backstop, consistent with the atomic multiSet promotion. - Restate remaining time-based phrasing as current-state facts. Co-Authored-By: Claude Opus 4.8 (1M context) --- contributingGuides/SEQUENTIAL_QUEUE.md | 18 +++++++++--------- 1 file changed, 9 insertions(+), 9 deletions(-) diff --git a/contributingGuides/SEQUENTIAL_QUEUE.md b/contributingGuides/SEQUENTIAL_QUEUE.md index 20eff5b8031c..49fe396a3465 100644 --- a/contributingGuides/SEQUENTIAL_QUEUE.md +++ b/contributingGuides/SEQUENTIAL_QUEUE.md @@ -2,7 +2,7 @@ ## Overview -The SequentialQueue is how the app keeps its **offline-first promise**: a user can "do their thing" regardless of connectivity, and every change they make is delivered to the server **once, in order, at the right time** — even across going offline, refreshing the tab, or restarting the app. +The SequentialQueue is how the app keeps its **offline-first promise**: a user can "do their thing" regardless of connectivity, and every change they make is sent to the server **serially, in order, at the right time** (deduplicated server-side on the rare resend from a retry or leadership change) — even across going offline, refreshing the tab, or restarting the app. Concretely, `SequentialQueue` is a **single-flight, FIFO, leader-only engine** that drains every `API.write()` request one at a time, with exponential backoff on failure, surviving offline periods and (for most requests) app restarts. It is a coordinator over four collaborators: @@ -89,7 +89,7 @@ This is the connective narrative — how the blocks interact at runtime for a si 1. **Prepare (synchronous, in `API.index.prepareRequest`).** `optimisticData` is applied to Onyx immediately, so the UI reflects the change at once. The conflict resolver is evaluated **read-only** here, solely to decide whether to *suppress* the optimistic data (`conflictAction.type === 'noAction'`); the rest of the conflict action is ignored at this site. A `requestID` is stamped from a per-tab counter (the field is literally `requestIndex`; `requestID` is read only as a legacy fallback in `getClientRequestIndex` — this doc keeps the conventional name), and `optimisticData` is stripped from the version that will be persisted (it has already been applied). -2. **Push (`async SequentialQueue.push`).** The conflict resolver runs a **second** time, now **authoritatively**, against the live persisted requests — this is the evaluation that may mutate the queue (push / replace / delete). The resolver function is then deleted off the request (it cannot be serialized), and `PersistedRequests.save` mutates the in-memory array **synchronously** while issuing the disk write (captured as `persistencePromise`). `push` then marks `isReadyPromise` pending synchronously (before its first `await`, so a READ on the next tick parks behind this write) and **`await`s `persistencePromise` — the disk commit — before flushing**. After the commit it dispatches on three conditions: **offline** → `await` persist and return (reconnect will flush later); **already running** → defer via `isReadyPromise.then(() => flush(false))`; **idle** → `flush(false)`. A network flip to offline *during* the awaited write is handled explicitly: `push` resolves `isReadyPromise` and returns without flushing. +2. **Push (`async SequentialQueue.push`).** The conflict resolver runs a **second** time, now **authoritatively**, against the live persisted requests — this is the evaluation that may mutate the queue (push / replace / delete). The resolver function is then deleted off the request (it cannot be serialized), and `PersistedRequests.save` mutates the in-memory array **synchronously** while issuing the disk write (captured as `persistencePromise`). **If the app is already offline at entry**, `push` `await`s `persistencePromise` and returns *without arming* `isReadyPromise` — the READ gate is left untouched (reconnect will flush later). Otherwise `push` marks `isReadyPromise` pending synchronously (before its first `await`, so a READ on the next tick parks behind this write) and **`await`s `persistencePromise` — the disk commit — before flushing**. After the commit it dispatches on three conditions: **went offline *during* the awaited persist** → resolve `isReadyPromise` and return without flushing; **already running** → defer via `isReadyPromise.then(() => flush(false))`; **idle** → `flush(false)`. 3. **Flush (`flush`).** Five guards run in order — paused, offline, already-running, all-empty (three-legged: queue, `ongoingRequest`, **and** the `QueuedOnyxUpdates` buffer — see [the coordinator](#sequentialqueue-the-coordinator)), **not-leader**. If all pass, `isSequentialQueueRunning` is set, `isReadyPromise` is optionally reset, and a throwaway `PERSISTED_REQUESTS` connection guarantees disk is loaded before `process()` runs (it opts out of connection reuse, because `PersistedRequests` already holds a live `PERSISTED_REQUESTS` connection and reusing it would fire extra callbacks). @@ -102,7 +102,7 @@ This is the connective narrative — how the blocks interact at runtime for a si The persisted queue exists so an interrupted WRITE survives an app kill. The pieces are described across the blocks below; here is the one cold-start path that strings them together. 1. **Disk load.** On startup, the `PERSISTED_REQUESTS` and `PERSISTED_ONGOING_REQUESTS` connect callbacks hydrate the in-memory mirror — `persistedRequests` from the array, `ongoingRequest` from the single key. -2. **Head dedupe.** If the app was killed _after_ a request was promoted to `ongoingRequest` but _before_ it was removed, the same request can sit both as `ongoingRequest` and as the head of the array. The init path strips the array head when it `deepEqual`s the rehydrated `ongoingRequest` — this is the reason that structural dedupe exists. +2. **Head dedupe.** The init path strips the array head when it `deepEqual`s the rehydrated `ongoingRequest`. Promotion writes both keys in one atomic `multiSet` (see [PersistedRequests](#persistedrequests-the-store)), so it cannot leave the same request both as `ongoingRequest` and as the array head — this dedupe is a **defensive backstop** for any persisted state where the two keys are out of sync on disk (e.g. a queue whose two key writes did not land together). 3. **Flush.** `onPersistedRequestsInitialization` fires `flush()` once there is recovered work (and, separately, `Network/index` fires a startup flush gated on `ActiveClientManager.isReady()`, so the leader gate is meaningful on the first drain). The registered callback is not startup-only: the `PERSISTED_ONGOING_REQUESTS` connect callback re-fires it post-init whenever a *new* ongoing request appears (e.g. observed from another tab) — see [Inbound Consumers](#inbound-consumers-who-calls-the-queue). 4. **Re-drive.** `processNextRequest` sees a non-null `ongoingRequest` and returns it directly (rather than promoting a new head), so the **same** interrupted request is re-sent. @@ -125,17 +125,17 @@ Each block is described as **the problem it solves → how it works today → sh ## SequentialQueue (the coordinator) -**Problem it solves.** Serialize every WRITE so requests reach the server one at a time, in submission order, exactly once — with automatic retry on transient failure and correct ordering of dependent READs. +**Problem it solves.** Serialize every WRITE so requests reach the server one at a time, in submission order — deduplicated server-side on the rare resend, not by a client-side exactly-once lock — with automatic retry on transient failure and correct ordering of dependent READs. **How it works today.** - **Single-flight.** `isSequentialQueueRunning` is the re-entry guard: set at the top of `flush()` after the guards pass, cleared in `process().finally`. While set, new pushes defer rather than start a parallel drain. -- **`push()` dispatch.** `push` is `async`. It marks `isReadyPromise` pending synchronously, then **`await`s the disk-persistence promise before dispatching**: offline → `await` persist, return; running → defer behind `isReadyPromise.then(() => flush(false))`; idle → `flush(false)`. The queue no longer fires off the synchronous in-memory state ahead of the commit. +- **`push()` dispatch.** `push` is `async`. **Offline at entry** → `await` persist and return *without arming* `isReadyPromise` (the READ gate stays untouched). Otherwise it marks `isReadyPromise` pending synchronously, then **`await`s the disk-persistence promise before dispatching**: went offline *during* persist → resolve `isReadyPromise`, return; running → defer behind `isReadyPromise.then(() => flush(false))`; idle → `flush(false)`. The disk commit gates the flush, so the queue never fires off the synchronous in-memory state ahead of it. - **`flush()` guards, in order.** `isQueuePaused` → `isOfflineNetwork()` → `isSequentialQueueRunning` → all-empty → `!isClientTheLeader()`. The all-empty guard is three-legged — no queued requests, no `ongoingRequest`, **and** an empty `QueuedOnyxUpdates` buffer — so `flush()` doubles as the drain path for buffered updates on a request-empty queue. Leadership is checked **exactly once**, before the running flag flips; the recursive `process()` chain re-checks only `isQueuePaused` and `isOfflineNetwork()`, never leadership. - **`process()` recursion.** Pull the next request, run middleware, on success recurse; the recursion terminates when the queue and ongoing request are both empty. - **READ-after-WRITE gate.** `waitForIdle()` returns `isReadyPromise`; READs block on it so a READ response cannot clobber optimistic data from in-flight WRITEs on the same keys. See [the state machine](#the-state-machine) for the resolution rule. **Sharp edges.** -- `push()` `await`s `persistencePromise` (the enqueue disk commit) before it ever calls `flush()`, so a freshly-pushed request cannot reach the server while still absent from disk — the enqueue crash window is closed — post-initialization only (see [Where the request actually hits disk](#where-the-request-actually-hits-disk-and-where-it-doesnt)). `makeXHR` still gates only on auth readiness (never on a disk write); the ordering guarantee now comes from `push` awaiting the commit, not from `makeXHR`. The *promotion/removal/rollback* writes inside `process()` remain fire-and-forget (see [Where the request actually hits disk](#where-the-request-actually-hits-disk-and-where-it-doesnt)). +- `push()` `await`s `persistencePromise` (the enqueue disk commit) before it ever calls `flush()`, so a freshly-pushed request cannot reach the server while still absent from disk — there is no enqueue crash window, post-initialization (see [Where the request actually hits disk](#where-the-request-actually-hits-disk-and-where-it-doesnt)). `makeXHR` gates only on auth readiness (never on a disk write); the ordering guarantee comes from `push` awaiting the commit, not from `makeXHR`. The *promotion/removal/rollback* writes inside `process()` are fire-and-forget (see [Where the request actually hits disk](#where-the-request-actually-hits-disk-and-where-it-doesnt)). - Leadership being checked once per flush means a leadership change *during* an in-flight drain is not re-evaluated by the running cycle (see [Multi-Tab & Leader Election](#multi-tab--leader-election)). ## PersistedRequests (the store) @@ -172,7 +172,7 @@ This is the crux for any crash-safety or persist-before-fire reasoning, so it is **The Onyx promise is a real disk-commit handle, not a cache handle.** In `OnyxUtils` (`setWithRetry`/`multiSetWithRetry`), the cache is updated and subscribers are notified **synchronously** via `broadcastUpdate`/`keyChanged` — there is a source comment, _"This approach prioritizes fast UI changes without waiting for data to be stored in device storage"_ — and the function then returns the `storage.setItem`/`storage.multiSet` chain. That returned promise resolves only after the storage provider commits: an IndexedDB transaction `complete` event on web, or a SQLite WAL commit on native (`journal_mode=WAL`, `synchronous=NORMAL` — durable across a JS/app crash, weaker than an fsync against OS-level power loss). So `push()`'s `persistencePromise` **does** represent durable persistence. Three caveats bound that statement: a **`null` write** returns `Promise.resolve()` immediately (the storage delete runs fire-and-forget — see below), so for null/delete payloads the promise is *not* a commit handle, and a `multiSet` containing a `null` value covers only its non-null pairs (e.g. `endRequestAndRemoveFromQueue`'s "atomic" promise covers only the array write, not the ongoing-key delete); Onyx's internal `retryOperation` **swallows storage failures** after `MAX_STORAGE_OPERATION_RETRY_ATTEMPTS` (5), resolving anyway; and on web Onyx can silently **degrade to `MemoryOnlyProvider`** when the IndexedDB store cannot be created, after which every "commit" is memory-only. -**And on the enqueue path, the disk write now gates the network.** `push` is `async`: it issues `save()` (synchronous in-memory update + the disk write captured as `persistencePromise`), marks `isReadyPromise` pending, then **`await`s `persistencePromise`** and only afterward calls `flush(false)`. So the enqueue commit lands before `flush()` → `process()` → `processWithMiddleware()` → `makeXHR()` ever runs. `makeXHR` itself still gates the `HttpUtils.xhr` call only on `hasReadRequiredDataFromStorage()` (auth/NetworkStore readiness) — it has no disk gate — but the request can no longer leave the device ahead of its enqueue commit, because `push` blocked on that commit first. (The in-flight `#2` promotion `multiSet` is still fire-and-forget; the gate closed is specifically the enqueue window.) **This holds only once `PersistedRequests` has initialized.** Before initialization, `save()` parks the request in `pendingSaveOperations` and returns `Promise.resolve()` — *not* a commit handle — so `push`'s `await` is a no-op. The real enqueue write is issued later by the init connect-callback as an **un-awaited** `Onyx.set(PERSISTED_REQUESTS, …)`, which then calls `triggerInitializationCallback()` → `flush()`. So during that startup window the XHR can still fire ahead of the enqueue commit; the crash window is closed only after initialization. +**And on the enqueue path, the disk write gates the network.** `push` is `async`: it issues `save()` (synchronous in-memory update + the disk write captured as `persistencePromise`), marks `isReadyPromise` pending, then **`await`s `persistencePromise`** and only afterward calls `flush(false)`. So the enqueue commit lands before `flush()` → `process()` → `processWithMiddleware()` → `makeXHR()` ever runs. `makeXHR` itself still gates the `HttpUtils.xhr` call only on `hasReadRequiredDataFromStorage()` (auth/NetworkStore readiness) — it has no disk gate — but the request cannot leave the device ahead of its enqueue commit, because `push` blocks on that commit first. (The in-flight `#2` promotion `multiSet` is fire-and-forget; the gated window is specifically the enqueue path.) **This holds only once `PersistedRequests` has initialized.** Before initialization, `save()` parks the request in `pendingSaveOperations` and returns `Promise.resolve()` — *not* a commit handle — so `push`'s `await` is a no-op. The real enqueue write is issued later by the init connect-callback as an **un-awaited** `Onyx.set(PERSISTED_REQUESTS, …)`, which then calls `triggerInitializationCallback()` → `flush()`. So during that startup window the XHR can fire ahead of the enqueue commit; the crash window stays open until initialization completes. ``` async push() (online · idle · no conflict) @@ -326,7 +326,7 @@ In the drain-end `process().finally`, `resolveIsReadyPromise` runs only when `is - **Offline** (`isOfflineNetwork()`) → resolve. The queue cannot process anyway, so READs should be allowed through (they'll show optimistic/stale data, which is acceptable offline). Going offline never sets `isQueuePaused`; it is checked directly. - **`isQueuePaused`** (a `shouldPauseQueue` data-gap sync, or a `DeferredOnyxUpdates`/`applyOnyxUpdatesReliably` pause) → do **not** resolve. WRITEs are still pending behind the gap, so READs must keep waiting or they'd clobber in-flight optimistic data. This is exactly why the gate keys on `isOfflineNetwork()` rather than `!isQueuePaused`. -**`resolveIsReadyPromise` is now called from four sites**, not just the `finally`. Because `push()` arms the pending promise *before* awaiting the disk write, every early-out path that would otherwise skip the drain has to release parked READs itself, or they would hang: +**`resolveIsReadyPromise` is called from four sites**, not just the `finally`. Because `push()` arms the pending promise *before* awaiting the disk write, every early-out path *that armed it* has to release parked READs itself, or they would hang. (The offline-at-entry early-out is the exception: it returns before arming, so it has nothing to release.) 1. The **drain-end `process().finally`** — when offline or the queue is empty (above). 2. The **all-empty guard in `flush()`** — e.g. a conflict resolver deleted the only request without pushing a replacement, so there is nothing to drain. @@ -439,7 +439,7 @@ Tests are useful here as evidence of the **intended contract** — what the team | Suite | What it pins as contract | |---|---| | `tests/unit/APITest.ts` | Offline-persist, online-flush, persisted-until-backend-response, retry/backoff for 5xx + Auth-down (asserts ~3 fetches with throttle waits), reauthentication + ordering, WRITE-blocks-READ, pause/unpause, no-duplicates + enable-feature conflict collapsing, supportal-411 no-retry + `failureData` | -| `tests/unit/SequentialQueueTest.ts` | Push/persist counts, conflict resolution (replace/push/noAction, incl. replace-while-ongoing — now `async`/awaited since `push` is async), move-to-ongoing persistence, startup hydration of the ongoing request, `queueFlushedData` lifecycle, `ALREADY_CREATED` treated as success without retry, **and the persist-before-fire edges: `waitForIdle` resolves without flushing when the network goes offline during persist, and a disk-write failure during persist does not strand `isReadyPromise` or block the queue** | +| `tests/unit/SequentialQueueTest.ts` | Push/persist counts, conflict resolution (replace/push/noAction, incl. replace-while-ongoing, `async`/awaited to match `push`'s async signature), move-to-ongoing persistence, startup hydration of the ongoing request, `queueFlushedData` lifecycle, `ALREADY_CREATED` treated as success without retry, **and the persist-before-fire edges: `waitForIdle` resolves without flushing when the network goes offline during persist, and a disk-write failure during persist does not strand `isReadyPromise` or block the queue** | | `tests/unit/PersistedRequests.ts` | Save/remove/processNext/update/updateOngoing, File/Blob handling, cross-tab follower reconciliation; the `Issue 3a`/`3c` cases assert the **durable** ongoing-request behavior (`processNextRequest` always persists via `multiSet`) — a legacy test title references the vestigial `persistWhenOngoing` flag | | `tests/unit/RequestConflictUtilsTest.ts` | Pure push/replace/delete/noAction resolver logic | | `tests/actions/QueuedOnyxUpdatesTest.ts` | Account-scoped update filtering on flush | From 2d31fb299d103d0ec6f3ff4bffcbbde92cbbb66a Mon Sep 17 00:00:00 2001 From: Adam Horodyski Date: Wed, 17 Jun 2026 18:47:15 +0200 Subject: [PATCH 12/12] Apply language rules to SEQUENTIAL_QUEUE.md: drop em dashes, expand acronyms, cut throat-clearing Co-Authored-By: Claude Opus 4.8 (1M context) --- contributingGuides/SEQUENTIAL_QUEUE.md | 224 ++++++++++++------------- 1 file changed, 112 insertions(+), 112 deletions(-) diff --git a/contributingGuides/SEQUENTIAL_QUEUE.md b/contributingGuides/SEQUENTIAL_QUEUE.md index 49fe396a3465..f98a6293b31d 100644 --- a/contributingGuides/SEQUENTIAL_QUEUE.md +++ b/contributingGuides/SEQUENTIAL_QUEUE.md @@ -2,24 +2,24 @@ ## Overview -The SequentialQueue is how the app keeps its **offline-first promise**: a user can "do their thing" regardless of connectivity, and every change they make is sent to the server **serially, in order, at the right time** (deduplicated server-side on the rare resend from a retry or leadership change) — even across going offline, refreshing the tab, or restarting the app. +The SequentialQueue is how the app keeps its **offline-first promise**: a user can "do their thing" regardless of connectivity, and every change they make is sent to the server **serially, in order, at the right time** (deduplicated server-side on the rare resend from a retry or leadership change), even across going offline, refreshing the tab, or restarting the app. -Concretely, `SequentialQueue` is a **single-flight, FIFO, leader-only engine** that drains every `API.write()` request one at a time, with exponential backoff on failure, surviving offline periods and (for most requests) app restarts. It is a coordinator over four collaborators: +`SequentialQueue` is a **single-flight, first-in-first-out (FIFO), leader-only engine** that drains every `API.write()` request one at a time, with exponential backoff on failure, surviving offline periods and (for most requests) app restarts. It is a coordinator over four collaborators: | Collaborator | Role | |---|---| -| **PersistedRequests** | The request store — an in-memory mirror of the on-disk queue (`persistedRequests` array + a single `ongoingRequest`). | +| **PersistedRequests** | The request store: an in-memory mirror of the on-disk queue (`persistedRequests` array plus a single `ongoingRequest`). | | **RequestThrottle** | Per-instance exponential backoff with jitter; produces the "give up" signal. | | **QueuedOnyxUpdates** | A separate in-memory buffer holding WRITE responses' Onyx updates until the whole queue drains (anti-flicker). | -| **The middleware chain** (`Request.processWithMiddleware`) | Where a single request actually runs end-to-end (XHR + auth + response persistence). | +| **The middleware chain** (`Request.processWithMiddleware`) | Where a single request actually runs end to end: the XMLHttpRequest (XHR), auth, and response persistence. | -> **The one inversion to internalize before reading further:** within this subsystem, **in-memory state is authoritative and disk is a lagging backup** — the opposite of the usual Onyx contract. `PersistedRequests` deliberately ignores most of its own Onyx subscription echoes. Roughly half of that module is defensive code protecting this invariant. A reader who misses this will misread much of it as redundant. See [Why in-memory is authoritative](#why-in-memory-is-authoritative). +> **One inversion to keep in mind:** within this subsystem, **in-memory state is authoritative and disk is a lagging backup**, the opposite of the usual Onyx contract. `PersistedRequests` deliberately ignores most of its own Onyx subscription echoes. Roughly half of that module is defensive code protecting this invariant. A reader who misses this will misread much of it as redundant. See [Why in-memory is authoritative](#why-in-memory-is-authoritative). This document covers **how the queue works today**. For sibling concerns: - **How the app decides it is offline** (the hard-stop model, failure tracking, reachability) → [Network State Detection](NETWORK_STATE_DETECTION.md). - **How features should behave when offline** (optimistic UX patterns A/B/C/D) → [Offline UX Patterns](philosophies/OFFLINE.md). -This is an observational reference. Where current behavior diverges from apparent intent, that is noted neutrally in a **Sharp edges** subsection on the relevant block. All references use module and function **names**, not line numbers, so the doc survives refactors — read it in terms of blocks and their relationships. +This is an observational reference. Where current behavior diverges from apparent intent, that is noted neutrally in a **Sharp edges** subsection on the relevant block. All references use module and function **names**, not line numbers, so the doc survives refactors. Read it in terms of blocks and their relationships. ## Contents @@ -77,7 +77,7 @@ This is an observational reference. Where current behavior diverges from apparen │ on empty queue → flush buffered Onyx updates │ └──────────────────────────────────────────────┘ - boundary collaborators (contract only — linked, not owned here): + boundary collaborators (contract only; linked, not owned here): • ActiveClientManager → isClientTheLeader() (web multi-tab) • NetworkState → getIsOffline() (see NETWORK_STATE_DETECTION.md) • OnyxUpdateManager → resolves shouldPauseQueue data gaps, then unpause() @@ -85,37 +85,37 @@ This is an observational reference. Where current behavior diverges from apparen ## Lifecycle of One Request -This is the connective narrative — how the blocks interact at runtime for a single `API.write()`. +This is the connective narrative: how the blocks interact at runtime for a single `API.write()`. -1. **Prepare (synchronous, in `API.index.prepareRequest`).** `optimisticData` is applied to Onyx immediately, so the UI reflects the change at once. The conflict resolver is evaluated **read-only** here, solely to decide whether to *suppress* the optimistic data (`conflictAction.type === 'noAction'`); the rest of the conflict action is ignored at this site. A `requestID` is stamped from a per-tab counter (the field is literally `requestIndex`; `requestID` is read only as a legacy fallback in `getClientRequestIndex` — this doc keeps the conventional name), and `optimisticData` is stripped from the version that will be persisted (it has already been applied). +1. **Prepare (synchronous, in `API.index.prepareRequest`).** `optimisticData` is applied to Onyx immediately, so the UI reflects the change at once. The conflict resolver is evaluated **read-only** here, solely to decide whether to *suppress* the optimistic data (`conflictAction.type === 'noAction'`); the rest of the conflict action is ignored at this site. A `requestID` is stamped from a per-tab counter (the field is literally `requestIndex`; `requestID` is read only as a legacy fallback in `getClientRequestIndex`; this doc keeps the conventional name), and `optimisticData` is stripped from the version that will be persisted (it has already been applied). -2. **Push (`async SequentialQueue.push`).** The conflict resolver runs a **second** time, now **authoritatively**, against the live persisted requests — this is the evaluation that may mutate the queue (push / replace / delete). The resolver function is then deleted off the request (it cannot be serialized), and `PersistedRequests.save` mutates the in-memory array **synchronously** while issuing the disk write (captured as `persistencePromise`). **If the app is already offline at entry**, `push` `await`s `persistencePromise` and returns *without arming* `isReadyPromise` — the READ gate is left untouched (reconnect will flush later). Otherwise `push` marks `isReadyPromise` pending synchronously (before its first `await`, so a READ on the next tick parks behind this write) and **`await`s `persistencePromise` — the disk commit — before flushing**. After the commit it dispatches on three conditions: **went offline *during* the awaited persist** → resolve `isReadyPromise` and return without flushing; **already running** → defer via `isReadyPromise.then(() => flush(false))`; **idle** → `flush(false)`. +2. **Push (`async SequentialQueue.push`).** The conflict resolver runs a **second** time, now **authoritatively**, against the live persisted requests. This is the evaluation that may mutate the queue (push, replace, or delete). The resolver function is then deleted off the request (it cannot be serialized), and `PersistedRequests.save` mutates the in-memory array **synchronously** while issuing the disk write (captured as `persistencePromise`). **If the app is already offline at entry**, `push` `await`s `persistencePromise` and returns *without arming* `isReadyPromise`. The READ gate is left untouched (reconnect will flush later). Otherwise `push` marks `isReadyPromise` pending synchronously (before its first `await`, so a READ on the next tick parks behind this write) and **`await`s `persistencePromise`, the disk commit, before flushing**. After the commit it dispatches on three conditions: **went offline *during* the awaited persist** → resolve `isReadyPromise` and return without flushing; **already running** → defer via `isReadyPromise.then(() => flush(false))`; **idle** → `flush(false)`. -3. **Flush (`flush`).** Five guards run in order — paused, offline, already-running, all-empty (three-legged: queue, `ongoingRequest`, **and** the `QueuedOnyxUpdates` buffer — see [the coordinator](#sequentialqueue-the-coordinator)), **not-leader**. If all pass, `isSequentialQueueRunning` is set, `isReadyPromise` is optionally reset, and a throwaway `PERSISTED_REQUESTS` connection guarantees disk is loaded before `process()` runs (it opts out of connection reuse, because `PersistedRequests` already holds a live `PERSISTED_REQUESTS` connection and reusing it would fire extra callbacks). +3. **Flush (`flush`).** Five guards run in order: paused, offline, already-running, all-empty (three-legged: queue, `ongoingRequest`, **and** the `QueuedOnyxUpdates` buffer, see [the coordinator](#sequentialqueue-the-coordinator)), **not-leader**. If all pass, `isSequentialQueueRunning` is set, `isReadyPromise` is optionally reset, and a throwaway `PERSISTED_REQUESTS` connection guarantees disk is loaded before `process()` runs (it opts out of connection reuse, because `PersistedRequests` already holds a live `PERSISTED_REQUESTS` connection and reusing it would fire extra callbacks). 4. **Process (`process`).** Guards again (paused / offline / empty), then `processNextRequest` promotes the head to `ongoingRequest` and runs the middleware chain. On success it removes the request and **recurses**; on error it walks the [error ladder](#error-handling). -5. **Finally (the drain-end `process().finally`, chained inside `flush`).** The running flag clears, `isReadyPromise` resolves **iff offline or the queue is empty**, and — only when the queue is fully drained — `QueuedOnyxUpdates` is flushed and `queueFlushedData` is applied and cleared. +5. **Finally (the drain-end `process().finally`, chained inside `flush`).** The running flag clears, `isReadyPromise` resolves **only when offline or the queue is empty**, and, only when the queue is fully drained, `QueuedOnyxUpdates` is flushed and `queueFlushedData` is applied and cleared. ## Restart Recovery The persisted queue exists so an interrupted WRITE survives an app kill. The pieces are described across the blocks below; here is the one cold-start path that strings them together. -1. **Disk load.** On startup, the `PERSISTED_REQUESTS` and `PERSISTED_ONGOING_REQUESTS` connect callbacks hydrate the in-memory mirror — `persistedRequests` from the array, `ongoingRequest` from the single key. -2. **Head dedupe.** The init path strips the array head when it `deepEqual`s the rehydrated `ongoingRequest`. Promotion writes both keys in one atomic `multiSet` (see [PersistedRequests](#persistedrequests-the-store)), so it cannot leave the same request both as `ongoingRequest` and as the array head — this dedupe is a **defensive backstop** for any persisted state where the two keys are out of sync on disk (e.g. a queue whose two key writes did not land together). -3. **Flush.** `onPersistedRequestsInitialization` fires `flush()` once there is recovered work (and, separately, `Network/index` fires a startup flush gated on `ActiveClientManager.isReady()`, so the leader gate is meaningful on the first drain). The registered callback is not startup-only: the `PERSISTED_ONGOING_REQUESTS` connect callback re-fires it post-init whenever a *new* ongoing request appears (e.g. observed from another tab) — see [Inbound Consumers](#inbound-consumers-who-calls-the-queue). +1. **Disk load.** On startup, the `PERSISTED_REQUESTS` and `PERSISTED_ONGOING_REQUESTS` connect callbacks hydrate the in-memory mirror: `persistedRequests` from the array, `ongoingRequest` from the single key. +2. **Head dedupe.** The init path strips the array head when it `deepEqual`s the rehydrated `ongoingRequest`. Promotion writes both keys in one atomic `multiSet` (see [PersistedRequests](#persistedrequests-the-store)), so it cannot leave the same request both as `ongoingRequest` and as the array head. This dedupe is a **defensive backstop** for any persisted state where the two keys are out of sync on disk (e.g. a queue whose two key writes did not land together). +3. **Flush.** `onPersistedRequestsInitialization` fires `flush()` once there is recovered work (and, separately, `Network/index` fires a startup flush gated on `ActiveClientManager.isReady()`, so the leader gate is meaningful on the first drain). The registered callback is not startup-only: the `PERSISTED_ONGOING_REQUESTS` connect callback re-fires it post-init whenever a *new* ongoing request appears (e.g. observed from another tab). See [Inbound Consumers](#inbound-consumers-who-calls-the-queue). 4. **Re-drive.** `processNextRequest` sees a non-null `ongoingRequest` and returns it directly (rather than promoting a new head), so the **same** interrupted request is re-sent. **What does _not_ recover:** -- A **File/Blob ongoing request** — `PERSISTED_ONGOING_REQUESTS` holds `null` on disk (the live object was memory-only), so there is nothing to re-drive; the upload is lost. -- A request caught in a **fire-and-forget persist window** — a write that touches the server but whose `multiSet` to disk is not awaited, so the disk record is momentarily stale and a crash inside the window loses the request. Three such windows remain; the enqueue window does not. See [Where the request actually hits disk](#where-the-request-actually-hits-disk-and-where-it-doesnt). - - **Promotion** (`processNextRequest`) — not awaited. - - **Removal** (`endRequestAndRemoveFromQueue`) — not awaited. - - **Rollback** writes — not awaited. - - **Enqueue** (`push`) — _closed_: `push` awaits the `save()` disk commit before flushing, so a freshly-pushed request cannot reach the server while still absent from disk. +- A **File/Blob ongoing request**. `PERSISTED_ONGOING_REQUESTS` holds `null` on disk (the live object was memory-only), so there is nothing to re-drive; the upload is lost. +- A request caught in a **fire-and-forget persist window**: a write that touches the server but whose `multiSet` to disk is not awaited, so the disk record is momentarily stale and a crash inside the window loses the request. Three such windows remain; the enqueue window does not. See [Where the request actually hits disk](#where-the-request-actually-hits-disk-and-where-it-doesnt). + - **Promotion** (`processNextRequest`): not awaited. + - **Removal** (`endRequestAndRemoveFromQueue`): not awaited. + - **Rollback** writes: not awaited. + - **Enqueue** (`push`), now _closed_: `push` awaits the `save()` disk commit before flushing, so a freshly-pushed request cannot reach the server while still absent from disk. -**On sign-out (a different reset):** `Session.cleanupSession()` aborts the in-flight cancellable XHR via `HttpUtils.cancelPendingRequests()` — surfacing as `REQUEST_CANCELLED` to `process()`'s catch (dropped, no data applied) — and clears the persisted queue via `PersistedRequests.clear()`. This is the only inbound path that _aborts_ an in-flight request rather than letting it complete. A bare `Onyx.clear()` (without the abort) resets the mirror to `[]`/`null` through the carve-out path, but because `isInitialized` is already true the init callback does **not** re-fire, so clearing on its own schedules no flush; an in-flight `process()` chain holds local references and is unaffected by the memory wipe. +**On sign-out (a different reset):** `Session.cleanupSession()` aborts the in-flight cancellable XHR via `HttpUtils.cancelPendingRequests()` (this surfaces as `REQUEST_CANCELLED` to `process()`'s catch, dropped, no data applied) and clears the persisted queue via `PersistedRequests.clear()`. This is the only inbound path that _aborts_ an in-flight request rather than letting it complete. A bare `Onyx.clear()` (without the abort) resets the mirror to `[]`/`null` through the carve-out path, but because `isInitialized` is already true the init callback does **not** re-fire, so clearing on its own schedules no flush; an in-flight `process()` chain holds local references and is unaffected by the memory wipe. --- @@ -125,17 +125,17 @@ Each block is described as **the problem it solves → how it works today → sh ## SequentialQueue (the coordinator) -**Problem it solves.** Serialize every WRITE so requests reach the server one at a time, in submission order — deduplicated server-side on the rare resend, not by a client-side exactly-once lock — with automatic retry on transient failure and correct ordering of dependent READs. +**Problem it solves.** Serialize every WRITE so requests reach the server one at a time, in submission order, with automatic retry on transient failure and correct ordering of dependent READs. On the rare resend, duplicates are reconciled server-side, not by a client-side exactly-once lock. **How it works today.** - **Single-flight.** `isSequentialQueueRunning` is the re-entry guard: set at the top of `flush()` after the guards pass, cleared in `process().finally`. While set, new pushes defer rather than start a parallel drain. - **`push()` dispatch.** `push` is `async`. **Offline at entry** → `await` persist and return *without arming* `isReadyPromise` (the READ gate stays untouched). Otherwise it marks `isReadyPromise` pending synchronously, then **`await`s the disk-persistence promise before dispatching**: went offline *during* persist → resolve `isReadyPromise`, return; running → defer behind `isReadyPromise.then(() => flush(false))`; idle → `flush(false)`. The disk commit gates the flush, so the queue never fires off the synchronous in-memory state ahead of it. -- **`flush()` guards, in order.** `isQueuePaused` → `isOfflineNetwork()` → `isSequentialQueueRunning` → all-empty → `!isClientTheLeader()`. The all-empty guard is three-legged — no queued requests, no `ongoingRequest`, **and** an empty `QueuedOnyxUpdates` buffer — so `flush()` doubles as the drain path for buffered updates on a request-empty queue. Leadership is checked **exactly once**, before the running flag flips; the recursive `process()` chain re-checks only `isQueuePaused` and `isOfflineNetwork()`, never leadership. +- **`flush()` guards, in order.** `isQueuePaused` → `isOfflineNetwork()` → `isSequentialQueueRunning` → all-empty → `!isClientTheLeader()`. The all-empty guard is three-legged: no queued requests, no `ongoingRequest`, **and** an empty `QueuedOnyxUpdates` buffer. So `flush()` doubles as the drain path for buffered updates on a request-empty queue. Leadership is checked **exactly once**, before the running flag flips; the recursive `process()` chain re-checks only `isQueuePaused` and `isOfflineNetwork()`, never leadership. - **`process()` recursion.** Pull the next request, run middleware, on success recurse; the recursion terminates when the queue and ongoing request are both empty. - **READ-after-WRITE gate.** `waitForIdle()` returns `isReadyPromise`; READs block on it so a READ response cannot clobber optimistic data from in-flight WRITEs on the same keys. See [the state machine](#the-state-machine) for the resolution rule. **Sharp edges.** -- `push()` `await`s `persistencePromise` (the enqueue disk commit) before it ever calls `flush()`, so a freshly-pushed request cannot reach the server while still absent from disk — there is no enqueue crash window, post-initialization (see [Where the request actually hits disk](#where-the-request-actually-hits-disk-and-where-it-doesnt)). `makeXHR` gates only on auth readiness (never on a disk write); the ordering guarantee comes from `push` awaiting the commit, not from `makeXHR`. The *promotion/removal/rollback* writes inside `process()` are fire-and-forget (see [Where the request actually hits disk](#where-the-request-actually-hits-disk-and-where-it-doesnt)). +- `push()` `await`s `persistencePromise` (the enqueue disk commit) before it ever calls `flush()`, so a freshly-pushed request cannot reach the server while still absent from disk. There is no enqueue crash window once initialization completes (see [Where the request actually hits disk](#where-the-request-actually-hits-disk-and-where-it-doesnt)). `makeXHR` gates only on auth readiness (never on a disk write); the ordering guarantee comes from `push` awaiting the commit, not from `makeXHR`. The *promotion/removal/rollback* writes inside `process()` are fire-and-forget (see [Where the request actually hits disk](#where-the-request-actually-hits-disk-and-where-it-doesnt)). - Leadership being checked once per flush means a leadership change *during* an in-flight drain is not re-evaluated by the running cycle (see [Multi-Tab & Leader Election](#multi-tab--leader-election)). ## PersistedRequests (the store) @@ -144,35 +144,35 @@ Each block is described as **the problem it solves → how it works today → sh **How it works today.** - **Two Onyx keys, one mirror.** `PERSISTED_REQUESTS` (the array) and `PERSISTED_ONGOING_REQUESTS` (the single in-flight request) back an in-memory mirror (`persistedRequests`, `ongoingRequest`). After initialization the in-memory copies are authoritative. -- **The ongoing-request model.** `processNextRequest` captures the head as a **local** reference, sets `ongoingRequest`, trims the queue (`slice(1)`), and writes **both keys in a single atomic `Onyx.multiSet`** — so a crash cannot leave the request both "in the queue" and "ongoing." `rollbackOngoingRequest` and `endRequestAndRemoveFromQueue` likewise update both keys together. -- **Why `processNextRequest` returns the local reference.** Onyx's post-`multiSet` callback fires synchronously and would overwrite the module-level `ongoingRequest` with a JSON-serialized copy — which strips `File`/`Blob` payloads. Returning the captured local reference preserves the live objects for the request about to run. (The `knownOngoingRequestIDs` own-write guard ignores that echo for requests carrying a `requestIndex` — the local-ref return is load-bearing for those without one.) -- **File/Blob _ongoing_ requests are not durably persisted.** `shouldPersistOngoingRequest` returns false when any value in `request.data` is a `File`/`Blob`, so `null` is written to `PERSISTED_ONGOING_REQUESTS` and the live object is kept only in the in-memory `ongoingRequest` mirror (a `File`/`Blob` cannot be structured-cloned to survive a restart anyway). Note this null-out guard applies **only to the ongoing key** — the `PERSISTED_REQUESTS` array writes the full request, File/Blob included; IndexedDB can store it on web, while SQLite reduces it to `{}` on native. See [Where the request actually hits disk](#where-the-request-actually-hits-disk-and-where-it-doesnt). +- **The ongoing-request model.** `processNextRequest` captures the head as a **local** reference, sets `ongoingRequest`, trims the queue (`slice(1)`), and writes **both keys in a single atomic `Onyx.multiSet`**, so a crash cannot leave the request both "in the queue" and "ongoing." `rollbackOngoingRequest` and `endRequestAndRemoveFromQueue` likewise update both keys together. +- **Why `processNextRequest` returns the local reference.** Onyx's post-`multiSet` callback fires synchronously and would overwrite the module-level `ongoingRequest` with a JSON-serialized copy, which strips `File`/`Blob` payloads. Returning the captured local reference preserves the live objects for the request about to run. (The `knownOngoingRequestIDs` own-write guard ignores that echo for requests carrying a `requestIndex`; the local-ref return is load-bearing for those without one.) +- **File/Blob _ongoing_ requests are not durably persisted.** `shouldPersistOngoingRequest` returns false when any value in `request.data` is a `File`/`Blob`, so `null` is written to `PERSISTED_ONGOING_REQUESTS` and the live object is kept only in the in-memory `ongoingRequest` mirror (a `File`/`Blob` cannot be structured-cloned to survive a restart anyway). This null-out guard applies **only to the ongoing key**. The `PERSISTED_REQUESTS` array writes the full request, File/Blob included; IndexedDB can store it on web, while SQLite reduces it to `{}` on native. See [Where the request actually hits disk](#where-the-request-actually-hits-disk-and-where-it-doesnt). - **Structural removal.** `endRequestAndRemoveFromQueue` and the init-time dedupe match by `deepEqual` (structural), **not** by `requestID`. `requestID` matching is used only for cross-tab merge and leader-deletion reconciliation (via `knownRequestIDs`). - **`getLength()` counts the ongoing request.** It returns the array length plus one when `ongoingRequest` is non-null, so `getLength() === 0` means the queue is truly idle. ### Why in-memory is authoritative -Onyx (since 3.0.46) fires `set`/`multiSet` subscription callbacks **synchronously and sometimes out of order**. A naive subscriber that adopted every echoed value could let a stale, older echo overwrite correct in-memory state — losing or duplicating queued requests (the failure mode behind that era's lost/duplicated-request bug). So `PersistedRequests` ignores its own write echoes: +Onyx (since 3.0.46) fires `set`/`multiSet` subscription callbacks **synchronously and sometimes out of order**. A subscriber that adopted every echoed value could let a stale, older echo overwrite correct in-memory state, losing or duplicating queued requests (the failure mode behind that era's lost/duplicated-request bug). So `PersistedRequests` ignores its own write echoes: -- The `PERSISTED_REQUESTS` callback, once initialized, does **not** blindly adopt disk. It absorbs only requests whose `requestID` is **not** in `knownRequestIDs` (genuine cross-tab writes — which it re-persists and reports via the cross-tab callback), and it reconciles leader-tab deletions only while `pendingOnyxWrites === 0`. +- The `PERSISTED_REQUESTS` callback, once initialized, does **not** blindly adopt disk. It absorbs only requests whose `requestID` is **not** in `knownRequestIDs` (genuine cross-tab writes, which it re-persists and reports via the cross-tab callback), and it reconciles leader-tab deletions only while `pendingOnyxWrites === 0`. - `pendingOnyxWrites` is a hand-maintained counter: `trackOnyxWrite` wraps the `Onyx.set`/`multiSet` calls the module issues (across both keys) so the callback can tell "this echo is my own in-flight write" from "this is a real external change." One write escapes the wrapper: `clear()` issues its `Onyx.set(PERSISTED_ONGOING_REQUESTS, null)` bare (only the companion array write is tracked). - **Two carve-outs** still adopt disk: a `null` value (e.g. from `Onyx.clear()`) falls through to the disk-load/re-init path (the guard is gated on `val != null`); and genuinely-new cross-tab requests identified by an unknown `requestID`. **Sharp edges.** -- The `PERSISTED_ONGOING_REQUESTS` callback **is guarded against own-write echoes, but differently** from the array: it early-returns when the value's `requestIndex` is in `knownOngoingRequestIDs` (a set fed by `processNextRequest` and `updateOngoingRequest`). It never consults `pendingOnyxWrites`; `knownOngoingRequestIDs` is its only own-write guard (the set also catches stale serialized copies a write-counter cannot). Two paths are adopted unconditionally — a `null` echo and a value with an unknown (or missing) `requestIndex`. The same callback also fires `triggerInitializationCallback()` (= `flush`) post-init when a new ongoing request appears — a mid-session flush trigger (see [Inbound Consumers](#inbound-consumers-who-calls-the-queue)). -- The `pendingOnyxWrites` counter is a hand-rolled ordering mechanism reimplementing a guarantee the data layer doesn't provide. It is correct only as long as every own write is wrapped and increments/decrements stay balanced (`clear()`'s bare ongoing-key write is the one exception — see above). +- The `PERSISTED_ONGOING_REQUESTS` callback **is guarded against own-write echoes, but differently** from the array: it early-returns when the value's `requestIndex` is in `knownOngoingRequestIDs` (a set fed by `processNextRequest` and `updateOngoingRequest`). It never consults `pendingOnyxWrites`; `knownOngoingRequestIDs` is its only own-write guard (the set also catches stale serialized copies a write-counter cannot). Two paths are adopted unconditionally: a `null` echo and a value with an unknown (or missing) `requestIndex`. The same callback also fires `triggerInitializationCallback()` (= `flush`) post-init when a new ongoing request appears, a mid-session flush trigger (see [Inbound Consumers](#inbound-consumers-who-calls-the-queue)). +- The `pendingOnyxWrites` counter is a hand-rolled ordering mechanism reimplementing a guarantee the data layer doesn't provide. It is correct only as long as every own write is wrapped and increments/decrements stay balanced (`clear()`'s bare ongoing-key write is the one exception; see above). - Structural `deepEqual` removal is keyed on object shape, not identity. If a request object is mutated mid-flight by middleware (e.g. optimistic-ID rewriting, a `shouldRetry` flag, `File` stripping on serialization, an added rollback marker), a later `deepEqual` can fail to match the original. -- `persistWhenOngoing` is a **vestigial field**. It is read only for logging (in `processNextRequest`, `SequentialQueue`, and `RequestsQueuesState`) and is never assigned by any production write path — ongoing-request persistence is unconditional, gated only by `shouldPersistOngoingRequest` (whether the request can be serialized). Every production read resolves to `undefined`; it is safe to remove. +- `persistWhenOngoing` is a **vestigial field**. It is read only for logging (in `processNextRequest`, `SequentialQueue`, and `RequestsQueuesState`) and is never assigned by any production write path. Ongoing-request persistence is unconditional, gated only by `shouldPersistOngoingRequest` (whether the request can be serialized). Every production read resolves to `undefined`; it is safe to remove. ## Where the request actually hits disk (and where it doesn't) -This is the crux for any crash-safety or persist-before-fire reasoning, so it is worth pinning exactly. +This is the crux for any crash-safety or persist-before-fire reasoning, so here is exactly where a write hits disk. -**A WRITE persists through one function.** `PersistedRequests.save()` — invoked by `SequentialQueue.push()` directly, or via `handleConflictActions` for the conflict paths. Every mutator in `PersistedRequests` shares one shape: the in-memory module state (`persistedRequests`, `ongoingRequest`, `knownRequestIDs`) is updated **synchronously first**, then the Onyx write is issued with the already-updated value, wrapped in `trackOnyxWrite`. +**A WRITE persists through one function.** `PersistedRequests.save()`, invoked by `SequentialQueue.push()` directly or via `handleConflictActions` for the conflict paths. Every mutator in `PersistedRequests` shares one shape: the in-memory module state (`persistedRequests`, `ongoingRequest`, `knownRequestIDs`) is updated **synchronously first**, then the Onyx write is issued with the already-updated value, wrapped in `trackOnyxWrite`. -**The Onyx promise is a real disk-commit handle, not a cache handle.** In `OnyxUtils` (`setWithRetry`/`multiSetWithRetry`), the cache is updated and subscribers are notified **synchronously** via `broadcastUpdate`/`keyChanged` — there is a source comment, _"This approach prioritizes fast UI changes without waiting for data to be stored in device storage"_ — and the function then returns the `storage.setItem`/`storage.multiSet` chain. That returned promise resolves only after the storage provider commits: an IndexedDB transaction `complete` event on web, or a SQLite WAL commit on native (`journal_mode=WAL`, `synchronous=NORMAL` — durable across a JS/app crash, weaker than an fsync against OS-level power loss). So `push()`'s `persistencePromise` **does** represent durable persistence. Three caveats bound that statement: a **`null` write** returns `Promise.resolve()` immediately (the storage delete runs fire-and-forget — see below), so for null/delete payloads the promise is *not* a commit handle, and a `multiSet` containing a `null` value covers only its non-null pairs (e.g. `endRequestAndRemoveFromQueue`'s "atomic" promise covers only the array write, not the ongoing-key delete); Onyx's internal `retryOperation` **swallows storage failures** after `MAX_STORAGE_OPERATION_RETRY_ATTEMPTS` (5), resolving anyway; and on web Onyx can silently **degrade to `MemoryOnlyProvider`** when the IndexedDB store cannot be created, after which every "commit" is memory-only. +**The Onyx promise is a real disk-commit handle, not a cache handle.** In `OnyxUtils` (`setWithRetry`/`multiSetWithRetry`), the cache is updated and subscribers are notified **synchronously** via `broadcastUpdate`/`keyChanged` (a source comment says _"This approach prioritizes fast UI changes without waiting for data to be stored in device storage"_), and the function then returns the `storage.setItem`/`storage.multiSet` chain. That returned promise resolves only after the storage provider commits: an IndexedDB transaction `complete` event on web, or a SQLite write-ahead logging (WAL) commit on native (`journal_mode=WAL`, `synchronous=NORMAL`, durable across a JS/app crash, weaker than an fsync against OS-level power loss). So `push()`'s `persistencePromise` **does** represent durable persistence. Three caveats bound that statement: a **`null` write** returns `Promise.resolve()` immediately (the storage delete runs fire-and-forget, see below), so for null/delete payloads the promise is *not* a commit handle, and a `multiSet` containing a `null` value covers only its non-null pairs (e.g. `endRequestAndRemoveFromQueue`'s "atomic" promise covers only the array write, not the ongoing-key delete); Onyx's internal `retryOperation` **swallows storage failures** after `MAX_STORAGE_OPERATION_RETRY_ATTEMPTS` (5), resolving anyway; and on web Onyx can silently **degrade to `MemoryOnlyProvider`** when the IndexedDB store cannot be created, after which every "commit" is memory-only. -**And on the enqueue path, the disk write gates the network.** `push` is `async`: it issues `save()` (synchronous in-memory update + the disk write captured as `persistencePromise`), marks `isReadyPromise` pending, then **`await`s `persistencePromise`** and only afterward calls `flush(false)`. So the enqueue commit lands before `flush()` → `process()` → `processWithMiddleware()` → `makeXHR()` ever runs. `makeXHR` itself still gates the `HttpUtils.xhr` call only on `hasReadRequiredDataFromStorage()` (auth/NetworkStore readiness) — it has no disk gate — but the request cannot leave the device ahead of its enqueue commit, because `push` blocks on that commit first. (The in-flight `#2` promotion `multiSet` is fire-and-forget; the gated window is specifically the enqueue path.) **This holds only once `PersistedRequests` has initialized.** Before initialization, `save()` parks the request in `pendingSaveOperations` and returns `Promise.resolve()` — *not* a commit handle — so `push`'s `await` is a no-op. The real enqueue write is issued later by the init connect-callback as an **un-awaited** `Onyx.set(PERSISTED_REQUESTS, …)`, which then calls `triggerInitializationCallback()` → `flush()`. So during that startup window the XHR can fire ahead of the enqueue commit; the crash window stays open until initialization completes. +**On the enqueue path, the disk write gates the network.** `push` is `async`: it issues `save()` (synchronous in-memory update plus the disk write captured as `persistencePromise`), marks `isReadyPromise` pending, then **`await`s `persistencePromise`** and only afterward calls `flush(false)`. So the enqueue commit lands before `flush()` → `process()` → `processWithMiddleware()` → `makeXHR()` ever runs. `makeXHR` itself still gates the `HttpUtils.xhr` call only on `hasReadRequiredDataFromStorage()` (auth/NetworkStore readiness) and has no disk gate, but the request cannot leave the device ahead of its enqueue commit, because `push` blocks on that commit first. (The in-flight `#2` promotion `multiSet` is fire-and-forget; the gated window is specifically the enqueue path.) **This holds only once `PersistedRequests` has initialized.** Before initialization, `save()` parks the request in `pendingSaveOperations` and returns `Promise.resolve()` (*not* a commit handle), so `push`'s `await` is a no-op. The real enqueue write is issued later by the init connect-callback as an **un-awaited** `Onyx.set(PERSISTED_REQUESTS, …)`, which then calls `triggerInitializationCallback()` → `flush()`. So during that startup window the XHR can fire ahead of the enqueue commit; the crash window stays open until initialization completes. ``` async push() (online · idle · no conflict) @@ -192,7 +192,7 @@ async push() (online · idle · no conflict) └─ process() ├─ processNextRequest → Onyx.multiSet(PERSISTED_REQUESTS, ONGOING) ▒▒ #2 (not awaited) └─ processWithMiddleware → makeXHR - └─ gate: hasReadRequiredDataFromStorage() (auth only — no disk gate) + └─ gate: hasReadRequiredDataFromStorage() (auth only, no disk gate) └─ HttpUtils.xhr(...) ───────────────────────► NETWORK FIRES ╔══════════════════════════════════════════════════════════════════╗ @@ -208,37 +208,37 @@ async push() (online · idle · no conflict) | When | Function | Key(s) written | Awaited by the control flow? | |---|---|---|---| -| Enqueue | `save()` | `PERSISTED_REQUESTS` | **Yes** — `push` awaits `persistencePromise` before `flush()` | -| Promote head → ongoing | `processNextRequest()` | both (atomic `multiSet`) | No — fire-and-forget | -| Success / non-retryable drop | `endRequestAndRemoveFromQueue()` | both (atomic `multiSet`) | No — fire-and-forget | -| Retryable failure | `rollbackOngoingRequest()` | both (atomic `multiSet`) | No — fire-and-forget | -| Conflict REPLACE | `update()` | `PERSISTED_REQUESTS` | **Yes** — `handleConflictActions` awaits it | -| Conflict DELETE | `deleteRequestsByIndices()` | `PERSISTED_REQUESTS` | **Yes** — `handleConflictActions` awaits it | +| Enqueue | `save()` | `PERSISTED_REQUESTS` | **Yes**, `push` awaits `persistencePromise` before `flush()` | +| Promote head → ongoing | `processNextRequest()` | both (atomic `multiSet`) | No, fire-and-forget | +| Success / non-retryable drop | `endRequestAndRemoveFromQueue()` | both (atomic `multiSet`) | No, fire-and-forget | +| Retryable failure | `rollbackOngoingRequest()` | both (atomic `multiSet`) | No, fire-and-forget | +| Conflict REPLACE | `update()` | `PERSISTED_REQUESTS` | **Yes**, `handleConflictActions` awaits it | +| Conflict DELETE | `deleteRequestsByIndices()` | `PERSISTED_REQUESTS` | **Yes**, `handleConflictActions` awaits it | | In-flight update (e.g. reauth) | `updateOngoingRequest()` | `PERSISTED_ONGOING_REQUESTS` | No | -> `save()`'s disk-write `.catch` (and those on `update()` / `deleteRequestsByIndices()`) `Log.alert`s a storage emergency but does **not** re-throw, so `persistencePromise` resolves to `void` whether the storage commit **succeeded or failed** — even a consumer that awaits it (e.g. `push` itself, or `API.write`) cannot detect a failed persist. `push` leans on this deliberately: its `await persistencePromise` is wrapped in a `try/catch` that flushes anyway on rejection, since the request is already in the in-memory queue. Onyx itself can also swallow a failed commit — the retry caveat above — so the silence has two layers. +> `save()`'s disk-write `.catch` (and those on `update()` / `deleteRequestsByIndices()`) `Log.alert`s a storage emergency but does **not** re-throw, so `persistencePromise` resolves to `void` whether the storage commit **succeeded or failed**. Even a consumer that awaits it (e.g. `push` itself, or `API.write`) cannot detect a failed persist. `push` leans on this deliberately: its `await persistencePromise` is wrapped in a `try/catch` that flushes anyway on rejection, since the request is already in the in-memory queue. Onyx itself can also swallow a failed commit (the retry caveat above), so the silence has two layers. **Where a request does _not_ hit disk:** -- **READ commands and `makeRequestWithSideEffects`** never route through `save()` — nothing is persisted or recovered for them. +- **READ commands and `makeRequestWithSideEffects`** never route through `save()`. Nothing is persisted or recovered for them. - **`save()` before initialization** parks the request in `pendingSaveOperations` and returns `Promise.resolve()`; the real write is deferred to the init connect callback. -- **A File/Blob _ongoing_ request** writes `null` to `PERSISTED_ONGOING_REQUESTS` (kept in memory only) and is unrecoverable on restart — but see the asymmetry below: the queue _array_ does write it. -- **`Onyx.set(key, null)`** (e.g. clearing the ongoing key) is routed to a storage **delete** (`remove()`), not a value write — and its returned promise is `Promise.resolve()`, resolved before the delete commits. -- **Subscriber broadcasts** are the synchronous cache/callback loop; they run before — and independently of — the storage write. -- **Asymmetry:** the File/Blob null-out guard applies **only to the ongoing key**. `PERSISTED_REQUESTS` array writes carry no such guard and pass the File/Blob through to the storage provider intact — IndexedDB structured-clone stores a `File` on web, while SQLite's `JSON.stringify` reduces it to `{}` (data lost) on native. File/Blob is therefore **not** uniformly kept off disk. +- **A File/Blob _ongoing_ request** writes `null` to `PERSISTED_ONGOING_REQUESTS` (kept in memory only) and is unrecoverable on restart. But see the asymmetry below: the queue _array_ does write it. +- **`Onyx.set(key, null)`** (e.g. clearing the ongoing key) is routed to a storage **delete** (`remove()`), not a value write. Its returned promise is `Promise.resolve()`, resolved before the delete commits. +- **Subscriber broadcasts** are the synchronous cache/callback loop; they run before the storage write and independently of it. +- **Asymmetry:** the File/Blob null-out guard applies **only to the ongoing key**. `PERSISTED_REQUESTS` array writes carry no such guard and pass the File/Blob through to the storage provider intact. IndexedDB structured-clone stores a `File` on web, while SQLite's `JSON.stringify` reduces it to `{}` (data lost) on native. File/Blob is therefore **not** uniformly kept off disk. ## RequestThrottle (backoff) -**Problem it solves.** Keep retrying a transiently-failing request without hammering a degraded backend — spread load with jittered exponential backoff — and provide a definite "give up" signal when retries are exhausted. +**Problem it solves.** Keep retrying a transiently-failing request without hammering a degraded backend. Spread load with jittered exponential backoff, and provide a definite "give up" signal when retries are exhausted. **How it works today.** - **`getRequestWaitTime`** seeds the first wait with a random jitter in `[MIN_RETRY_WAIT_TIME_MS, MAX_RANDOM_RETRY_WAIT_TIME_MS]` = `[10, 100]` ms, then **doubles** the prior wait on each retry, capped at `MAX_RETRY_WAIT_TIME_MS` (10 s). -- **`sleep`** increments the retry count and picks the cap — `MAX_OPEN_APP_REQUEST_RETRIES` (2) for the `OPEN_APP` command, else `MAX_REQUEST_RETRIES` (10). Within the cap it resolves after the wait; once exceeded it **rejects with no argument** — this argument-less rejection is the give-up signal that `process()`'s catch consumes. -- **`clear`** resets the wait, the retry count, and any pending timeout — called on success and on any non-retryable outcome. +- **`sleep`** increments the retry count and picks the cap: `MAX_OPEN_APP_REQUEST_RETRIES` (2) for the `OPEN_APP` command, else `MAX_REQUEST_RETRIES` (10). Within the cap it resolves after the wait; once exceeded it **rejects with no argument**. This argument-less rejection is the give-up signal that `process()`'s catch consumes. +- **`clear`** resets the wait, the retry count, and any pending timeout. It is called on success and on any non-retryable outcome. - The queue uses a single shared instance, `sequentialQueueRequestThrottle`. **Sharp edges.** -- After the retry cap, a request is **permanently dropped** (see the [give-up row](#error-handling)) — for non-`OPEN_APP` commands with no user-facing modal. +- After the retry cap, a request is **permanently dropped** (see the [give-up row](#error-handling)), for non-`OPEN_APP` commands with no user-facing modal. - `clear()` fires on every success, so a burst that interleaves successes with failures keeps resetting backoff to the floor, weakening the intended exponential spacing against a degraded backend. ## QueuedOnyxUpdates and queueFlushedData @@ -247,15 +247,15 @@ These are **two distinct deferral mechanisms** that are easy to confuse. **Problem `QueuedOnyxUpdates` solves.** Prevent UI flicker: if each WRITE's response Onyx updates were applied immediately as the queue drained, the UI would replay intermediate states. Instead, WRITE responses are buffered and applied **after the whole queue drains**. -**How `QueuedOnyxUpdates` works.** It is a module-level **in-memory** buffer. `queueOnyxUpdates` appends and resolves immediately; `flushQueue` snapshots-and-clears (to avoid races) then applies via `Onyx.update`. The WRITE-vs-READ routing decision lives **upstream** in `OnyxUpdates.applyHTTPSOnyxUpdates`: WRITE-type responses (`onyxData`, then `successData`/`failureData`/`finallyData`) route through `queueOnyxUpdates`; READ-type responses apply via `Onyx.update` immediately. When not signed in, `flushQueue` filters the snapshot to a 15-key preserved allowlist (`SESSION`, `NETWORK`, `IS_LOADING_APP`, `HAS_LOADED_APP`, `CREDENTIALS`, `ACCOUNT`, `MODAL`, `PRESERVED_USER_SESSION`, the theme/locale/try-new-dot/focus-mode NVPs, and the three RAM-only loading flags); the filter is skipped entirely under `CONFIG.IS_TEST_ENV`. The pause early-return lives in `SequentialQueue.flushOnyxUpdatesQueue` (returns early when `isQueuePaused`). It is invoked at the end of a full drain — and from two other entry points: `flush()`'s all-empty guard counts a non-empty buffer as pending work (a flush with zero requests proceeds purely to drain the buffer), and `unpause()` calls it directly when the queue is already empty. +**How `QueuedOnyxUpdates` works.** It is a module-level **in-memory** buffer. `queueOnyxUpdates` appends and resolves immediately; `flushQueue` snapshots-and-clears (to avoid races) then applies via `Onyx.update`. The WRITE-vs-READ routing decision lives **upstream** in `OnyxUpdates.applyHTTPSOnyxUpdates`: WRITE-type responses (`onyxData`, then `successData`/`failureData`/`finallyData`) route through `queueOnyxUpdates`; READ-type responses apply via `Onyx.update` immediately. When not signed in, `flushQueue` filters the snapshot to a 15-key preserved allowlist (`SESSION`, `NETWORK`, `IS_LOADING_APP`, `HAS_LOADED_APP`, `CREDENTIALS`, `ACCOUNT`, `MODAL`, `PRESERVED_USER_SESSION`, the theme/locale/try-new-dot/focus-mode name-value pairs (NVPs), and the three RAM-only loading flags); the filter is skipped entirely under `CONFIG.IS_TEST_ENV`. The pause early-return lives in `SequentialQueue.flushOnyxUpdatesQueue` (returns early when `isQueuePaused`). It is invoked at the end of a full drain. It also runs from two other entry points: `flush()`'s all-empty guard counts a non-empty buffer as pending work (a flush with zero requests proceeds purely to drain the buffer), and `unpause()` calls it directly when the queue is already empty. -**Problem `queueFlushedData` solves.** Apply a small piece of data **only after a full drain** — specifically, mark the app as loaded only once the queue has actually emptied (not mid-drain). +**Problem `queueFlushedData` solves.** Apply a small piece of data **only after a full drain**: mark the app as loaded only once the queue has actually emptied, not mid-drain. **How `queueFlushedData` works.** It is a **distinct, Onyx-persisted** buffer (`QUEUE_FLUSHED_DATA`), separate from the in-memory `QueuedOnyxUpdates`. `SequentialQueue.saveQueueFlushedData` appends a successfully-processed request's `queueFlushedData` field; the queue applies it via `Onyx.update` and clears it only when fully drained (after `flushOnyxUpdatesQueue`). Its sole producer is `App.getOnyxDataForOpenOrReconnect` (`OPEN_APP` / `ReconnectApp`), currently carrying exactly one entry: a merge of `HAS_LOADED_APP = true`. **Sharp edges.** - Both apply **only** when the queue reaches fully-empty. Under sustained WRITE pressure neither applies, so `HAS_LOADED_APP` never flips and the buffers accumulate. -- The application chain in the drain-end `process().finally` has no `.catch` — a failed `Onyx.update` silently skips the subsequent clear. +- The application chain in the drain-end `process().finally` has no `.catch`. A failed `Onyx.update` silently skips the subsequent clear. ## Public Contract (the API layer) @@ -265,42 +265,42 @@ These are **two distinct deferral mechanisms** that are easy to confuse. | Method | Persisted? | Retried? | Ordering | Notes | |---|---|---|---|---| -| `API.write` | Yes (via `push`) | Yes (throttle) | FIFO, leader-only | Applies `optimisticData` synchronously, strips it from the persisted request, then `push`es. Resolves to `void` — does **not** await the network round-trip. Maps to [OFFLINE.md](philosophies/OFFLINE.md) patterns A/B. | +| `API.write` | Yes (via `push`) | Yes (throttle) | FIFO, leader-only | Applies `optimisticData` synchronously, strips it from the persisted request, then `push`es. Resolves to `void`; does **not** await the network round-trip. Maps to [OFFLINE.md](philosophies/OFFLINE.md) patterns A/B. | | `API.read` | No | No | Blocks on `waitForIdle()` (`isReadyPromise`) before sending | Applies `optimisticData` but never persists. Fire-and-forget through the middleware. Two short-lived-auth sign-in commands bypass `waitForWrites`; `API.paginate`'s READ flavor sits behind the same gate. | -| `API.makeRequestWithSideEffects` | No | No | None — bypasses the queue entirely | For commands whose caller needs the response. Online-only in effect. | +| `API.makeRequestWithSideEffects` | No | No | None; bypasses the queue entirely | For commands whose caller needs the response. Online-only in effect. | - **The conflict resolver runs twice.** Read-only in `prepareRequest` (only to set "apply optimistic data?"), then authoritatively in `push` (mutates the queue). See [Conflict Resolution](#conflict-resolution). - **`API.read` blocks on writes** because a READ that returned before in-flight WRITEs landed could overwrite optimistic data on the same keys. **Sharp edges.** -- `makeRequestWithSideEffects` is **not** persisted or retried. Offline, the underlying `fetch` fails and the promise the caller received **rejects** (logged and re-thrown by the `Logging` middleware). It is not silently swallowed — but nothing retries it, so the caller must handle the rejection. +- `makeRequestWithSideEffects` is **not** persisted or retried. Offline, the underlying `fetch` fails and the promise the caller received **rejects** (logged and re-thrown by the `Logging` middleware). It is not silently swallowed, but nothing retries it, so the caller must handle the rejection. ## Inbound Consumers (who calls the queue) -The blocks above describe what the queue does; this is the inbound surface — who drives it and the ordering guarantees they lean on. +The blocks above describe what the queue does; this is the inbound surface: who drives it and the ordering guarantees they lean on. **Who triggers `flush()`:** - `push()` on the idle-online path. -- `onPersistedRequestsInitialization` — recovered work at startup (see [Restart Recovery](#restart-recovery)). -- `onCrossTabRequestsMerged` — another tab enqueued a genuinely-new request. -- The `PERSISTED_ONGOING_REQUESTS` connect callback — post-init, when a **new** ongoing request appears (e.g. observed from another tab), it re-fires the registered initialization callback, which is `flush`. -- `Reconnect` — on **two** edges: the offline→online `NetworkState` transition and the app-became-active listener. Both are deliberately decoupled from `reconnect()`'s data-sync (`openApp`/`reconnectApp`), which never flushes. -- `Network/index` startup — `ActiveClientManager.isReady().then(flush)`, the primary "drain whatever survived the last session, once we know who the leader is" trigger. +- `onPersistedRequestsInitialization`: recovered work at startup (see [Restart Recovery](#restart-recovery)). +- `onCrossTabRequestsMerged`: another tab enqueued a genuinely-new request. +- The `PERSISTED_ONGOING_REQUESTS` connect callback: post-init, when a **new** ongoing request appears (e.g. observed from another tab), it re-fires the registered initialization callback, which is `flush`. +- `Reconnect`, on **two** edges: the offline→online `NetworkState` transition and the app-became-active listener. Both are deliberately decoupled from `reconnect()`'s data-sync (`openApp`/`reconnectApp`), which never flushes. +- `Network/index` startup: `ActiveClientManager.isReady().then(flush)`, the primary "drain whatever survived the last session, once we know who the leader is" trigger. - The **native background-fetch wake-up** (`backgroundTask`), which flushes on a native scheduler tick. **Two ordering gates (don't conflate them):** -- **`waitForIdle()`** resolves when the _whole_ queue drains (returns `isReadyPromise`). Beyond `API.read`, it is consumed by auth-token-swap and post-sign-in flows — `Delegate` connect/disconnect, `Session` merge-account, and `SignInModal` before `openApp` — to fully drain pending WRITEs under the **current** authToken before swapping tokens; sending a queued WRITE under a new token trips `Reauthentication` and forces a logout. +- **`waitForIdle()`** resolves when the _whole_ queue drains (returns `isReadyPromise`). Beyond `API.read`, it is consumed by auth-token-swap and post-sign-in flows (`Delegate` connect/disconnect, `Session` merge-account, and `SignInModal` before `openApp`) to fully drain pending WRITEs under the **current** authToken before swapping tokens. Sending a queued WRITE under a new token trips `Reauthentication` and forces a logout. - **`getCurrentRequest()`** resolves when only the _single in-flight_ request's `process()` chain settles (or immediately if none is in flight). Its production consumer is `User.ts`'s Pusher multi-event handler, which sequences server-pushed Onyx updates **after** the in-flight WRITE so a push cannot apply ahead of the WRITE response that may carry related updates. **Direct `PersistedRequests` callers (bypassing `push`/`flush`):** -- `App` clear-and-restore and `ImportOnyxState` — `rollbackOngoingRequest()` to snapshot/restore the queue around an `Onyx.clear()` or a state import. The `App` path also takes a `getAll()` snapshot and restores via direct `save()` calls — a raw enqueue that bypasses `push()` (no conflict resolution, no flush trigger). -- `Report.getGuidedSetupDataForOpenReport` — `getAll()` (read-only) to scan the queue for an already-enqueued `OPEN_REPORT` carrying `guidedSetupData`, preventing duplication. -- `Policy` and `HandleUnusedOptimisticID` — `getAll()` + `update()`/`updateOngoingRequest()` to rewrite a queued (or in-flight) request in place, e.g. optimistic-ID repair. -- `API.index` — `getLength()` (read-only). This does **not** gate reads; `waitForWrites` always awaits `waitForSequentialQueueIdle()` (= `isReadyPromise`) regardless of the count. -- `isPaused()` / `isRunning()` are read-only accessors — `RequestsQueuesState` diagnostics reads these plus `getAll()`/`getOngoingRequest()` directly; `MainQueue` consumes `isRunning()` **behaviorally**, holding its own processing while the sequential queue drains. +- `App` clear-and-restore and `ImportOnyxState`: `rollbackOngoingRequest()` to snapshot/restore the queue around an `Onyx.clear()` or a state import. The `App` path also takes a `getAll()` snapshot and restores via direct `save()` calls, a raw enqueue that bypasses `push()` (no conflict resolution, no flush trigger). +- `Report.getGuidedSetupDataForOpenReport`: `getAll()` (read-only) to scan the queue for an already-enqueued `OPEN_REPORT` carrying `guidedSetupData`, preventing duplication. +- `Policy` and `HandleUnusedOptimisticID`: `getAll()` + `update()`/`updateOngoingRequest()` to rewrite a queued (or in-flight) request in place, e.g. optimistic-ID repair. +- `API.index`: `getLength()` (read-only). This does **not** gate reads; `waitForWrites` always awaits `waitForSequentialQueueIdle()` (= `isReadyPromise`) regardless of the count. +- `isPaused()` / `isRunning()` are read-only accessors. `RequestsQueuesState` diagnostics reads these plus `getAll()`/`getOngoingRequest()` directly; `MainQueue` consumes `isRunning()` **behaviorally**, holding its own processing while the sequential queue drains. --- @@ -312,8 +312,8 @@ The blocks above describe what the queue does; this is the inbound surface — w |---|---|---|---|---| | `isSequentialQueueRunning` | Single-flight / re-entry guard | top of `flush()` after guards pass | `process().finally` | At most one drain in progress per tab | | `currentRequestPromise` | The in-flight `process()` chain (for `getCurrentRequest`) | start of the process chain | the drain-end `process().finally` (set to `null`) | Non-null only while a request is in flight | -| `isQueuePaused` | Data-gap / deferred-update pause **only** (never offline) | `pause()` — the `shouldPauseQueue` response, `DeferredOnyxUpdates`, `applyOnyxUpdatesReliably` | `unpause()` / `resetQueue()` | While true, nothing is processed; offline is a **separate** `isOfflineNetwork()` check in `push`/`flush`/`process` | -| `isReadyPromise` / `resolveIsReadyPromise` | One-shot gate READs block on (READ-after-WRITE ordering) | starts **already-resolved** (`Promise.resolve()`) at module load; armed pending via `setIsReadyPromisePending()` — in `push()`'s sync prelude and in `flush(true)` | `resolveIsReadyPromise()` — see the four sites below | A READ may proceed only when no WRITE it must follow is pending | +| `isQueuePaused` | Data-gap / deferred-update pause **only** (never offline) | `pause()`: the `shouldPauseQueue` response, `DeferredOnyxUpdates`, `applyOnyxUpdatesReliably` | `unpause()` / `resetQueue()` | While true, nothing is processed; offline is a **separate** `isOfflineNetwork()` check in `push`/`flush`/`process` | +| `isReadyPromise` / `resolveIsReadyPromise` | One-shot gate READs block on (READ-after-WRITE ordering) | starts **already-resolved** (`Promise.resolve()`) at module load; armed pending via `setIsReadyPromisePending()`, in `push()`'s sync prelude and in `flush(true)` | `resolveIsReadyPromise()`; see the four sites below | A READ may proceed only when no WRITE it must follow is pending | | `isReadyPromisePending` | Idempotency guard for `setIsReadyPromisePending()` (prevents orphaning READs parked on a prior pending promise) | `true` when a pending promise is armed | `false` inside `resolveIsReadyPromise` | At most one pending `isReadyPromise` is armed at a time | | `shouldFailAllRequests` | Sticky `NETWORK`-key flag → erroring requests are failed and dropped | `NETWORK` Onyx callback | `NETWORK` Onyx callback | Test/debug only | | `queueFlushedDataToStore` | In-memory mirror of `QUEUE_FLUSHED_DATA` | the `QUEUE_FLUSHED_DATA` connect-callback echo of `saveQueueFlushedData`'s `Onyx.set` | `clearQueueFlushedData` | Applied only on full drain | @@ -321,19 +321,19 @@ The blocks above describe what the queue does; this is the inbound surface — w ### Why `isReadyPromise` resolves on offline, not on paused -In the drain-end `process().finally`, `resolveIsReadyPromise` runs only when `isOfflineNetwork() || !hasRemainingRequests` — keyed on the **direct offline check**, **deliberately not** on `isQueuePaused`. The two are distinct conditions, not one overloaded flag: +In the drain-end `process().finally`, `resolveIsReadyPromise` runs only when `isOfflineNetwork() || !hasRemainingRequests`, keyed on the **direct offline check**, **deliberately not** on `isQueuePaused`. The two are distinct conditions, not one overloaded flag: - **Offline** (`isOfflineNetwork()`) → resolve. The queue cannot process anyway, so READs should be allowed through (they'll show optimistic/stale data, which is acceptable offline). Going offline never sets `isQueuePaused`; it is checked directly. - **`isQueuePaused`** (a `shouldPauseQueue` data-gap sync, or a `DeferredOnyxUpdates`/`applyOnyxUpdatesReliably` pause) → do **not** resolve. WRITEs are still pending behind the gap, so READs must keep waiting or they'd clobber in-flight optimistic data. This is exactly why the gate keys on `isOfflineNetwork()` rather than `!isQueuePaused`. **`resolveIsReadyPromise` is called from four sites**, not just the `finally`. Because `push()` arms the pending promise *before* awaiting the disk write, every early-out path *that armed it* has to release parked READs itself, or they would hang. (The offline-at-entry early-out is the exception: it returns before arming, so it has nothing to release.) -1. The **drain-end `process().finally`** — when offline or the queue is empty (above). -2. The **all-empty guard in `flush()`** — e.g. a conflict resolver deleted the only request without pushing a replacement, so there is nothing to drain. -3. The **not-leader guard in `flush()`** — followers never process the queue, so they resolve immediately (otherwise a follower tab's READs would hang forever after any write). -4. **`push()` itself**, when the network flips offline *during* the awaited disk write — `flush()`'s offline early-return wouldn't resolve, so `push` resolves and returns without flushing. +1. The **drain-end `process().finally`**: when offline or the queue is empty (above). +2. The **all-empty guard in `flush()`**: e.g. a conflict resolver deleted the only request without pushing a replacement, so there is nothing to drain. +3. The **not-leader guard in `flush()`**: followers never process the queue, so they resolve immediately (otherwise a follower tab's READs would hang forever after any write). +4. **`push()` itself**, when the network flips offline *during* the awaited disk write. `flush()`'s offline early-return wouldn't resolve, so `push` resolves and returns without flushing. -`unpause()` calls `flush(false)` precisely to **preserve** the existing `isReadyPromise` — swapping in a fresh one would orphan READs already awaiting the old promise. (`setIsReadyPromisePending()` is idempotent for the same reason: it no-ops when a pending promise is already armed.) +`unpause()` calls `flush(false)` precisely to **preserve** the existing `isReadyPromise`. Swapping in a fresh one would orphan READs already awaiting the old promise. (`setIsReadyPromisePending()` is idempotent for the same reason: it no-ops when a pending promise is already armed.) **Sharp edges.** `isReadyPromise` is a single shared one-shot resolver. If a flush round arms it but never reaches any of the four resolve sites (e.g. a persistent, unresolvable `shouldPauseQueue` data gap), subsequent READs blocked on `waitForIdle()` have no timeout. @@ -341,17 +341,17 @@ In the drain-end `process().finally`, `resolveIsReadyPromise` runs only when `is # Error Handling -When a request errors, `process()`'s `.catch` walks an ordered ladder. **Which Onyx data gets applied diverges sharply by error class** — this table is the contract. +When a request errors, `process()`'s `.catch` walks an ordered ladder. **Which Onyx data gets applied diverges sharply by error class.** This table is the contract. | Error class | Matched by | Retries? | Onyx data applied | User-facing modal | Source / meaning | |---|---|---|---|---|---| -| `REQUEST_CANCELLED` | `error.name` | No — drop | **None** | No | Sign-out cancels in-flight requests; optimistic state assumed correct | -| `DUPLICATE_RECORD` | `error.message` | No — drop | **None** | No | Record already exists server-side; optimistic state assumed correct | -| `shouldFailAllRequests` | `NETWORK` flag (same branch as above) | No — drop | `failureData` **+** `finallyData` | No | Test/debug fail-all | -| `ALREADY_CREATED` | `error.message` | No — drop | `successData` **+** `finallyData` | No | Resource already created server-side → treat as success | -| `THROTTLED` on `RESEND_VALIDATE_CODE` | `error.message` + command | No — drop | `failureData` | No | Rate-limit (429); do not retry to avoid spam | -| Generic (everything else) | — | **Yes** — `rollbackOngoingPersistedRequest()` then `throttle.sleep().then(process)` | (none until resolved/given-up) | No | Transient network/server errors | -| Give-up (generic, retries exhausted) | `throttle.sleep` rejects with no arg | No — drop | **`failureData` only** (not `finallyData`) | **`OPEN_APP` only** (`setIsOpenAppFailureModalOpen`) | Retry cap (10, or 2 for `OPEN_APP`) reached | +| `REQUEST_CANCELLED` | `error.name` | No, drop | **None** | No | Sign-out cancels in-flight requests; optimistic state assumed correct | +| `DUPLICATE_RECORD` | `error.message` | No, drop | **None** | No | Record already exists server-side; optimistic state assumed correct | +| `shouldFailAllRequests` | `NETWORK` flag (same branch as above) | No, drop | `failureData` **+** `finallyData` | No | Test/debug fail-all | +| `ALREADY_CREATED` | `error.message` | No, drop | `successData` **+** `finallyData` | No | Resource already created server-side → treat as success | +| `THROTTLED` on `RESEND_VALIDATE_CODE` | `error.message` + command | No, drop | `failureData` | No | Rate-limit (429); do not retry to avoid spam | +| Generic (everything else) | (fallthrough) | **Yes**: `rollbackOngoingPersistedRequest()` then `throttle.sleep().then(process)` | (none until resolved/given-up) | No | Transient network/server errors | +| Give-up (generic, retries exhausted) | `throttle.sleep` rejects with no arg | No, drop | **`failureData` only** (not `finallyData`) | **`OPEN_APP` only** (`setIsOpenAppFailureModalOpen`) | Retry cap (10, or 2 for `OPEN_APP`) reached | Notes: - **Success path** does not live here. `SaveResponseInOnyx` applies the server's `onyxData` + `successData` + `finallyData`; for WRITE-type responses that is routed through [`QueuedOnyxUpdates`](#queuedonyxupdates-and-queueflusheddata) (deferred until drain). @@ -371,27 +371,27 @@ Three concerns cut across every block. **Problem it solves.** On web, `PERSISTED_REQUESTS` is a single Onyx key replicated to **every tab** of one browser. Without coordination, N tabs would each drain the shared queue and send every request N times. -**How it works today.** `flush()` hard-gates on `isClientTheLeader()` (the last GUID in the shared active-clients list wins — with an `isPromotingNewLeader` carve-out: during the `beforeunload` handoff the departing leader keeps answering `true` even when its GUID is no longer last). Only the leader drains; followers may still **enqueue** (their genuinely-new requests merge cross-tab via `requestID` not in `knownRequestIDs`) but never send. Leadership is checked **once** per flush, before the running flag flips. +**How it works today.** `flush()` hard-gates on `isClientTheLeader()` (the last client GUID (globally unique identifier) in the shared active-clients list wins, with an `isPromotingNewLeader` carve-out: during the `beforeunload` handoff the departing leader keeps answering `true` even when its GUID is no longer last). Only the leader drains; followers may still **enqueue** (their genuinely-new requests merge cross-tab via `requestID` not in `knownRequestIDs`) but never send. Leadership is checked **once** per flush, before the running flag flips. **Sharp edges.** -- **Mid-flush leadership change.** Because leadership is checked once and the async `process()` recursion never re-checks, a leader tab closed/refreshed mid-request can hand leadership to a follower that flushes the same shared queue while the old request is still in flight. There is no cross-tab lock on the ongoing request — duplicate-send avoidance relies on server-side `DUPLICATE_RECORD` / `ALREADY_CREATED` reconciliation, not a lock. +- **Mid-flush leadership change.** Because leadership is checked once and the async `process()` recursion never re-checks, a leader tab closed/refreshed mid-request can hand leadership to a follower that flushes the same shared queue while the old request is still in flight. There is no cross-tab lock on the ongoing request. Duplicate-send avoidance relies on server-side `DUPLICATE_RECORD` / `ALREADY_CREATED` reconciliation, not a lock. - **`requestID` is per-tab, not globally unique.** It comes from a module-scoped counter initialized to a wall-clock timestamp at module load and incremented per request (stamped as `requestIndex`; `requestID` is the legacy fallback name); each tab has its own counter. Timestamp seeding *reduces* but does not *eliminate* cross-tab `requestID` collisions, which the `knownRequestIDs` cross-tab diff relies on. ## Offline Behavior **Problem it solves.** Let the user keep working with no connection; deliver their writes when connectivity returns. -**How it works today.** `getIsOffline()` (from `NetworkState`, read synchronously) gates `push`, `flush`, and `process`. Offline, `push` persists and returns without flushing. On reconnect, `Reconnect` calls `SequentialQueue.flush()` from two edges — the offline→online `NetworkState` transition and the app-became-active listener — deliberately separate from its `reconnect()` data-sync, which never flushes (see [Inbound Consumers](#inbound-consumers-who-calls-the-queue)). The queue does **not** own offline *detection* — see [Network State Detection](NETWORK_STATE_DETECTION.md) for the hard-stop model and recovery probes. +**How it works today.** `getIsOffline()` (from `NetworkState`, read synchronously) gates `push`, `flush`, and `process`. Offline, `push` persists and returns without flushing. On reconnect, `Reconnect` calls `SequentialQueue.flush()` from two edges: the offline→online `NetworkState` transition and the app-became-active listener. Both are deliberately separate from its `reconnect()` data-sync, which never flushes (see [Inbound Consumers](#inbound-consumers-who-calls-the-queue)). The queue does **not** own offline *detection*. See [Network State Detection](NETWORK_STATE_DETECTION.md) for the hard-stop model and recovery probes. ## Pausing & Data-Gap Sync **Problem it solves.** The server tags WRITE responses with update IDs. If the client has missed intermediate updates (a gap between `lastUpdateIDAppliedToClient` and the response's `previousUpdateID`), applying further buffered WRITE responses would put Onyx data out of causal order. The queue must pause until the gap is filled. -**How it works today.** `SaveResponseInOnyx` is the **sole producer** of `shouldPauseQueue`: it sets the flag when the command is not in `requestsToIgnoreLastUpdateID` and `OnyxUpdates.doesClientNeedToBeUpdated` reports a `previousUpdateID` gap, after stashing the update info. `process()` then `pause()`s — the flag flips first, but the just-processed request is still removed and its `queueFlushedData` saved before the recursion early-returns on the pause. The actual gap fetch runs in a **decoupled** `OnyxUpdateManager` flow that fetches the missing range, applies the deferred updates in order, and then calls `unpause()`. Within this concern `pause()` has two inbound callers — `applyOnyxUpdatesReliably` (on detecting a missing-updates fetch) and `DeferredOnyxUpdates` — with `unpause()` called once the gap is filled. +**How it works today.** `SaveResponseInOnyx` is the **sole producer** of `shouldPauseQueue`: it sets the flag when the command is not in `requestsToIgnoreLastUpdateID` and `OnyxUpdates.doesClientNeedToBeUpdated` reports a `previousUpdateID` gap, after stashing the update info. `process()` then `pause()`s. The flag flips first, but the just-processed request is still removed and its `queueFlushedData` saved before the recursion early-returns on the pause. The actual gap fetch runs in a **decoupled** `OnyxUpdateManager` flow that fetches the missing range, applies the deferred updates in order, and then calls `unpause()`. Within this concern `pause()` has two inbound callers: `applyOnyxUpdatesReliably` (on detecting a missing-updates fetch) and `DeferredOnyxUpdates`, with `unpause()` called once the gap is filled. **Sharp edges.** -- The in-code source comment at the `resolveIsReadyPromise` decision claims `isQueuePaused` is set for offline pauses as well as `shouldPauseQueue` data-gap syncs. That is **inaccurate**: no offline path ever calls `pause()` (its only callers are the `shouldPauseQueue` response, `DeferredOnyxUpdates`, and `applyOnyxUpdatesReliably`); offline is handled entirely by the separate `isOfflineNetwork()` gate. The decision logic itself is correct — it keys on `isOfflineNetwork()` — only the explanatory comment is wrong (see [the state machine](#why-isreadypromise-resolves-on-offline-not-on-paused)). -- `shouldPauseQueue` rides on the response object, so any middleware registered after `SaveResponseInOnyx` that returned a *new* response object would strip the flag. Today the only later middleware is `FraudMonitoring`, which returns the response unchanged — so the flag survives, but the in-code comment claiming `SaveResponseInOnyx` "must be last" is stale (see [Middleware](#the-middleware-chain-boundary)). +- The in-code source comment at the `resolveIsReadyPromise` decision claims `isQueuePaused` is set for offline pauses as well as `shouldPauseQueue` data-gap syncs. That is **inaccurate**: no offline path ever calls `pause()` (its only callers are the `shouldPauseQueue` response, `DeferredOnyxUpdates`, and `applyOnyxUpdatesReliably`); offline is handled entirely by the separate `isOfflineNetwork()` gate. The decision logic itself is correct (it keys on `isOfflineNetwork()`); only the explanatory comment is wrong (see [the state machine](#why-isreadypromise-resolves-on-offline-not-on-paused)). +- `shouldPauseQueue` rides on the response object, so any middleware registered after `SaveResponseInOnyx` that returned a *new* response object would strip the flag. Today the only later middleware is `FraudMonitoring`, which returns the response unchanged, so the flag survives. But the in-code comment claiming `SaveResponseInOnyx` "must be last" is stale (see [Middleware](#the-middleware-chain-boundary)). --- @@ -402,14 +402,14 @@ Three concerns cut across every block. **How it works today.** - **Registration order:** `Logging`, `LoadTest`, `FailureTracking`, `Reauthentication`, `handleDeletedAccount`, `SupportalPermission`, `HandleUnusedOptimisticID`, `Pagination`, `SentryServerTiming`, `SaveResponseInOnyx`, `FraudMonitoring`. - **Execution model.** `processWithMiddleware` folds the chain with `reduce`, seeding from `makeXHR(request)`. Each middleware appends its own `.then` to the promise it receives. So the **first-registered** middleware (`Logging`) wraps the raw XHR and its response handler fires **first**; the **last-registered** (`FraudMonitoring`) fires **last**. **Response handlers fire in registration order** (not the reverse). -- **`isFromSequentialQueue = true`** changes exactly one middleware behavior: `Reauthentication` **re-throws** on failure so the queue's catch can retry (rather than resolving a callback). The `QueuedOnyxUpdates` routing in `SaveResponseInOnyx` is **not** keyed on this flag — it is keyed on `request.data.apiRequestType === WRITE` in `applyHTTPSOnyxUpdates` (see [QueuedOnyxUpdates](#queuedonyxupdates-and-queueflusheddata)), which for queue traffic amounts to the same thing. +- **`isFromSequentialQueue = true`** changes exactly one middleware behavior: `Reauthentication` **re-throws** on failure so the queue's catch can retry (rather than resolving a callback). The `QueuedOnyxUpdates` routing in `SaveResponseInOnyx` is **not** keyed on this flag. It is keyed on `request.data.apiRequestType === WRITE` in `applyHTTPSOnyxUpdates` (see [QueuedOnyxUpdates](#queuedonyxupdates-and-queueflusheddata)), which for queue traffic amounts to the same thing. - **`SaveResponseInOnyx`** is the penultimate middleware and the sole producer of `shouldPauseQueue`. - **`Reauthentication`** reacts to a 407 (`NOT_AUTHENTICATED`), which arrives as a *resolved* response (data), by reauthenticating. On the sequential-queue path it re-runs the chain; if reauth ultimately fails it throws `new Error('Failed to reauthenticate')`. On reauth **success** the ongoing request stays promoted and is re-driven in place (no rollback, no dequeue); only reauth **failure** escapes to the generic ladder, which rolls the ongoing request back to the queue head and retries with backoff. `HandleUnusedOptimisticID` is the path that mutates the ongoing request in place (via `updateOngoingRequest`) during such a re-run. **Sharp edges.** - The chain order is implicit import-time module state; there is no runtime assertion that `SaveResponseInOnyx` is penultimate or that `Reauthentication` precedes it. -- `SaveResponseInOnyx`'s early-return guard has a **dead term**: `onyxUpdates` defaults to `response.onyxData ?? []`, and an empty array is truthy, so the `!onyxUpdates` leg of the guard is never true — a response's empty `onyxData` cannot short-circuit the middleware. The sibling `successData`/`failureData`/`finallyData` legs still fire, so the `if` as a whole is not dead — only its empty-`onyxData` short-circuit is. -- The in-code "must be last" comment on `SaveResponseInOnyx` is **inaccurate**: `FraudMonitoring` is registered after it. The current order is safe only **incidentally** — `FraudMonitoring` returns the response object unchanged, so the `shouldPauseQueue` flag riding on it survives. The load-bearing invariant is "no middleware after `SaveResponseInOnyx` may mutate or replace the response," not literally being registered last; nothing enforces either form. +- `SaveResponseInOnyx`'s early-return guard has a **dead term**: `onyxUpdates` defaults to `response.onyxData ?? []`, and an empty array is truthy, so the `!onyxUpdates` leg of the guard is never true. A response's empty `onyxData` cannot short-circuit the middleware. The sibling `successData`/`failureData`/`finallyData` legs still fire, so the `if` as a whole is not dead; only its empty-`onyxData` short-circuit is. +- The in-code "must be last" comment on `SaveResponseInOnyx` is **inaccurate**: `FraudMonitoring` is registered after it. The current order is safe only **incidentally**. `FraudMonitoring` returns the response object unchanged, so the `shouldPauseQueue` flag riding on it survives. The load-bearing invariant is "no middleware after `SaveResponseInOnyx` may mutate or replace the response," not literally being registered last; nothing enforces either form. --- @@ -420,43 +420,43 @@ Three concerns cut across every block. **How it works today.** A request may carry `checkAndFixConflictingRequest`. It is evaluated **twice**: 1. **In `prepareRequest` (read-only)** against `getAll()`, solely to decide whether to suppress optimistic data (`conflictAction.type === 'noAction'`). The rest of the action is ignored here. 2. **In `push` (authoritative)** against the live persisted requests. The resolver function is deleted off the request (to keep it serializable), and the `conflictAction` is applied via `handleConflictActions`, which performs the queue mutations. The four action shapes: - - **push** — append the new request. - - **replace** — overwrite the request at a computed index. - - **delete** — remove requests at computed indices, optionally push a new request and/or recurse into a `nextAction` (`pushNewRequest` is a required boolean on the type; the optionality is behavioral). - - **noAction** — ignore the new request. + - **push**: append the new request. + - **replace**: overwrite the request at a computed index. + - **delete**: remove requests at computed indices, optionally push a new request and/or recurse into a `nextAction` (`pushNewRequest` is a required boolean on the type; the optionality is behavioral). + - **noAction**: ignore the new request. **Sharp edges.** - **Index fragility.** `replace`/`delete` carry **numeric indices** computed by the resolver at `push` time, but `handleConflictActions` applies them **asynchronously** (awaiting between positional operations). Meanwhile `processNextRequest` (`slice(1)`), `endRequestAndRemoveFromQueue` (splice), and cross-tab merges can shift indices. An index computed against one snapshot may point at a different request by the time it is applied. -- **Conflict checking sees only the queue, never the in-flight `ongoingRequest`** — a new write cannot dedupe against a request that is already being sent. +- **Conflict checking sees only the queue, never the in-flight `ongoingRequest`.** A new write cannot dedupe against a request that is already being sent. - The resolver is expected to be pure, but production resolvers are not: `resolveCommentDeletionConflicts` performs an `Onyx.update`, and because `prepareRequest` invokes the **same** closure, the side effect fires on **both** evaluations (twice per write; an idempotent merge today). `resolveEditCommentWithNewAddCommentRequest` goes further and mutates the queued request's `data` in place during evaluation. --- # Test Coverage -Tests are useful here as evidence of the **intended contract** — what the team asserts as a guarantee. (Stated as observation; gaps below are facts, not a to-do list.) +Tests are useful here as evidence of the **intended contract**: what the team asserts as a guarantee. (Stated as observation; gaps below are facts, not a to-do list.) | Suite | What it pins as contract | |---|---| | `tests/unit/APITest.ts` | Offline-persist, online-flush, persisted-until-backend-response, retry/backoff for 5xx + Auth-down (asserts ~3 fetches with throttle waits), reauthentication + ordering, WRITE-blocks-READ, pause/unpause, no-duplicates + enable-feature conflict collapsing, supportal-411 no-retry + `failureData` | | `tests/unit/SequentialQueueTest.ts` | Push/persist counts, conflict resolution (replace/push/noAction, incl. replace-while-ongoing, `async`/awaited to match `push`'s async signature), move-to-ongoing persistence, startup hydration of the ongoing request, `queueFlushedData` lifecycle, `ALREADY_CREATED` treated as success without retry, **and the persist-before-fire edges: `waitForIdle` resolves without flushing when the network goes offline during persist, and a disk-write failure during persist does not strand `isReadyPromise` or block the queue** | -| `tests/unit/PersistedRequests.ts` | Save/remove/processNext/update/updateOngoing, File/Blob handling, cross-tab follower reconciliation; the `Issue 3a`/`3c` cases assert the **durable** ongoing-request behavior (`processNextRequest` always persists via `multiSet`) — a legacy test title references the vestigial `persistWhenOngoing` flag | +| `tests/unit/PersistedRequests.ts` | Save/remove/processNext/update/updateOngoing, File/Blob handling, cross-tab follower reconciliation; the `Issue 3a`/`3c` cases assert the **durable** ongoing-request behavior (`processNextRequest` always persists via `multiSet`); a legacy test title references the vestigial `persistWhenOngoing` flag | | `tests/unit/RequestConflictUtilsTest.ts` | Pure push/replace/delete/noAction resolver logic | | `tests/actions/QueuedOnyxUpdatesTest.ts` | Account-scoped update filtering on flush | | `tests/unit/MiddlewareTest.ts` | `SaveResponseInOnyx` and `HandleUnusedOptimisticID` | | `tests/unit/ReconnectTest.ts` | The offline→online transition flushes the queue exactly once | | `tests/actions/OnyxUpdateManagerTest.ts` | The queue is paused while a missing-updates gap is fetched, and resumed after | -The suites rely on `resetQueue()` (resets the four core coordinator vars — running flag, `currentRequestPromise`, pause flag, `isReadyPromise` — but not `shouldFailAllRequests`, `queueFlushedDataToStore`, or the throttle) and `resetPendingWritesForTest()` (resets only the `pendingOnyxWrites` counter) — test-only reset primitives, the pendants to `clear()`. +The suites rely on `resetQueue()` (resets the four core coordinator vars: running flag, `currentRequestPromise`, pause flag, and `isReadyPromise`, but not `shouldFailAllRequests`, `queueFlushedDataToStore`, or the throttle) and `resetPendingWritesForTest()` (resets only the `pendingOnyxWrites` counter). These are test-only reset primitives, the companions to `clear()`. **Untested on this branch (factual):** -- The **give-up-after-max-retries** branch (apply `failureData`, dequeue, open the `OPEN_APP` failure modal) — all retry tests recover within 1–3 attempts. -- A **dedicated `RequestThrottle` unit** — its backoff math and reject threshold are exercised only indirectly via the shared instance's wait-time getter. -- The **non-leader `flush` guard** — only follower disk reconciliation is covered, not the early return. -- **True mid-flight-crash-then-restart recovery** — startup hydration is simulated by manually seeding `PERSISTED_ONGOING_REQUESTS`. +- The **give-up-after-max-retries** branch (apply `failureData`, dequeue, open the `OPEN_APP` failure modal). All retry tests recover within 1–3 attempts. +- A **dedicated `RequestThrottle` unit**. Its backoff math and reject threshold are exercised only indirectly via the shared instance's wait-time getter. +- The **non-leader `flush` guard**. Only follower disk reconciliation is covered, not the early return. +- **True mid-flight-crash-then-restart recovery**. Startup hydration is simulated by manually seeding `PERSISTED_ONGOING_REQUESTS`. - **`waitForIdle` is used only as a drain helper** in tests (pagination, attachments, workspace-upgrade, network, middleware), never asserted as a READ-blocks-on-WRITE ordering guarantee. -**Two distinct orderings — don't conflate them.** Persist-before-**fire** (await the disk commit before the network XHR) is in effect today and is the behavior described throughout this doc. Persist-before-**optimistic** (persist the request *before* applying `optimisticData` to Onyx in `prepareRequest`) is a *separate*, still-open ordering: `optimisticData` is applied in `prepareRequest`, before `push` is ever called, so the persist-before-fire await does not affect it. `APITest.ts` › `API.write() persistence guarantees` › `'Issue 1: should persist the request before applying optimistic data'` asserts `optimisticDataApplied === true` and `requestPersistedBeforeOptimistic === false` — it pins that optimistic-before-persist ordering, and its "flip the final assertion" comment refers to the persist-before-optimistic fix (not yet shipped). The sibling `Issue 2` case stalls the write promise on an unresolved `PERSISTED_REQUESTS` `Onyx.set`, asserting `push` awaits persistence (the current persist-before-fire behavior), though its describe-block header comment still reads "BUG" — an upstream test inconsistency, not a behavior claim of this doc. +**Two distinct orderings, don't conflate them.** Persist-before-**fire** (await the disk commit before the network XHR) is in effect today and is the behavior described throughout this doc. Persist-before-**optimistic** (persist the request *before* applying `optimisticData` to Onyx in `prepareRequest`) is a *separate*, still-open ordering: `optimisticData` is applied in `prepareRequest`, before `push` is ever called, so the persist-before-fire await does not affect it. `APITest.ts` › `API.write() persistence guarantees` › `'Issue 1: should persist the request before applying optimistic data'` asserts `optimisticDataApplied === true` and `requestPersistedBeforeOptimistic === false`. It pins that optimistic-before-persist ordering, and its "flip the final assertion" comment refers to the persist-before-optimistic fix (not yet shipped). The sibling `Issue 2` case stalls the write promise on an unresolved `PERSISTED_REQUESTS` `Onyx.set`, asserting `push` awaits persistence (the current persist-before-fire behavior), though its describe-block header comment still reads "BUG", an upstream test inconsistency, not a behavior claim of this doc. --- @@ -478,7 +478,7 @@ Defined in `src/CONST/index.ts` under `CONST.NETWORK` (retry/backoff subset rele # Key Modules Reference -A code-location index — concept → file. The behavior lives in the sections above; this is just where to find it. +A code-location index from concept to file. The behavior lives in the sections above; this is just where to find it. | Module | Responsibility | Section | |---|---|---| @@ -488,7 +488,7 @@ A code-location index — concept → file. The behavior lives in the sections a | `src/libs/actions/QueuedOnyxUpdates.ts` | In-memory buffer of WRITE-response Onyx updates, flushed on drain (anti-flicker). | [QueuedOnyxUpdates](#queuedonyxupdates-and-queueflusheddata) | | `src/libs/actions/OnyxUpdates.ts` | `applyHTTPSOnyxUpdates` WRITE/READ routing; `doesClientNeedToBeUpdated`. | [QueuedOnyxUpdates](#queuedonyxupdates-and-queueflusheddata) | | `src/libs/API/index.ts` | Public facade: `prepareRequest`; `write`/`read`/`makeRequestWithSideEffects` routing. | [Public Contract](#public-contract-the-api-layer) | -| `src/libs/Request.ts` | `processWithMiddleware` — folds the middleware chain over `makeXHR`. | [Middleware](#the-middleware-chain-boundary) | +| `src/libs/Request.ts` | `processWithMiddleware` folds the middleware chain over `makeXHR`. | [Middleware](#the-middleware-chain-boundary) | | `src/libs/Middleware/SaveResponseInOnyx.ts` | Penultimate middleware; sole producer of `shouldPauseQueue`. | [Middleware](#the-middleware-chain-boundary) | | `src/libs/Middleware/Reauthentication.ts` | Re-throws on `isFromSequentialQueue`; handles 407 reauth. | [Middleware](#the-middleware-chain-boundary) | | `src/libs/ActiveClientManager/` | Web-tab leader election; always-leader on native. | [Multi-Tab](#multi-tab--leader-election) |