Skip to content

[Due for payment 2026-06-11] [Onyx] Skip retries in retryOperation for non-retriable storage errors #90633

@fabioh8010

Description

@fabioh8010

Coming from #87862 (comment).

Issue

OnyxUtils.retryOperation currently retries every non-capacity storage error up to 5 times with no delay between attempts, then silently swallows the failure with Promise.resolve(). For the UnknownError: Internal error opening backing store for indexedDB.open error class analyzed in #87862, retries are provably futile — the underlying LevelDB backing store is broken at the filesystem level and cannot recover from immediate retries — but each occurrence still produces 5 retry log lines plus a "5 retries exhausted" alert, multiplying log/Sentry volume by ~6× over the underlying event rate.

Current classification (lib/OnyxUtils.ts:53-62)

const IDB_STORAGE_ERRORS = ['quotaexceedederror'];
const SQLITE_STORAGE_ERRORS = ['database or disk is full'];
const STORAGE_ERRORS = [...IDB_STORAGE_ERRORS, ...SQLITE_STORAGE_ERRORS];

For the error in question, error.name === 'UnknownError' and error.message starts with Internal error opening backing store…. Neither matches any token in STORAGE_ERRORS, so isStorageCapacityError === false, eviction is skipped, and the operation falls into the generic retry loop.

Current retry loop (lib/OnyxUtils.ts:806-814)

if (nextRetryAttempt > MAX_STORAGE_OPERATION_RETRY_ATTEMPTS) {
    Logger.logAlert(`Storage operation failed after 5 retries. Error: ${error}. onyxMethod: ${onyxMethod.name}.`);
    return Promise.resolve();
}

if (!isStorageCapacityError) {
    // @ts-expect-error No overload matches this call.
    return onyxMethod(defaultParams, nextRetryAttempt);
}

For 884,955 occurrences of this error in #87862 (26.3% of all storage failures), this produces approximately 6× that volume in log lines — pure noise, because no retry succeeds.

Solution

Introduce a NON_RETRIABLE_ERRORS classification alongside the existing STORAGE_ERRORS, and short-circuit retryOperation for matching errors:

const IDB_NON_RETRIABLE_ERRORS = [
    'internal error opening backing store',
] as const;

const NON_RETRIABLE_ERRORS = [...IDB_NON_RETRIABLE_ERRORS];

In retryOperation, check this list before the existing retry logic:

const isNonRetriableError = NON_RETRIABLE_ERRORS.some(
    (token) => errorName?.includes(token) || errorMessage?.includes(token),
);

if (isNonRetriableError) {
    Logger.logAlert(`Storage operation skipped retry for non-retriable error. Error: ${error}. onyxMethod: ${onyxMethod.name}.`);
    return Promise.resolve();
}

The healing path (separate action item #90636) is responsible for trying to recover the IDB connection for this error class — retryOperation should defer to it, not compete with it.

Why this is safe

  • The cache layer has already absorbed the write before retryOperation is reached (see parent comment §5). Skipping retries doesn't change observable behavior for the session.
  • The 5 retries are already silently dropped via Promise.resolve() after exhaustion today, so callers cannot detect retry success vs. failure.
  • The only observable change is log/Sentry volume, which drops by ~5–6× for this error class.

Scope

This issue is scoped to Internal error opening backing store. Other connection-state errors (e.g. The database connection is closing already handled in #84192, Safari connection drops) may also fit the NON_RETRIABLE_ERRORS pattern depending on whether retryOperation is the right tool for them vs. dedicated mitigation. Out of scope for this issue, but the classification structure introduced here makes adding them later trivial.

Test plan

  • Unit test: mock Storage.setItem to reject with new Error('Internal error opening backing store for indexedDB.open.') (name: UnknownError), assert retryOperation is called once (not 5×), and the "skipped retry for non-retriable error" log fires exactly once.
  • Unit test: confirm existing capacity-error eviction path still works (regression).
  • Confirm in VictoriaLogs post-deploy that the volume of Failed to save to storage. Error: ...Internal error opening backing store... drops by ~5–6×.
Issue OwnerCurrent Issue Owner: @elirangoshen

Metadata

Metadata

Labels

Type

No type
No fields configured for issues without a type.

Projects

Status
Done

Milestone

No milestone

Relationships

None yet

Development

No branches or pull requests

Issue actions