Skip to content
This repository was archived by the owner on Mar 31, 2026. It is now read-only.

feat(dbapi): add retry_aborts_internally option to Connection#1538

Closed
waiho-gumloop wants to merge 1 commit intogoogleapis:mainfrom
waiho-gumloop:feat/dbapi-retry-aborts-internally
Closed

feat(dbapi): add retry_aborts_internally option to Connection#1538
waiho-gumloop wants to merge 1 commit intogoogleapis:mainfrom
waiho-gumloop:feat/dbapi-retry-aborts-internally

Conversation

@waiho-gumloop
Copy link
Copy Markdown
Contributor

@waiho-gumloop waiho-gumloop commented Mar 31, 2026

Summary

Add a retry_aborts_internally flag to the DBAPI Connection class and the connect() function. When set to False, aborted transactions raise RetryAborted directly from commit() instead of entering the internal statement-replay retry loop.

Changes

  • Connection.__init__: Accept retry_aborts_internally parameter (default True)
  • Connection.retry_aborts_internally: Property getter/setter with guard against mid-transaction changes
  • Connection.commit(): Check _retry_aborts_internally before entering the replay loop; when disabled, wrap Aborted in RetryAborted for PEP 249 compliance
  • connect(): Pass-through retry_aborts_internally to Connection
  • Tests: 8 new unit tests covering default behavior, constructor override, setter, setter-during-transaction guard, commit with retry enabled, and commit with retry disabled

Rationale

Why the internal retry was added

The DBAPI's statement-replay retry was introduced to support Django and other PEP 249 ORMs (original issue googleapis/python-spanner#34). These frameworks build transactions incrementally through individual cursor.execute() calls — the DBAPI layer sees a sequence of statements but has no callable representing the full transaction. When Spanner aborts a transaction, the only option is to replay all recorded statements and verify checksums to ensure read consistency.

Why applications may not want the internal retry

Applications that implement their own transaction retry logic — wrapping the entire transaction in a callable and re-invoking it with a fresh session on abort (similar to Session.run_in_transaction) — do not need transparent statement replay. The application already re-reads data and makes fresh decisions on each retry, making checksum validation unnecessary.

When both layers retry simultaneously, the result is nested retry loops that cause severe problems under contention:

  1. Contention amplification (thundering herd): The internal replay acquires locks on the same rows that caused the original abort. Under concurrent writes, this triggers cascading aborts across threads — each replay attempt can abort another thread's replay, leading to exponential retry growth.
  2. Wasted work: The internal retry replays statements up to 50 times with its own backoff, accumulating 13–19 seconds of lock wait time before finally raising RetryAborted. The outer application retry then starts fresh, having wasted all that time.
  3. Checksum mismatches on contended rows: For read-modify-write patterns, replayed reads almost always return different data than the original (because another transaction committed in between), causing checksums to fail. The internal retry is structurally unable to succeed in this scenario.

In our production workload with 10 concurrent writers, disabling the internal retry reduced abort-to-recovery time from ~18 seconds to ~0.05 seconds (using Spanner's server-suggested retry delay) and improved success rates from ~55% to 100%.

Precedent in other Spanner client libraries

This change aligns the Python DBAPI with existing functionality in other Spanner clients:

Library Mechanism Default
JDBC (com.google.cloud.spanner.jdbc) RETRY_ABORTS_INTERNALLY connection property true
Go (cloud.google.com/go/spanner) NewReadWriteStmtBasedTransaction vs ReadWriteTransaction Separate API
Python DBAPI (this PR) retry_aborts_internally constructor/connect parameter True

The JDBC driver's RETRY_ABORTS_INTERNALLY was added specifically for the same use case: applications with their own retry wrappers that need to opt out of the driver's internal retry to avoid interference.

Usage

from google.cloud.spanner_dbapi import connect

# Default behavior (unchanged) — internal retry enabled
conn = connect("instance", "database", project="project", credentials=creds)

# Disable internal retry for application-managed retries
conn = connect("instance", "database", project="project", credentials=creds,
               retry_aborts_internally=False)

# Can also be toggled between transactions
conn.retry_aborts_internally = False

When retry_aborts_internally=False, commit() raises RetryAborted (a subclass of OperationalError) on transaction abort, allowing the application's retry logic to handle it directly.

Test plan

  • test_retry_aborts_internally_defaults_true — constructor defaults to True
  • test_retry_aborts_internally_set_false — constructor accepts False
  • test_retry_aborts_internally_setter — property setter works
  • test_retry_aborts_internally_setter_while_transaction_active — setter rejects changes during active transaction
  • test_commit_retries_internally_when_enabled — commit calls retry_transaction when flag is True
  • test_commit_raises_retry_aborted_when_internal_retry_disabled — commit raises RetryAborted when flag is False
  • test_connect_retry_aborts_internally_default — connect() defaults to True
  • test_connect_retry_aborts_internally_false — connect() passes False through to Connection
  • Full test suite: 68/68 tests pass

Related Issue

Closes googleapis/google-cloud-python#16491

Add a `retry_aborts_internally` flag (default True) to the DBAPI
Connection class and the `connect()` function. When set to False,
aborted transactions raise `RetryAborted` directly from `commit()`
instead of entering the internal statement-replay retry loop.

This allows applications that implement their own transaction retry
logic (e.g. re-invoking a callable with a fresh session) to avoid
nested retry loops and contention amplification under concurrent
writes.

Equivalent to `RETRY_ABORTS_INTERNALLY` in the Spanner JDBC driver
and `ReadWriteStmtBasedTransaction` in the Go client.
@waiho-gumloop waiho-gumloop requested review from a team as code owners March 31, 2026 19:39
@product-auto-label product-auto-label bot added size: m Pull request size is medium. api: spanner Issues related to the googleapis/python-spanner API. labels Mar 31, 2026
Copy link
Copy Markdown
Contributor

@gemini-code-assist gemini-code-assist bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Code Review

This pull request introduces the retry_aborts_internally flag to the Spanner DB-API connection, allowing users to disable the automatic internal retry of aborted transactions. This is useful for applications that implement their own retry logic. The changes include updates to the Connection class, the connect factory function, and corresponding unit tests to verify the new behavior. I have no feedback to provide.

@parthea
Copy link
Copy Markdown
Contributor

parthea commented Mar 31, 2026

Hi @waiho-gumloop, The code in this repository has moved to https://github.com/googleapis/google-cloud-python/tree/main/packages/google-cloud-spanner. Please could you open a new PR in google-cloud-python?

@parthea parthea closed this Mar 31, 2026
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.

Labels

api: spanner Issues related to the googleapis/python-spanner API. size: m Pull request size is medium.

Projects

None yet

Development

Successfully merging this pull request may close these issues.

feat(dbapi): add retry_aborts_internally option to disable internal statement-replay retry

3 participants