Pipeline execute_many and wrap in a transaction by Dev-iL · Pull Request #171 · psqlpy-python/psqlpy

Dev-iL · 2026-05-14T14:39:20Z

Description

Connection.execute_many / Transaction.execute_many no longer issue one round-trip per row. The implementation in src/connection/impls.rs now:

Pipelines all Bind/Execute messages on the same connection via FuturesOrdered (tokio-postgres dispatches them back-to-back instead of stalling on each reply).
Wraps the batch in a single transaction, which is what actually delivers the order-of-magnitude win — postgres fsyncs the WAL on every implicit auto-commit, so collapsing N auto-commits into one transaction collapses N fsyncs into one.

When invoked from Connection.execute_many, the wrap is BEGIN/COMMIT (with ROLLBACK on failure). When invoked from Transaction.execute_many, the wrap is a SAVEPOINT psqlpy_execute_many (RELEASE on success; ROLLBACK TO + RELEASE on failure) so a failed batch can never poison the caller's surrounding transaction. Internal docs on the method body explain the rationale, the asyncpg comparison, and the deliberate divergence on savepoint behaviour.

Motivation and Context

Fixes #167. Reported behaviour: execute_many was ~93× slower than asyncpg.executemany for the same workload because it issued one full round-trip per row and never amortized fsync cost. The bottleneck was visible in src/connection/impls.rs as a sequential for ... await over self.query(&stmt, &params).

The change also introduces a behavioural shift worth flagging in release notes: a mid-batch failure now rolls back earlier rows in the batch (previously each row auto-committed independently). This matches asyncpg / psycopg executemany semantics and the way bulk APIs are generally expected to behave.

How has this been tested?

Environment: PostgreSQL 14 in Docker on localhost (sub-millisecond RTT), CPython 3.13, Linux.

Rust-level microbenchmark against the forked tokio-postgres (1000-row INSERT batch): ~1326 ms sequential → ~32 ms pipelined-in-transaction (41× speedup, ~31 k rows/s). Pipelining without the transaction wrap only got ~1024 ms — confirms the fsync floor is the real bottleneck.
End-to-end through pyo3 with the same 1000-row INSERT batch from Performance issues in execute_many #167: ~128 ms / ~7,800 rows/s, versus the ~3 batches/sec the issue reports.
Savepoint isolation: inside a user transaction, a tx.execute_many that fails on a PK violation no longer aborts the surrounding transaction; subsequent statements in the same tx continue to succeed.
Connection atomicity: outside any transaction, a failed batch rolls back cleanly — no partial rows visible.
Existing test suite (python/tests/test_connection.py, python/tests/test_transaction.py): 37 passed, no failures attributable to this change.
cargo build --release, cargo clippy --release, and the project's pre-commit chain (rustfmt, clippy, cargo check, ruff, mypy) all clean.

Screenshots (if appropriate):

N/A.

Types of changes

Bug fix (non-breaking change which fixes an issue)
New feature (non-breaking change which adds functionality)
Breaking change (fix or feature that would cause existing functionality to not work as expected) — mid-batch failure now rolls back earlier rows; previously each row auto-committed independently.

Checklist:

My code follows the code style of this project.
My change requires a change to the documentation. (release notes should mention the atomicity semantics shift, but no user-facing API doc changes)
I have updated the documentation accordingly.

Replaces the per-row sequential await loop in execute_many with concurrent futures driven via FuturesOrdered, brackets the batch in BEGIN/COMMIT when not already in a transaction, and uses a SAVEPOINT when invoked from Transaction.execute_many so a failed batch can never poison the caller's surrounding transaction. The order-of-magnitude speedup comes from collapsing N implicit auto-commits into one WAL fsync; pipelining alone is insufficient. Locally measured against the forked tokio-postgres: 1000-row INSERT batch ~1326 ms sequential -> ~32 ms pipelined-in-transaction. End-to-end through pyo3: ~128 ms for 1000 rows (~7,800 rows/s), versus the ~3 batches/sec reported in psqlpy-python#167. Fixes psqlpy-python#167 Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>

chandr-andr

lgtm, thanks

chandr-andr approved these changes May 15, 2026

View reviewed changes

chandr-andr merged commit 1e56b25 into psqlpy-python:main May 15, 2026
44 of 45 checks passed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Pipeline execute_many and wrap in a transaction#171

Pipeline execute_many and wrap in a transaction#171
chandr-andr merged 1 commit into
psqlpy-python:mainfrom
Dev-iL:2605/exec_many_perf

Dev-iL commented May 14, 2026

Uh oh!

chandr-andr left a comment

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

Conversation

Dev-iL commented May 14, 2026

Description

Motivation and Context

How has this been tested?

Screenshots (if appropriate):

Types of changes

Checklist:

Uh oh!

chandr-andr left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants