Skip to content

Add copy_records_to_table for COPY FROM STDIN bulk-load#169

Open
Dev-iL wants to merge 1 commit into
psqlpy-python:mainfrom
Dev-iL:2605/copy_from_stdin
Open

Add copy_records_to_table for COPY FROM STDIN bulk-load#169
Dev-iL wants to merge 1 commit into
psqlpy-python:mainfrom
Dev-iL:2605/copy_from_stdin

Conversation

@Dev-iL
Copy link
Copy Markdown
Contributor

@Dev-iL Dev-iL commented May 14, 2026

Description

Adds copy_records_to_table on Connection and Transaction, mirroring asyncpg.Connection.copy_records_to_table. The new method accepts an iterable of records, introspects column types from the target table, and streams rows over COPY FROM STDIN (FORMAT binary) via tokio-postgres' BinaryCopyInWriter.

Signature:

await conn.copy_records_to_table(
    table_name: str,
    records: Iterable[Sequence[Any]],
    columns: Sequence[str] | None = None,
    schema_name: str | None = None,
) -> int  # number of inserted rows

Internals:

  • A SELECT <cols|*> FROM <qualified_table> WHERE false is prepared (non-cached) to fetch the column tokio_postgres::types::Type list — no pg_catalog query needed.
  • Each Python cell is converted to PythonDTO with the existing from_python_typed, so the type coverage matches execute() (including JSON/JSONB, arrays, geometry, decimals, etc.).
  • Identifiers (table_name, schema_name, columns) are routed through the existing quote_ident helper.
  • The introspection prepare and the COPY share a single read lock on the connection so they run on the same backend.

Motivation and Context

Closes #166.

Prior to this change, the only COPY API was binary_copy_to_table, which required callers to pre-encode PostgreSQL's binary COPY wire format (e.g. via pgpq/Arrow). There was no record-list API and no streaming-writer API — so no ergonomic bulk-load path comparable to asyncpg.copy_records_to_table or psycopg3's cursor.copy(...). Applications falling back to execute_many paid a large throughput penalty.

This PR adds the asyncpg-style record-list variant (Option A from the issue). The psycopg3-style async with conn.copy(...) as writer streaming variant (Option B) is intentionally left as a follow-up — it's a separate, larger API surface (async context manager, partial-write semantics) and worth its own issue.

How has this been tested?

  • cargo build — passes, no warnings.
  • cargo clippy — passes, no warnings.
  • New integration tests in python/tests/test_copy_records.py, run against PostgreSQL 14:
    • test_copy_records_to_table_on_connection — round-trips mixed INT/TEXT/FLOAT8/TIMESTAMPTZ, including a NULL.
    • test_copy_records_to_table_with_columns_subset — explicit columns= leaves untouched columns NULL.
    • test_copy_records_to_table_in_transaction — same API on Transaction.
    • test_copy_records_to_table_rejects_record_arity_mismatch — clean error when a record's field count doesn't match the resolved columns.
    • test_copy_records_to_table_uses_schema_qualifierschema_name= qualifies both introspection and the COPY statement.

Test run:

python/tests/test_copy_records.py::test_copy_records_to_table_on_connection PASSED
python/tests/test_copy_records.py::test_copy_records_to_table_with_columns_subset PASSED
python/tests/test_copy_records.py::test_copy_records_to_table_in_transaction PASSED
python/tests/test_copy_records.py::test_copy_records_to_table_rejects_record_arity_mismatch PASSED
python/tests/test_copy_records.py::test_copy_records_to_table_uses_schema_qualifier PASSED
============================== 5 passed in 0.56s ===============================

Existing test_binary_copy.py is unchanged and the binary_copy_to_table code path is untouched.

Screenshots (if appropriate):

N/A — backend driver change, no UI.

Types of changes

  • Bug fix (non-breaking change which fixes an issue)
  • New feature (non-breaking change which adds functionality)
  • Breaking change (fix or feature that would cause existing functionality to not work as expected)

Checklist:

  • My code follows the code style of this project.
  • My change requires a change to the documentation.
  • I have updated the documentation accordingly.

Notes on the checklist:

  • python/psqlpy/_internal/__init__.pyi stubs have been added on both Transaction and Connection so users get full type/IDE support for the new method.
  • docs/components/connection.md and docs/components/transaction.md do not currently document binary_copy_to_table either, so no existing section needed updating. Happy to add a COPY FROM STDIN doc page in this PR or a follow-up if maintainers prefer.

Closes psqlpy-python#166. The existing binary_copy_to_table required callers to
pre-encode PostgreSQL's binary COPY wire format, leaving no
ergonomic bulk-load path comparable to asyncpg's
copy_records_to_table or psycopg3's cursor.copy(...).

The new method on Connection and Transaction accepts an iterable
of records, introspects column types from the target table, and
streams rows via tokio-postgres' BinaryCopyInWriter using the same
PythonDTO conversions used by execute().
@Dev-iL Dev-iL force-pushed the 2605/copy_from_stdin branch from 7794063 to 65b9d12 Compare May 14, 2026 14:06
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

No public API for COPY FROM STDIN (text/binary stream ingestion)

1 participant