From v5_STABLE part 2 by mason-sharp · Pull Request #382 · pgEdge/spock

mason-sharp · 2026-03-10T01:43:44Z

This is part 2 on top of #372.

Making it a draft for now. It currently has part 1 as a base. Once that is merged, this can use main as its target.

coderabbitai · 2026-03-10T01:44:02Z

No actionable comments were generated in the recent review. 🎉

ℹ️ Recent review info

⚙️ Run configuration

Configuration used: Organization UI

Review profile: CHILL

Plan: Pro

Run ID: 434d2715-a9a7-4b08-9b0f-0d1f1d79177c

📥 Commits

Reviewing files that changed from the base of the PR and between 506481f and b220182.

📒 Files selected for processing (1)

src/spock_functions.c

🚧 Files skipped from review as they are similar to previous changes (1)

src/spock_functions.c

📝 Walkthrough

Walkthrough

Adds a configurable GUC spock.feedback_frequency to control apply-worker feedback frequency, enforces subscriber read-only guards that block resync/write operations (allowing COPY TO, blocking COPY FROM), and adds TAP tests for resync/read-only behavior plus a log-suppression tweak for a performance test.

Changes

Cohort / File(s)	Summary
Feedback Frequency Feature `docs/configuring.md`, `include/spock.h`, `src/spock.c`, `src/spock_apply.c`	Introduce `spock.feedback_frequency` (int GUC, default 200). Add global declaration, register the GUC in `_PG_init`, and replace the hardcoded feedback threshold in `maybe_send_feedback()` with the new variable.
Read-Only Mode Enhancements `src/spock_functions.c`, `src/spock_readonly.c`	Add `check_readonly_for_resync()` to block subscriber-side resync/truncate when `spock.readonly` is enabled. Update utility parsing to treat `COPY TO` as read-only and `COPY FROM` as a write (blocked in read-only mode).
Test Coverage & Test Tweaks `tests/tap/t/015_resync_readonly_origin.pl`, `tests/tap/t/100_progress_period.pl`	Add an end-to-end TAP test exercising resync behavior under read-only provider/subscriber states. Adjust a performance test to reduce log noise by setting `log_min_messages` to `warning` before test actions.

Poem

🐇 I counted WALs by rhythm and beat,
Feedback tuned so timing's neat,
COPY TO nibble, COPY FROM must wait,
Resync pauses at the subscriber gate,
A tiny hop to keep replicas straight.

🚥 Pre-merge checks | ✅ 2 | ❌ 1

❌ Failed checks (1 inconclusive)

Check name	Status	Explanation	Resolution
Title check	❓ Inconclusive	The title 'From v5_STABLE part 2' is vague and generic, using non-descriptive terms that don't convey meaningful information about the changeset's primary purpose.	Use a more descriptive title that summarizes the main changes, such as 'Add configurable feedback_frequency and read-only resync guards' or similar.

✅ Passed checks (2 passed)

Check name	Status	Explanation
Description check	✅ Passed	The description explains this is part 2 of a multi-part change based on PR `#372`, which provides adequate context despite minimal detail about the actual changes.
Docstring Coverage	✅ Passed	Docstring coverage is 83.33% which is sufficient. The required threshold is 80.00%.

_{✏️ Tip: You can configure your own custom pre-merge checks in the settings.}

✨ Finishing Touches

📝 Generate docstrings (stacked PR)
📝 Generate docstrings (commit on current branch)

🧪 Generate unit tests (beta)

Create PR with unit tests
Post copyable unit tests in a comment
Commit unit tests in branch from-v5_STABLE-part2

Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out.

❤️ Share

_{Comment @coderabbitai help to get the list of available commands and usage tips.}

mason-sharp · 2026-03-10T03:14:04Z

@coderabbitai review

coderabbitai · 2026-03-10T03:14:10Z

✅ Actions performed

Review triggered.

Note: CodeRabbit is an incremental review system and does not re-review already reviewed commits. This command is applicable only when automatic reviews are paused.

coderabbitai

Actionable comments posted: 3

🤖 Prompt for all review comments with AI agents

Verify each finding against the current code and only fix it if needed.

Inline comments:
In `@src/spock_functions.c`:
- Around line 1004-1019: The subscription-wide synchronize path in
spock_alter_subscription_synchronize currently skips the readonly precheck that
check_readonly_for_resync enforces (used by
spock_alter_subscription_resynchronize_table), so add a call to
check_readonly_for_resync(nspname, relname) (or a variant that checks the whole
subscription if necessary) early in spock_alter_subscription_synchronize before
any sync-state creation or table truncation; ensure you use the same error
semantics and message by invoking the existing check_readonly_for_resync
function (or refactor that function into a subscription-level check and call it
from both spock_alter_subscription_resynchronize_table and
spock_alter_subscription_synchronize) so the read-only guard is applied
consistently.

In `@tests/tap/t/015_resync_readonly_origin.pl`:
- Around line 52-53: Replace the fixed wall-clock sleeps (system_or_bail
'sleep', '5' and the other similar sleep calls) with state-based waits: remove
system_or_bail 'sleep' invocations and instead poll the subscription/table
status using the test-suite's existing sync helpers (e.g.,
wait_for_subscription_sync, wait_for_table_sync) or, if those helpers are
absent, implement a short polling loop that queries the subscription/table state
and exits when the desired state is reached (with a reasonable timeout and
failure message). Do this for the occurrences at the other noted sleep sites so
the test waits on observable cluster readiness rather than fixed delays.
- Around line 91-92: After each backtick psql invocation (e.g. the
$resync_origin_readonly capture) add an explicit exit-status assertion using
is($?, 0, ...) for expected-success calls and isnt($?, 0, ...) for
expected-failure calls, placed immediately after the command capture and before
the like(...) output checks; apply this change consistently to all four resync
invocations in the test so you first assert the shell exit code ($?) matches the
expected success/failure, then verify error or success text with like(...).

ℹ️ Review info

⚙️ Run configuration

Configuration used: Organization UI

Review profile: CHILL

Plan: Pro

Run ID: 7aa9f8b7-14ca-4264-b87e-7d7955a530fd

📥 Commits

Reviewing files that changed from the base of the PR and between 2d2b568 and 55434b3.

📒 Files selected for processing (8)

docs/configuring.md
include/spock.h
src/spock.c
src/spock_apply.c
src/spock_functions.c
src/spock_readonly.c
tests/tap/t/015_resync_readonly_origin.pl
tests/tap/t/100_progress_period.pl

src/spock_functions.c

tests/tap/t/015_resync_readonly_origin.pl

Sends feedback to WAL sender every n messages, where n is the configured value. (cherry picked from commit 07b840f)

When sub_resync_table() is called with truncate=true while the subscriber has spock.readonly enabled, the table gets truncated but COPY FROM fails, causing permanent data loss. Fix: Check subscriber readonly status before truncating. Also fix spock.readonly to allow COPY TO (read operation) while blocking COPY FROM (write operation). This allows resync to work when the origin is readonly, since syncing only reads from origin. Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com> (cherry picked from commit c9482a8)

(cherry picked from commit 2582fe3)

coderabbitai

Actionable comments posted: 1

♻️ Duplicate comments (1)

tests/tap/t/015_resync_readonly_origin.pl (1)

52-53: ⚠️ Potential issue | 🟠 Major

Replace the wall-clock sleeps with state-based waits.

These checkpoints are waiting on async replication, config reload, and resync completion, none of which are guaranteed to settle in 2/3/5/10 seconds. That makes the TAP flaky on slow CI and unnecessarily slow on fast CI; poll the observable state instead.

Representative pattern

+sub wait_until {
+    my ($desc, $timeout, $predicate) = `@_`;
+    for (1 .. $timeout) {
+        return if $predicate->();
+        sleep 1;
+    }
+    BAIL_OUT("$desc not reached before timeout");
+}
...
-system_or_bail 'sleep', '2';
+wait_until('subscriber readonly applied', 15, sub {
+    scalar_query(2, "SHOW spock.readonly") eq 'all';
+});

Also applies to: 60-61, 83-84, 94-95, 104-104, 129-129, 161-161, 175-176

🤖 Prompt for AI Agents

Verify each finding against the current code and only fix it if needed.

In `@tests/tap/t/015_resync_readonly_origin.pl` around lines 52 - 53, The test
currently uses a hard sleep via system_or_bail 'sleep', '5' to wait for DDL
replication/table sync, which makes the TAP flaky; replace this wall-clock sleep
with a state-based polling wait that repeatedly checks the actual observable
condition (e.g., a helper like wait_for_replication_completion,
wait_for_resync_complete, or a loop calling
check_table_synced/check_config_reload) until it returns success or a timeout
occurs; update the call site that uses system_or_bail 'sleep' to call the
polling helper (or inline a short loop with a small sleep and a deadline) so the
test proceeds as soon as replication/config/resync is done rather than waiting a
fixed 5 seconds.

🤖 Prompt for all review comments with AI agents

Verify each finding against the current code and only fix it if needed.

Inline comments:
In `@src/spock_apply.c`:
- Around line 3941-3945: The timeout check is using last_receive_timestamp
(updated on every PQgetCopyData) which prevents the wal_sender_timeout branch
from firing; change the condition to compare now against last_feedback_timestamp
instead (i.e. use TimestampDifferenceExceeds(*last_feedback_timestamp, now,
wal_sender_timeout / 2) || w_message_count >= spock_feedback_frequency), and
ensure last_feedback_timestamp is updated whenever feedback is actually sent (in
the helper that sends feedback) so the time-based branch reflects time since
last feedback rather than last receive.

---

Duplicate comments:
In `@tests/tap/t/015_resync_readonly_origin.pl`:
- Around line 52-53: The test currently uses a hard sleep via system_or_bail
'sleep', '5' to wait for DDL replication/table sync, which makes the TAP flaky;
replace this wall-clock sleep with a state-based polling wait that repeatedly
checks the actual observable condition (e.g., a helper like
wait_for_replication_completion, wait_for_resync_complete, or a loop calling
check_table_synced/check_config_reload) until it returns success or a timeout
occurs; update the call site that uses system_or_bail 'sleep' to call the
polling helper (or inline a short loop with a small sleep and a deadline) so the
test proceeds as soon as replication/config/resync is done rather than waiting a
fixed 5 seconds.

ℹ️ Review info

⚙️ Run configuration

Configuration used: Organization UI

Review profile: CHILL

Plan: Pro

Run ID: 77f1b899-9cdf-4911-a610-35f557e6bcea

📥 Commits

Reviewing files that changed from the base of the PR and between 55434b3 and 506481f.

📒 Files selected for processing (8)

docs/configuring.md
include/spock.h
src/spock.c
src/spock_apply.c
src/spock_functions.c
src/spock_readonly.c
tests/tap/t/015_resync_readonly_origin.pl
tests/tap/t/100_progress_period.pl

🚧 Files skipped from review as they are similar to previous changes (3)

docs/configuring.md
src/spock_functions.c
include/spock.h

coderabbitai · 2026-03-11T03:07:16Z

src/spock_apply.c

+	 * Send feedback if wal_sender_timeout/2 has passed or after
+	 * spock.feedback_frequency 'w' messages.
 	 */
 	if (TimestampDifferenceExceeds(*last_receive_timestamp, now, wal_sender_timeout / 2) ||
-		w_message_count >= 10)
+		w_message_count >= spock_feedback_frequency)


⚠️ Potential issue | 🟠 Major

Track time since the last feedback, not since the last receive.

Lines 2882-2883 refresh last_receive_timestamp immediately after every PQgetCopyData(), and Line 2957 calls this helper afterwards. That makes the timeout branch here effectively dead during normal streaming, so increasing the threshold to spock_feedback_frequency changes the behavior to “count-based feedback plus separate keepalive traffic” rather than the documented “count or wal_sender_timeout / 2” behavior.

Representative fix

- TimestampTz last_receive_timestamp = GetCurrentTimestamp(); + TimestampTz last_receive_timestamp = GetCurrentTimestamp(); + TimestampTz last_feedback_timestamp = last_receive_timestamp; ... - maybe_send_feedback(applyconn, last_received, - &last_receive_timestamp); + maybe_send_feedback(applyconn, last_received, + &last_feedback_timestamp); ... -static void -maybe_send_feedback(PGconn *applyconn, XLogRecPtr lsn_to_send, - TimestampTz *last_receive_timestamp) +static void +maybe_send_feedback(PGconn *applyconn, XLogRecPtr lsn_to_send, + TimestampTz *last_feedback_timestamp) { static int w_message_count = 0; TimestampTz now = GetCurrentTimestamp(); ... - if (TimestampDifferenceExceeds(*last_receive_timestamp, now, wal_sender_timeout / 2) || + if (TimestampDifferenceExceeds(*last_feedback_timestamp, now, wal_sender_timeout / 2) || w_message_count >= spock_feedback_frequency) { send_feedback(applyconn, lsn_to_send, now, true); - *last_receive_timestamp = now; + *last_feedback_timestamp = now; w_message_count = 0; } }

🤖 Prompt for AI Agents

Verify each finding against the current code and only fix it if needed. In `@src/spock_apply.c` around lines 3941 - 3945, The timeout check is using last_receive_timestamp (updated on every PQgetCopyData) which prevents the wal_sender_timeout branch from firing; change the condition to compare now against last_feedback_timestamp instead (i.e. use TimestampDifferenceExceeds(*last_feedback_timestamp, now, wal_sender_timeout / 2) || w_message_count >= spock_feedback_frequency), and ensure last_feedback_timestamp is updated whenever feedback is actually sent (in the helper that sends feedback) so the time-based branch reflects time since last feedback rather than last receive.

spock_alter_subscription_synchronize() iterates remote tables and calls truncate_table() when truncate=true, but unlike spock_resync_table() it was missing the check_readonly_for_resync() guard introduced in SPOC-440. A readonly subscriber would have its tables truncated with no way to refill them. Add the guard and fix the missing forward declaration for check_readonly_for_resync() (static function defined after its first call site). Co-Authored-By: Asif Rehman <asifr@pgedge.com> Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>

mason-sharp requested a review from rasifr March 10, 2026 01:43

coderabbitai bot reviewed Mar 10, 2026

View reviewed changes

src/spock_functions.c Show resolved Hide resolved

tests/tap/t/015_resync_readonly_origin.pl Show resolved Hide resolved

tests/tap/t/015_resync_readonly_origin.pl Show resolved Hide resolved

Base automatically changed from from-v5_STABLE-part1 to main March 10, 2026 15:39

mason-sharp and others added 4 commits March 10, 2026 18:50

Introduce GUC spock.feedback_frequency

d984512

Sends feedback to WAL sender every n messages, where n is the configured value. (cherry picked from commit 07b840f)

Add default in docs for feedback_frequency

ba68d53

Update comment about read-only mode

2a79b6b

(cherry picked from commit 2582fe3)

mason-sharp force-pushed the from-v5_STABLE-part2 branch from 55434b3 to 506481f Compare March 11, 2026 01:52

mason-sharp marked this pull request as ready for review March 11, 2026 02:57

coderabbitai bot reviewed Mar 11, 2026

View reviewed changes

rasifr force-pushed the from-v5_STABLE-part2 branch from 506481f to b220182 Compare March 11, 2026 10:53

rasifr approved these changes Mar 11, 2026

View reviewed changes

mason-sharp merged commit 33a26be into main Mar 11, 2026
10 checks passed

mason-sharp deleted the from-v5_STABLE-part2 branch March 11, 2026 13:06

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

From v5_STABLE part 2#382

From v5_STABLE part 2#382
mason-sharp merged 5 commits intomainfrom
from-v5_STABLE-part2

mason-sharp commented Mar 10, 2026

Uh oh!

coderabbitai bot commented Mar 10, 2026 •

edited

Loading

Walkthrough

Changes

Poem

❌ Failed checks (1 inconclusive)

Uh oh!

mason-sharp commented Mar 10, 2026

Uh oh!

coderabbitai bot commented Mar 10, 2026

Uh oh!

coderabbitai bot left a comment

Uh oh!

Uh oh!

Uh oh!

Uh oh!

coderabbitai bot left a comment

Uh oh!

coderabbitai bot Mar 11, 2026

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

Conversation

mason-sharp commented Mar 10, 2026

Uh oh!

coderabbitai bot commented Mar 10, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Walkthrough

Changes

Poem

❌ Failed checks (1 inconclusive)

Uh oh!

mason-sharp commented Mar 10, 2026

Uh oh!

coderabbitai bot commented Mar 10, 2026

Uh oh!

coderabbitai bot left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Uh oh!

Uh oh!

coderabbitai bot left a comment

Choose a reason for hiding this comment

Uh oh!

coderabbitai bot Mar 11, 2026

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

coderabbitai bot commented Mar 10, 2026 •

edited

Loading