Conversation
|
No actionable comments were generated in the recent review. 🎉 ℹ️ Recent review info⚙️ Run configurationConfiguration used: Organization UI Review profile: CHILL Plan: Pro Run ID: 📒 Files selected for processing (1)
🚧 Files skipped from review as they are similar to previous changes (1)
📝 WalkthroughWalkthroughAdds a configurable GUC Changes
Poem
🚥 Pre-merge checks | ✅ 2 | ❌ 1❌ Failed checks (1 inconclusive)
✅ Passed checks (2 passed)
✏️ Tip: You can configure your own custom pre-merge checks in the settings. ✨ Finishing Touches
🧪 Generate unit tests (beta)
Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out. Comment |
|
@coderabbitai review |
✅ Actions performedReview triggered.
|
There was a problem hiding this comment.
Actionable comments posted: 3
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.
Inline comments:
In `@src/spock_functions.c`:
- Around line 1004-1019: The subscription-wide synchronize path in
spock_alter_subscription_synchronize currently skips the readonly precheck that
check_readonly_for_resync enforces (used by
spock_alter_subscription_resynchronize_table), so add a call to
check_readonly_for_resync(nspname, relname) (or a variant that checks the whole
subscription if necessary) early in spock_alter_subscription_synchronize before
any sync-state creation or table truncation; ensure you use the same error
semantics and message by invoking the existing check_readonly_for_resync
function (or refactor that function into a subscription-level check and call it
from both spock_alter_subscription_resynchronize_table and
spock_alter_subscription_synchronize) so the read-only guard is applied
consistently.
In `@tests/tap/t/015_resync_readonly_origin.pl`:
- Around line 52-53: Replace the fixed wall-clock sleeps (system_or_bail
'sleep', '5' and the other similar sleep calls) with state-based waits: remove
system_or_bail 'sleep' invocations and instead poll the subscription/table
status using the test-suite's existing sync helpers (e.g.,
wait_for_subscription_sync, wait_for_table_sync) or, if those helpers are
absent, implement a short polling loop that queries the subscription/table state
and exits when the desired state is reached (with a reasonable timeout and
failure message). Do this for the occurrences at the other noted sleep sites so
the test waits on observable cluster readiness rather than fixed delays.
- Around line 91-92: After each backtick psql invocation (e.g. the
$resync_origin_readonly capture) add an explicit exit-status assertion using
is($?, 0, ...) for expected-success calls and isnt($?, 0, ...) for
expected-failure calls, placed immediately after the command capture and before
the like(...) output checks; apply this change consistently to all four resync
invocations in the test so you first assert the shell exit code ($?) matches the
expected success/failure, then verify error or success text with like(...).
ℹ️ Review info
⚙️ Run configuration
Configuration used: Organization UI
Review profile: CHILL
Plan: Pro
Run ID: 7aa9f8b7-14ca-4264-b87e-7d7955a530fd
📒 Files selected for processing (8)
docs/configuring.mdinclude/spock.hsrc/spock.csrc/spock_apply.csrc/spock_functions.csrc/spock_readonly.ctests/tap/t/015_resync_readonly_origin.pltests/tap/t/100_progress_period.pl
Sends feedback to WAL sender every n messages, where n is the configured value. (cherry picked from commit 07b840f)
When sub_resync_table() is called with truncate=true while the subscriber has spock.readonly enabled, the table gets truncated but COPY FROM fails, causing permanent data loss. Fix: Check subscriber readonly status before truncating. Also fix spock.readonly to allow COPY TO (read operation) while blocking COPY FROM (write operation). This allows resync to work when the origin is readonly, since syncing only reads from origin. Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com> (cherry picked from commit c9482a8)
(cherry picked from commit 2582fe3)
55434b3 to
506481f
Compare
There was a problem hiding this comment.
Actionable comments posted: 1
♻️ Duplicate comments (1)
tests/tap/t/015_resync_readonly_origin.pl (1)
52-53:⚠️ Potential issue | 🟠 MajorReplace the wall-clock sleeps with state-based waits.
These checkpoints are waiting on async replication, config reload, and resync completion, none of which are guaranteed to settle in
2/3/5/10seconds. That makes the TAP flaky on slow CI and unnecessarily slow on fast CI; poll the observable state instead.Representative pattern
+sub wait_until { + my ($desc, $timeout, $predicate) = `@_`; + for (1 .. $timeout) { + return if $predicate->(); + sleep 1; + } + BAIL_OUT("$desc not reached before timeout"); +} ... -system_or_bail 'sleep', '2'; +wait_until('subscriber readonly applied', 15, sub { + scalar_query(2, "SHOW spock.readonly") eq 'all'; +});Also applies to: 60-61, 83-84, 94-95, 104-104, 129-129, 161-161, 175-176
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@tests/tap/t/015_resync_readonly_origin.pl` around lines 52 - 53, The test currently uses a hard sleep via system_or_bail 'sleep', '5' to wait for DDL replication/table sync, which makes the TAP flaky; replace this wall-clock sleep with a state-based polling wait that repeatedly checks the actual observable condition (e.g., a helper like wait_for_replication_completion, wait_for_resync_complete, or a loop calling check_table_synced/check_config_reload) until it returns success or a timeout occurs; update the call site that uses system_or_bail 'sleep' to call the polling helper (or inline a short loop with a small sleep and a deadline) so the test proceeds as soon as replication/config/resync is done rather than waiting a fixed 5 seconds.
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.
Inline comments:
In `@src/spock_apply.c`:
- Around line 3941-3945: The timeout check is using last_receive_timestamp
(updated on every PQgetCopyData) which prevents the wal_sender_timeout branch
from firing; change the condition to compare now against last_feedback_timestamp
instead (i.e. use TimestampDifferenceExceeds(*last_feedback_timestamp, now,
wal_sender_timeout / 2) || w_message_count >= spock_feedback_frequency), and
ensure last_feedback_timestamp is updated whenever feedback is actually sent (in
the helper that sends feedback) so the time-based branch reflects time since
last feedback rather than last receive.
---
Duplicate comments:
In `@tests/tap/t/015_resync_readonly_origin.pl`:
- Around line 52-53: The test currently uses a hard sleep via system_or_bail
'sleep', '5' to wait for DDL replication/table sync, which makes the TAP flaky;
replace this wall-clock sleep with a state-based polling wait that repeatedly
checks the actual observable condition (e.g., a helper like
wait_for_replication_completion, wait_for_resync_complete, or a loop calling
check_table_synced/check_config_reload) until it returns success or a timeout
occurs; update the call site that uses system_or_bail 'sleep' to call the
polling helper (or inline a short loop with a small sleep and a deadline) so the
test proceeds as soon as replication/config/resync is done rather than waiting a
fixed 5 seconds.
ℹ️ Review info
⚙️ Run configuration
Configuration used: Organization UI
Review profile: CHILL
Plan: Pro
Run ID: 77f1b899-9cdf-4911-a610-35f557e6bcea
📒 Files selected for processing (8)
docs/configuring.mdinclude/spock.hsrc/spock.csrc/spock_apply.csrc/spock_functions.csrc/spock_readonly.ctests/tap/t/015_resync_readonly_origin.pltests/tap/t/100_progress_period.pl
🚧 Files skipped from review as they are similar to previous changes (3)
- docs/configuring.md
- src/spock_functions.c
- include/spock.h
| * Send feedback if wal_sender_timeout/2 has passed or after | ||
| * spock.feedback_frequency 'w' messages. | ||
| */ | ||
| if (TimestampDifferenceExceeds(*last_receive_timestamp, now, wal_sender_timeout / 2) || | ||
| w_message_count >= 10) | ||
| w_message_count >= spock_feedback_frequency) |
There was a problem hiding this comment.
Track time since the last feedback, not since the last receive.
Lines 2882-2883 refresh last_receive_timestamp immediately after every PQgetCopyData(), and Line 2957 calls this helper afterwards. That makes the timeout branch here effectively dead during normal streaming, so increasing the threshold to spock_feedback_frequency changes the behavior to “count-based feedback plus separate keepalive traffic” rather than the documented “count or wal_sender_timeout / 2” behavior.
Representative fix
- TimestampTz last_receive_timestamp = GetCurrentTimestamp();
+ TimestampTz last_receive_timestamp = GetCurrentTimestamp();
+ TimestampTz last_feedback_timestamp = last_receive_timestamp;
...
- maybe_send_feedback(applyconn, last_received,
- &last_receive_timestamp);
+ maybe_send_feedback(applyconn, last_received,
+ &last_feedback_timestamp);
...
-static void
-maybe_send_feedback(PGconn *applyconn, XLogRecPtr lsn_to_send,
- TimestampTz *last_receive_timestamp)
+static void
+maybe_send_feedback(PGconn *applyconn, XLogRecPtr lsn_to_send,
+ TimestampTz *last_feedback_timestamp)
{
static int w_message_count = 0;
TimestampTz now = GetCurrentTimestamp();
...
- if (TimestampDifferenceExceeds(*last_receive_timestamp, now, wal_sender_timeout / 2) ||
+ if (TimestampDifferenceExceeds(*last_feedback_timestamp, now, wal_sender_timeout / 2) ||
w_message_count >= spock_feedback_frequency)
{
send_feedback(applyconn, lsn_to_send, now, true);
- *last_receive_timestamp = now;
+ *last_feedback_timestamp = now;
w_message_count = 0;
}
}🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.
In `@src/spock_apply.c` around lines 3941 - 3945, The timeout check is using
last_receive_timestamp (updated on every PQgetCopyData) which prevents the
wal_sender_timeout branch from firing; change the condition to compare now
against last_feedback_timestamp instead (i.e. use
TimestampDifferenceExceeds(*last_feedback_timestamp, now, wal_sender_timeout /
2) || w_message_count >= spock_feedback_frequency), and ensure
last_feedback_timestamp is updated whenever feedback is actually sent (in the
helper that sends feedback) so the time-based branch reflects time since last
feedback rather than last receive.
spock_alter_subscription_synchronize() iterates remote tables and calls truncate_table() when truncate=true, but unlike spock_resync_table() it was missing the check_readonly_for_resync() guard introduced in SPOC-440. A readonly subscriber would have its tables truncated with no way to refill them. Add the guard and fix the missing forward declaration for check_readonly_for_resync() (static function defined after its first call site). Co-Authored-By: Asif Rehman <asifr@pgedge.com> Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
506481f to
b220182
Compare
This is part 2 on top of #372.
Making it a draft for now. It currently has part 1 as a base. Once that is merged, this can use main as its target.