Skip to content

MDEV-39660 Replica Crashes on Malformed Partial_rows_log_event#5093

Open
bnestere wants to merge 1 commit into
12.3from
12.3-MDEV-39660
Open

MDEV-39660 Replica Crashes on Malformed Partial_rows_log_event#5093
bnestere wants to merge 1 commit into
12.3from
12.3-MDEV-39660

Conversation

@bnestere
Copy link
Copy Markdown
Contributor

If the first Partial_rows_log_event in a group provides an original_event_size that is smaller than the actual size of the underlying Rows_log_event, the slave will not allocate enough memory to hold the actual Rows_log_event content, but still try to copy that much memory, resulting in the slave crashing.

This patch adds validation checking for this condition, and will error with code ER_PARTIAL_ROWS_LOG_EVENT_BAD_STREAM when detected.

Additionally, the error message for ER_PARTIAL_ROWS_LOG_EVENT_BAD_STREAM is generalized to better support various types of broken streams. The previous message was hard-coded to be about detecting out-of-sequence fragments. The new message is updated to allow general "found %s" and "expected %s" conditions.

If the first Partial_rows_log_event in a group provides an
original_event_size that is smaller than the actual size of the
underlying Rows_log_event, the slave will not allocate enough memory
to hold the actual Rows_log_event content, but still try to copy
that much memory, resulting in the slave crashing.

This patch adds validation checking for this condition, and will error
with code ER_PARTIAL_ROWS_LOG_EVENT_BAD_STREAM when detected.

Additionally, the error message for ER_PARTIAL_ROWS_LOG_EVENT_BAD_STREAM
is generalized to better support various types of broken streams. The
previous message was hard-coded to be about detecting out-of-sequence
fragments. The new message is updated to allow general "found %s" and
"expected %s" conditions.

Reviewed-by: Claude Code
Signed-off-by: Brandon Nesterenko <brandon.nesterenko@mariadb.com>
@bnestere bnestere self-assigned this May 18, 2026
@bnestere bnestere added MariaDB Corporation Replication Patches involved in replication labels May 18, 2026
Copy link
Copy Markdown

@gemini-code-assist gemini-code-assist Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Code Review

This pull request improves the error reporting for broken Partial_rows_log_event streams by making the messages more descriptive and adding a new test case for corrupt event sizes. It also introduces a bounds check to prevent buffer overflows when reassembling fragments. Feedback highlights potential integer truncation and overflow issues on 32-bit systems during memory allocation and bounds checking, as well as a suggestion to increase the maximum length for event type strings to prevent truncation in logs.

Comment thread sql/log_event_server.cc
Comment on lines 6236 to 6237
: (char *) my_malloc(PSI_INSTRUMENT_ME,
partial_ev->original_event_size, MYF(MY_WME));
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

security-high high

On 32-bit systems, partial_ev->original_event_size (a uint64_t) could be larger than SIZE_MAX. Passing it directly to my_malloc (which takes size_t) will result in truncation. For example, a size of 4GB + 100 bytes would be truncated to 100 bytes. This would lead to a buffer overflow during memcpy even with the new bounds checks, because rows_ev_buf_len would still be the large 64-bit value. Consider adding a check to ensure original_event_size fits in size_t on the current platform.

Comment thread sql/log_event_server.cc
Comment on lines +6255 to +6268
if (this->ev_len + partial_ev->get_rows_size() > this->rows_ev_buf_len)
{
char found_buf[PARTIAL_ROWS_EVENT_BAD_ORIG_SIZE_ERRSTR_LEN];
char expect_buf[PARTIAL_ROWS_EVENT_BAD_ORIG_SIZE_ERRSTR_LEN];
my_snprintf(found_buf, sizeof(found_buf), "row data size %" PRIu64,
this->ev_len + partial_ev->get_rows_size());
my_snprintf(expect_buf, sizeof(expect_buf), "row data size %" PRIu64,
this->rows_ev_buf_len);
rgi->rli->report(ERROR_LEVEL, ER_PARTIAL_ROWS_LOG_EVENT_BAD_STREAM,
rgi->gtid_info(),
ER_THD(rgi->thd, ER_PARTIAL_ROWS_LOG_EVENT_BAD_STREAM),
found_buf, expect_buf);
return ER_PARTIAL_ROWS_LOG_EVENT_BAD_STREAM;
}
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

security-high high

The check this->ev_len + partial_ev->get_rows_size() > this->rows_ev_buf_len can suffer from integer overflow on 32-bit systems if ev_len and the fragment size are 32-bit types (like ulong and size_t), as the addition is performed in 32-bit arithmetic before being compared to the 64-bit rows_ev_buf_len. Using subtraction (partial_ev->get_rows_size() > this->rows_ev_buf_len - this->ev_len) is safer. Also, ensure the sum in the error message is calculated using 64-bit arithmetic to avoid reporting a truncated value.

  if (partial_ev->get_rows_size() > this->rows_ev_buf_len - this->ev_len)
  {
    char found_buf[PARTIAL_ROWS_EVENT_BAD_ORIG_SIZE_ERRSTR_LEN];
    char expect_buf[PARTIAL_ROWS_EVENT_BAD_ORIG_SIZE_ERRSTR_LEN];
    my_snprintf(found_buf, sizeof(found_buf), "row data size %" PRIu64,
                (uint64_t)this->ev_len + partial_ev->get_rows_size());
    my_snprintf(expect_buf, sizeof(expect_buf), "row data size %" PRIu64,
                this->rows_ev_buf_len);
    rgi->rli->report(ERROR_LEVEL, ER_PARTIAL_ROWS_LOG_EVENT_BAD_STREAM,
                     rgi->gtid_info(),
                     ER_THD(rgi->thd, ER_PARTIAL_ROWS_LOG_EVENT_BAD_STREAM),
                     found_buf, expect_buf);
    return ER_PARTIAL_ROWS_LOG_EVENT_BAD_STREAM;
  }

Comment thread sql/log_event.h
(sizeof("row data size ") + 20 + 1)

// Len of "type %s" where %s is Log_event::get_type_str()
constexpr size_t MAX_LOG_EVENT_TYPE_STR_LEN= 25; // Log_event::get_type_str()
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

medium

The constant MAX_LOG_EVENT_TYPE_STR_LEN is set to 25, which is exactly the length of some MariaDB event type strings (e.g., Update_rows_compressed_v1). Increasing this to 32 would provide better headroom and avoid potential truncation in error messages if longer event names are added in the future.

constexpr size_t MAX_LOG_EVENT_TYPE_STR_LEN= 32; // Log_event::get_type_str()

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

MariaDB Corporation Replication Patches involved in replication

Development

Successfully merging this pull request may close these issues.

1 participant