Skip to content

Commit 2d141ad

Browse files
committed
removed explanations of RBR vs SBR, we now support RBR
Signed-off-by: jvaidya <jitendra.vaidya@gmail.com>
1 parent 17f3ced commit 2d141ad

File tree

1 file changed

+11
-63
lines changed

1 file changed

+11
-63
lines changed

doc/VitessReplication.md

Lines changed: 11 additions & 63 deletions
Original file line numberDiff line numberDiff line change
@@ -3,70 +3,18 @@
33
## Statement vs Row Based Replication
44

55
MySQL supports two primary modes of replication in its binary logs: statement or
6-
row based.
7-
8-
**Statement Based Replication**:
9-
10-
* The statements executed on the master are copied almost as-is in the master
11-
logs.
12-
* The slaves replay these statements as is.
13-
* If the statements are expensive (especially an update with a complicated WHERE
14-
clause), they will be expensive on the slaves too.
15-
* For current timestamp and auto-increment values, the master also puts
16-
additional SET statements in the logs to make the statement have the same
17-
effect, so the slaves end up with the same values.
18-
19-
**Row Based Replication**:
20-
21-
* The statements executed on the master result in updated rows. The new full
22-
values for these rows are copied to the master logs.
23-
* The slaves change their records for the rows they receive. The update is by
24-
primary key, and contains the new values for each column, so usually it’s very
25-
fast.
26-
* Each updated row contains the entire row, not just the columns that were
27-
updated (unless the flag --binlog\_row\_image=minimal is used).
28-
* The replication stream is harder to read, as it contains almost binary data,
29-
that don’t easily map to the original statements.
30-
* There is a configurable limit on how many rows can be affected by one
31-
binlog event, so the master logs are not flooded.
32-
* The format of the logs depends on the master schema: each row has a list of
33-
values, one value for each column. So if the master schema is different from
34-
the slave schema, updates will misbehave (exception being if slave has extra
35-
columns at the end).
36-
* It is possible to revert to statement based replication for some commands to
37-
avoid these drawbacks (for instance for DELETE statements that affect a large
38-
number of rows).
39-
* Schema changes always use statement based replication.
40-
* If comments are added to a statement, they are stripped from the
41-
replication stream (as only rows are transmitted). There is a flag
42-
--binlog\_rows\_query\_log\_events to add the original statement to each row
43-
update, but it is costly in terms of binlog size.
44-
45-
For the longest time, MySQL replication has been single-threaded: only one
46-
statement is applied by the slaves at a time. Since the master applies more
47-
statements in parallel, replication can fall behind on the slaves fairly easily,
48-
under higher load. Even though the situation has improved (parallel slave
49-
apply), the slave replication speed is still a limiting factor for a lot of
50-
applications. Since row based replication achieves higher update rates on the
51-
slaves in most cases, it has been the only viable option for most performance
52-
sensitive applications.
53-
54-
Schema changes however are not easy to achieve with row based
55-
replication. Adding columns can be done offline, but removing or changing
56-
columns cannot easily be done (there are multiple ways to achieve this, but they
57-
all have limitations or performance implications, and are not that easy to
58-
setup).
59-
60-
Vitess helps by using statement based replication (therefore allowing complex
61-
schema changes), while at the same time simplifying the replication stream (so
62-
slaves can be fast), by rewriting Update statements.
63-
64-
Then, with statement based replication, it becomes easier to perform offline
65-
advanced schema changes, or large data updates. Vitess’s solution is called
66-
schema swap.
6+
row based. Vitess supports both these modes.
7+
8+
For schema changes, if the number of affected rows is greater > 100k (configurable), we don't allow direct application
9+
of DDLs the recommended tool in such cases is gh-ost
6710

68-
We plan to also support row based replication in the future, and adapt our tools
69-
to provide the same features when possible. See Appendix for our plan.
11+
When using statement based replication, Vitess helps by rewriting Update statements,
12+
therefore allowing complex schema changes, while at the same time simplifying the replication stream (so
13+
slaves can be fast). This is described in detail below.
14+
15+
Thus, with statement based replication, it becomes easier to perform offline
16+
advanced schema changes, or large data updates. Vitess’s solution is called
17+
schema swap (described below).
7018

7119
## Rewriting Update Statements
7220

0 commit comments

Comments
 (0)