Skip to content

Commit f63fbd3

Browse files
committed
BUG#27652526: REJOIN OLD PRIMARY NODE MAY DUPLICATE KEY WHEN RECOVERY
Group Replication does implement conflict detection on multi-primary to avoid write errors on parallel operations. The conflict detection is also engaged in single-primary mode on the particular case of primary change and the new primary still has a backlog to apply. Until the backlog is flushed, conflict detection is enabled to prevent write errors between the backlog and incoming transactions. The conflict detection data, which we name certification info, is also used to detected dependencies between accepted transactions, dependencies which will rule the transactions schedule on the parallel applier. In order to avoid that the certification info grows forever, periodically all members exchange their GTID_EXECUTED set, which full intersection will provide the set of transactions that are applied on all members. Future transactions cannot conflict with this set since all members are operating on top of it, so we can safely remove all write-sets from the certification info that do belong to those transactions. More details at WL#6833: Group Replication: Read-set free Certification Module (DBSM Snapshot Isolation). Though a corner case was found on which the garbage collection was purging more data than it should. The scenario is: 1) Group with 2 members; 2) Member1 executes: CREATE TABLE t1(a INT, b INT, PRIMARY KEY(a)); INSERT INTO t1 VALUE(1, 1); Both members have a GTID_EXECUTED= UUID:1-4 Both members certification info has: Hash of item in Writeset snapshot version (Gtid_set) #1 UUID1:1-4 3) member1 executes TA UPDATE t1 SET b=10 WHERE a=1; and blocks immediately before send the transaction to the group. This transaction has snapshot_version: UUID:1-4 4) member2 executes TB UPDATE t1 SET b=10 WHERE a=1; This transaction has snapshot_version: UUID:1-4 It goes through the complete patch and it is committed. This transaction has GTID: UUID:1000002 Both members have a GTID_EXECUTED= UUID:1-4:1000002 Both members certification info has: Hash of item in Writeset snapshot version (Gtid_set) #1 UUID1:1-4:1000002 5) member2 becomes extremely slow in processing transactions, we simulate that by holding the transaction queue to the GR pipeline. Transaction delivery is still working, but the transaction will be block before certification. 6) member1 is able to send its TA transaction, lets recall that this transaction has snapshot_version: UUID:1-4. On conflict detection on member1, it will conflict with #1, since this snapshot_version does not contain the snapshot_version of #1, that is TA was executed on a previous version than TB. On member2 the transaction will be delivered and will be put on hold before conflict detection. 7) meanwhile the certification info garbage collection kicks in. Both members have a GTID_EXECUTED= UUID:1-4:1000002 Its intersection is UUID:1-4:1000002 Both members certification info has: Hash of item in Writeset snapshot version (Gtid_set) #1 UUID1:1-4:1000002 The condition to purge write-sets is: snapshot_version.is_subset(intersection) We have "UUID:1-4:1000002".is_subset("UUID:1-4:1000002) which is true, so we remove #1. Both members certification info has: Hash of item in Writeset snapshot version (Gtid_set) <empty> 8) member2 gets back to normal, we release transaction TA, lets recall that this transaction has snapshot_version: UUID:1-4. On conflict detection, since the certification info is empty, the transaction will be allowed to proceed, which is incorrect, it must rollback (like on member1) since it conflicts with TB. The problem it is on certification garbage collection, more precisely on the condition used to purge data, we cannot leave the certification info empty otherwise this situation can happen. The condition must be changed to snapshot_version.is_subset_not_equals(intersection) which will always leave a placeholder to detect delayed conflicting transaction. So a trace of the solution is (starting on step 7): 7) meanwhile the certification info garbage collection kicks in. Both members have a GTID_EXECUTED= UUID:1-4:1000002 Its intersection is UUID:1-4:1000002 Both members certification info has: Hash of item in Writeset snapshot version (Gtid_set) #1 UUID1:1-4:1000002 The condition to purge write-sets is: snapshot_version.is_subset_not_equals(intersection) We have "UUID:1-4:1000002".is_subset_not_equals("UUID:1-4:1000002) which is false, so we do not remove #1. Both members certification info has: Hash of item in Writeset snapshot version (Gtid_set) #1 UUID1:1-4:1000002 8) member2 gets back to normal, we release transaction TA, lets recall that this transaction has snapshot_version: UUID:1-4. On conflict detection on member2, it will conflict with #1, since this snapshot_version does not contain the snapshot_version of #1, that is TA was executed on a previous version than TB. This is the same scenario that we see on this bug, though here the pipeline is being blocked by the distributed recovery procedure, that is, while the joining member is applying the missing data through the recovery channel, the incoming data is being queued. Meanwhile the certification info garbage collection kicks in and purges more data that it should, the result it is that conflicts are not being detected.
1 parent 6e40ff2 commit f63fbd3

8 files changed

+239
-12
lines changed

rapid/plugin/group_replication/src/applier.cc

Lines changed: 6 additions & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -1,4 +1,4 @@
1-
/* Copyright (c) 2014, 2017, Oracle and/or its affiliates. All rights reserved.
1+
/* Copyright (c) 2014, 2018, Oracle and/or its affiliates. All rights reserved.
22
33
This program is free software; you can redistribute it and/or modify
44
it under the terms of the GNU General Public License as published by
@@ -299,6 +299,11 @@ int Applier_module::apply_data_packet(Data_packet *data_packet,
299299
uchar* payload= data_packet->payload;
300300
uchar* payload_end= data_packet->payload + data_packet->len;
301301

302+
DBUG_EXECUTE_IF("group_replication_before_apply_data_packet", {
303+
const char act[] = "now wait_for continue_apply";
304+
DBUG_ASSERT(!debug_sync_set_action(current_thd, STRING_WITH_LEN(act)));
305+
});
306+
302307
if (check_single_primary_queue_status())
303308
return 1; /* purecov: inspected */
304309

rapid/plugin/group_replication/src/certifier.cc

Lines changed: 2 additions & 2 deletions
Original file line numberDiff line numberDiff line change
@@ -1,4 +1,4 @@
1-
/* Copyright (c) 2014, 2017, Oracle and/or its affiliates. All rights reserved.
1+
/* Copyright (c) 2014, 2018, Oracle and/or its affiliates. All rights reserved.
22
33
This program is free software; you can redistribute it and/or modify
44
it under the terms of the GNU General Public License as published by
@@ -1230,7 +1230,7 @@ void Certifier::garbage_collect()
12301230
stable_gtid_set_lock->wrlock();
12311231
while (it != certification_info.end())
12321232
{
1233-
if (it->second->is_subset(stable_gtid_set))
1233+
if (it->second->is_subset_not_equals(stable_gtid_set))
12341234
{
12351235
if (it->second->unlink() == 0)
12361236
delete it->second;
Lines changed: 74 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,74 @@
1+
include/group_replication.inc
2+
Warnings:
3+
Note #### Sending passwords in plain text without SSL/TLS is extremely insecure.
4+
Note #### Storing MySQL user name or password information in the master info repository is not secure and is therefore not recommended. Please consider using the USER and PASSWORD connection options for START SLAVE; see the 'START SLAVE Syntax' in the MySQL Manual for more information.
5+
[connection server1]
6+
7+
############################################################
8+
# 1. Create a table on server1.
9+
CREATE TABLE t1(a INT, b INT, PRIMARY KEY(a));
10+
INSERT INTO t1 VALUE(1, 1);
11+
include/rpl_sync.inc
12+
13+
############################################################
14+
# 2. Set a debug sync before broadcast message to group on
15+
# connection server_1.
16+
# Commit a transaction that will be block before broadcast.
17+
[connection server_1]
18+
SET @@GLOBAL.DEBUG='+d,group_replication_before_message_broadcast';
19+
BEGIN;
20+
UPDATE t1 SET b=10 WHERE a=1;
21+
COMMIT;
22+
23+
############################################################
24+
# 3. Wait until server_1 connection reaches the
25+
# group_replication_before_message_broadcast debug sync point.
26+
[connection server1]
27+
28+
############################################################
29+
# 4. Execute a transaction on server2, that will reach first
30+
# certification, since server_1 is blocked before broadcast.
31+
[connection server2]
32+
UPDATE t1 SET b=20 WHERE a=1;
33+
34+
############################################################
35+
# 5. Suspend pipeline on server2.
36+
SET @@GLOBAL.DEBUG='+d,group_replication_before_apply_data_packet';
37+
38+
############################################################
39+
# 6. Resume the transaction on server_1
40+
[connection server1]
41+
SET DEBUG_SYNC='now SIGNAL waiting';
42+
SET @@GLOBAL.DEBUG='-d,group_replication_before_message_broadcast';
43+
[connection server_1]
44+
ERROR HY000: Plugin instructed the server to rollback the current transaction.
45+
46+
############################################################
47+
# 7. Make sure the pipeline is suspended on server2.
48+
[connection server2]
49+
50+
############################################################
51+
# 8. Wait until certification info garbage collector does
52+
# its work.
53+
54+
############################################################
55+
# 9. Resume the pipeline on server2.
56+
SET DEBUG_SYNC='now SIGNAL continue_apply';
57+
SET @@GLOBAL.DEBUG='-d,group_replication_before_apply_data_packet';
58+
59+
############################################################
60+
# 10. Execute a new transaction in order to have a sync point
61+
# to make the test deterministic,
62+
# Validate that data and GTIDs are correct.
63+
[connection server1]
64+
INSERT INTO t1 VALUE(2, 2);
65+
include/rpl_sync.inc
66+
include/assert.inc [GTID_EXECUTED must contain 6 transactions]
67+
[connection server2]
68+
include/assert.inc [GTID_EXECUTED must contain 6 transactions]
69+
include/diff_tables.inc [server1:t1, server2:t1]
70+
71+
############################################################
72+
# 11. Clean up.
73+
DROP TABLE t1;
74+
include/group_replication_end.inc

rapid/plugin/group_replication/tests/mtr/r/gr_perfschema_group_member_stats.result

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -53,7 +53,7 @@ server1
5353
include/assert.inc [The value of member_id should be equal to server UUID after starting group replication]
5454
include/assert.inc [The value of Count_Transactions_checked should be 6 after starting group replication]
5555
include/assert.inc [The value of Count_conflicts_detected should be 0 after starting group replication]
56-
include/assert.inc [The value of Count_Transactions_rows_validating should be 4 after starting group replication]
56+
include/assert.inc [The value of Count_Transactions_rows_validating should be 6 after starting group replication]
5757
include/assert.inc [The value of Transactions_committed_all_members should have server 1 GTIDs before server2 start]
5858
include/assert.inc [The value of Last_Conflict_free_transaction should be the gtid of the last applied transaction.]
5959
SET SESSION sql_log_bin= 0;

rapid/plugin/group_replication/tests/mtr/r/gr_set_gtid_next.result

Lines changed: 2 additions & 2 deletions
Original file line numberDiff line numberDiff line change
@@ -72,14 +72,14 @@ include/assert.inc ['There is a value 3 in table t2']
7272
# 6. Check that stable set and certification info size are
7373
# properly updated after stable set propagation and
7474
# certification info garbage collection on server 1.
75-
include/assert.inc ['Count_transactions_rows_validating must be 0']
75+
include/assert.inc ['Count_transactions_rows_validating must be 2']
7676
include/assert.inc ['Transactions_committed_all_members must be equal to GTID_EXECUTED']
7777

7878
############################################################
7979
# 7. Check that stable set and certification info size are
8080
# properly updated after stable set propagation and
8181
# certification info garbage collection on server 2.
82-
include/assert.inc ['Count_transactions_rows_validating must be 0']
82+
include/assert.inc ['Count_transactions_rows_validating must be 2']
8383
include/assert.inc ['Transactions_committed_all_members must be equal to GTID_EXECUTED']
8484

8585
############################################################
Lines changed: 148 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,148 @@
1+
################################################################################
2+
# Validate that certification info garbage collection do not purge more data
3+
# than it should.
4+
#
5+
# Test:
6+
# 0. The test requires two servers: M1 and M2.
7+
# 1. Create a table on server1.
8+
# 2. Set a debug sync before broadcast message to group on
9+
# connection server_1.
10+
# Commit a transaction that will be block before broadcast.
11+
# 3. Wait until server_1 connection reaches the
12+
# group_replication_before_message_broadcast debug sync point.
13+
# 4. Execute a transaction on server2, that will reach first
14+
# certification, since server_1 is blocked before broadcast.
15+
# 5. Suspend pipeline on server2.
16+
# 6. Resume the transaction on server_1
17+
# 7. Make sure the pipeline is suspended on server2.
18+
# 8. Wait until certification info garbage collector does
19+
# its work.
20+
# 9. Resume the pipeline on server2.
21+
# 10. Execute a new transaction in order to have a sync point
22+
# to make the test deterministic,
23+
# Validate that data and GTIDs are correct.
24+
# 11. Clean up.
25+
################################################################################
26+
--source include/have_debug_sync.inc
27+
--source include/big_test.inc
28+
--source ../inc/have_group_replication_plugin.inc
29+
--source ../inc/group_replication.inc
30+
31+
--echo
32+
--echo ############################################################
33+
--echo # 1. Create a table on server1.
34+
CREATE TABLE t1(a INT, b INT, PRIMARY KEY(a));
35+
INSERT INTO t1 VALUE(1, 1);
36+
--source include/rpl_sync.inc
37+
38+
--echo
39+
--echo ############################################################
40+
--echo # 2. Set a debug sync before broadcast message to group on
41+
--echo # connection server_1.
42+
--echo # Commit a transaction that will be block before broadcast.
43+
--let $rpl_connection_name= server_1
44+
--source include/rpl_connection.inc
45+
SET @@GLOBAL.DEBUG='+d,group_replication_before_message_broadcast';
46+
BEGIN;
47+
UPDATE t1 SET b=10 WHERE a=1;
48+
--send COMMIT
49+
50+
--echo
51+
--echo ############################################################
52+
--echo # 3. Wait until server_1 connection reaches the
53+
--echo # group_replication_before_message_broadcast debug sync point.
54+
--let $rpl_connection_name= server1
55+
--source include/rpl_connection.inc
56+
--let $wait_condition=SELECT COUNT(*)=1 FROM INFORMATION_SCHEMA.PROCESSLIST WHERE State = 'debug sync point: now'
57+
--source include/wait_condition.inc
58+
59+
--echo
60+
--echo ############################################################
61+
--echo # 4. Execute a transaction on server2, that will reach first
62+
--echo # certification, since server_1 is blocked before broadcast.
63+
--let $rpl_connection_name= server2
64+
--source include/rpl_connection.inc
65+
UPDATE t1 SET b=20 WHERE a=1;
66+
67+
--echo
68+
--echo ############################################################
69+
--echo # 5. Suspend pipeline on server2.
70+
SET @@GLOBAL.DEBUG='+d,group_replication_before_apply_data_packet';
71+
72+
--echo
73+
--echo ############################################################
74+
--echo # 6. Resume the transaction on server_1
75+
--let $rpl_connection_name= server1
76+
--source include/rpl_connection.inc
77+
SET DEBUG_SYNC='now SIGNAL waiting';
78+
SET @@GLOBAL.DEBUG='-d,group_replication_before_message_broadcast';
79+
80+
--let $rpl_connection_name= server_1
81+
--source include/rpl_connection.inc
82+
--error ER_TRANSACTION_ROLLBACK_DURING_COMMIT
83+
--reap
84+
85+
--echo
86+
--echo ############################################################
87+
--echo # 7. Make sure the pipeline is suspended on server2.
88+
--let $rpl_connection_name= server2
89+
--source include/rpl_connection.inc
90+
--let $wait_condition=SELECT COUNT(*)=1 FROM INFORMATION_SCHEMA.PROCESSLIST WHERE State = 'debug sync point: now'
91+
--source include/wait_condition.inc
92+
93+
--echo
94+
--echo ############################################################
95+
--echo # 8. Wait until certification info garbage collector does
96+
--echo # its work.
97+
--let $gtid_assignment_block_size= `SELECT @@GLOBAL.group_replication_gtid_assignment_block_size;`
98+
--let $expected_gtid_set= $group_replication_group_name:1-4:1000002
99+
if ($gtid_assignment_block_size == 1)
100+
{
101+
--let $expected_gtid_set= $group_replication_group_name:1-5
102+
}
103+
--let $wait_condition= SELECT transactions_committed_all_members = "$expected_gtid_set" from performance_schema.replication_group_member_stats;
104+
--let $wait_timeout= 150
105+
--source include/wait_condition.inc
106+
107+
--echo
108+
--echo ############################################################
109+
--echo # 9. Resume the pipeline on server2.
110+
SET DEBUG_SYNC='now SIGNAL continue_apply';
111+
SET @@GLOBAL.DEBUG='-d,group_replication_before_apply_data_packet';
112+
113+
--echo
114+
--echo ############################################################
115+
--echo # 10. Execute a new transaction in order to have a sync point
116+
--echo # to make the test deterministic,
117+
--echo # Validate that data and GTIDs are correct.
118+
--let $rpl_connection_name= server1
119+
--source include/rpl_connection.inc
120+
INSERT INTO t1 VALUE(2, 2);
121+
--source include/rpl_sync.inc
122+
123+
--let $expected_gtid_set= $group_replication_group_name:1-5:1000002
124+
if ($gtid_assignment_block_size == 1)
125+
{
126+
--let $expected_gtid_set= $group_replication_group_name:1-6
127+
}
128+
129+
--let $assert_text= GTID_EXECUTED must contain 6 transactions
130+
--let $assert_cond= "[SELECT @@GLOBAL.GTID_EXECUTED]" = "$expected_gtid_set";
131+
--source include/assert.inc
132+
133+
--let $rpl_connection_name= server2
134+
--source include/rpl_connection.inc
135+
--let $assert_text= GTID_EXECUTED must contain 6 transactions
136+
--let $assert_cond= "[SELECT @@GLOBAL.GTID_EXECUTED]" = "$expected_gtid_set";
137+
--source include/assert.inc
138+
139+
--let $diff_tables=server1:t1, server2:t1
140+
--source include/diff_tables.inc
141+
142+
143+
--echo
144+
--echo ############################################################
145+
--echo # 11. Clean up.
146+
DROP TABLE t1;
147+
148+
--source ../inc/group_replication_end.inc

rapid/plugin/group_replication/tests/mtr/t/gr_perfschema_group_member_stats.test

Lines changed: 2 additions & 2 deletions
Original file line numberDiff line numberDiff line change
@@ -182,8 +182,8 @@ START SLAVE SQL_THREAD FOR CHANNEL "group_replication_applier";
182182
--source include/assert.inc
183183

184184
--let $certification_db_size= query_get_value(SELECT Count_Transactions_rows_validating from performance_schema.replication_group_member_stats, Count_Transactions_rows_validating, 1)
185-
--let $assert_text= The value of Count_Transactions_rows_validating should be 4 after starting group replication
186-
--let $assert_cond= "$certification_db_size" = 4
185+
--let $assert_text= The value of Count_Transactions_rows_validating should be 6 after starting group replication
186+
--let $assert_cond= "$certification_db_size" = 6
187187
--source include/assert.inc
188188

189189
--let $stable_set= query_get_value(SELECT Transactions_committed_all_members from performance_schema.replication_group_member_stats, Transactions_committed_all_members, 1)

rapid/plugin/group_replication/tests/mtr/t/gr_set_gtid_next.test

Lines changed: 4 additions & 4 deletions
Original file line numberDiff line numberDiff line change
@@ -168,8 +168,8 @@ INSERT INTO t2 VALUES (3);
168168
--connection server1
169169

170170
--let $count_transactions_validating= query_get_value(SELECT Count_transactions_rows_validating from performance_schema.replication_group_member_stats, Count_transactions_rows_validating, 1)
171-
--let $assert_text= 'Count_transactions_rows_validating must be 0'
172-
--let $assert_cond= $count_transactions_validating = 0
171+
--let $assert_text= 'Count_transactions_rows_validating must be 2'
172+
--let $assert_cond= $count_transactions_validating = 2
173173
--source include/assert.inc
174174

175175
--let $transactions_committed_all_members= query_get_value(SELECT Transactions_committed_all_members from performance_schema.replication_group_member_stats, Transactions_committed_all_members, 1)
@@ -186,8 +186,8 @@ INSERT INTO t2 VALUES (3);
186186
--connection server2
187187

188188
--let $count_transactions_validating= query_get_value(SELECT Count_transactions_rows_validating from performance_schema.replication_group_member_stats, Count_transactions_rows_validating, 1)
189-
--let $assert_text= 'Count_transactions_rows_validating must be 0'
190-
--let $assert_cond= $count_transactions_validating = 0
189+
--let $assert_text= 'Count_transactions_rows_validating must be 2'
190+
--let $assert_cond= $count_transactions_validating = 2
191191
--source include/assert.inc
192192

193193
--let $transactions_committed_all_members= query_get_value(SELECT Transactions_committed_all_members from performance_schema.replication_group_member_stats, Transactions_committed_all_members, 1)

0 commit comments

Comments
 (0)