Skip to content

Commit b9b10c1

Browse files
author
Luis Soares
committed
BUG#20236305: MSR: CRASH ON 'START/STOP SLAVE' CMD I.E. ER1794 -> ER1201 -> CRASH
BUG#20191813: MSR + MTS: IF WE HAVE ANY INACTIVE CHANNEL, POST RESTART START SLAVE HITS ER1794 (Patch and commit messages addapted from Rith's cset draft.) Observations: 1. Till 5.6, an active_mi always existed unless the server was started in bootstrap mode. 2. Multisource introduces new mi<->rli for each channel. 3. Combining above 2 points, MSR code was written in such a way that following shall hold for backward compatibility: a) create a default channel always (exept for bootstrap mode) b) Create all other channels. 4. Bullet #3 above was accomplished like this: a) Read the repositories and create channels b) If default channel not created, create one. 5. Initialiation of default channel failed when 4.i) failed. 6. When 4.a) failed, we deleted the master_info for that channel Notes about bug#19021091: - If proper postions are not given, --relay-log-recovery fails and hence initialization fails, after the behavioir introduced in bug#19021091 - Hence 4.a) above failed. The bug exists because the default channel was never created, thence START SLAVE crashes as it expects a default channel and that is not found. The solution for this bug two-fold: (i) make sure that default channel is always created in rpl_info_factory.cc; (ii) when the init of repositories into master_info failed, don't delete the master info for that channel. Note: - Seems that there is room for refactoring this part of the code, but this should be done outside of the scope of this bug. - Fixed minor things in a couple of tests.
1 parent 24cfe04 commit b9b10c1

6 files changed

+358
-93
lines changed

mysql-test/suite/rpl/r/rpl_multi_source_init_failure.result

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -22,6 +22,6 @@ Note 1759 Sending passwords in plain text without SSL/TLS is extremely insecure.
2222
Note 1760 Storing MySQL user name or password information in the master info repository is not secure and is therefore not recommended. Please consider using the USER and PASSWORD connection options for START SLAVE; see the 'START SLAVE Syntax' in the MySQL Manual for more information.
2323
START SLAVE;
2424
STOP SLAVE FOR CHANNEL 'ch_a';
25-
include/rpl_restart_server.inc [server_number=2 parameters: --relay-log-recovery --skip-slave-start=off --master-info-repository=TABLE --relay-log-info-repository=TABLE]
25+
include/rpl_restart_server.inc [server_number=2 parameters: --relay-log-recovery --skip-slave-start --master-info-repository=TABLE --relay-log-info-repository=TABLE]
2626
[connection slave]
2727
RESET SLAVE ALL;
Lines changed: 67 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,67 @@
1+
include/master-slave.inc
2+
Warnings:
3+
Note #### Sending passwords in plain text without SSL/TLS is extremely insecure.
4+
Note #### Storing MySQL user name or password information in the master info repository is not secure and is therefore not recommended. Please consider using the USER and PASSWORD connection options for START SLAVE; see the 'START SLAVE Syntax' in the MySQL Manual for more information.
5+
[connection master]
6+
[connection slave]
7+
call mtr.add_suppression("Slave failed to initialize relay log info structure from the repository");
8+
call mtr.add_suppression("Slave: Could not start slave for channel");
9+
call mtr.add_suppression("Error during --relay-log-recovery: Could not locate rotate event from the master");
10+
call mtr.add_suppression("slave with the same server_uuid as this slave has connected to the master");
11+
call mtr.add_suppression("Failed to create or recover replication info repositories");
12+
include/stop_slave.inc
13+
RESET SLAVE ALL;
14+
SET GLOBAL master_info_repository='TABLE';
15+
SET GLOBAL relay_log_info_repository='TABLE';
16+
CHANGE MASTER TO MASTER_HOST='localhost', MASTER_USER='root', MASTER_PORT=MASTER_MYPORT FOR CHANNEL 'ch1';
17+
#
18+
# RESTART SLAVE SERVER
19+
#
20+
include/rpl_restart_server.inc [server_number=2 parameters: --relay-log-recovery --skip-slave-start --master-info-repository=TABLE --relay-log-info-repository=TABLE --slave-parallel-workers=4 --relay-log-purge=0]
21+
START SLAVE;
22+
ERROR HY000: Slave failed to initialize relay log info structure from the repository
23+
CHANGE MASTER TO MASTER_HOST='localhost', MASTER_USER='root', MASTER_PORT=MASTER_MYPORT FOR CHANNEL 'ch1';
24+
START SLAVE;
25+
ERROR HY000: Slave failed to initialize relay log info structure from the repository
26+
RESET SLAVE ALL FOR CHANNEL 'ch1';
27+
CHANGE MASTER TO MASTER_HOST='localhost', MASTER_USER='root', MASTER_PORT=MASTER_MYPORT FOR CHANNEL 'ch1';
28+
CHANGE MASTER TO MASTER_HOST='localhost', MASTER_USER='root', MASTER_PORT=MASTER_MYPORT FOR CHANNEL '';
29+
START SLAVE;
30+
include/stop_slave.inc
31+
RESET SLAVE ALL;
32+
SET @@global.master_info_repository='SAVE_MI_REPO_TYPE';
33+
SET @@global.relay_log_info_repository='SAVE_RLI_REPO_TYPE';
34+
SET @@global.slave_parallel_workers=SAVE_PARALLEL_WORKERS;
35+
CHANGE MASTER TO MASTER_HOST='127.0.0.1', MASTER_USER='root', MASTER_PORT=MASTER_MYPORT;
36+
include/start_slave.inc
37+
include/rpl_restart_server.inc [server_number=2]
38+
[connection master]
39+
[connection slave]
40+
call mtr.add_suppression("Slave: Failed to initialize the master info structure for channel");
41+
call mtr.add_suppression("The slave coordinator and worker threads are stopped");
42+
call mtr.add_suppression("Recovery from master pos");
43+
call mtr.add_suppression("It is not possible to change the type of the relay log repository because there are workers repositories with possible");
44+
include/stop_slave.inc
45+
Warnings:
46+
Note 3084 Replication thread(s) for channel '' are already stopped.
47+
RESET SLAVE ALL;
48+
SET @@global.master_info_repository="TABLE";
49+
SET @@global.relay_log_info_repository="TABLE";
50+
SET @@global.slave_parallel_workers=5;
51+
CHANGE MASTER TO MASTER_HOST='localhost', MASTER_USER='root', MASTER_PORT=MASTER_MYPORT FOR CHANNEL 'ch_trunk';
52+
START SLAVE;
53+
=== RESTART SLAVE SERVER ===
54+
include/rpl_restart_server.inc [server_number=2 parameters: --relay-log-recovery --skip-slave-start --master-info-repository=TABLE --relay-log-info-repository=TABLE --slave-parallel-workers=5]
55+
[connection slave]
56+
START SLAVE;
57+
include/stop_slave.inc
58+
RESET SLAVE ALL;
59+
SET @@global.master_info_repository='SAVE_MI_REPO_TYPE';
60+
SET @@global.relay_log_info_repository='SAVE_RLI_REPO_TYPE';
61+
SET @@global.slave_parallel_workers=SAVE_PARALLEL_WORKERS;
62+
CHANGE MASTER TO MASTER_HOST='127.0.0.1', MASTER_USER='root', MASTER_PORT=MASTER_MYPORT;
63+
include/start_slave.inc
64+
include/rpl_restart_server.inc [server_number=2]
65+
include/start_slave.inc
66+
[connection master]
67+
include/rpl_end.inc

mysql-test/suite/rpl/t/rpl_multi_source_init_failure.test

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -28,7 +28,7 @@ STOP SLAVE FOR CHANNEL 'ch_a';
2828

2929
# --relay-log-recovery is a source of the following mi initialization failure
3030
--let $rpl_server_number= 2
31-
--let $rpl_server_parameters= --relay-log-recovery --skip-slave-start=off --master-info-repository=TABLE --relay-log-info-repository=TABLE
31+
--let $rpl_server_parameters= --relay-log-recovery --skip-slave-start --master-info-repository=TABLE --relay-log-info-repository=TABLE
3232
--source include/rpl_restart_server.inc
3333

3434
--source include/rpl_connection_slave.inc
Lines changed: 163 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,163 @@
1+
--source include/have_binlog_format_mixed.inc
2+
--source include/master-slave.inc
3+
4+
--let $save_mi_repo_type=`SELECT @@GLOBAL.master_info_repository`
5+
--let $save_rli_repo_type=`SELECT @@GLOBAL.relay_log_info_repository`
6+
--let $save_slave_parallel_workers=`SELECT @@global.slave_parallel_workers`
7+
8+
#
9+
# BUG#20236305: MSR: CRASH ON 'START/STOP SLAVE' CMD I.E. ER1794 -> ER1201 -> CRASH
10+
#
11+
12+
# Tests that default channel is created even if creation of
13+
# other channels fails in multisource replication.
14+
# Test that default channel is always created to preserve
15+
# backward compatibility.
16+
17+
--source include/rpl_connection_slave.inc
18+
call mtr.add_suppression("Slave failed to initialize relay log info structure from the repository");
19+
call mtr.add_suppression("Slave: Could not start slave for channel");
20+
call mtr.add_suppression("Error during --relay-log-recovery: Could not locate rotate event from the master");
21+
call mtr.add_suppression("slave with the same server_uuid as this slave has connected to the master");
22+
call mtr.add_suppression("Failed to create or recover replication info repositories");
23+
--source include/stop_slave.inc
24+
25+
# On the slave
26+
RESET SLAVE ALL;
27+
SET GLOBAL master_info_repository='TABLE';
28+
SET GLOBAL relay_log_info_repository='TABLE';
29+
30+
# create a new channel
31+
--disable_warnings
32+
--replace_result $MASTER_MYPORT MASTER_MYPORT
33+
--eval CHANGE MASTER TO MASTER_HOST='localhost', MASTER_USER='root', MASTER_PORT=$MASTER_MYPORT FOR CHANNEL 'ch1'
34+
--enable_warnings
35+
36+
--echo #
37+
--echo # RESTART SLAVE SERVER
38+
--echo #
39+
--let $rpl_server_number= 2
40+
--let $rpl_server_parameters= --relay-log-recovery --skip-slave-start --master-info-repository=TABLE --relay-log-info-repository=TABLE --slave-parallel-workers=4 --relay-log-purge=0
41+
--source include/rpl_restart_server.inc
42+
43+
# slave fails to initialize due to BUG#19021091 (default
44+
# channel cannot recover valid positions from the SQL
45+
# applier thread).
46+
--error ER_SLAVE_RLI_INIT_REPOSITORY
47+
START SLAVE;
48+
49+
# This command would fail with an error, which would
50+
# fail with error ER_MASTER_INFO . Later when start
51+
# slave was issued, the server would crash.
52+
#
53+
# Now, CHANGE MASTER succeeds (and later START SLAVE
54+
# fails).
55+
--disable_warnings
56+
--replace_result $MASTER_MYPORT MASTER_MYPORT
57+
--eval CHANGE MASTER TO MASTER_HOST='localhost', MASTER_USER='root', MASTER_PORT=$MASTER_MYPORT FOR CHANNEL 'ch1'
58+
--enable_warnings
59+
60+
# This would have crashed, but it does not anymore.
61+
--error ER_SLAVE_RLI_INIT_REPOSITORY
62+
START SLAVE;
63+
64+
# Lets clear the offending channel and recreate it.
65+
RESET SLAVE ALL FOR CHANNEL 'ch1';
66+
--disable_warnings
67+
--replace_result $MASTER_MYPORT MASTER_MYPORT
68+
--eval CHANGE MASTER TO MASTER_HOST='localhost', MASTER_USER='root', MASTER_PORT=$MASTER_MYPORT FOR CHANNEL 'ch1'
69+
--enable_warnings
70+
71+
# Lets configure the default channel as well.
72+
--disable_warnings
73+
--replace_result $MASTER_MYPORT MASTER_MYPORT
74+
--eval CHANGE MASTER TO MASTER_HOST='localhost', MASTER_USER='root', MASTER_PORT=$MASTER_MYPORT FOR CHANNEL ''
75+
--enable_warnings
76+
77+
# Lets start the slave (and as such, assert that the
78+
# START SLAVE command is not failing any more).
79+
#
80+
# (There are two channels connected to the same server
81+
# though, which may render the slave unable to connect,
82+
# thence not using --source include/start_slave.inc )
83+
START SLAVE;
84+
85+
# clean up
86+
--source include/stop_slave.inc
87+
RESET SLAVE ALL;
88+
--replace_result $save_mi_repo_type SAVE_MI_REPO_TYPE
89+
--eval SET @@global.master_info_repository='$save_mi_repo_type'
90+
--replace_result $save_rli_repo_type SAVE_RLI_REPO_TYPE
91+
--eval SET @@global.relay_log_info_repository='$save_rli_repo_type'
92+
--replace_result $save_slave_parallel_workers SAVE_PARALLEL_WORKERS
93+
--eval SET @@global.slave_parallel_workers=$save_slave_parallel_workers
94+
95+
--disable_warnings
96+
--replace_result $MASTER_MYPORT MASTER_MYPORT
97+
--eval CHANGE MASTER TO MASTER_HOST='127.0.0.1', MASTER_USER='root', MASTER_PORT=$MASTER_MYPORT
98+
--enable_warnings
99+
--source include/start_slave.inc
100+
101+
--let $rpl_server_number= 2
102+
--let $rpl_server_parameters=
103+
--source include/rpl_restart_server.inc
104+
--source include/rpl_connection_master.inc
105+
106+
#
107+
# BUG#20191813: MSR + MTS: IF WE HAVE ANY INACTIVE CHANNEL, POST RESTART START SLAVE HITS ER1794
108+
#
109+
110+
#
111+
# Added test case of BUG#20191813 for sanity check
112+
#
113+
# Test validates that even if the default IO channel
114+
# is not initialized, the existing channel will be
115+
# able to start and not throw an error.
116+
#
117+
118+
--source include/rpl_connection_slave.inc
119+
call mtr.add_suppression("Slave: Failed to initialize the master info structure for channel");
120+
call mtr.add_suppression("The slave coordinator and worker threads are stopped");
121+
call mtr.add_suppression("Recovery from master pos");
122+
call mtr.add_suppression("It is not possible to change the type of the relay log repository because there are workers repositories with possible");
123+
--source include/stop_slave.inc
124+
RESET SLAVE ALL;
125+
SET @@global.master_info_repository="TABLE";
126+
SET @@global.relay_log_info_repository="TABLE";
127+
SET @@global.slave_parallel_workers=5;
128+
--disable_warnings
129+
--replace_result $MASTER_MYPORT MASTER_MYPORT
130+
--eval CHANGE MASTER TO MASTER_HOST='localhost', MASTER_USER='root', MASTER_PORT=$MASTER_MYPORT FOR CHANNEL 'ch_trunk'
131+
--enable_warnings
132+
START SLAVE;
133+
134+
--echo === RESTART SLAVE SERVER ===
135+
--let $rpl_server_number= 2
136+
--let $rpl_server_parameters= --relay-log-recovery --skip-slave-start --master-info-repository=TABLE --relay-log-info-repository=TABLE --slave-parallel-workers=5
137+
--source include/rpl_restart_server.inc
138+
--source include/rpl_connection_slave.inc
139+
START SLAVE;
140+
141+
# clean up
142+
--source include/stop_slave.inc
143+
RESET SLAVE ALL;
144+
--replace_result $save_mi_repo_type SAVE_MI_REPO_TYPE
145+
--eval SET @@global.master_info_repository='$save_mi_repo_type'
146+
--replace_result $save_rli_repo_type SAVE_RLI_REPO_TYPE
147+
--eval SET @@global.relay_log_info_repository='$save_rli_repo_type'
148+
--replace_result $save_slave_parallel_workers SAVE_PARALLEL_WORKERS
149+
--eval SET @@global.slave_parallel_workers=$save_slave_parallel_workers
150+
151+
--disable_warnings
152+
--replace_result $MASTER_MYPORT MASTER_MYPORT
153+
--eval CHANGE MASTER TO MASTER_HOST='127.0.0.1', MASTER_USER='root', MASTER_PORT=$MASTER_MYPORT
154+
--enable_warnings
155+
--source include/start_slave.inc
156+
157+
--let $rpl_server_number= 2
158+
--let $rpl_server_parameters=
159+
--source include/rpl_restart_server.inc
160+
--source include/start_slave.inc
161+
--source include/rpl_connection_master.inc
162+
163+
--source include/rpl_end.inc

0 commit comments

Comments
 (0)