Skip to content

Commit 94148af

Browse files
BUG#33934013: MYSQL CRASHES IF YOU STOP REPLICATION WHILE IT
IS REPLICATING A CTAS. DESCRIPTION: ============ If a CREATE TABLE ... AS SELECT (CTAS) is being applied on a Replica and is interrupted by stopping replication, it could sometimes cause a server exit. ANALYSIS: ========= CTAS is implemented as an atomic operation for Storage Engines that support Atomic DDLs. Applying binlog for CTAS at replica is also atomic. CTAS (CREATE + INSERT) is now executed as a single binlog transaction on Replica. Thus, if the server crashes while executing it on the Replica, the transaction is rolled back. The class Transactional_ddl_context is used to keep the context of transactional DDL statements. Currently, only CREATE TABLE with START TRANSACTION uses this context. When a CTAS query is executed, it is written to binary log in the following format. BEGIN CREATE TABLE t1 (c1 INT) START TRANSACTION Write_rows events on t1 COMMIT Consider a case of CTAS where the CREATE has succeeded and the Replica is interrupted by manually stopping replication (STOP REPLICA) just after the applier thread has executed atleast one WRITE_ROWS_EVENT. As expected, the SQL thread is killed and the transaction is forced to rollback. The rollback process involves removing all the instances of TABLE and the TABLE_SHARE instance from the Table cache and Table definition cache respectively. To remove the TABLE object from the Table cache, the expectation is that the TABLE object must be unused. In this scenario, before applying the WRITE_ROWS_EVENT the table was opened and the TABLE object was added to the used list in the Table cache and also appropriate locks were taken. Since the SQL thread was stopped abruptly, the TABLE object was not marked as free. Hence, as part of the cleanup, when Relay_log_info::cleanup_context() attempts to close the thread tables thereby free the TABLE object from the Table cache, an assert is triggered in Table_cache_manager::free_table(). Freeing the TABLE object from Table cache also involves closing the table handler. It expects the handler to have released its lock (i.e handler::m_lock_type == F_UNLCK). In this scenario, handler::m_lock_type was set to F_WRLCK while opening and locking the table earlier and thus triggering an assert while closing the handler. FIX: ==== The fix is to unlock and move the TABLE object into the free list in the Table cache just before freeing the TABLE and TABLE_SHARE objects from their respective caches. Change-Id: I40ed95efc28db4fb6231a38724787d3e3b8d202d
1 parent 2335998 commit 94148af

File tree

2 files changed

+23
-0
lines changed

2 files changed

+23
-0
lines changed

sql/rpl_replica.cc

Lines changed: 10 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -5020,6 +5020,16 @@ static int exec_relay_log_event(THD *thd, Relay_log_info *rli,
50205020
return 1;
50215021
}
50225022

5023+
DBUG_EXECUTE_IF("wait_on_exec_relay_log_event", {
5024+
if (ev->get_type_code() == binary_log::WRITE_ROWS_EVENT) {
5025+
const char act[] =
5026+
"now SIGNAL signal.waiting_on_event_execution "
5027+
"WAIT_FOR signal.can_continue_execution";
5028+
assert(opt_debug_sync_timeout > 0);
5029+
assert(!debug_sync_set_action(current_thd, STRING_WITH_LEN(act)));
5030+
}
5031+
};);
5032+
50235033
/* ptr_ev can change to NULL indicating MTS coorinator passed to a Worker */
50245034
exec_res = apply_event_and_update_pos(ptr_ev, thd, rli);
50255035
/*

sql/sql_class.cc

Lines changed: 13 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -3230,6 +3230,11 @@ void Transactional_ddl_context::init(dd::String_type db,
32303230
dd::String_type tablename,
32313231
const handlerton *hton) {
32323232
assert(m_hton == nullptr);
3233+
/*
3234+
Currently, Transactional_ddl_context is used only for CREATE TABLE ... START
3235+
TRANSACTION statement.
3236+
*/
3237+
assert(m_thd->lex->sql_command == SQLCOM_CREATE_TABLE);
32333238
m_db = db;
32343239
m_tablename = tablename;
32353240
m_hton = hton;
@@ -3241,7 +3246,15 @@ void Transactional_ddl_context::init(dd::String_type db,
32413246
*/
32423247
void Transactional_ddl_context::rollback() {
32433248
if (!inited()) return;
3249+
/*
3250+
Since the transaction is being rolledback, We need to unlock and close the
3251+
table belonging to this transaction.
3252+
*/
3253+
if (m_thd->lock) mysql_unlock_tables(m_thd, m_thd->lock);
3254+
m_thd->lock = nullptr;
3255+
if (m_thd->open_tables) close_thread_table(m_thd, &m_thd->open_tables);
32443256
table_cache_manager.lock_all_and_tdc();
3257+
32453258
TABLE_SHARE *share =
32463259
get_cached_table_share(m_db.c_str(), m_tablename.c_str());
32473260
if (share) {

0 commit comments

Comments
 (0)