Skip to content

Commit 3d6f534

Browse files
author
Venkatesh Prasad
committed
Bug #113727: GTID Replica stalls completely when log_slave_updates=0 and a local
DDL executed https://bugs.mysql.com/bug.php?id=113727 Problem ------- In high concurrency scenarios, MySQL replica can enter into a deadlock due to a race condition between the replica applier thread and the client thread performing a binlog group commit. Analysis -------- It needs at least 3 threads for this deadlock to happen 1. One client thread 2. Two replica applier threads How this deadlock happens? -------------------------- 0. Binlog is enabled on replica, but log_replica_updates is disabled. 1. Initially, both "Commit Order" and "Binlog Flush" queues are empty. 2. Replica applier thread 1 enters the group commit pipeline to register in the "Commit Order" queue since `log-replica-updates` is disabled on the replica node. 3. Since both "Commit Order" and "Binlog Flush" queues are empty, the applier thread 1 3.1. Becomes leader (In Commit_stage_manager::enroll_for()). 3.2. Registers in the commit order queue. 3.3. Acquires the lock MYSQL_BIN_LOG::LOCK_log. 3.4. Commit Order queue is emptied, but the lock MYSQL_BIN_LOG::LOCK_log is not yet released. NOTE: SE commit for applier thread is already done by the time it reaches here. 4. Replica applier thread 2 enters the group commit pipeline to register in the "Commit Order" queue since `log-replica-updates` is disabled on the replica node. 5. Since the "Commit Order" queue is empty (emptied by applier thread 1 in 3.4), the applier thread 2 5.1. Becomes leader (In Commit_stage_manager::enroll_for()) 5.2. Registers in the commit order queue. 5.3. Tries to acquire the lock MYSQL_BIN_LOG::LOCK_log. Since it is held by applier thread 1 it will wait until the lock is released. 6. Client thread enters the group commit pipeline to register in the "Binlog Flush" queue. 7. Since "Commit Order" queue is not empty (there is applier thread 2 in the queue), it enters the conditional wait `m_stage_cond_leader` with an intention to become the leader for both the "Binlog Flush" and "Commit Order" queues. 8. Applier thread 1 releases the lock MYSQL_BIN_LOG::LOCK_log and proceeds to update the GTID by calling gtid_state->update_commit_group() from Commit_order_manager::flush_engine_and_signal_threads(). 9. Applier thread 2 acquires the lock MYSQL_BIN_LOG::LOCK_log. 9.1. It checks if there is any thread waiting in the "Binlog Flush" queue to become the leader. Here it finds the client thread waiting to be the leader. 9.2. It releases the lock MYSQL_BIN_LOG::LOCK_log and signals on the cond_var `m_stage_cond_leader` and enters a conditional wait until the thread's `tx_commit_pending` is set to false by the client thread (will be done in the Commit_stage_manager::process_final_stage_for_ordered_commit_group() called by client thread from fetch_and_process_flush_stage_queue()). 10. The client thread wakes up from the cond_var `m_stage_cond_leader`. The thread has now become a leader and it is its responsibility to update GTID of applier thread 2. 10.1. It acquires the lock MYSQL_BIN_LOG::LOCK_log. 10.2. Returns from `enroll_for()` and proceeds to process the "Commit Order" and "Binlog Flush" queues. 10.3. Fetches the "Commit Order" and "Binlog Flush" queues. 10.4. Performs the storage engine flush by calling ha_flush_logs() from fetch_and_process_flush_stage_queue(). 10.5. Proceeds to update the GTID of threads in "Commit Order" queue by calling gtid_state->update_commit_group() from Commit_stage_manager::process_final_stage_for_ordered_commit_group(). 11. At this point, we will have - Client thread performing GTID update on behalf if applier thread 2 (from step 10.5), and - Applier thread 1 performing GTID update for itself (from step 8). Due to the lack of proper synchronization between the above two threads, there exists a time window where both threads can call gtid_state->update_commit_group() concurrently. In subsequent steps, both threads simultaneously try to modify the contents of the array `commit_group_sidnos` which is used to track the lock status of sidnos. This concurrent access to `update_commit_group()` can cause a lock-leak resulting in one thread acquiring the sidno lock and not releasing at all. ----------------------------------------------------------------------------------------------------------- Client thread Applier Thread 1 ----------------------------------------------------------------------------------------------------------- update_commit_group() => global_sid_lock->rdlock(); update_commit_group() => global_sid_lock->rdlock(); calls update_gtids_impl_lock_sidnos() calls update_gtids_impl_lock_sidnos() set commit_group_sidno[2] = true set commit_group_sidno[2] = true lock_sidno(2) -> successful lock_sidno(2) -> waits update_gtids_impl_own_gtid() -> Add the thd->owned_gtid in `executed_gtids()` if (commit_group_sidnos[2]) { unlock_sidno(2); commit_group_sidnos[2] = false; } Applier thread continues.. lock_sidno(2) -> successful update_gtids_impl_own_gtid() -> Add the thd->owned_gtid in `executed_gtids()` if (commit_group_sidnos[2]) { <=== this check fails and lock is not released. unlock_sidno(2); commit_group_sidnos[2] = false; } Client thread continues without releasing the lock ----------------------------------------------------------------------------------------------------------- 12. As the above lock-leak can also happen the other way i.e, the applier thread fails to unlock, there can be different consequences hereafter. 13. If the client thread continues without releasing the lock, then at a later stage, it can enter into a deadlock with the applier thread performing a GTID update with stack trace. Client_thread ------------- mysql#1 __GI___lll_lock_wait mysql#2 ___pthread_mutex_lock mysql#3 native_mutex_lock <= waits for commit lock while holding sidno lock mysql#4 Commit_stage_manager::enroll_for mysql#5 MYSQL_BIN_LOG::change_stage mysql#6 MYSQL_BIN_LOG::ordered_commit mysql#7 MYSQL_BIN_LOG::commit mysql#8 ha_commit_trans mysql#9 trans_commit_implicit mysql#10 mysql_create_like_table mysql#11 Sql_cmd_create_table::execute mysql#12 mysql_execute_command mysql#13 dispatch_sql_command Applier thread -------------- mysql#1 ___pthread_mutex_lock mysql#2 native_mutex_lock mysql#3 safe_mutex_lock mysql#4 Gtid_state::update_gtids_impl_lock_sidnos <= waits for sidno lock mysql#5 Gtid_state::update_commit_group mysql#6 Commit_order_manager::flush_engine_and_signal_threads <= acquires commit lock here mysql#7 Commit_order_manager::finish mysql#8 Commit_order_manager::wait_and_finish mysql#9 ha_commit_low mysql#10 trx_coordinator::commit_in_engines mysql#11 MYSQL_BIN_LOG::commit mysql#12 ha_commit_trans mysql#13 trans_commit mysql#14 Xid_log_event::do_commit mysql#15 Xid_apply_log_event::do_apply_event_worker mysql#16 Slave_worker::slave_worker_exec_event mysql#17 slave_worker_exec_job_group mysql#18 handle_slave_worker 14. If the applier thread continues without releasing the lock, then at a later stage, it can perform recursive locking while setting the GTID for the next transaction (in set_gtid_next()). In debug builds the above case hits the assertion `safe_mutex_assert_not_owner()` meaning the lock is already acquired by the replica applier thread when it tries to re-acquire the lock. Solution -------- In the above problematic example, when seen from each thread individually, we can conclude that there is no problem in the order of lock acquisition, thus there is no need to change the lock order. However, the root cause for this problem is that multiple threads can concurrently access to the array `Gtid_state::commit_group_sidnos`. In its initial implementation, it was expected that threads should hold the `MYSQL_BIN_LOG::LOCK_commit` before modifying its contents. But it was not considered when upstream implemented WL#7846 (MTS: slave-preserve-commit-order when log-slave-updates/binlog is disabled). With this patch, we now ensure that `MYSQL_BIN_LOG::LOCK_commit` is acquired when the client thread (binlog flush leader) when it tries to perform GTID update on behalf of threads waiting in "Commit Order" queue, thus providing a guarantee that `Gtid_state::commit_group_sidnos` array is never accessed without the protection of `MYSQL_BIN_LOG::LOCK_commit`.
1 parent 49ef33f commit 3d6f534

File tree

2 files changed

+10
-0
lines changed

2 files changed

+10
-0
lines changed

sql/rpl_commit_stage_manager.cc

Lines changed: 7 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -476,7 +476,14 @@ THD *Commit_stage_manager::fetch_queue_skip_acquire_lock(StageID stage) {
476476
void Commit_stage_manager::process_final_stage_for_ordered_commit_group(
477477
THD *first) {
478478
if (first != nullptr) {
479+
/*
480+
The below call to update_commit_group() function accesses the array
481+
`commit_group_sidnos` and needs to be protected with
482+
MYSQL_BIN_LOG::LOCK_commit.
483+
*/
484+
mysql_mutex_lock(mysql_bin_log.get_commit_lock());
479485
gtid_state->update_commit_group(first);
486+
mysql_mutex_unlock(mysql_bin_log.get_commit_lock());
480487
signal_done(first, Commit_stage_manager::COMMIT_ORDER_FLUSH_STAGE);
481488
}
482489
}

sql/rpl_gtid_state.cc

Lines changed: 3 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -154,6 +154,9 @@ void Gtid_state::broadcast_owned_sidnos(const THD *thd) {
154154
void Gtid_state::update_commit_group(THD *first_thd) {
155155
DBUG_TRACE;
156156

157+
// Assert that we already hold MYSQL_BIN_LOG::LOCK_commit here
158+
mysql_mutex_assert_owner(mysql_bin_log.get_commit_lock());
159+
157160
bool gtid_threshold_breach = false;
158161
/*
159162
We are going to loop in all sessions of the group commit in order to avoid

0 commit comments

Comments
 (0)