Skip to content

Commit d18ea48

Browse files
committed
Bug #28966455 APPLIER LOG MISSES A TRANSACTION IN GR
Problem ============================================================================== The primary contains a transaction in its binary log that is missing from the secondaries. Analysis ============================================================================== Our hypothesis is that the XCom of the primary delivered the transaction, while the XCom of the secondaries delivered a no_op. We have identified a message flow diagram where this happens, which is below. In a nutshell, pax_msg.proposal is initialized to (0,_). But ballot (0,_) is the ballot reserved for the 2-phase Paxos used by the leader of a consensus round. In the diagram below, S0 is the leader and proposes transaction T on its reserved 2-phase ballot (0,0). S1 is a follower that wants to propose no_op on ballot (1,1). S1 should have inherited the transaction T as its proposal value as per Phase 2 (a) in Paxos Made Simple. [1] But S1 proposes no_op instead because pax_msg.proposal is intiialized to (0,1), which is greater than the transaction's T ballot (0,0). Here is the diagram that exposes the problem: Legend ------ SX: Server X PX: Proposer of SX ALX: Acceptor/learner of SX O: Event on the respective server X->: Message sent from the server on X to the server on -> %%%: Comment/observation E: The part where P1 is deviating from the Paxos protocol Diagram ------- S0 S1 S2 P0 AL0 P1 AL1 AL2 | O | O O A{0,1,2}.promise = (0,0) | | | | | %%%%%%%%%%%%%%%%%%%%%%%%%%%% P0 starts trying consensus for T | | | | | O | | | | P0.ballot = (0,0); P0.value = T X--->|-------|--->|------->| accept_op[ballot=(0,0), | | | | | value=T (P0.value)] | | | | | | O | O O AL{0,1,2}.value = T |<---X-------|----X--------X ack_accept_op[ballot=(0,0)] | | | | | %%%%%%%%%%%%%%%%%%%%%%%%%%%% P0 got majority of accepts for (0,0) T | | | | | %%%%%%%%%%%%%%%%%%%%%%%%%%%% P1 starts trying consensus for no_op | | | | | | | O | | P1.ballot = (1,1); P1.value = no_op | |<------X--->|------->| prepare_op[ballot=(1,1)] | | | | | | O | O O AL{0,1,2}.promise = (1,1) | X------>|<---X--------X ack_prepare_op[ballot=(1,1), | | | | | accepted={(0,0) T}] | | | | | %%%%%%%%%%%%%%%%%%%%%%%%%%%% P1 got a majority of prepares for (1,1) | | | | | | | E | | P1.value should be set to T here. | | E | | According to the Paxos protocol, if any | | E | | acceptor replies with a previously | | E | | accepted value, one must use it. But | | E | | handle_ack_prepare will not do it because | | E | | handle_ack_prepare has the following code: | | E | | | | E | | if (gt_ballot(m->proposal, | | E | | p->proposer.msg->proposal)) | | E | | { | | E | | replace_pax_msg(&p->proposer.msg, m); | | E | | ... | | E | | } | | E | | | | E | | However, p->proposer.msg->proposal is | | E | | initialized to (0,1) on P1, meaning that: | | E | | | | E | | if (0,0) > (0,1): P1.value = T | | E | | | | E | | Therefore, P1.value = no_op. | | E | | (see handle_ack_prepare) | | | | | | | ...--X--->|------->| accept_op[ballot=(1,1), | | | | | value=no_op (P1.value)] | | | | | | | | O O AL{1,2}.value = no_op | | |<---X--------X ack_accept_op[ballot=(1,1)] | | | | | %%%%%%%%%%%%%%%%%%%%%%%%%%%% P1 got majority of accepts for (1,1) no_op %%%%%%%%%%%%%%%%%%%%%%%%%%%% Values accepted for P{0,1} don't agree | | | | | | | ...--X--->|------->| tiny_learn_op[ballot=(1,1), no_op] | | | | | | | | O O AL{1,2} learn no_op | | | O O Executor task of S{1,2} delivers no_op | | | | | X--->|--... | | | tiny_learn_op[ballot=(0,0)] | | | | | | O | | | AL0 learns T | O | | | Executor task of S0 delivers T | | | | | %%%%%%%%%%%%%%%%%%%%%%%%%%%% S0 delivered T, S{1,2} delivered no_op | | | | | Solution ============================================================================== Initialize pax_msg.proposal to (-1,_) so that it is always less than any ballot used by any proposer of a consensus round. This way a proposer will inherit previously accepted values, because (-1,_) is less than any ballot used by any proposer. References ============================================================================== [1] Lamport, L. (2001). Paxos made simple. ACM Sigact News, 32(4), 18-25. Reviewed-by: Tiago Jorge <[email protected]> RB: 21168
1 parent a144a60 commit d18ea48

File tree

1 file changed

+9
-2
lines changed
  • rapid/plugin/group_replication/libmysqlgcs/src/bindings/xcom/xcom

1 file changed

+9
-2
lines changed

rapid/plugin/group_replication/libmysqlgcs/src/bindings/xcom/xcom/pax_msg.c

Lines changed: 9 additions & 2 deletions
Original file line numberDiff line numberDiff line change
@@ -1,4 +1,4 @@
1-
/* Copyright (c) 2015, 2016, Oracle and/or its affiliates. All rights reserved.
1+
/* Copyright (c) 2015, 2018, Oracle and/or its affiliates. All rights reserved.
22
33
This program is free software; you can redistribute it and/or modify
44
it under the terms of the GNU General Public License as published by
@@ -52,7 +52,14 @@ static pax_msg *init_pax_msg(pax_msg *p, int refcnt, synode_no synode, site_def
5252
p->to = VOID_NODE_NO;
5353
p->op = initial_op;
5454
init_ballot(&p->reply_to, 0, nodeno);
55-
init_ballot(&p->proposal, 0, nodeno);
55+
/*
56+
-1 ensures ballot (-1,nodeno) is less than any ballot used by any
57+
proposer.
58+
Leader will use reserved ballot (0,_) for its initial 2-phase Paxos
59+
round.
60+
Remaining rounds will use ballot (1+,_) and the vanilla 3-phase Paxos.
61+
*/
62+
init_ballot(&p->proposal, -1, nodeno);
5663
p->synode = synode;
5764
p->msg_type = normal;
5865
p->receivers = NULL;

0 commit comments

Comments
 (0)