-
Notifications
You must be signed in to change notification settings - Fork 266
PHPLIB-467: Connections survive primary step-down #667
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Changes from all commits
File filter
Filter by extension
Conversations
Jump to
Diff view
Diff view
There are no files selected for viewing
Original file line number | Diff line number | Diff line change |
---|---|---|
|
@@ -13,7 +13,6 @@ | |
"bind_ip_all": true | ||
}, | ||
"rsParams": { | ||
"priority": 99, | ||
"tags": { | ||
"ordinal": "one", | ||
"dc": "pa" | ||
|
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,281 @@ | ||
<?php | ||
|
||
namespace MongoDB\Tests\SpecTests; | ||
|
||
use IteratorIterator; | ||
use MongoDB\Client; | ||
use MongoDB\Collection; | ||
use MongoDB\Driver\Command; | ||
use MongoDB\Driver\Exception\BulkWriteException; | ||
use MongoDB\Driver\Exception\Exception as DriverException; | ||
use MongoDB\Driver\ReadPreference; | ||
use MongoDB\Driver\Server; | ||
use MongoDB\Driver\WriteConcern; | ||
use MongoDB\Operation\BulkWrite; | ||
use MongoDB\Tests\CommandObserver; | ||
use Symfony\Bridge\PhpUnit\SetUpTearDownTrait; | ||
use UnexpectedValueException; | ||
use function current; | ||
|
||
/** | ||
* @see https://github.com/mongodb/specifications/tree/master/source/connections-survive-step-down/tests | ||
*/ | ||
class PrimaryStepDownSpecTest extends FunctionalTestCase | ||
alcaeus marked this conversation as resolved.
Show resolved
Hide resolved
|
||
{ | ||
use SetUpTearDownTrait; | ||
|
||
const INTERRUPTED_AT_SHUTDOWN = 11600; | ||
const NOT_MASTER = 10107; | ||
const SHUTDOWN_IN_PROGRESS = 91; | ||
|
||
/** @var Client */ | ||
private $client; | ||
|
||
/** @var Collection */ | ||
private $collection; | ||
|
||
private function doSetUp() | ||
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. I was going to suggest dropping the collection under test on tearDown, since the tests write to it; however, I realized that we don't consistently do that across the test suite. For example: I don't think this is worth doing here, but I did want to share the thought. Long term, I'd be more interested in auditing/profiling the test suite to see where we can improve execution time by reducing things like drops. I know database drops can be fairly expensive, so those would be worth looking at to see if they serve a useful purpose. There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. I'll leave it for now. As you mentioned, leaving data behind to check out after test failures can make sense, and the spec tests drop collections during setup anyways. At a previous job I've played around with only dropping collections that were written to during tests (using APM to figure out which collections were written to), but the performance gain from that was negligible IIRC. |
||
{ | ||
parent::setUp(); | ||
|
||
$this->client = new Client(static::getUri(), ['retryWrites' => false, 'heartbeatFrequencyMS' => 500, 'serverSelectionTimeoutMS' => 20000, 'serverSelectionTryOnce' => false]); | ||
|
||
$this->dropAndRecreateCollection(); | ||
$this->collection = $this->client->selectCollection($this->getDatabaseName(), $this->getCollectionName()); | ||
} | ||
|
||
/** | ||
* @see https://github.com/mongodb/specifications/tree/master/source/connections-survive-step-down/tests#id10 | ||
*/ | ||
public function testNotMasterKeepsConnectionPool() | ||
{ | ||
$runOn = [(object) ['minServerVersion' => '4.1.11', 'topology' => [self::TOPOLOGY_REPLICASET]]]; | ||
$this->checkServerRequirements($runOn); | ||
|
||
// Set a fail point | ||
$this->configureFailPoint([ | ||
'configureFailPoint' => 'failCommand', | ||
'mode' => ['times' => 1], | ||
'data' => [ | ||
'failCommands' => ['insert'], | ||
'errorCode' => self::NOT_MASTER, | ||
], | ||
]); | ||
|
||
$totalConnectionsCreated = $this->getTotalConnectionsCreated(); | ||
|
||
// Execute an insert into the test collection of a {test: 1} document. | ||
try { | ||
$this->insertDocuments(1); | ||
} catch (BulkWriteException $e) { | ||
// Verify that the insert failed with an operation failure with 10107 code. | ||
$this->assertSame(self::NOT_MASTER, $e->getCode()); | ||
} | ||
|
||
// Execute an insert into the test collection of a {test: 1} document and verify that it succeeds. | ||
$result = $this->insertDocuments(1); | ||
$this->assertSame(1, $result->getInsertedCount()); | ||
|
||
// Verify that the connection pool has not been cleared | ||
$this->assertSame($totalConnectionsCreated, $this->getTotalConnectionsCreated()); | ||
} | ||
|
||
/** | ||
* @see https://github.com/mongodb/specifications/tree/master/source/connections-survive-step-down/tests#id11 | ||
*/ | ||
public function testNotMasterResetConnectionPool() | ||
{ | ||
$runOn = [(object) ['minServerVersion' => '4.0.0', 'maxServerVersion' => '4.0.999', 'topology' => [self::TOPOLOGY_REPLICASET]]]; | ||
$this->checkServerRequirements($runOn); | ||
|
||
// Set a fail point | ||
$this->configureFailPoint([ | ||
'configureFailPoint' => 'failCommand', | ||
'mode' => ['times' => 1], | ||
'data' => [ | ||
'failCommands' => ['insert'], | ||
'errorCode' => self::NOT_MASTER, | ||
], | ||
]); | ||
|
||
$totalConnectionsCreated = $this->getTotalConnectionsCreated(); | ||
|
||
// Execute an insert into the test collection of a {test: 1} document. | ||
try { | ||
$this->insertDocuments(1); | ||
} catch (BulkWriteException $e) { | ||
// Verify that the insert failed with an operation failure with 10107 code. | ||
$this->assertSame(self::NOT_MASTER, $e->getCode()); | ||
} | ||
|
||
// Verify that the connection pool has been cleared | ||
$this->assertSame($totalConnectionsCreated + 1, $this->getTotalConnectionsCreated()); | ||
} | ||
|
||
/** | ||
* @see https://github.com/mongodb/specifications/tree/master/source/connections-survive-step-down/tests#id12 | ||
*/ | ||
public function testShutdownResetConnectionPool() | ||
{ | ||
$runOn = [(object) ['minServerVersion' => '4.0.0']]; | ||
$this->checkServerRequirements($runOn); | ||
|
||
// Set a fail point | ||
$this->configureFailPoint([ | ||
'configureFailPoint' => 'failCommand', | ||
'mode' => ['times' => 1], | ||
'data' => [ | ||
'failCommands' => ['insert'], | ||
'errorCode' => self::SHUTDOWN_IN_PROGRESS, | ||
], | ||
]); | ||
|
||
$totalConnectionsCreated = $this->getTotalConnectionsCreated(); | ||
|
||
// Execute an insert into the test collection of a {test: 1} document. | ||
try { | ||
$this->insertDocuments(1); | ||
} catch (BulkWriteException $e) { | ||
// Verify that the insert failed with an operation failure with 91 code. | ||
$this->assertSame(self::SHUTDOWN_IN_PROGRESS, $e->getCode()); | ||
} | ||
|
||
// Verify that the connection pool has been cleared | ||
$this->assertSame($totalConnectionsCreated + 1, $this->getTotalConnectionsCreated()); | ||
} | ||
|
||
/** | ||
* @see https://github.com/mongodb/specifications/tree/master/source/connections-survive-step-down/tests#id13 | ||
*/ | ||
public function testInterruptedAtShutdownResetConnectionPool() | ||
{ | ||
$runOn = [(object) ['minServerVersion' => '4.0.0']]; | ||
$this->checkServerRequirements($runOn); | ||
|
||
// Set a fail point | ||
$this->configureFailPoint([ | ||
'configureFailPoint' => 'failCommand', | ||
'mode' => ['times' => 1], | ||
'data' => [ | ||
'failCommands' => ['insert'], | ||
'errorCode' => self::INTERRUPTED_AT_SHUTDOWN, | ||
], | ||
]); | ||
|
||
$totalConnectionsCreated = $this->getTotalConnectionsCreated(); | ||
|
||
// Execute an insert into the test collection of a {test: 1} document. | ||
try { | ||
$this->insertDocuments(1); | ||
} catch (BulkWriteException $e) { | ||
// Verify that the insert failed with an operation failure with 11600 code. | ||
$this->assertSame(self::INTERRUPTED_AT_SHUTDOWN, $e->getCode()); | ||
} | ||
|
||
// Verify that the connection pool has been cleared | ||
$this->assertSame($totalConnectionsCreated + 1, $this->getTotalConnectionsCreated()); | ||
} | ||
|
||
/** | ||
* @see https://github.com/mongodb/specifications/tree/master/source/connections-survive-step-down/tests#id9 | ||
*/ | ||
public function testGetMoreIteration() | ||
{ | ||
$runOn = [(object) ['minServerVersion' => '4.1.11', 'topology' => [self::TOPOLOGY_REPLICASET]]]; | ||
$this->checkServerRequirements($runOn); | ||
|
||
// Insert 5 documents into a collection with a majority write concern. | ||
$this->insertDocuments(5); | ||
|
||
// Start a find operation on the collection with a batch size of 2, and retrieve the first batch of results. | ||
$cursor = $this->collection->find([], ['batchSize' => 2]); | ||
|
||
$iterator = new IteratorIterator($cursor); | ||
$iterator->rewind(); | ||
$this->assertTrue($iterator->valid()); | ||
|
||
$iterator->next(); | ||
$this->assertTrue($iterator->valid()); | ||
|
||
$totalConnectionsCreated = $this->getTotalConnectionsCreated(); | ||
|
||
// Send a {replSetStepDown: 5, force: true} command to the current primary and verify that the command succeeded | ||
$primary = $this->client->getManager()->selectServer(new ReadPreference(ReadPreference::RP_PRIMARY)); | ||
$primary->executeCommand('admin', new Command(['replSetStepDown' => 5, 'force' => true])); | ||
|
||
// Retrieve the next batch of results from the cursor obtained in the find operation, and verify that this operation succeeded. | ||
$events = []; | ||
$observer = new CommandObserver(); | ||
$observer->observe( | ||
function () use ($iterator) { | ||
$iterator->next(); | ||
}, | ||
function ($event) use (&$events) { | ||
$events[] = $event; | ||
} | ||
); | ||
$this->assertTrue($iterator->valid()); | ||
$this->assertCount(1, $events); | ||
$this->assertSame('getMore', $events[0]['started']->getCommandName()); | ||
|
||
// Verify that no new connections have been created | ||
$this->assertSame($totalConnectionsCreated, $this->getTotalConnectionsCreated($cursor->getServer())); | ||
|
||
// Wait to allow primary election to complete and prevent subsequent test failures | ||
$this->waitForMasterReelection(); | ||
} | ||
|
||
private function insertDocuments($count) | ||
{ | ||
$operations = []; | ||
|
||
for ($i = 1; $i <= $count; $i++) { | ||
$operations[] = [ | ||
BulkWrite::INSERT_ONE => [['test' => $i]], | ||
]; | ||
} | ||
|
||
return $this->collection->bulkWrite($operations, ['writeConcern' => new WriteConcern('majority')]); | ||
} | ||
|
||
private function dropAndRecreateCollection() | ||
{ | ||
$this->client->selectCollection($this->getDatabaseName(), $this->getCollectionName())->drop(); | ||
$this->client->selectDatabase($this->getDatabaseName())->command(['create' => $this->getCollectionName()]); | ||
} | ||
|
||
private function getTotalConnectionsCreated(Server $server = null) | ||
{ | ||
$server = $server ?: $this->client->getManager()->selectServer(new ReadPreference('primary')); | ||
|
||
$cursor = $server->executeCommand( | ||
$this->getDatabaseName(), | ||
new Command(['serverStatus' => 1]), | ||
new ReadPreference(ReadPreference::RP_PRIMARY) | ||
); | ||
|
||
$cursor->setTypeMap(['root' => 'array', 'document' => 'array']); | ||
$document = current($cursor->toArray()); | ||
|
||
if (isset($document['connections'], $document['connections']['totalCreated'])) { | ||
return (int) $document['connections']['totalCreated']; | ||
} | ||
|
||
throw new UnexpectedValueException('Could not determine number of total connections'); | ||
} | ||
|
||
private function waitForMasterReelection() | ||
{ | ||
try { | ||
$this->insertDocuments(1); | ||
|
||
return; | ||
} catch (DriverException $e) { | ||
$this->client->getManager()->selectServer(new ReadPreference('primary')); | ||
|
||
return; | ||
} | ||
|
||
$this->fail('Expected primary to be re-elected within 20 seconds.'); | ||
alcaeus marked this conversation as resolved.
Show resolved
Hide resolved
|
||
} | ||
} |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Did these removals actually break some PHPLIB tests, or was that resolved? IIRC, you mentioned something about it on Monday.
Uh oh!
There was an error while loading. Please reload this page.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
They pass on travis-ci, but sometimes fail on my machine. I'm currently investigating the cause of these sporadic failures.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I wasn't able to figure out build failures, but at least they're not confined to my machine anymore. Re-adding the priority fixes the current failures but reintroduces others.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I've taken a look at the libmongoc implementation of these tests, and they work exactly the same: grab the connection count from the current primary, perform an operation that fails, check connection count again. This works fine, unless the first test that triggers a stepdown ran. If it ran, connection counts are sometimes off, without knowing exactly why. Moving the stepdown test to the end solves this.
I feel bad moving the test, but I'm not sure what could be the cause of this. We don't manage connections at all, and the corresponding tests pass for libmongoc: https://evergreen.mongodb.com/task_log_raw/mongo_c_driver_gcc54_test_latest_replica_set_noauth_sasl_openssl_0acb4a539a62248c2a7f1e0b4a866340c463359c_19_08_27_00_33_53/0?type=T#L3694.