Skip to content

PYTHON-2457 Test that clients wait 500ms between failed heartbeat checks #524

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
merged 3 commits into from
Dec 8, 2020
Merged
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
68 changes: 28 additions & 40 deletions test/test_streaming_protocol.py
Original file line number Diff line number Diff line change
Expand Up @@ -130,54 +130,42 @@ def changed_event(event):
self.assertEqual(1, len(events))
self.assertGreater(events[0].new_description.round_trip_time, 0)

@client_context.require_version_min(4, 9, -1)
@client_context.require_failCommand_appName
def test_monitor_waits_after_server_check_error(self):
hb_listener = HeartbeatEventListener()
client = rs_or_single_client(
event_listeners=[hb_listener], heartbeatFrequencyMS=500,
appName='waitAfterErrorTest')
self.addCleanup(client.close)
# Force a connection.
client.admin.command('ping')
address = client.address

# This test implements:
# https://github.com/mongodb/specifications/blob/6c5b2ac/source/server-discovery-and-monitoring/server-discovery-and-monitoring-tests.rst#monitors-sleep-at-least-minheartbeatfreqencyms-between-checks
fail_ismaster = {
'mode': {'times': 50},
'mode': {'times': 5},
'data': {
'failCommands': ['isMaster'],
'closeConnection': False,
'errorCode': 91,
# This can be uncommented after SERVER-49220 is fixed.
# 'appName': 'waitAfterErrorTest',
'errorCode': 1234,
'appName': 'SDAMMinHeartbeatFrequencyTest',
},
}
with self.fail_point(fail_ismaster):
time.sleep(2)

# Server should be selectable.
client.admin.command('ping')

def hb_started(event):
return (isinstance(event, monitoring.ServerHeartbeatStartedEvent)
and event.connection_id == address)

hb_started_events = hb_listener.matching(hb_started)
# Explanation of the expected heartbeat events:
# Time: event
# 0ms: create MongoClient
# 1ms: run monitor handshake, 1
# 2ms: run awaitable isMaster, 2
# 3ms: run configureFailPoint
# 502ms: isMaster fails for the first time with command error
# 1002ms: run monitor handshake, 3
# 1502ms: run monitor handshake, 4
# 2002ms: run monitor handshake, 5
# 2003ms: disable configureFailPoint
# 2004ms: isMaster succeeds, 6
# 2004ms: awaitable isMaster, 7
self.assertGreater(len(hb_started_events), 7)
# This can be reduced to ~15 after SERVER-49220 is fixed.
self.assertLess(len(hb_started_events), 40)
start = time.time()
client = single_client(
appName='SDAMMinHeartbeatFrequencyTest',
serverSelectionTimeoutMS=5000)
self.addCleanup(client.close)
# Force a connection.
client.admin.command('ping')
duration = time.time() - start
# Explanation of the expected events:
# 0ms: run configureFailPoint
# 1ms: create MongoClient
# 2ms: failed monitor handshake, 1
# 502ms: failed monitor handshake, 2
# 1002ms: failed monitor handshake, 3
# 1502ms: failed monitor handshake, 4
# 2002ms: failed monitor handshake, 5
# 2502ms: monitor handshake succeeds
# 2503ms: run awaitable isMaster
# 2504ms: application handshake succeeds
# 2505ms: ping command succeeds
self.assertGreaterEqual(duration, 2)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can you add comments here that explain the math? E.g.

# isMaster attempted 5 times with 500 ms between checks => duration > 4 * 500ms 
self.assertGreaterEqual(duration, 2)

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Added a link to the spec and an explanation like the previous version of the test.

self.assertLessEqual(duration, 3.5)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

How do you arrive at 3.5s?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I added it as a generous upper bound to account for any timing differences in the various test runner implementatinos. Drivers should really take between 2 and 2.5 seconds (depending on when the timer was started and how fast the monitors start up).


@client_context.require_failCommand_appName
def test_heartbeat_awaited_flag(self):
Expand Down