-
Notifications
You must be signed in to change notification settings - Fork 913
GODRIVER-1825 Select the server with fewer in-progress operations. #867
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
GODRIVER-1825 Select the server with fewer in-progress operations. #867
Conversation
fff2d40
to
4dcc245
Compare
4dd7cd3
to
d212fa8
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM! Thank you for the inline comments describing the rationale for spec tests in topology_test.go and the small deviations from the spec.
Suggested one test change and a comment fix up.
x/mongo/driver/topology/pool.go
Outdated
// came from this pool. Do this in checkIn() instead of checkInNoEvent() because the latter is | ||
// called by createConnections() and is not necessarily a check-in of an in-use connection. Use | ||
// an int64 instead of a uint64 to mitigate the impact of any possible bugs that could cause the | ||
// uint64 to underflow, which would effectively make the server unselectable. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I agree with the rationale for using an int64 rather than an uint64. I think that is a good precaution.
One suggested edit:
// uint64 to underflow, which would effectively make the server unselectable. | |
// uint64 to underflow, which would make the server much less selectable. |
A server could still be selected if it has the lowest average latency.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Good suggestion, will update.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Updated.
8be0f17
to
3d75312
Compare
3d75312
to
2abcf79
Compare
bf81967
to
9904e77
Compare
9904e77
to
34cbc49
Compare
// Sleep for 100ms to allow all server state discovery to complete. We need both servers to | ||
// be selectable when we start running the test or the distribution of selected servers will | ||
// be skewed. Unfortunately there's not currently another signal we can block on. | ||
time.Sleep(100 * time.Millisecond) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Suggested offline, consider waiting for two ServerHeartbeatSucceededEvent
events rather than sleep.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Actually used TopologyDescriptionChangedEvent
s to confirm that the topology has exactly 2 Mongos
type instances before continuing to run the test.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Updated.
x/mongo/driver/topology/server.go
Outdated
connImpl, err := s.pool.checkOut(ctx) | ||
// Increment the operation count before calling checkOut to make sure that all connection | ||
// requests are included in the operation count, including those in the wait queue. If we got an | ||
// error insted of a connection, immediately decrement the operation count. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
// error insted of a connection, immediately decrement the operation count. | |
// error instead of a connection, immediately decrement the operation count. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Good catch, will update.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Updated.
ce7b1c2
to
0a56eba
Compare
GODRIVER-1825
Update
Topology.SelectServer()
to pick 2 random servers from the list of selectable servers, and return the one with fewer in-progress operations to more evenly balance load across the selectable servers.Changes:
operationCount
counter inServer
that keeps track of the number of requests for connections. Note thatoperationCount
is not decremented while a connection is pinned to a transaction until the transaction is committed or aborted.Topology.SelectServer()
and return the one with fewer in-progress operations.