-
Notifications
You must be signed in to change notification settings - Fork 266
PHPLIB-1235 Implements Single-Doc Benchmarks #1161
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
In the spec, Measurement is supposed to output MB/s, I'm not sure this is something we can do with PHPBench, but I can explore reporting features.
17b5b74
to
d3248b4
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
A short note on the iterations: multiple iterations can be run to ensure that the benchmarks are consistent. For example, we could let phpbench fail the benchmark suite if any benchmark has a deviation of more than x%. We currently don't do this, but will eventually add such assertions in CI. Therefore, I would recommend not hardcoding any iteration number in the benchmarks, instead relying on what's configured in phpbench.json.dist
and phpbench.json
.
d54f390
to
92e1c1f
Compare
Running this new benchmark takes 35 minutes on my laptop. ❯ time ./vendor/bin/phpbench run --filter SingleDocBench
PHPBench (1.2.14) running benchmarks... #standwithukraine
with configuration file: /Users/jerome/Develop/mongo-php-library/phpbench.json.dist
with PHP version 8.2.9, xdebug ✔, opcache ❌
\MongoDB\Benchmark\DriverBench\SingleDocBench
benchRunCommand.........................I2 - Mo693.331ms (±1.44%)
benchFindOneById # default..............I2 - Mo1.097s (±6.82%)
benchFindOneById # bson typemap.........I2 - Mo968.745ms (±0.45%)
benchInsertOne # Small doc..............I2 - Mo2.812m (±1.93%)
benchInsertOne # Small BSON doc.........I2 - Mo2.928m (±0.58%)
benchInsertOne # Large doc..............I2 - Mo632.107ms (±4.04%)
benchInsertOne # Large BSON doc.........I2 - Mo709.506ms (±4.88%)
Subjects: 3, Assertions: 0, Failures: 0, Errors: 0
./vendor/bin/phpbench run --filter SingleDocBench 19.94s user 7.76s system 1% cpu 35:20.41 total |
yield 'Small doc WC 0' => [ | ||
'document' => self::readJsonFile(self::SMALL_FILE_PATH), | ||
'repeat' => 10_000, | ||
'options' => ['writeConcern' => new WriteConcern(0)], |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I found that if my mongo is a replica and the write operation is done with the default writeConcern: majority
, the insertOne
operation is is slow. If I set writeConcern: {w: 0}
, this is very faster. And best of all if I use a single server it is almost instantaneous.
With replica
benchInsertOne # Small doc..............I2 - Mo3.014m (±1.60%)
benchInsertOne # Small BSON doc.........I2 - Mo3.194m (±1.51%)
benchInsertOne # Small doc WC 0.........I2 - Mo20.912s (±9.49%)
benchInsertOne # Large doc..............I2 - Mo648.394ms (±2.47%)
benchInsertOne # Large BSON doc.........I2 - Mo624.408ms (±1.23%)
benchInsertOne # Large doc WC 0.........I2 - Mo152.876ms (±1.91%)
With single server
benchInsertOne # Small doc..............I2 - Mo919.939ms (±2.23%)
benchInsertOne # Small BSON doc.........I2 - Mo931.763ms (±2.56%)
benchInsertOne # Small doc WC 0.........I2 - Mo259.724ms (±1.03%)
benchInsertOne # Large doc..............I2 - Mo310.960ms (±5.61%)
benchInsertOne # Large BSON doc.........I2 - Mo304.353ms (±2.49%)
benchInsertOne # Large doc WC 0.........I2 - Mo147.833ms (±10.32%)
In Github Action (single server)
subject # set | revs | its | mem_peak | mode | rstdev |
---|---|---|---|---|---|
benchInsertOne # Small doc | 1 | 3 | 1.566mb | 1.949s | ± 0.60% |
benchInsertOne # Small BSON doc | 1 | 3 | 1.566mb | 1.962s | ± 1.00% |
benchInsertOne # Small doc WC 0 | 1 | 3 | 1.567mb | 255.210ms | ± 1.19% |
benchInsertOne # Large doc | 1 | 3 | 19.796mb | 395.377ms | ± 0.47% |
benchInsertOne # Large BSON doc | 1 | 3 | 14.410mb | 393.043ms | ± 0.13% |
benchInsertOne # Large doc WC 0 | 1 | 3 | 14.410mb | 247.049ms | ± 1.02% |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
WriteConcern 0, aka "unacknowledged writes" is expected to be fast, as the driver sends the command to the server but never waits to hear back. This means that there are no guarantees that the data has actually been written. The default write concern on a replica set is "majority", which equals 2 in the default 3-member configuration.
Since the specification does not specify what configurations to run this on, I would suggest running benchmarks on a standalone with the default write concern only.
The main cause for this seems to be the "Small doc" benchmark:
I would suggest dropping the second data set, as decoding to raw BSON doesn't yield a significantly different result from the original, as network overhead is likely a big factor in the timing here. At the same time, limiting this test to a single iteration would make sense given the runtime. That would reduce the time to ~3 minutes, which is much more tolerable. However, given these results I'd still like to check with other drivers to see how long this process takes in their drivers and whether we should expect this kind of time. |
7e5ed90
to
db2ed88
Compare
db2ed88
to
a6d502c
Compare
a6d502c
to
13f2431
Compare
c52973d
to
8c49411
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@GromNaN: Apologies. I realized I had some pending comments that were never published. Nothing major, so feel free to ignore or address in a separate commit as you like.
$collection = self::getCollection(); | ||
|
||
for ($i = $params['repeat']; $i > 0; $i--) { | ||
$collection->insertOne($params['document'], $params['options'] ?? []); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Is it intentional that none of the data providers specify options
?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I removed it before the merge. Good catch.
$database = self::getDatabase(); | ||
|
||
for ($i = 0; $i < 10_000; $i++) { | ||
$manager->executeCommand($database, new Command(['hello' => true])); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
$manager->executeCommand($database, new Command(['hello' => true])); | |
$manager->executeCommand($database, new Command(['hello' => 1])); |
The key doesn't matter for these commands, but we typically use 1
as a value for things like hello
and ping
.
I also agree with @alcaeus earlier comment that you can probably move Command construction out of the benchmark if you want to remove more unrelated overhead. Either a data provider or a class property might work.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The spec orders to use true explicitly.
For the command object creation. I think it's part of the driver task to send a command. I ça add an additional bench with excluded command instantiation from the loop.
Fix PHPLIB-1235
https://github.com/mongodb/specifications/blob/master/source/benchmarking/benchmarking.rst#single-doc-benchmarks
Result: