-
Notifications
You must be signed in to change notification settings - Fork 266
PHPLIB-1236 Implement Multi-Doc Benchmarks #1165
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
12822d3
to
4ff3c8a
Compare
* @see https://github.com/mongodb/specifications/blob/ddfc8b583d49aaf8c4c19fa01255afb66b36b92e/source/benchmarking/benchmarking.rst#multi-doc-benchmarks | ||
*/ | ||
#[AfterMethods('afterAll')] | ||
final class GridFSBench |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Extracted GridFS benchmarks into a specific class because they are very different that other collection benchmarks. They need some properties.
public static function getStream(int $size) | ||
{ | ||
$stream = fopen('php://memory', 'w+'); | ||
fwrite($stream, str_repeat("\0", $size)); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The dataset, designated GRIDFS_LARGE (disk file 'gridfs_large.bin'), consists of a single file containing about 50 MB of random data.
I don't need to commit a 50MB file full of NULL characters from the spec. I can generate it on-demand in-memory.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Agreed, thank you!
return self::$client ??= new Client(self::getUri()); | ||
} | ||
|
||
public static function getDatabase(): Database |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Refactored to give access to an instance of each library objects, and cache them.
@@ -5,6 +5,7 @@ | |||
"runner.bootstrap": "vendor/autoload.php", | |||
"runner.file_pattern": "*Bench.php", | |||
"runner.path": "benchmark", | |||
"runner.php_config": { "memory_limit": "1G" }, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
50MB file takes a lot more than expected to be downloaded in-memory.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I need to profile why this is taking so much memory.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Profiling result: it's the substr
that uses the most memory.
mongo-php-library/src/GridFS/ReadableStream.php
Lines 166 to 174 in ec6c431
while (strlen($data) < $length) { | |
if ($this->bufferOffset >= strlen($this->buffer) && ! $this->initBufferFromNextChunk()) { | |
break; | |
} | |
$initialDataLength = strlen($data); | |
$data .= substr($this->buffer, $this->bufferOffset, $length - $initialDataLength); | |
$this->bufferOffset += strlen($data) - $initialDataLength; | |
} |

4ff3c8a
to
01951e0
Compare
Fix PHPLIB-1236
https://github.com/mongodb/specifications/blob/master/source/benchmarking/benchmarking.rst#multi-doc-benchmarks