Skip to content

PHPLIB-1236 Implement Multi-Doc Benchmarks #1165

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
merged 2 commits into from
Sep 18, 2023
Merged

Conversation

GromNaN
Copy link
Member

@GromNaN GromNaN commented Sep 15, 2023

Fix PHPLIB-1236

https://github.com/mongodb/specifications/blob/master/source/benchmarking/benchmarking.rst#multi-doc-benchmarks

  • Find many and empty the cursor
  • Small doc bulk insert
  • Large doc bulk insert
  • GridFS upload
  • GridFS download
\MongoDB\Benchmark\DriverBench\MultiDocBench

    benchFindMany # Driver default typemap..I2 - Mo133.957ms (±0.20%)
    benchFindMany # Raw BSON................I2 - Mo17.109ms (±0.86%)
    benchBulkInsert # Small doc.............I2 - Mo53.328ms (±0.25%)
    benchBulkInsert # Small BSON doc........I2 - Mo53.180ms (±0.36%)
    benchBulkInsert # Large doc.............I2 - Mo242.104ms (±2.10%)
    benchBulkInsert # Large BSON doc........I2 - Mo239.291ms (±0.63%)

\MongoDB\Benchmark\DriverBench\GridFSBench

    benchUpload.............................I2 - Mo31.448ms (±0.47%)
    benchDownload...........................I2 - Mo67.099ms (±11.71%)

Subjects: 4, Assertions: 0, Failures: 0, Errors: 0

@GromNaN GromNaN force-pushed the PHPLIB-1236 branch 2 times, most recently from 12822d3 to 4ff3c8a Compare September 15, 2023 20:59
* @see https://github.com/mongodb/specifications/blob/ddfc8b583d49aaf8c4c19fa01255afb66b36b92e/source/benchmarking/benchmarking.rst#multi-doc-benchmarks
*/
#[AfterMethods('afterAll')]
final class GridFSBench
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Extracted GridFS benchmarks into a specific class because they are very different that other collection benchmarks. They need some properties.

public static function getStream(int $size)
{
$stream = fopen('php://memory', 'w+');
fwrite($stream, str_repeat("\0", $size));
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The dataset, designated GRIDFS_LARGE (disk file 'gridfs_large.bin'), consists of a single file containing about 50 MB of random data.

I don't need to commit a 50MB file full of NULL characters from the spec. I can generate it on-demand in-memory.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Agreed, thank you!

return self::$client ??= new Client(self::getUri());
}

public static function getDatabase(): Database
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Refactored to give access to an instance of each library objects, and cache them.

@@ -5,6 +5,7 @@
"runner.bootstrap": "vendor/autoload.php",
"runner.file_pattern": "*Bench.php",
"runner.path": "benchmark",
"runner.php_config": { "memory_limit": "1G" },
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

50MB file takes a lot more than expected to be downloaded in-memory.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I need to profile why this is taking so much memory.

Copy link
Member Author

@GromNaN GromNaN Sep 18, 2023

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Profiling result: it's the substr that uses the most memory.

while (strlen($data) < $length) {
if ($this->bufferOffset >= strlen($this->buffer) && ! $this->initBufferFromNextChunk()) {
break;
}
$initialDataLength = strlen($data);
$data .= substr($this->buffer, $this->bufferOffset, $length - $initialDataLength);
$this->bufferOffset += strlen($data) - $initialDataLength;
}

image

@GromNaN GromNaN marked this pull request as ready for review September 15, 2023 21:05
@GromNaN GromNaN requested a review from alcaeus September 15, 2023 21:05
@GromNaN GromNaN merged commit ec6c431 into mongodb:master Sep 18, 2023
@GromNaN GromNaN deleted the PHPLIB-1236 branch September 18, 2023 09:27
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants