Skip to content

PHPC-2269: Add internal iterator logic for MongoDB\BSON\Iterator #1454

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
merged 3 commits into from
Aug 25, 2023

Conversation

alcaeus
Copy link
Member

@alcaeus alcaeus commented Aug 4, 2023

PHPC-2269

Adding to v1.17 only as we'd be backporting a number of compat macros from PHP itself, in addition to ZVAL_OBJ_COPY. Waiting on #1450 to be merged so I don't have to bother fixing the PHP 7.2 build.

@alcaeus alcaeus requested a review from jmikola August 4, 2023 09:19
@alcaeus alcaeus self-assigned this Aug 4, 2023
{
php_phongo_iterator_t* intern = Z_ITERATOR_OBJ_P(&iter->data);

return intern->valid ? SUCCESS : FAILURE;
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm once again reminded how dumb PHP is for having SUCCESS defined as 0, when bools work exactly the other way around. Took me a moment to figure out why as I was initially returning intern->valid directly.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I've been bitten by this before, but I've made peace with assuming PHP's behavior is inspired by process exit codes.

@@ -263,10 +298,92 @@ static HashTable* php_phongo_iterator_get_properties(phongo_compat_object_handle
return php_phongo_iterator_get_properties_hash(object, false);
}

/* Iterator handlers */
static void php_phongo_iterator_it_dtor(zend_object_iterator* iter)
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This entire logic was shamelessly lifted from spl_array.c

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Does "this entire logic" refer to all of the iterator functions in this file (not just dtor here)?

Copy link
Member Author

@alcaeus alcaeus Aug 25, 2023

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes - all of the php_phongo_iterator_it_* methods were heavily inspired by PHP's standard implementation.

{
php_phongo_iterator_t* intern = Z_ITERATOR_OBJ_P(&iter->data);

return php_phongo_iterator_get_current(intern);
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Unlike the current() method itself, there's no need to copy/deref the zval we're returning here. Another notable difference to current() is that we're leaving an IS_UNDEF zval as-is, whereas current returns null in those cases. I don't think it makes a significant difference, and I'm not sure why Cursor::current() (which inspired the BSON iterator) returns null for an undef zval. I know there are some weird shenanigans going on when returning a IS_UNDEF, but I don't think this will ever matter in practice, as having that value returned indicates that something went seriously wrong during the iteration and there should've been an exception anyways.

Comment on lines +367 to +371
if (by_ref) {
phongo_throw_exception(PHONGO_ERROR_LOGIC, "An iterator cannot be used with foreach by reference");
return NULL;
}
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

There is precedent for this in PHP itself, although they emit an E_ERROR in such cases. I opted to throw instead as it's more consistent how we expose errors.

We can't support iteration by reference as the BSON structures themselves are read-only, so assigning to the decoded value makes no sense.

php_phongo_iterator_it_get_current_key,
php_phongo_iterator_it_move_forward,
php_phongo_iterator_it_rewind,
NULL,
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is the function pointer for invalidate_current, which isn't used. Quoting from phpinternalsbook:

The only handler that has no correspondence in userland is invalidate_current, which can be used to destroy the current key/value. The handler is largely unused though, in particular foreach won’t even call it.

We could point this to php_phongo_iterator_free_current, although I'm not sure what the benefits of that would be.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

php_phongo_iterator_free_current is already called from the necessary places.

I'm fine with leaving invalidate_current unset, unless there's prior art for other iterator implementations using it.

I do think it'd be worth adding a /* invalidate_current */ comment here so we don't need to reference PHONGO_ITERATOR_FUNCS's definition to see what's being omitted.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Added the comment as suggested.

@alcaeus alcaeus force-pushed the phpc-2269-bson-iterator-internal branch from 3904abe to cbc8eb9 Compare August 4, 2023 10:40
@alcaeus alcaeus changed the base branch from v1.16 to master August 4, 2023 10:40
@alcaeus alcaeus force-pushed the phpc-2269-bson-iterator-internal branch from cbc8eb9 to 8171ae3 Compare August 4, 2023 10:41
@alcaeus
Copy link
Member Author

alcaeus commented Aug 4, 2023

Here are some results from the performance tests done using the tests added in mongodb/mongo-php-library#1141.

Results in PHP 8.2-debug:

$ ./vendor/bin/phpbench run benchmark/BSON/DocumentBench.php --config=phpbench.ci.json --filter="benchIteration$" --report=aggregate --ref=original
PHPBench (1.2.14) running benchmarks... #standwithukraine
with configuration file: phpbench.ci.json
with PHP version 8.2.8, xdebug ✔, opcache ❌
comparing [actual vs. original]

\MongoDB\Benchmark\BSON\DocumentBench

    benchIteration..........................I2 - [Mo23.925ms vs. Mo114.955ms] -79.19% (±0.19%)

Subjects: 1, Assertions: 0, Failures: 0, Errors: 0
+---------------+----------------+-----+------+-----+---------------+------------------+----------------+
| benchmark     | subject        | set | revs | its | mem_peak      | mode             | rstdev         |
+---------------+----------------+-----+------+-----+---------------+------------------+----------------+
| DocumentBench | benchIteration |     | 10   | 3   | 6.498mb 0.00% | 23.925ms -79.19% | ±0.19% -75.23% |
+---------------+----------------+-----+------+-----+---------------+------------------+----------------+

This shows an improvement of 79% with this new logic. Using a non-debug version of PHP however yields wildly different results:

$ ./vendor/bin/phpbench run benchmark/BSON/DocumentBench.php --config=phpbench.ci.json --filter="benchIteration$" --report=aggregate --ref=original
PHPBench (1.2.14) running benchmarks... #standwithukraine
with configuration file: phpbench.ci.json
with PHP version 8.2.8, xdebug ✔, opcache ❌
comparing [actual vs. original]

\MongoDB\Benchmark\BSON\DocumentBench

    benchIteration..........................I2 - [Mo75.472ms vs. Mo76.018ms] -0.72% (±0.37%)

Subjects: 1, Assertions: 0, Failures: 0, Errors: 0
+---------------+----------------+-----+------+-----+---------------+-----------------+----------------+
| benchmark     | subject        | set | revs | its | mem_peak      | mode            | rstdev         |
+---------------+----------------+-----+------+-----+---------------+-----------------+----------------+
| DocumentBench | benchIteration |     | 10   | 3   | 6.319mb 0.00% | 75.472ms -0.72% | ±0.37% -20.52% |
+---------------+----------------+-----+------+-----+---------------+-----------------+----------------+

Here, there is no improvement whatsoever, but interestingly enough the test is slower than in the debug version, which has me somewhat confused.

The extension was compiled the same way for both runs, always including the developer flags (--enable-mongodb-developer-flags).

@alcaeus alcaeus marked this pull request as draft August 4, 2023 15:51
@alcaeus alcaeus force-pushed the phpc-2269-bson-iterator-internal branch from 8171ae3 to 08fb5d9 Compare August 18, 2023 06:53
@alcaeus
Copy link
Member Author

alcaeus commented Aug 18, 2023

Here are the latest results comparing iteration in non-debug PHP:

benchIteratorToArray....................I0 - [Mo25.636ms vs. Mo77.726ms] -67.02% (±0.00%)

These are the results I was looking for - turns out make install does not respect a custom extension_dir (which I'll have to investigate separately) and I wasn't testing the improvements at all.

@alcaeus alcaeus marked this pull request as ready for review August 18, 2023 07:04
@alcaeus alcaeus force-pushed the phpc-2269-bson-iterator-internal branch from 08fb5d9 to a9a1be5 Compare August 22, 2023 12:41
Copy link
Member

@jmikola jmikola left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Some small comment suggestions but implementation LGTM.

void php_phongo_iterator_init_ce(INIT_FUNC_ARGS)
{
php_phongo_iterator_ce = register_class_MongoDB_BSON_Iterator(zend_ce_iterator);
php_phongo_iterator_ce->create_object = php_phongo_iterator_create_object;
php_phongo_iterator_ce->get_iterator = php_phongo_iterator_get_iterator;
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Quoting PHPC-2269's description:

If this speeds up iteration, we should consider adding the same logic to The MongoDB\Cursor class, which may suffer from the same performance overhead.

Given that this does appear to speed things up, would you like to create a separate issue for making a similar change to Cursor?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Created PHPC-2278.

php_phongo_iterator_it_get_current_key,
php_phongo_iterator_it_move_forward,
php_phongo_iterator_it_rewind,
NULL,
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

php_phongo_iterator_free_current is already called from the necessary places.

I'm fine with leaving invalidate_current unset, unless there's prior art for other iterator implementations using it.

I do think it'd be worth adding a /* invalidate_current */ comment here so we don't need to reference PHONGO_ITERATOR_FUNCS's definition to see what's being omitted.

{
php_phongo_iterator_t* intern = Z_ITERATOR_OBJ_P(&iter->data);

return intern->valid ? SUCCESS : FAILURE;
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I've been bitten by this before, but I've made peace with assuming PHP's behavior is inspired by process exit codes.

(invalidate_current), \
(get_gc), \
}
#else
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
#else
#else /* PHP_VERSION_ID < 80000 */

This is used in some other longer #if blocks elsewhere in the file.

Note that we don't have prior art for this with #if(n)def, so I didn't suggest it below.

(rewind), \
(invalidate_current), \
}
#endif
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
#endif
#endif /* PHP_VERSION_ID >= 80000 */

@@ -263,10 +298,92 @@ static HashTable* php_phongo_iterator_get_properties(phongo_compat_object_handle
return php_phongo_iterator_get_properties_hash(object, false);
}

/* Iterator handlers */
static void php_phongo_iterator_it_dtor(zend_object_iterator* iter)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Does "this entire logic" refer to all of the iterator functions in this file (not just dtor here)?

@alcaeus alcaeus force-pushed the phpc-2269-bson-iterator-internal branch from a9a1be5 to 75d478f Compare August 25, 2023 07:31
@alcaeus alcaeus merged commit cc31f3a into mongodb:master Aug 25, 2023
@alcaeus alcaeus deleted the phpc-2269-bson-iterator-internal branch August 25, 2023 07:56
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants