Skip to content

Add zend_string_init_fast and related macros to speed up string creation #5684

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Closed
wants to merge 3 commits into from

Conversation

twose
Copy link
Member

@twose twose commented Jun 8, 2020

Any better name?
BTW, I removed some spaces mixed in the tabs so the diff looks a bit strange

In theory, we can apply it to add_next_index_string*/add_assoc_string*...

Zend/zend_API.h Outdated
@@ -636,6 +636,20 @@ END_EXTERN_C()
ZVAL_PSTRINGL(z, "", 0); \
} while (0)

#define ZVAL_ONE_CHAR_STRING(z, c) do { \
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

what about using ZVAL_ONE_CHAR_INTERNED_STR? So that we wouldn't lose the information that it creates an interned string?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

what about using ZVAL_ONE_CHAR_INTERNED_STR? So that we wouldn't lose the information that it creates an interned string?

but API named *_STR usually takes zend_string * as a parameter and ZVAL_ONE_CHAR_INTERNED_STRING is too long...

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Length of a macro name is not a valid argument IMHO, readability is more important than just how many characters less you will type once.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Length of a macro name is not a valid argument IMHO, readability is more important than just how many characters less you will type once.

You're right, I prefer ZVAL_ONE_CHAR_INTERNED_STRING

Copy link
Member

@kocsismate kocsismate Jun 8, 2020

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@twose Using str vs string has a semantic difference. Please see this here: #5647 (comment) Initially, I also argued against using abbreviations, but "unfortunately" they make sense in this case. :)

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@kocsismate I'm a little confused now, this macro accepts a parameter of type char, not zend_string *, so we named it *_ONE_CHAR_INTERNED_STRING instead of ONE_CHAR_INTERNED_STR, is that right?

Copy link
Member

@nikic nikic Jun 8, 2020

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'd suggest using just ZVAL_CHAR(zv, c), RETVAL_CHAR(c), RETURN_CHAR(c) ... :)

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'd suggest using just ZVAL_CHAR(zv, c) ... :)

I'm in favor of your proposal, the reason I didn’t do this at first was that there is no char type in PHP

#define RETVAL_EMPTY_STRING() ZVAL_EMPTY_STRING(return_value)
#define RETVAL_RES(r) ZVAL_RES(return_value, r)
#define RETVAL_ARR(r) ZVAL_ARR(return_value, r)
#define RETVAL_NULL() ZVAL_NULL(return_value)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Previously I also tried to do something similar, but it turned out that tabs should only be used for indentation at the beginning of the line. Otherwise, spaces are preferred.

ZVAL_INTERNED_STR(z, ZSTR_CHAR((zend_uchar) _c)); \
} while (0)

#define ZVAL_STRINGL_FAST(z, s, l) do { \
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can you please add a comment to clarify when the fast implementation should be used? I believe there are some people (including me) who would appreciate a little bit more information about its applicability/use-cases.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@kocsismate
sorry for my poor English...😢
you can add comments if possible...

@Girgias
Copy link
Member

Girgias commented Jun 8, 2020

There seems to be different things going on in this PR, the ZVAL_ONE_CHAR_STRING seems to be completely orthogonal to the fast sting API proposal and I don't see why it's mixed up with this one, but that may be because there are a bazillion whitespace changes which doesn't make this PR easy to read.

And like Maté said I don't see when I should be using this over the non fast version as just reading the name of the function I would expect the point of it is to use as much as possible and in that case I would find it more logical to just rename the current zend_string_init() into a "slow" variant and make this fast version the one used by zend_string_init where we'll get most of the advantages of it for all extensions and internal code without needing to go through them.

And if that's not why it's trying to achieve than I'm having trouble grasping the meaning of "fast" here.

@twose
Copy link
Member Author

twose commented Jun 8, 2020

@Girgias I had the same idea as you at the beginning, but there are many zend_string_free/zend_string_efree in the kernel, they conclude that the string returned from zend_string_init/ZVAL_STRINGL is allocated by emalloc, these two optimizations conflict with each other, we should choose one of them or fix them carefully

@twose
Copy link
Member Author

twose commented Jun 8, 2020

for now, applying this optimization to RETVAL_STRING/RETURN_STRING is relatively easy

@nikic
Copy link
Member

nikic commented Jun 8, 2020

This all depends on how generally useful this optimization is. The places where it is used right now are there because zero or one character strings are common in those context. There are of course many cases where they aren't.

@twose
Copy link
Member Author

twose commented Jun 8, 2020

I think the cost of the inspection is small but the gain is big when the condition is hit
I am just conservatively adding new APIs, but I need everyone's consistent opinion then I could finish it...

Copy link
Member

@nikic nikic left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This looks reasonable to me. I'm not really happy about the zend_string_init_fast name, but I don't have a better suggestion myself :(

@twose
Copy link
Member Author

twose commented Jun 8, 2020

This looks reasonable to me. I'm not really happy about the zend_string_init_fast name, but I don't have a better suggestion myself :(

There are only two hard things in Computer Science: cache invalidation and naming things

Copy link
Member

@nikic nikic left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'll take this as-is... for now the usage guideline here would be "use it if you have benchmarked and found it to be faster".

@php-pulls php-pulls closed this in 83a7701 Jun 8, 2020
@twose twose deleted the zend_string_init_fast branch June 8, 2020 13:38
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants