-
Notifications
You must be signed in to change notification settings - Fork 7.9k
Add support for internal enums via stub generation #7212
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
a6b1741
to
53bdf98
Compare
40a7b1c
to
75f8e12
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Could use more tests, especially things like unserialize
. Can't comment on persistent objects.
build/gen_stub.php
Outdated
} | ||
|
||
return null; | ||
} else { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Unnecessary nesting.
build/gen_stub.php
Outdated
|
||
case "string": | ||
$suffix = "_string"; | ||
$additionalParams = ", \"$value\""; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This will only work for simple string values, but I guess that's enough for stubs.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yes, I've been inspired by other places using string values… but an addslashes() would do no harm though
@@ -47,6 +48,22 @@ static zend_always_inline zval *zend_enum_fetch_case_value(zend_object *zobj) | |||
return OBJ_PROP_NUM(zobj, 1); | |||
} | |||
|
|||
ZEND_API zend_class_entry *zend_register_internal_enum(const char *name); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Do we need these variations? I often find it the many functions hard to navigate. Maybe that's just me.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Depends … well, you have to know that IS_UNDEF means unbacked and NULL means no functions (i.e. you don't need to define an empty array with just ZEND_FE_END). It's not always obvious what you would have to do if these simpler functions wouldn't be there.
Especially the distinction backed vs. unbacked makes sense to me
Good point, didn't think about deserialization at all. Will check that out. Also need to add a test for assignment to simple constant (there shall be no constant propagation of enum objects). |
59f5465
to
892a15e
Compare
Deserialization indeed had only ZEND_COPY instead of ZEND_COPY_OR_DUP. Fixed that. Added test. Anything else you can think of where a test would be needed? @iluuu1994 |
This uses persistently malloc()'ed objects, which get duplicated to an request-unique object. Those request-unique objects are supposed to be immutable for the lifetime of the request, i.e. they can be just efree()'d without issues.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Okay, so here is my understanding of this patch:
- Persistent objects are
malloc
ed and get aGC_PERSISTENT
flag. We don't have space for this flag, so one bit of the GC root space is stolen. - Persistent objects cannot be returned into userland code and need to be duplicated first. It looks like you missed doing this at least in the from/tryFrom and cases implementation. Try running with ZEND_RC_DEBUG=1: https://gist.github.com/nikic/5bd0c0f1280c217c957ef8fa55c82f8e.
- Persistent objects are duplicated in the following way: There is a single allocation for all persistent objects, whose number and size is known at startup. The persistent object stores the offset into this allocation. If the refcount of the corresponding object is zero, it is initialized once using a shallow copy. All per-request objects are freed at once at the end of the request and cannot be individually freed.
I think the principal alternative to this approach is to do the same as in userland, i.e. use CONSTANT_AST constant values that evaluate to the (non-persistent objects). However, this would require support for mutable_data for internal classes. I think overall this may be the cleaner solution, because it brings handling of internal and userland classes closer and does not require introduction of any new concepts.
Anyway, I would like to hear @dstogov's thoughts on this patch and the persistent object concept.
return zend_register_internal_backed_enum_ex(name, IS_UNDEF, NULL); | ||
} | ||
|
||
ZEND_API zend_class_entry *zend_register_internal_enum_ex(const char *name, const zend_function_entry *functions) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'd combine these two functions. Especially as these are intended for use with stubs, having an _ex variant to save a NULL doesn't seem particularly useful.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yeah I tend to agree, I should just distinguish backed vs unbacked.
zend_class_constant *c = zend_declare_class_constant_ex(ce, name, &casezv, ZEND_ACC_PUBLIC, NULL); | ||
ZEND_CLASS_CONST_FLAGS(c) |= ZEND_CLASS_CONST_IS_CASE; | ||
|
||
zend_string_release(name); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Why are you releasing the name here? APIs generally shouldn't release values they are passed.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I feel it's unergonomic to do zend_string * ... = zend_string_init(..); zend_register_enum_case(...); zend_string_release(...);
it essentially becomes three statements instead of just one zend_register_enum_case(... zend_string_init (..))
… I particular I cannot imagine a case where you would want to retain the zend_string afterwards.
_zend_object_minimal_init(obj, ce); | ||
GC_TYPE_INFO(obj) |= IS_OBJ_PERSISTENT << GC_FLAGS_SHIFT; | ||
obj->handle = zend_objects_persistent_handle_counter; | ||
zend_objects_persistent_handle_counter += (alloc_size - 1) / sizeof(zend_object) + 1; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
If you're doing something like this anyway, why is persistent_objects/zend_objects_persistent_handle_counter quantized to zend_objects rather than just a raw memory region?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Probably does not make too much sense, the main reason is that I'm accessing EG(persistent_objects) + obj->handle
where EG(persistent_objects) is a zend_object-array. But could just cast to char and back to zend_object at that point.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I mean, why is it zend_object*
in the first place, rather than char*
?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Because it contains objects :-D that was the logical thought I had, but obviously, this chunk of memory can just be made more generic and the zend_objects_persistent_handle_counter just be a counter. There's no reason why this chunk of memory cannot be just shared between multiple potential usages. (and then I'd also decouple that "handle_counter" from zend_objects.c)
Yes I did. Arghs. I checked around where constants are accessed, but forgot to check the parts where the enums are directly accessed :-/
You precisely found the issue. Doing the same as userland is not so easy though, because the constant gets replaced by the unique object. (in userland) ... In internal there is just one overall copy of the constants table. If we'd replace it there with ast evaluation, it would break. Creating a new object obviously is out of question as well. Also, how would you even implement that mutable data? In particular with ZTS, you need thread local mutable data. Which presumably would live somewhere in EG() in some mapping I guess. At least it's not possible to just do ce->thread_local_mutable_data = &zend_tls_marked_variable; because the address is unique per thread... (EDIT: technically you could do Maybe I'm overestimating what would need to be done, but yeah, how do you imagine that working out? If you find a nice easy way. |
zend_string_release(name); | ||
zend_string_release(str); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
static zend_always_inline void zend_string_release(zend_string *s)
{
if (!ZSTR_IS_INTERNED(s)) {
if (GC_DELREF(s) == 0) {
pefree(s, GC_FLAGS(s) & IS_STR_PERSISTENT);
}
}
}
zend_string_release(name); | |
zend_string_release(str); |
str = zend_new_interned_string(str); | ||
|
||
zval namezv; | ||
ZVAL_STR_COPY(&namezv, name); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
ZVAL_STR_COPY(&namezv, name); | |
ZVAL_INTERNED_STR(&namezv, name); |
} | ||
|
||
zval zv; | ||
ZVAL_STR(&zv, str); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
ZVAL_STR(&zv, str); | |
ZVAL_INTERNED_STR(&zv, str); |
ZVAL_LONG(&zv, value); | ||
zend_object *case_obj = add_internal_enum_case(ce, name, &zv); | ||
|
||
zend_string_release(name); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Same comments on ZVAL_INTERNED_STR and not needing zval_string_released on zend_new_interned_string. It seems like both zend_string_init_interned_permanent and zend_new_interned_string_request return interned strings where refcounting or freeing isn't necessary.
The code works but it's longer than it needs to be?
I've posted an alternative implementation at #7302. |
This adds support for internal enums with the same basic approach as userland enums. Enum values are stored as CONSTANT_AST and objects created during constant updating at runtime. This means that we need to use mutable_data for internal enums. This just adds basic support and APIs, it does not include the stubs integration from #7212. Closes GH-7302.
Added the stub generation part in 32d4821, so internal enum support should now be complete. |
This uses persistently malloc()'ed objects, which get duplicated to an request-unique object.
Those request-unique objects are supposed to be immutable for the lifetime of the request, i.e. they can be just efree()'d without issues.
This patch requires nikic/PHP-Parser#792 (and a subsequent update of the tag) before merging.Done.