-
Notifications
You must be signed in to change notification settings - Fork 12.2k
Single allocation of encode_async block with non-ARC capture in ggml-metal.m #9747
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Great - thank you for the explanation!
Would Block_release
make more sense to be in ggml_metal_free()
instead?
diff --git a/ggml/src/ggml-metal.m b/ggml/src/ggml-metal.m
index 40835074..83bf658c 100644
--- a/ggml/src/ggml-metal.m
+++ b/ggml/src/ggml-metal.m
@@ -438,7 +438,6 @@ static struct ggml_backend_metal_context * ggml_metal_init(void) {
ctx->capture_scope = nil;
ctx->gf = nil;
- Block_release(ctx->encode_async);
ctx->encode_async = nil;
for (int i = 0; i < GGML_METAL_MAX_COMMAND_BUFFERS; ++i) {
ctx->command_buffers[i] = nil;
@@ -684,6 +683,8 @@ static void ggml_metal_free(struct ggml_backend_metal_context * ctx) {
[ctx->kernels[i].pipeline release];
}
+ Block_release(ctx->encode_async);
+
[ctx->queue release];
[ctx->device release];
In ggml_metal_init()
we only initialize the pointer with nil
and are never supposed to pass through that function a second time.
Oh indeed, I had thought that was the place where things were being de-init because of the nil assignment. It definitely belongs in the deallocation place instead. Will update in a sec. |
//ctx->encode_async = ^(size_t iter) { | ||
// ... | ||
//}; | ||
ctx->encode_async = Block_copy(^(size_t iter) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
In theory, we might want to call ggml_backend_metal_set_n_cb()
multiple times in the future (although unlikely). So to make this more correct in this case, we should check if ctx->encode_async == nil
here and if not, call Block_release()
first
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Ah, that had crossed my mind indeed, but didn't want to add more logic there than needed for now - worried about scope creep. I agree this makes total sense. Will add 👍
* Single allocation of encode_async block with non-ARC capture in ggml-metal.m * Moving Block_release to the deallocation code * Release encode block when re-setting encoding buffer count if needed * Update ggml/src/ggml-metal.m --------- Co-authored-by: Georgi Gerganov <[email protected]>
* Single allocation of encode_async block with non-ARC capture in ggml-metal.m * Moving Block_release to the deallocation code * Release encode block when re-setting encoding buffer count if needed * Update ggml/src/ggml-metal.m --------- Co-authored-by: Georgi Gerganov <[email protected]>
* Single allocation of encode_async block with non-ARC capture in ggml-metal.m * Moving Block_release to the deallocation code * Release encode block when re-setting encoding buffer count if needed * Update ggml/src/ggml-metal.m --------- Co-authored-by: Georgi Gerganov <[email protected]>
This PR hopefully addresses the issue raised in a comment in
ggml-metal.m
:The reason that fails is because ObjC blocks in manual memory management mode are not retained upon creation, they need to be explicitly retained, or rather, copied after creation. Usually this is done by using a
copy
able property declaration on the owning object, but in this case the block is assigned to a plain C structure. In such cases we can reproduce the same behaviour by using theBlock_copy
function. Having done that, we also need to release the block once we are done with it, preferably just before setting that property tonil
.This PR makes this change, and seems to fix the crash mentioned, but please do give it a go to verify before merging, just in case :D I hope this helps!