Single allocation of encode_async block with non-ARC capture in ggml-metal.m #9747

ptsochantaris · 2024-10-05T01:07:28Z

This PR hopefully addresses the issue raised in a comment in ggml-metal.m:

// TODO: how to avoid this allocation? I tried initializing it in ggml_backend_metal_set_n_cb but it crashes.

The reason that fails is because ObjC blocks in manual memory management mode are not retained upon creation, they need to be explicitly retained, or rather, copied after creation. Usually this is done by using a copyable property declaration on the owning object, but in this case the block is assigned to a plain C structure. In such cases we can reproduce the same behaviour by using the Block_copy function. Having done that, we also need to release the block once we are done with it, preferably just before setting that property to nil.

This PR makes this change, and seems to fix the crash mentioned, but please do give it a go to verify before merging, just in case :D I hope this helps!

I have read the contributing guidelines
Self-reported review complexity:
- Low
- Medium
- High

…metal.m

ggerganov

Great - thank you for the explanation!

Would Block_release make more sense to be in ggml_metal_free() instead?

diff --git a/ggml/src/ggml-metal.m b/ggml/src/ggml-metal.m
index 40835074..83bf658c 100644
--- a/ggml/src/ggml-metal.m
+++ b/ggml/src/ggml-metal.m
@@ -438,7 +438,6 @@ static struct ggml_backend_metal_context * ggml_metal_init(void) {
     ctx->capture_scope = nil;
 
     ctx->gf = nil;
-    Block_release(ctx->encode_async);
     ctx->encode_async = nil;
     for (int i = 0; i < GGML_METAL_MAX_COMMAND_BUFFERS; ++i) {
         ctx->command_buffers[i] = nil;
@@ -684,6 +683,8 @@ static void ggml_metal_free(struct ggml_backend_metal_context * ctx) {
         [ctx->kernels[i].pipeline release];
     }
 
+    Block_release(ctx->encode_async);
+
     [ctx->queue release];
     [ctx->device release];

In ggml_metal_init() we only initialize the pointer with nil and are never supposed to pass through that function a second time.

ptsochantaris · 2024-10-07T10:57:19Z

Oh indeed, I had thought that was the place where things were being de-init because of the nil assignment. It definitely belongs in the deallocation place instead. Will update in a sec.

ggerganov · 2024-10-07T11:15:24Z

ggml/src/ggml-metal.m

-    //ctx->encode_async = ^(size_t iter) {
-    //    ...
-    //};
+    ctx->encode_async = Block_copy(^(size_t iter) {


In theory, we might want to call ggml_backend_metal_set_n_cb() multiple times in the future (although unlikely). So to make this more correct in this case, we should check if ctx->encode_async == nil here and if not, call Block_release() first

Ah, that had crossed my mind indeed, but didn't want to add more logic there than needed for now - worried about scope creep. I agree this makes total sense. Will add 👍

ggml/src/ggml-metal.m

* Single allocation of encode_async block with non-ARC capture in ggml-metal.m * Moving Block_release to the deallocation code * Release encode block when re-setting encoding buffer count if needed * Update ggml/src/ggml-metal.m --------- Co-authored-by: Georgi Gerganov <[email protected]>

Single allocation of encode_async block with non-ARC capture in ggml-…

7403c05

…metal.m

ggerganov reviewed Oct 7, 2024

View reviewed changes

Moving Block_release to the deallocation code

5e63583

ggerganov reviewed Oct 7, 2024

View reviewed changes

Release encode block when re-setting encoding buffer count if needed

4af03de

ggerganov reviewed Oct 7, 2024

View reviewed changes

ggml/src/ggml-metal.m Outdated Show resolved Hide resolved

Update ggml/src/ggml-metal.m

594a07a

ggerganov merged commit 96b6912 into master Oct 7, 2024
53 checks passed

ggerganov deleted the ggml-metal-fix-single-allocation-crash branch October 7, 2024 12:26

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Single allocation of encode_async block with non-ARC capture in ggml-metal.m #9747

Single allocation of encode_async block with non-ARC capture in ggml-metal.m #9747

Uh oh!

ptsochantaris commented Oct 5, 2024

Uh oh!

ggerganov left a comment

Uh oh!

ptsochantaris commented Oct 7, 2024 •

edited

Loading

Uh oh!

ggerganov Oct 7, 2024 •

edited

Loading

Uh oh!

ptsochantaris Oct 7, 2024

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Single allocation of encode_async block with non-ARC capture in ggml-metal.m #9747

Single allocation of encode_async block with non-ARC capture in ggml-metal.m #9747

Uh oh!

Conversation

ptsochantaris commented Oct 5, 2024

Uh oh!

ggerganov left a comment

Choose a reason for hiding this comment

Uh oh!

ptsochantaris commented Oct 7, 2024 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

ggerganov Oct 7, 2024 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Choose a reason for hiding this comment

Uh oh!

ptsochantaris Oct 7, 2024

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Uh oh!

Uh oh!

ptsochantaris commented Oct 7, 2024 •

edited

Loading

ggerganov Oct 7, 2024 •

edited

Loading