Skip to content

Upgrade init_tensor API to return a ggml_status #11854

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
merged 2 commits into from
Feb 28, 2025

Conversation

WilliamTambellini
Copy link
Contributor

To prepare for an 'abort-free' ggml, as agreeed with Diego in the ggml repo, upgrade the backend init_tensor APIs to return a ggml_status.

Make sure to read the contributing guidelines before submitting a PR

@github-actions github-actions bot added testing Everything test related Nvidia GPU Issues specific to Nvidia GPUs Vulkan Issues specific to the Vulkan backend ggml changes relating to the ggml tensor library for machine learning SYCL https://en.wikipedia.org/wiki/SYCL - GPU programming language labels Feb 14, 2025
@WilliamTambellini
Copy link
Contributor Author

@slaren review please. Tks.

@WilliamTambellini
Copy link
Contributor Author

Tks @slaren
Reready for review.

@WilliamTambellini WilliamTambellini force-pushed the init_tensor branch 4 times, most recently from e2486eb to 51a0f6c Compare February 18, 2025 23:48
graehl

This comment was marked as outdated.

Copy link

@graehl graehl left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

ok, so ggml_backend_*_buffer_init_tensor can only return success for most backends but since it's called through the interface init_tensor pointer they still need to return success. was the plan to eventually make cuda_init_tensor sometimes return an error?

@WilliamTambellini
Copy link
Contributor Author

Tks @graehl

so ggml_backend_*_buffer_init_tensor can only return success for most backends but since it's called through the interface init_tensor pointer they still need to return success. was the plan to eventually make cuda_init_tensor sometimes return an error?

Yes but that a another PR in the ggml repo

@WilliamTambellini
Copy link
Contributor Author

@slaren reready for review please. Best.

Copy link
Contributor

@matiaslin matiaslin left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Good step forward towards the goal of returning an error instead of crashing.

@WilliamTambellini
Copy link
Contributor Author

@ggerganov review please

Comment on lines 959 to 983
enum ggml_status status = ggml_backend_view_init(t);
if (status != GGML_STATUS_SUCCESS) {
GGML_LOG_WARN("%s: failed to ggml_backend_view_init: %s\n", __func__, ggml_status_to_string(status));
return false;
}
}
} else {
if (t->view_src != NULL && t->buffer == NULL) {
// view of a pre-allocated tensor
ggml_backend_view_init(t);
enum ggml_status status = ggml_backend_view_init(t);
if (status != GGML_STATUS_SUCCESS) {
GGML_LOG_WARN("%s: failed to ggml_backend_view_init: %s\n", __func__, ggml_status_to_string(status));
return false;
}
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This will leak memory if it fails.

To prepare for an 'abort-free' ggml
(ggml not to abort on OOMs but return a OOM status),
as agreeed with Diego in the ggml repo,
upgrade the init_tensor() and view_init() APIs
to return a ggml_status.
@slaren slaren merged commit 70680c4 into ggml-org:master Feb 28, 2025
47 checks passed
@ag2s20150909 ag2s20150909 mentioned this pull request Mar 3, 2025
@WilliamTambellini
Copy link
Contributor Author

@ggerganov I now have to retouch my PR in ggml. Could you please trigger a sync of ggml from llamacpp to the ggml repo?

@ggerganov
Copy link
Member

@WilliamTambellini Should be good now.

mglambda pushed a commit to mglambda/llama.cpp that referenced this pull request Mar 8, 2025
* Upgrade init_tensor API to return a ggml_status

To prepare for an 'abort-free' ggml
(ggml not to abort on OOMs but return a OOM status),
as agreeed with Diego in the ggml repo,
upgrade the init_tensor() and view_init() APIs
to return a ggml_status.

* misc fixes

---------

Co-authored-by: slaren <[email protected]>
arthw pushed a commit to arthw/llama.cpp that referenced this pull request Mar 19, 2025
* Upgrade init_tensor API to return a ggml_status

To prepare for an 'abort-free' ggml
(ggml not to abort on OOMs but return a OOM status),
as agreeed with Diego in the ggml repo,
upgrade the init_tensor() and view_init() APIs
to return a ggml_status.

* misc fixes

---------

Co-authored-by: slaren <[email protected]>
mostlyuseful pushed a commit to mostlyuseful/llama.cpp that referenced this pull request May 12, 2025
* Upgrade init_tensor API to return a ggml_status

To prepare for an 'abort-free' ggml
(ggml not to abort on OOMs but return a OOM status),
as agreeed with Diego in the ggml repo,
upgrade the init_tensor() and view_init() APIs
to return a ggml_status.

* misc fixes

---------

Co-authored-by: slaren <[email protected]>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
ggml changes relating to the ggml tensor library for machine learning Nvidia GPU Issues specific to Nvidia GPUs SYCL https://en.wikipedia.org/wiki/SYCL - GPU programming language testing Everything test related Vulkan Issues specific to the Vulkan backend
Projects
None yet
Development

Successfully merging this pull request may close these issues.

5 participants