Skip to content

[CANN]Support operator SIN COS ARGMAX #12709

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
merged 3 commits into from
Apr 3, 2025

Conversation

noemotiovon
Copy link
Contributor

Why is this PR needed?

Optimize the sin , cos, argmax operator in the CANN backend with the aclnn acceleration library.

Test

  # SIN
  SIN(type=f16,ne=[10,2,2,2]): OK
  SIN(type=f32,ne=[10,2,2,2]): OK
  5294/5294 tests passed
  Backend CANN0: OK
  
  # COS
  COS(type=f16,ne=[10,2,2,2]): OK
  COS(type=f32,ne=[10,2,2,2]): OK
  5294/5294 tests passed
  Backend CANN0: OK
  
  #ARGMAX
  ARGMAX(type=f32,ne=[32,1,1,1]): OK
  ARGMAX(type=f32,ne=[100,10,1,1]): OK
  ARGMAX(type=f32,ne=[1024,10,1,1]): OK
  ARGMAX(type=f32,ne=[1024,12,1,1]): OK
  ARGMAX(type=f32,ne=[2000,10,1,1]): OK
  ARGMAX(type=f32,ne=[5438,3,1,1]): OK
  5294/5294 tests passed
  Backend CANN0: OK

Signed-off-by: noemotiovon <[email protected]>
@github-actions github-actions bot added the ggml changes relating to the ggml tensor library for machine learning label Apr 2, 2025
ACL_CHECK(aclnnArgMax(workspaceAddr, workspaceSize, executor, ctx.stream()));

size_t cpy_size = ggml_nbytes(dst);
ACL_CHECK(aclrtMemcpyAsync(dst->data, cpy_size, buffer, cpy_size,
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The extra copy here is necessary because the shape computed by the aclnn operator differs from dst, so a buffer is used to hold the data before copying it back to dst.

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

No need to alloc a extra buffer. use dst->data instead.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thank you for your suggestion! I forgot to directly modify dst->data's shape, haha.

@hipudding hipudding self-requested a review April 3, 2025 00:51
@hipudding hipudding added the Ascend NPU issues specific to Ascend NPUs label Apr 3, 2025

aclTensor* acl_src = ggml_cann_create_tensor(src0);
aclTensor* acl_dst = ggml_cann_create_tensor(dst);
aclnn_argmax(ctx, acl_src, acl_dst, dst);
Copy link
Collaborator

@hipudding hipudding Apr 3, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It seems aclnn_argmax need dst, and acl_dst can be created in aclnn_argmax, and do not need pass as a parameter.
Besides, If aclnn_argmax is not used by other internal functions, these two function should combine together.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Okay, I will inline this method.

noemotiovon added 2 commits April 3, 2025 01:29
Signed-off-by: noemotiovon <[email protected]>
Signed-off-by: noemotiovon <[email protected]>
@hipudding hipudding merged commit 65cfe13 into ggml-org:master Apr 3, 2025
48 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Ascend NPU issues specific to Ascend NPUs ggml changes relating to the ggml tensor library for machine learning
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants