Skip to content

Implements dpctl.tensor.repeat, dpctl.tensor.tile #1381

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
merged 9 commits into from
Sep 13, 2023
Merged

Conversation

ndgrigorian
Copy link
Collaborator

@ndgrigorian ndgrigorian commented Aug 31, 2023

This pull request implements dpctl.tensor.repeat, as well as changes necessary to include it in dpctl.

This function repeats the elements of an array along a given axis, and accepts integers, tuples, and usm_ndarrays for the number of repetitions. The basic approach, where repeats is a scalar, is implemented as _repeat_by_scalar in the _tensor_impl submodule. The more complicated case, where repeats is a tuple, is implemented as _repeat_by_sequence.

To implement _repeat_by_sequence, kernels for cumulative sums were moved into a separate header, and a 1D cumulative sum of general integers (rather than just nonzero) was added.

An example of the new functionality:

In [1]: import dpctl.tensor as dpt, numpy as np

In [2]: x = dpt.reshape(dpt.arange(5*10), (5, 10))

In [3]: x
Out[3]:
usm_ndarray([[ 0,  1,  2,  3,  4,  5,  6,  7,  8,  9],
             [10, 11, 12, 13, 14, 15, 16, 17, 18, 19],
             [20, 21, 22, 23, 24, 25, 26, 27, 28, 29],
             [30, 31, 32, 33, 34, 35, 36, 37, 38, 39],
             [40, 41, 42, 43, 44, 45, 46, 47, 48, 49]])

In [4]: dpt.repeat(x, 2, axis=0)
Out[4]:
usm_ndarray([[ 0,  1,  2,  3,  4,  5,  6,  7,  8,  9],
             [ 0,  1,  2,  3,  4,  5,  6,  7,  8,  9],
             [10, 11, 12, 13, 14, 15, 16, 17, 18, 19],
             [10, 11, 12, 13, 14, 15, 16, 17, 18, 19],
             [20, 21, 22, 23, 24, 25, 26, 27, 28, 29],
             [20, 21, 22, 23, 24, 25, 26, 27, 28, 29],
             [30, 31, 32, 33, 34, 35, 36, 37, 38, 39],
             [30, 31, 32, 33, 34, 35, 36, 37, 38, 39],
             [40, 41, 42, 43, 44, 45, 46, 47, 48, 49],
             [40, 41, 42, 43, 44, 45, 46, 47, 48, 49]])

dpctl.tensor.tile is also implemented as as mostly top-level function.

  • Have you provided a meaningful PR description?
  • Have you added a test, reproducer or referred to an issue with a reproducer?
  • Have you tested your changes locally for CPU and GPU devices?
  • Have you made sure that new changes do not introduce compiler warnings?
  • Have you checked performance impact of proposed changes?
  • If this PR is a work in progress, are you opening the PR as a draft?

@github-actions
Copy link

@coveralls
Copy link
Collaborator

coveralls commented Aug 31, 2023

Coverage Status

coverage: 85.774% (+0.1%) from 85.65% when pulling 459d209 on repeat-impl into 51d994a on master.

@github-actions
Copy link

Array API standard conformance tests for dpctl=0.14.6dev4=py310ha25a700_40 ran successfully.
Passed: 915
Failed: 85
Skipped: 119

@oleksandr-pavlyk
Copy link
Contributor

dpt.repeat(dpt.empty(tuple()), 2) requires to specify axis, but input arrays has none. Perhaps it is good for axis=None to also be allowed in this case.

@github-actions
Copy link

github-actions bot commented Sep 2, 2023

Array API standard conformance tests for dpctl=0.14.6dev4=py310ha25a700_41 ran successfully.
Passed: 916
Failed: 84
Skipped: 119

@github-actions
Copy link

github-actions bot commented Sep 3, 2023

Array API standard conformance tests for dpctl=0.14.6dev4=py310ha25a700_44 ran successfully.
Passed: 915
Failed: 85
Skipped: 119

hev.wait()
else:
repeats = dpt.asarray(repeats, dtype="i8", sycl_queue=exec_q)
if not dpt.all(repeats >= 0):
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Once we have tensor.min this could be made more efficient: dpt.min(repeats) >= 0

@oleksandr-pavlyk
Copy link
Contributor

@ndgrigorian Since repeat argument can be rather large and, especially, if it were to be computed as usm_ndarray already, it is wasteful to force users to bring it to host and convert it to a tuple only to be copied back to usm_ndarray again.

I think we must support repeats parameter as usm_ndarray. A use case I have in mind is that the repeats sequence is a sample from a multinomial distribution.

@ndgrigorian ndgrigorian changed the title Implements dpctl.tensor.repeat Implements dpctl.tensor.repeat, dpctl.tensor.tile Sep 12, 2023
@github-actions
Copy link

Array API standard conformance tests for dpctl=0.14.6dev5=py310ha25a700_10 ran successfully.
Passed: 916
Failed: 84
Skipped: 119

@oleksandr-pavlyk
Copy link
Contributor

Thank you @ndgrigorian ! Look great.

@ndgrigorian ndgrigorian merged commit 83fff33 into master Sep 13, 2023
@github-actions
Copy link

Deleted rendered PR docs from intelpython.github.com/dpctl, latest should be updated shortly. 🤞

@ndgrigorian ndgrigorian deleted the repeat-impl branch September 20, 2023 07:58
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants