-
Notifications
You must be signed in to change notification settings - Fork 30
Implements dpctl.tensor.repeat
, dpctl.tensor.tile
#1381
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
View rendered docs @ https://intelpython.github.io/dpctl/pulls/1381/index.html |
Array API standard conformance tests for dpctl=0.14.6dev4=py310ha25a700_40 ran successfully. |
|
Array API standard conformance tests for dpctl=0.14.6dev4=py310ha25a700_41 ran successfully. |
Array API standard conformance tests for dpctl=0.14.6dev4=py310ha25a700_44 ran successfully. |
hev.wait() | ||
else: | ||
repeats = dpt.asarray(repeats, dtype="i8", sycl_queue=exec_q) | ||
if not dpt.all(repeats >= 0): |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Once we have tensor.min
this could be made more efficient: dpt.min(repeats) >= 0
@ndgrigorian Since I think we must support |
Doing this will make implementing more accumulators convenient
- Also adds a check that the sole element of a length 1 tuple is an integer before proceeding to the scalar case
52a44a9
to
459d209
Compare
dpctl.tensor.repeat
dpctl.tensor.repeat
, dpctl.tensor.tile
Array API standard conformance tests for dpctl=0.14.6dev5=py310ha25a700_10 ran successfully. |
Thank you @ndgrigorian ! Look great. |
Deleted rendered PR docs from intelpython.github.com/dpctl, latest should be updated shortly. 🤞 |
This pull request implements
dpctl.tensor.repeat
, as well as changes necessary to include it indpctl
.This function repeats the elements of an array along a given axis, and accepts integers, tuples, and
usm_ndarray
s for the number of repetitions. The basic approach, whererepeats
is a scalar, is implemented as_repeat_by_scalar
in the_tensor_impl
submodule. The more complicated case, whererepeats
is a tuple, is implemented as_repeat_by_sequence.
To implement
_repeat_by_sequence
, kernels for cumulative sums were moved into a separate header, and a 1D cumulative sum of general integers (rather than just nonzero) was added.An example of the new functionality:
dpctl.tensor.tile
is also implemented as as mostly top-level function.