You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Copy file name to clipboardExpand all lines: CHANGELOG.md
+57-1Lines changed: 57 additions & 1 deletion
Original file line number
Diff line number
Diff line change
@@ -4,7 +4,63 @@ All notable changes to this project will be documented in this file.
4
4
The format is based on [Keep a Changelog](https://keepachangelog.com/en/1.0.0/),
5
5
and this project adheres to [Semantic Versioning](https://semver.org/spec/v2.0.0.html).
6
6
7
-
## [0.16.0] - MMM. DD, YYYY
7
+
## [0.17.0] - May. XX, 2024
8
+
9
+
This release features updated documentation web-page https://intelpython.github.io/dpctl/latest/index.html, adds cumulative reductions,
10
+
and complies with revision [2023.12](https://data-apis.org/array-api/2023.12/) of Python Array API specification.
11
+
12
+
### Added
13
+
14
+
* Added pybind11 caster for ``sycl::half`` to map to/from Python `float` to ``"dpctl4pybind11.hpp"`` header: [gh-1655](https://github.com/IntelPython/dpctl/pull/1655)
15
+
* Added support for DLPack data interchange per Python Array API 2023.12 specification: [gh-1667](https://github.com/IntelPython/dpctl/pull/1667)
16
+
* Implemented `tensor.cumulative_sum`, `tensor.cumulative_prod` and `tensor.cumulative_logsumexp`: [gh-1602](https://github.com/IntelPython/dpctl/pull/1602)
17
+
18
+
### Changed
19
+
20
+
* Expanded documentation for `dpctl`: [gh-1619](https://github.com/IntelPython/dpctl/pull/1619)
* Improved performance of `tensor.reshape` operation with `order="F"` when copying is needed, or requested: [gh-1677](https://github.com/IntelPython/dpctl/pull/1677)
26
+
27
+
### Fixed
28
+
29
+
* Fixed initialization of byte type constants in `dpctl_capi` Python/C API loader class in `"dpctl4pybind11.hpp"`: [gh-1665](https://github.com/IntelPython/dpctl/pull/1665)
30
+
* Fixed crash in `tensor.sort` reported for a CPU device and a CUDA device: [gh-1676](https://github.com/IntelPython/dpctl/pull/1676)
31
+
* Fixed race condition in accumulation kernel for custom operations that caused test failures with AMD CPUs: [gh-1624](https://github.com/IntelPython/dpctl/pull/1624)
32
+
* Fixed comparison operators for mixed signed and unsigned integral types: [gh-1650](https://github.com/IntelPython/dpctl/pull/1650)
33
+
* Support use of index arrays of different integral types in indexing operations: [gh-47](https://github.com/IntelPython/dpctl/pull/1647)
34
+
* Fixed source code to compile for NVidia(TM) GPUs with DPC++ 2024.1: [gh-1630](https://github.com/IntelPython/dpctl/pull/1630)
35
+
* Corrected `tensor.tile` for scalar inputs and empty repetitions: [gh-1628](https://github.com/IntelPython/dpctl/pull/1628)
36
+
* Fixed support for `out` keyword in `tensor.matmul`: [gh-1610](https://github.com/IntelPython/dpctl/pull/1610)
37
+
* Fixed bug in basic slicing of empty arrays: [gh-1680](https://github.com/IntelPython/dpctl/pull/1680)
38
+
* Fixed bug in `tensor.bitwise_invert` for boolean input array: [gh-1681](https://github.com/IntelPython/dpctl/pull/1681)
39
+
* Fixed bug in `tensor.repeat` on zero-size input arrays: [gh-1682](https://github.com/IntelPython/dpctl/pull/1682)
40
+
41
+
42
+
## [0.16.1] - Apr. 10, 2024
43
+
44
+
This is a bug-fix release, which also provides a change needed by ``numba_dpex`` project to support dispatching kernels
45
+
consuming instances of ``sycl::local_accessor`` template type.
46
+
47
+
### Changed
48
+
49
+
* Changed behavior of ``dpctl.tensor.usm_ndarray.__dlpack_device__`` method to return device id of the parent unpartitioned device if array is allocated on a sub-device instead of raising an exception: [#1604](https://github.com/IntelPython/dpctl/pull/1604)
50
+
* Array creation functions and the ``usm_ndarray`` constructor in `dpctl.tensor` submodule now use cached default-selected device to improve performance: [#1606](https://github.com/IntelPython/dpctl/pull/1606)
51
+
* Changed treatment of `axis` keyword for `dpctl.tensor.tensordot` and `dpctl.tensor.vecdot` to align with Python Array API 2023.12 specification: [#1608](https://github.com/IntelPython/dpctl/pull/1608)
52
+
* Changed implementation of `DPCTLQueue_SubmitRange`, `DPCTLQueue_SubmitNDRange` in DPCTLSyclInterface library to support ``sycl::local_accessor`` arguments needed by ``numba_dpex``; the enum `DPCTLKernelArgType` to correspond to C++ disjoint types: [#1609](https://github.com/IntelPython/dpctl/pull/1609), [#1611](https://github.com/IntelPython/dpctl/pull/1611), [#1612](https://github.com/IntelPython/dpctl/pull/1612)
53
+
54
+
### Fixed
55
+
56
+
* Fixed a crash on Windows platform during execution of getter of `dpctl.SyclPlatfom.default_context` property: : [#1604](https://github.com/IntelPython/dpctl/pull/1604)
57
+
* Fixed kernel submission error on NVidia CUDA GPUs during `dpctl.tensor.matmul` operation: [#1605](https://github.com/IntelPython/dpctl/pull/1605)
58
+
* Fixed corruption of context cache table entries: [#1607](https://github.com/IntelPython/dpctl/pull/1607)
59
+
* Fixed incorrect result from ``dpctl.tensor.tensordot`` reported in issue [#1570](https://github.com/IntelPython/dpctl/issues/1570): [#1608](https://github.com/IntelPython/dpctl/pull/1608)
60
+
* Fixed library name output by ``python -m dpctl --library``: [#1615](https://github.com/IntelPython/dpctl/pull/1615)
61
+
62
+
63
+
## [0.16.0] - Feb. 16, 2024
8
64
9
65
This release will require DPC++ 2024.1.0, which no longer supports Intel Gen9 integrated GPUs found in Intel CPUs of 10th generation and older.
0 commit comments