Skip to content

Commit 53d1e55

Browse files
authored
[SYCL][Doc] Add release notes for July release (#4190)
1 parent 104b45e commit 53d1e55

File tree

1 file changed

+239
-0
lines changed

1 file changed

+239
-0
lines changed

sycl/ReleaseNotes.md

Lines changed: 239 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -1,3 +1,242 @@
1+
# July'21 release notes
2+
3+
Release notes for commit range 6a49170027fb..962909fe9e78
4+
5+
## New features
6+
- Implemented SYCL 2020 specialization constants [07b27965] [ba3d657]
7+
[bd8dcf4] [d15b841]
8+
- Provided SYCL 2020 function objects [24a2ad89]
9+
- Added support for ITT notification in SYCL Runtime [a7b8daf] [8d3921e3]
10+
- Implemented SYCL 2020 sub_group algorithms [e8caf6c3]
11+
- Implemented SYCL 2020 `sycl::handler::host_task` method [75e5a269]
12+
- Implemented SYCL 2020 `sycl::sub_group` class [19dcac79]
13+
- Added support for AMD GPU devices [ec612228]
14+
- Implemented SYCL 2020 `sycl::is_device_copyable` type trait [44c1cbcd]
15+
- Implemented SYCL 2020 USM features [1df6873d]
16+
- Implemented support for Device UUID from [Intel's Extensions for Device Information](doc/extensions/IntelGPU/IntelGPUDeviceInfo.md) [25aee287]
17+
- Implemented SYCL 2020 `sycl::atomic_fence` [dcd59547]
18+
- Implemented `intel::loop_count_max`, `intel::loop_count_max`,
19+
`intel::loop_count_avg` attributes that allow to specify number of loop
20+
iterations for FPGA [f74b4ef]
21+
- Implemented generation of compiler report for kernel arguments [201f902]
22+
- Implemented SYCL 2020 `[[reqd_sub_group_size]]` attribute [347e41c]
23+
- Implemented support for `[[intel::named_sub_group_size(primary)]]` attribute
24+
from [sub-group extension](doc/extensions/SubGroup/SYCL_INTEL_sub_group.asciidoc#attributes)
25+
[347e41c]
26+
- Implemented SYCL 2020 interoperability API [e6733e4]
27+
- Added [group sorting algorithm](doc/extensions/GroupAlgorithms/SYCL_INTEL_group_sort.asciidoc)
28+
extension specification [edaee9b]
29+
- Added [initial draft](doc/extensions/LevelZeroBackend/LevelZeroBackend.md)
30+
for querying of free device memory in LevelZero backend extension [fa428bf]
31+
- Added [InvokeSIMD](doc/extensions/InvokeSIMD/InvokeSIMD.asciidoc) and
32+
[Uniform](doc/extensions/Uniform/Uniform.asciidoc) extensions [72e1611]
33+
- Added [Matrix Programming Extension for DPC++ document](doc/extensions/Matrix/dpcpp-joint-matrix.asciidoc) [ace4c733]
34+
- Implemented SYCL 2020 `sycl::span` [9356d53]
35+
- Added [device-if](doc/extensions/DeviceIf/device_if.asciidoc) extension
36+
[4fb95fc]
37+
- Added a [programming guide](doc/MultiTileCardWithLevelZero.md) for
38+
multi-tile and multi-card under Level Zero backend [d581178a]
39+
- Implemented SYCL 2020 `sycl::bit_cast` [d4b66bd]
40+
41+
## Improvements
42+
### SYCL Compiler
43+
- Use `opencl-aot` instead of `aoc` when AOT flow for FPGA Emulator is
44+
triggered [3a99558]
45+
- Allowed for using an external host compiler [6f0ad1a]
46+
- Cleaned up preprocessing output when `-fsycl` option is passed [3a18db6]
47+
- Allowed kernel names in anonymous namespace [e47dbad]
48+
- Set default value of -sycl-std to 2020 for SYCL enabled compilations
49+
[680adc0]
50+
- Added implication of `-fPIC` compilation for wrapped object when using
51+
`-shared` [1754934]
52+
- Added a diagnostic for `-fsycl` and `-ffreestanding` as non-supported
53+
combination [a36c6720]
54+
- [ESIMD] Renamed `simd::format` to `simd::bit_cast_view` [653dede1]
55+
- Allowed `[[sycl::work_group_size_hint]]` to accept constant expr args
56+
[ef8e4019]
57+
- Deprecated the old-style SYCL attributes according to the SYCL 2020 spec
58+
[001bbd42]
59+
- Deprecated `[[intel::reqd_work_group_size]]` attribute spelling, please use
60+
`[[sycl::reqd_work_group_size]]` instead [8ef7eacc]
61+
- Enabled native FP atomics by default. Defining the
62+
`SYCL_USE_NATIVE_FP_ATOMICS` macro explicitly is no longer required - it is
63+
now automatically defined for hardware targets with "native" support for
64+
atomic functions. [0bbb68ee]
65+
- Switched to ignoring `-O0` option for device code when compiling for FPGA
66+
with hardware [7d94edf4]
67+
- Allowed for known aliases to be used for `-fsycl-targets`. Passing
68+
`*-unknown-unknown-sycldevice` components of the SYCL target triple is no
69+
longer necessary. [9778952a]
70+
- [ESIMD] Added support for half type in ESIMD intrinsics [d5958ebf]
71+
- Implemented `sycl::kernel::get_kernel_bundle` method [69a68a6d]
72+
- Added a diagnostic in case of timing issues for FPGA AOT [c69a3115]
73+
- Added support for C `memcpy` usages in the device code [76051ccf]
74+
- [ESIMD] Added support for vectorizing scalar function [3fc66cc]
75+
- Disabled vectorization and loop transformation passes because loop unrolling
76+
in "SYCL optimization mode" used default heuristic, which is tuned the code
77+
for CPU and might not have been profitable for other devices [ff6929e6]
78+
### SYCL Library
79+
- Added an exception throw if no matched device is found when
80+
`SYCL_DEVICE_FILTER` is set regardless of `device_selector` used [ef4e6dd]
81+
- Changed event status update to complete without waiting when run on CUDA
82+
devices [be7c1cb]
83+
- Improved performance when executing with dynamic batching on Level Zero
84+
backend [fa382d6]
85+
- Introduced pooling for USM and buffer allocations in Level Zero backend
86+
[4cffedd]
87+
- Added support for vectors with length of 3 and 16 elements in sub-group load
88+
and store operations [4e6452d]
89+
- Added interop types for images for Level Zero and OpenCL backends [a58cfef]
90+
- Improved plugins discovery - continue discovering even if a plugin fails to
91+
load [8c07803]
92+
- Implemented queries for IEEE rounded `sqrt`/`div` in Level Zero backend
93+
[91b35c4]
94+
- Added SYCL 2020 `interop_handle::get_backend()` method [041ca27]
95+
- [ESIMD] Deprecated `block_load`/`block_store` and
96+
`simd::copy_from`/`simd::copy_to` [5c41ed6]
97+
- Allowed for `const` and `volatile` pointer in sub-group `load` operation
98+
[50edee4]
99+
- Replaced use of `interop<>` with SYCL 2020 `backend_return_t<>` in
100+
`interop_handle` [d08c21a]
101+
- [ESIMD] Moved ESIMD APIs to `sycl::ext::intel::experimental::esimd` namespace
102+
[92da579]
103+
- Added global offset support for Level Zero backend [9ca2f911]
104+
- [ESIMD] Changed `simd::replicate` API by adding suffixes into the names to
105+
reflect the order of template arguments [e45408ad]
106+
- Introduced `SYCL_REDUCTION_DETERMINISTIC` macro which forces reduction
107+
algorithms to produce stable results [a3fc51a4]
108+
- Improved `SYCL_DEVICE_ALLOWLIST` format [9216b49d]
109+
- Added `SYCL_DISABLE_PARALLEL_FOR_RANGE_ROUNDING` macro to disable range
110+
rounding [5c4275ac]
111+
- Disabled range rounding by default when compiling for FPGA [5c4275ac]
112+
- Deprecated `sycl::buffer::get_count()`, please use `sycl::buffer::size()`
113+
instead [baf2ed9d]
114+
- Implemented `sycl::group_barrier` free function [48363902]
115+
- Added support of [SYCL_INTEL_enqueue_barrier extension](doc/extensions/EnqueueBarrier/enqueue_barrier.asciidoc) for CUDA backend [2e978482]
116+
- Deprecated `has_extension` method of `sycl::device` and `sycl::platform`
117+
classes, please use `has` method with aspects APIs instead [51c747da]
118+
- Deprecated `sycl::*_class` types, please use STL classes instead [51c747da]
119+
- Deprecated `sycl::ndrange` with an offset [51c747da]
120+
- Deprecated `barrier` and `mem_fence` methods of `sycl::nd_item` class,
121+
please use `sycl::group_barrier()` and `sycl::atomic_fence()` free functions
122+
instead [51c747da]
123+
- Deprecated `sycl::byte`, please use `std::byte` instead [51c747da]
124+
- Deprecated `sycl::info::device::max_constant_buffer_size` and
125+
`sycl::info::device::max_constant_args` [51c747da]
126+
- Deprecated `sycl::ext::intel::fpga_reg` taking non-trivially copyable
127+
structs [b4c322a8]
128+
- Added support for `sycl::property::queue::cuda::use_default_stream` queue
129+
property [08330525]
130+
- Switched to using atomic version of reductions if `sycl::aspect::atomic64`
131+
is available for a target [544fb7c8]
132+
- Added support for `sycl::aspect::fp16` for CUDA backend [db20bab3]
133+
- Deprecated `sycl::aspect::usm_system_allocator`, please use
134+
`sycl::aspect::usm_system_allocations` instead [000cc82d]
135+
- Optimized `sycl::queue::wait` to wait for batch of events rather than
136+
waiting for each event individually [7fe72dba]
137+
- Deprecated `sycl::ONEAPI::atomic_fence`, please use `sycl::atomic_fence`
138+
instead [dcd59547]
139+
- Added constexpr constructor for `sycl::half` type [5759e2a1]
140+
- Added support for more than 4Gb device allocations in Level Zero backend
141+
[fb1808b8]
142+
143+
### Documentation
144+
- Updated [sub-group algoritms](doc/extensions/SubGroupAlgorithms/SYCL_INTEL_sub_group_algorithms.asciidoc)
145+
extension to use `marray` instead of `vec` [98715ae]
146+
- Updated data flow pipes extension to be based on SYCL 2020 [f22f2e0]
147+
- Updated [ESIMD documentation](doc/extensions/ExplicitSIMD/dpcpp-explicit-simd.md)
148+
reflecting recent API changes [1e0bd1ed]
149+
- Updated [devicelib](doc/extensions/C-CXX-StandardLibrary/C-CXX-StandardLibrary.rst)
150+
extension document with `scalnbn`, `abs` and `div` (and their variants) as
151+
supported [febfb5a]
152+
- Addressed renaming of TBB dll to `tbb12.dll` in the
153+
[install script](tools/install.bat) [25433ba]
154+
155+
## Bug fixes
156+
### SYCL Compiler
157+
- Fixed crash which could happen in corner cases when null attribute created
158+
[cec6469]
159+
- Fixed crash when lowering `__sycl_alocateLocalMemory` [4960e71]
160+
- Fixed workflow for multi-file compilation in AOT mode [a0099a5]
161+
- Fixed problem with unbundling from object for device archives for FPGA
162+
[25ea6e1]
163+
- Stopped implying `defaultlib msvcrt` for Linux based driver on Windows
164+
[d3dc212d]
165+
- Fixed handling of `[[intel::max_global_work_dim()]]` in case of
166+
redeclarations [9b615928]
167+
- Fixed incorrect diagnostics in the presence of OpenMP [cbec0b5f]
168+
- Fixed an issue with incorrect output project report when using `-o` option
169+
with FPGA AOT enabled [18ac1723]
170+
- Removed restriction that was preventing from applying
171+
`[[intel::use_stall_enable_clusters]]` attribute to ANY function [15da879d]
172+
- Fixed bugs with recursion in SYCL kernels - diagnostics won't be emitted on
173+
using recursion in a discarded branch and in constexpr context [9a9a018c]
174+
- Fixed handling of `intel::use_stall_enable_clusters` attribute [06e4ebc7]
175+
### SYCL Library
176+
- Fixed build issue when CUDA 11 is used [f7224f1]
177+
- Fixed caching of sub-devices in Level Zero backend[4c34f93]
178+
- Fixed requesting of USM memory allocation info on CUDA [691f842]
179+
- Fixed [`joint_matrix_mad`](doc/extensions/Matrix/dpcpp-joint-matrix.asciidoc)
180+
behaviour to return `A*B+C` instead of assigning the result to `C` [ea59c2b]
181+
- Workaround an issue in Level Zero backend when event isn't waited upon its
182+
completion but is queried for its status in an infinite loop [bfef316]
183+
- Fixed persistent cache key comparison (esp. when there is `\0` symbol)
184+
[3e9ed1d]
185+
- Fixed a build issue when `sycl::kernel::get_native` is used [eb17836]
186+
- Fixed collisions of helper functions and SPIR-V operations when building with
187+
`-O0` or `-O1` [9f2fd98] [c2d6cfa]
188+
- [OpenCL] Fixed false-positive assertion trigger when allocation alignment is
189+
expected [3351916ad]
190+
- Aligned behavior of empty command groups with SYCL 2020 [1cf697bd]
191+
- Fixed build options handling when they come from different sources
192+
[67411472]
193+
- Fixed host task CUDA native memory handle [e9cf124b6]
194+
- Fixed a memory leak which could happen if a command submission fails
195+
[67eac4bd]
196+
- Fixed support for math functions `floor/rndd/rndu/rndz/rnde` in ESIMD mode
197+
[de694dd8]
198+
- Fixed memory allocations for multi-device contexts on Level Zero [f83c9356a]
199+
- Renamed `sycl::property::no_init` property to `sycl::property::no_init` in
200+
accordance to final SYCL 2020 specification, the old spelling is deprecated
201+
[ad46b641]
202+
- Use local size specified in `[[sycl::reqd_work_group_size]]` if no local
203+
size explicitly passed [0a54bef2]
204+
- Disabled persistent device code caching by default since it doesn't reliably
205+
identify driver version change [48f6bc9e]
206+
- [ESIMD] Fixed a bug in `simd_view::operator--` [ccc97e23]
207+
- Fixed a memory leak for host USM allocations [c18c3456]
208+
- Fixed possible crashes that could happen when `sycl::free` is called while
209+
there are still running kernels [c74f05d6]
210+
211+
## API/ABI breakages
212+
- None
213+
214+
## Known issues
215+
- [new] The compiler generates a temporary source file which is used during
216+
host compilation. This source file will appear to be a source dependency
217+
and could break build environments (such as Bazel) which closely keeps track
218+
of the generated files during a compilation. Build environments such as
219+
these will need to be configured in the DPC++ space to expect an additional
220+
intermediate file to be part of the compilation flow.
221+
- User-defined functions with the name and signature matching those of any
222+
OpenCL C built-in function (i.e. an exact match of arguments, return type
223+
doesn't matter) can lead to Undefined Behavior.
224+
- A DPC++ system that has FPGAs installed does not support multi-process
225+
execution. Creating a context opens the device associated with the context
226+
and places a lock on it for that process. No other process may use that
227+
device. Some queries about the device through device.get_info<>() also
228+
open up the device and lock it to that process since the runtime needs
229+
to query the actual device to obtain that information.
230+
- The format of the object files produced by the compiler can change between
231+
versions. The workaround is to rebuild the application.
232+
- Using `sycl::program`/`sycl::kernel_bundle` API to refer to a kernel defined
233+
in another translation unit leads to undefined behavior
234+
- Linkage errors with the following message:
235+
`error LNK2005: "bool const std::_Is_integral<bool>" (??$_Is_integral@_N@std@@3_NB) already defined`
236+
can happen when a SYCL application is built using MS Visual Studio 2019
237+
version below 16.3.0 and user specifies `-std=c++14` or `/std:c++14`.
238+
- Printing internal defines isn't supported on Windows [50628db]
239+
1240
# May'21 release notes
2241

3242
Release notes for commit range 2ffafb95f887..6a49170027fb

0 commit comments

Comments
 (0)