Skip to content

Commit 80cb692

Browse files
authored
[SYCL][Doc] Add release notes for May release (#3590)
1 parent edaee9b commit 80cb692

File tree

1 file changed

+205
-0
lines changed

1 file changed

+205
-0
lines changed

sycl/ReleaseNotes.md

Lines changed: 205 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -1,3 +1,208 @@
1+
# May'21 release notes
2+
3+
Release notes for commit range 2ffafb95f887..6a49170027fb
4+
5+
## New features
6+
- [ESIMD] Allowed ESIMD and regular SYCL kernels to coexist in the same
7+
translation unit and in the same program. The `-fsycl-explicit-simd` option
8+
is no longer required for compiling ESIMD code and was deprecated. DPCPP RT
9+
implicitly appends `-vc-codegen` compile option for ESIMD images.
10+
- [ESIMD] Added indirect read and write methods to ESIMD class [8208427]
11+
- Provided `sycl::ONEAPI::has_known_identity` type trait to determine if
12+
reduction interface supports user-defined type at compile-time [0c7bd24]
13+
[060fd50]
14+
- Added support for multiple reduction items [c042f9e]
15+
- Added support for `+=`, `*=`, `|=`, `^=`, `&=` operations for custom type
16+
reducers [b249099]
17+
- Added SYCL 2020 `sycl::kernel_bundle` support [5af118a] [dcfb6b1] [ae45333]
18+
[8335e17]
19+
- Added `sycl/sycl.hpp` entry header in compliance with SYCL 2020 [5edb228]
20+
[24d179c]
21+
- Added `__LIBSYCL_[MAJOR|MINOR|PATCH]_VERSION` macros, see
22+
[PreprocessorMacros](doc/PreprocessorMacros.md) for more information
23+
[9f3a74c]
24+
- Added support for SYCL 2020 reductions with `read_write` access mode to
25+
reduction variables [733d5e3]
26+
- Added support for SYCL 2020 reductions with
27+
`sycl::property::reduction::initialize_to_identity` property [3473c1a]
28+
- Implemented zero argument version of `sycl::buffer::reinterpret()` for
29+
SYCL 2020 [c0c3c80]
30+
- Added an initial AOT implementation of the experimental matrix extension on
31+
the CPU device to target AMX hardware. Base features are supported [35db973]
32+
- Added support for
33+
[SYCL_INTEL_local_memory extension](doc/extensions/LocalMemory/SYCL_INTEL_local_memory.asciidoc)
34+
[5a66fcb] [9a734f6]
35+
- Documented [Level Zero backend](doc/extensions/LevelZeroBackend/LevelZeroBackend.md)
36+
[8994e6d]
37+
38+
## Improvements
39+
### SYCL Compiler
40+
- Added support for math built-ins: `fmax`, `fmin`, `isinf`, `isfinite`,
41+
`isnormal`, `fpclassify` [1040b94]
42+
- The FPGA initiation interval attribute spelling `[[intel::ii]]` is
43+
deprecated. The new spelling is `[[intel::initiation_interval]]`. In
44+
addition, `[[intel::initiation_interval]]` may now be used as a function
45+
attribute, formerly its use was limited to statement attribute [b04e6a0]
46+
- Added support for function attribute `[[intel::disable_loop_pipelining]]`
47+
and `[[intel::max_concurrency(n)]]` [7324b3e]
48+
- Enabled `-fsycl-id-queries-fit-in-int` by default [f27bb01]
49+
- Added support for stdlib functions: `abs`, `labs`, `llabs`, `div`, `ldiv`,
50+
`lldiv` [86716c5] [2e9d33c]
51+
- Enabled range rounding for ESIMD kernels [25b482b] [bb20b7b]
52+
- Improved diagnostics on invalid kernel names [0c0f4c5]
53+
- Improved compilation time by combining device code compilation and
54+
integration header generation into one step [f110dd4]
55+
- Added support for `sycl::queue::mem_advise` for the CUDA backend [2b56ac9]
56+
### SYCL Library
57+
- Specialized atomic `fetch_add`, `fetch_min` and `fetch_max` for
58+
floating-point types [37a9a2a] [59ceaf4]
59+
- Added support for accessors to array types [7ed4f58]
60+
- Added sub-group information queries on CUDA [c36fa65]
61+
- Added support for `sycl::queue::barrier` in Level Zero plugin [7c31f90]
62+
- Improved runtime memory usage in Level Zero plugin [c9d71d4] [2ce2ca6]
63+
[46e3c64]
64+
- Added Level Zero interoperability with specifying of ownership [41221e2]
65+
- Improved runtime memory usage when using USM [461fa02]
66+
- Provided facility for user to control execution range rounding [f6ac45f]
67+
- Ensured correct access mode in `sycl::handler::copy()` method [b489479]
68+
- Disallowed for atomic accessors in `sycl::handler::copy()` method [14437db]
69+
- Provided move-assignability of `usm_allocator` class [05a805e]
70+
- Improved performance of copying data during native memory object creation
71+
on devices without host unified memory [ad8c9d1]
72+
- [ESIMD] Added implicit set up of fence before barrier as required by hardware
73+
[692228c]
74+
- Allowed for using of interoperability program constructor with multi-device
75+
context [c7f7674]
76+
- Allowed trace of Level Zero calls only with `SYCL_PI_TRACE=-1` [ea73219]
77+
- Added throw of `feature_not_supported` when when upon attempt to create
78+
program using `create_program_with_source` with Level Zero or CUDA [ba77e3a]
79+
- Added support for `inline` `cl` namespace in debugger [8e441d4]
80+
- Added support for build with GCC 7 [d8fea22]
81+
- Added in-memory caching of programs built with custom build options
82+
[86b0e8d] [e152b0d]
83+
- Improved range rounding heuristics [7efb692]
84+
- Added `get_backend` methods to SYCL classes [ee7e99f]
85+
- Added `sycl::sub_group::load` and `sycl::sub_group::store` versions that
86+
take raw pointers [248f550]
87+
- Enabled caching of devices in `sycl::device` interoperability constructors
88+
[d3aeb4a]
89+
- Added a warning on using SYCL 1.2.1 OpenCL interoperability API when
90+
compiling in SYCL 2020 mode. It can be suppressed by defining
91+
`SYCL2020_DISABLE_DEPRECATION_WARNINGS` [a249316]
92+
- Added support for blitter engine in Level Zero plugin. Some memory
93+
operations are submitted to a Level Zero copy queue now [11ba5b5]
94+
- Improved `sycl::INTEL::lsu::load` and `sycl::INTEL::lsu::store` to take
95+
`sycl::multi_ptr` [697469f]
96+
- Added a diagnostic on attempt to compile a SYCL application without dynamic
97+
C++ RT on Windows [d4180f4]
98+
- Added support for `Queue Order Properties` extension for Level Zero [50005c7]
99+
- Improved plugin discovery mechanism - if a plugin fails to initialize others
100+
will be discovered anyway [d513074]
101+
- Added support for `sycl::info::partition_affinity_domain::numa` in Level
102+
Zero plugin [2ba8e05]
103+
### Documentation
104+
- Updated TBB paths in `GetStartedGuide` [a9acb70]
105+
- Aligned linked allocation document with recent changes [22b9d01]
106+
- Updated `GetStartedGuide` for building with `libcxx` [d3a74c3]
107+
- Updated table of contents in `GetStartedGuide` [0f401bf]
108+
- Filled in address spaces handling section in design documentation [f782c2a]
109+
- Improved design document for program cache [ed4b4c4]
110+
- Updated compiler options [description](doc/UsersManual.md) [e56e576]
111+
- Updated
112+
[SYCL_INTEL_sub_group]doc/extensions/SubGroup/SYCL_INTEL_sub_group.asciidoc
113+
extension document to use `automatic` instead of `auto` [c4d08f5]
114+
115+
## Bug fixes
116+
### SYCL Compiler
117+
- Suppressed link time warning on Windows that incorrectly diagnosed
118+
conflicting section names while linking device binaries [8e6a3ec]
119+
- Disabled code coverage for device compilations [12a0b11]
120+
- Fixed an issue when unbundling a fat static archive and targeting non-FPGA
121+
device [90c79c7]
122+
- Addressed inconsistencies when performing compilations by using the target
123+
triple for FPGA (`spir64_fpga-unknown-unknown-sycldevice`) vs using
124+
`-fintelfpga` [c9a65fc]
125+
- Fixed generation of the output report folder when performing FPGA AOT
126+
compilations from a previously generated AOCR archive [eab4791]
127+
- Addressed issues dealing with improper settings when performing
128+
preprocessing when offloading is enabled [d03de03]
129+
- Fixed issue when using `-fsycl-device-only` on Windows when specifying an
130+
output file with `/o` [d1d6c5d]
131+
- Fixed inlining functions called from an ESIMD kernel, which broke code
132+
generation in the Intel GPU vector back-end [65b459d]
133+
- Fixed JIT crash on ESIMD kernels compiled with `-fsycl-id-queries-fit-in-int`
134+
[ad86c34]
135+
- Fixed compiler crash on ESIMD kernels calling external functions with
136+
`gpu::simd` arguments [dfaaaed]
137+
- Fixed issue with generating preprocessed output when using
138+
`-fsycl-device-only` [3d2225a]
139+
### SYCL Library
140+
- Fixed race-condition happening on application exit [8eb00d7] [c9c1de9]
141+
- Fixed faulty behaviour that happened when accessing a buffer in different
142+
contexts using `discard_*` access mode [f75b439]
143+
- Fixed support for `SYCL_PROGRAM_LINK_OPTIONS` and
144+
`SYCL_PROGRAM_COMPILE_OPTIONS` environment variables when compiling/linking
145+
through `sycl::program` class [9d74846]
146+
- Fixed deadlock in Level Zero plugin when batching enabled [645db17]
147+
- Fixed possible stack overflow in Level Zero plugin [ec6fbe1]
148+
- Fixed issues with empty wait list in Level Zero plugin [d8c8e08]
149+
- Added missing `double3` and `double4` support in geometric function `cross()`
150+
[b8afff4]
151+
- Fixed issue when using `std::vector<bool> &` argument for
152+
`sycl::buffer::set_final_data()` method [084d83a, 2a751bd]
153+
- Fixed support for `long long` in `sycl::vec::convert()` on Windows [5b49cd3]
154+
- Aligned local and image accessor with specification by allowing for property
155+
list in their constructor [88fab25]
156+
- Fixed support for offset in `parallel_for` for host device [1958715]
157+
- Added missing constructors for `sycl::buffer` class [bdfad9e]
158+
- Fixed coordinate conversion for `sampler` class on host device [cd6529f]
159+
- Fixed support for local accessors in debugger [fdacb75]
160+
- Fixed dropping of kernel attributes when execution range rounding is used
161+
[496f9a0] [677a7ea]
162+
- Added support for interoperability tasks that use `get_mem()` methods with
163+
Level Zero plugin [149f08d]
164+
- Fixed sub-device caching in the Level Zero plugin [0b18b49]
165+
- Fixed `get_native` methods to retain reference counter in case of OpenCL
166+
backend [ee7e99f]
167+
- Fixed sporadic failure happening due to illegal destruction of events before
168+
they have been signaled [2a76b2a]
169+
- Resolved a pinned host memory specific performance regression on CUDA that
170+
was introduced with the host unified behavior dependent logic [3be63ab]
171+
- Fixed illegal accesses that could happen when an application that uses host
172+
tasks exits without waiting for host tasks completion [552a521]
173+
- Fixed `sycl::event::get_info` queries that were working incorrectly when
174+
called on event without an encapsulated native handle [5d5a792]
175+
- Fixed compilation error with using multidimensional subscript for
176+
`sycl::accessor` with atomic access mode [0bfd34e]
177+
- Fixed a crash that happened when an accessor passed to a reduction was
178+
destroyed immediately after [b80f13e]
179+
- Fixed `sycl::device::get_info` with `sycl::info::device::max_mem_alloc_size`
180+
which was returning incorrect value in case of Level Zero backend [8dbaa53]
181+
182+
## API/ABI breakages
183+
- None
184+
185+
## Known issues
186+
- GlobalWorkOffset is not supported by Level Zero backend [6f9e9a76]
187+
- User-defined functions with the same name and signature (exact match of
188+
arguments, return type doesn't matter) as of an OpenCL C built-in
189+
function, can lead to Undefined Behavior.
190+
- A DPC++ system that has FPGAs installed does not support multi-process
191+
execution. Creating a context opens the device associated with the context
192+
and places a lock on it for that process. No other process may use that
193+
device. Some queries about the device through device.get_info<>() also
194+
open up the device and lock it to that process since the runtime needs
195+
to query the actual device to obtain that information.
196+
- The format of the object files produced by the compiler can change between
197+
versions. The workaround is to rebuild the application.
198+
- Using `sycl::program`/`sycl::kernel_bundle` API to refer to a kernel defined
199+
in another translation unit leads to undefined behavior
200+
- Linkage errors with the following message:
201+
`error LNK2005: "bool const std::_Is_integral<bool>" (??$_Is_integral@_N@std@@3_NB) already defined`
202+
can happen when a SYCL application is built using MS Visual Studio 2019
203+
version below 16.3.0 and user specifies `-std=c++14` or `/std:c++14`.
204+
- Printing internal defines isn't supported on Windows [50628db]
205+
1206
# January'21 release notes
2207

3208
Release notes for commit range 5eebd1e4bfce..2ffafb95f887

0 commit comments

Comments
 (0)