Skip to content

Commit 8890388

Browse files
authored
[SYCL] Release notes for August'20 DPCPP implementation update (#2269)
Increase major version of libsycl.so library as well
1 parent 7c73c11 commit 8890388

File tree

2 files changed

+186
-3
lines changed

2 files changed

+186
-3
lines changed

sycl/CMakeLists.txt

Lines changed: 3 additions & 3 deletions
Original file line numberDiff line numberDiff line change
@@ -11,10 +11,10 @@ option(SYCL_ADD_DEV_VERSION_POSTFIX "Adds -V postfix to version string" ON)
1111
list(APPEND CMAKE_MODULE_PATH "${CMAKE_CURRENT_SOURCE_DIR}/cmake/modules")
1212
include(AddSYCLExecutable)
1313

14-
set(SYCL_MAJOR_VERSION 2)
15-
set(SYCL_MINOR_VERSION 1)
14+
set(SYCL_MAJOR_VERSION 3)
15+
set(SYCL_MINOR_VERSION 0)
1616
set(SYCL_PATCH_VERSION 0)
17-
set(SYCL_DEV_ABI_VERSION 4)
17+
set(SYCL_DEV_ABI_VERSION 0)
1818
if (SYCL_ADD_DEV_VERSION_POSTFIX)
1919
set(SYCL_VERSION_POSTFIX "-${SYCL_DEV_ABI_VERSION}")
2020
endif()

sycl/ReleaseNotes.md

Lines changed: 183 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -1,3 +1,186 @@
1+
# August'20 release notes
2+
3+
Release notes for the commit range 75b3dc2..414c1e5
4+
5+
## New features
6+
- Implemented basic support for the [Explicit SIMD extension](./sycl/doc/extensions/ExplicitSIMD/dpcpp-explicit-simd.md)
7+
for low-level GPU performance tuning [84bf234] [32bf607] [a lot of others]
8+
- Implemented support for the [SYCL_INTEL_usm_address_spaces extension](https://github.com/intel/llvm/pull/1840)
9+
- Implemented support for the [Use Pinned Host Memory Property extension](doc/extensions/UsePinnedMemoryProperty/UsePinnedMemoryPropery.adoc) [e5ea144][aee2d6c][396759d]
10+
- Implemented aspects feature from the SYCL 2020 provisional Specification
11+
[89804af]
12+
13+
14+
## Improvements
15+
### SYCL Compiler
16+
- [CUDA BE] Removed unnecessary memory fence in the `sycl::group::barier`
17+
implementation which should improve performance [e2fc1b8]
18+
- [CUDA BE] Added support for the sycl builtins from relational, geometric,
19+
common and math categories [d4e7929] [d9bad0b] [0c9c9c0] [99957c5]
20+
- Added support for `C array` as a kernel parameter [00e7308]
21+
- [CUDA BE] Added support for kernel offset [c7bb288]
22+
- [CUDA BE] Added support for `sycl::half` type [8444189][8f39763]
23+
- Added support for SYCL kernel inheritance and nested arrays [0b2de9e]
24+
- Added a diagnostic on attempt to use const static data members that are not
25+
const-initialized [bde1085]
26+
- Added support for a set of standard library functions for AOT compilation
27+
[2bd5dab]
28+
- Allowed use of function declarators with empty parentheses [a4f2182]
29+
- The fallback implementation of standard library functions is now linked to
30+
the device code, only if such functions are used in kernels only [9a8864c]
31+
- Added support for recursive function calls in a constexpr context [06f667a]
32+
- Added a diagnostic on attempt to capture `this` as a kernel parameter
33+
[1b9f026]
34+
- Added [[intel::reqd_sub_group_size()]] attribute as a replacement for
35+
[[cl::reqd_sub_group_size()]] which is now depricated [b2da2c8]
36+
- Added propagation of attributes from transitive calls to the kernel[5c91609]
37+
- Changed the driver to pass corresponding device specific options when `-g`
38+
or `-O0` is passed [31eb425]
39+
- The `sycl::usm_allocator` has been improved. Now it has equality operators
40+
and can be used with `std::allocate_shared`. Disallowed usage with
41+
device allocations [ce915ef]
42+
- Added support for lambda functions passed to reductions [115c1a0]
43+
44+
45+
### SYCL Library
46+
- Added support for braced-init-list or a number as range for
47+
`sycl::queue::parallel_for` family functions [17299ee]
48+
- Finished implementation of [parallel_for simplification extension](doc/extensions/ParallelForSimpification) [af792cb]
49+
- Added 64-bit type support for to `load` and `store` methods of
50+
`sycl::intel::sub_group` [fe8d852]
51+
- [CUDA BE] Do not enable event profiling if it's not requested by passing
52+
`sycl::property::queue::enable_profiling` property [bbe8457]
53+
- Sub-group support has been aligned with the latest changes to the extension
54+
document [bea6aa2]
55+
- [CUDA BE] Optimized waiting for event completion by synchronizing with
56+
latest event for a queue [d7ee359]
57+
- Finished implementation of the [Host task with interop capabilities](https://github.com/codeplaysoftware/standards-proposals/blob/master/host_task/host_task.md)
58+
extension [f088e38]
59+
- Added builtins for one-element `sycl::vec` for host device [073a36b]
60+
- [L0 BE] Added support for specialization constants [be4e641]
61+
- Improved diagnostic on attempt to submit a kernel with local size which
62+
doesn't math value specified in the `sycl::intel::reqd_work_group_size`
63+
attribute for the kernel [03ef819]
64+
- [CUDA BE] Changed active context to be persistent [296fa1a]
65+
- [CUDA BE] Changed default gpu architecture for device code to `SM_50`
66+
[800e452]
67+
- Added a diagnostic on attempt to create a device accessor from zero-sized
68+
buffer [80b2110]
69+
- Changed default backend to level zero [11ef88c]
70+
- Improved performance of the SYCL graph cleanup [c099e47]
71+
- [L0 BE] Added support for `sycl::sampler` [f3b8cdf]
72+
- Added support for `TriviallyCopyable` types to the
73+
`sycl::intel::sub_group::shuffle` [d3c7b20]
74+
- Implemented range simplification for queue Shortcuts [4009b8b]
75+
- Changed `sycl::accessor::operator[]` to return const reference when acess
76+
mode is `sycl::access::mode::read_only` [03db009]
77+
- Exceptions thrown in a host task are now will be returned as asynchronous
78+
exceptions [280b93c]
79+
- Fixed `sycl::buffer` constructor which takes a contiguous container to
80+
enable copy back on destruction.
81+
- Added support for user-defined sub-group reductions [728429a]
82+
- The `sycl::backend::level0` has been renamed to `sycl::backend::level_zero`
83+
[215f591]
84+
- Extended `sycl::broadcast` to support `TriviallyCopyable` types [df6d715]
85+
- Implemented `get_native` and `make_*` functions for Level Zero allowing to
86+
query native handles of SYCL objects and to create SYCL objects by providing
87+
a native handle: platform, device, queue, program. The feature is described
88+
the SYCL 2020 provisional specification [a51c333]
89+
- Added support for `sycl::intel::atomic_ref` from [SYCL_INTEL_extended_atomics extension](doc/extensions/ExtendedAtomics/SYCL_INTEL_extended_atomics.asciidoc)
90+
91+
92+
### Documentation
93+
- Added [SYCL_INTEL_accessor_properties](doc/extensions/accessor_properties/SYCL_INTEL_accessor_properties.asciidoc) extension specification [58fc414]
94+
- The documentation for the CUDA BE has been improved [928b815]
95+
- The [Queue Shortcuts extension](sycl/doc/extensions/QueueShortcuts/QueueShortcuts.adoc)
96+
document has been updated [defac3c2]
97+
- Added [Use Pinned Host Memory Property extension](doc/extensions/UsePinnedMemoryProperty/UsePinnedMemoryPropery.adoc) specification [e5ea144]
98+
- Updated the [SYCL_INTEL_extended_atomics extension](doc/extensions/ExtendedAtomics/SYCL_INTEL_extended_atomics.asciidoc)
99+
to describe `sycl::intel::atomic_accessor` [4968e7c]
100+
- The [SYCL_INTEL_sub_group extension](doc/extensions/SubGroup/SYCL_INTEL_sub_group.asciidoc)
101+
document has been updated [067536e]
102+
- Added [FPGA lsu extension](sycl/doc/extensions/IntelFPGA/FPGALsu.md)
103+
document [2c2b5f2]
104+
105+
106+
## Bug fixes
107+
### SYCL Compiler
108+
- Fixed the diagnostic on `cl::reqd_sub_group_size` attribute mismatches
109+
[75b3dc2]
110+
- Fixed the issue with empty input for -foffload-static-lib option [8c8137f]
111+
- Fixed a problem with template instantiation during integration header
112+
generation [4ba61d0]
113+
- Fixed a problem which could happen when using a command lines with large
114+
numbers of files [87b94d5]
115+
- Fixed a crash when a kernel object field is an array of structures [b00fb7c]
116+
- Fixed issue which could prevent using of structures with constant-sized
117+
arrays as a kernel parameter [a4a7950]
118+
- Fixed a bug in the pass for lowering hierarchical parallelism code
119+
(SYCLLowerWGScope). Transformation was generating the code where work items
120+
hit the barrier in the loop different number of times which is illegal
121+
[a4a7950]
122+
- Fixed crash on attempt to use objects of `sycl::experimental::spec_constant`
123+
in the struct [d5a7f20]
124+
125+
### SYCL Library
126+
- Fixed problem with waiting on the same events several times which could
127+
happen when using USM [9bf602c]
128+
- Fixed a memory leak of `sycl::event` objects happened when using USM
129+
specific `sycl::queue` methods [a285b9d]
130+
- Fixed problem which could lead to a crash or deadlock when using
131+
`sycl::handler::codeplay_host_task` extension [e911de7]
132+
- Workarounded the problem which happened when an application uses long kernel
133+
names [b1b8510]
134+
- Fixed race which could happen when submitting the same kernel from multiple
135+
threads [95d3ec6]
136+
- [CUDA BE] Fixed a memory leak related to unreleased events [d0a148a]
137+
- [CUDA BE] Fixed diagnostic on attempt to fetch profiling info for commands
138+
which profiling is not enabled for [76bf2ed]
139+
- [L0 BE] Fixed memory leaks of device objects [eae48f6][6acb812]
140+
- [CUDA BE] Fixed a problem with that several operations were not profiled
141+
if required [a420e7a]
142+
- Fixed a possible race which could happen when an application builds an
143+
object of the `sycl::program` or submits kernels from multiple threads
144+
[363ad5f]
145+
- Fixed a memory leak of queue and context handles, which happened when
146+
backend is not OpenCL [9ddca50]
147+
- [CUDA BE] Fixed 3 dimensional buffer device to device copy [d917446]
148+
- Fixed one of the `sycl::queue` constructors which was ignoring
149+
`sycl::property::queue::enable_profiling` property [7863c0b]
150+
- Fixed endless-loop in `sycl::intel::reduction` for the data types not having
151+
fast atomics in case of local size is 1 [e6b6ae7]
152+
- Fixed a compilation error which happened when using
153+
`sycl::interop_handle::get_native_mem` method with an object of
154+
`sycl::accessor` created for host target [280b93c]
155+
- Fixed a deadlock which could happen when multiple threads try to build a
156+
program simultaneously
157+
- Aligned `sycl::handler::set_arg` with the SYCL specification [a6465c9]
158+
- Fixed an issue which could lead to "No kernel named was found" exception
159+
when using `sycl::handler::set_arg` method [a08674e]
160+
- Fixed `sycl::device::get_info<cl::sycl::info::device::sub_group_sizes>`
161+
which was return incorrect data [e65841b]
162+
163+
164+
## API/ABI breakages
165+
- The memory_manager API has changed
166+
- Layout of internal classes for `sycl::sampler` and `sycl::stream` have been
167+
changed
168+
169+
## Known issues
170+
- The format of the object files produced by the compiler can change between
171+
versions. The workaround is to rebuild the application.
172+
- The SYCL library doesn't guarantee stable API/ABI, so applications compiled
173+
with older version of the SYCL library may not work with new one.
174+
The workaround is to rebuild the application.
175+
[ABI policy guide](doc/ABIPolicyGuide.md)
176+
- Using `cl::sycl::program` API to refer to a kernel defined in another
177+
translation unit leads to undefined behavior
178+
- Linkage errors with the following message:
179+
`error LNK2005: "bool const std::_Is_integral<bool>" (??$_Is_integral@_N@std@@3_NB) already defined`
180+
can happen when a SYCL application is built using MS Visual Studio 2019
181+
version below 16.3.0
182+
The workaround is to enable `-std=c++17` for the failing MSVC version.
183+
1184
# June'20 release notes
2185

3186
Release notes for the commit range ba404be..24726df

0 commit comments

Comments
 (0)