|
| 1 | +# October'19 release notes |
| 2 | + |
| 3 | +Release notes for commit 918b285d8dede6ab0561fccc622f71cb858849a6 |
| 4 | + |
| 5 | +## New features |
| 6 | + - `cl::sycl::queue::mem_advise` method was implemented [4828db5] |
| 7 | + - `cl::sycl::handler::memcpy` and `cl::sycl::handler::memset` methods that |
| 8 | + operate on USM pointer were implemented [d9e8467] |
| 9 | + - Implemented `ordered queue` [extension](doc/extensions/OrderedQueue/OrderedQueue.adoc) |
| 10 | + - Implemented support for half type in sub-group collectives: broadcast, |
| 11 | + reduce, inclusive_scan and exclusive_scan [0c78bc8] |
| 12 | + - Added `cl::sycl::intel::ctz` built-in. [6a96b3c] |
| 13 | + - Added support for SYCL_EXTERNAL macro. |
| 14 | + - Added support for passing device function pointers to a kernel [dc9db24] |
| 15 | + - Added support for USM on host device [5b0952c] |
| 16 | + - Enabled C++11 attribute spelling for clang `loop_unroll` attribute [2f1e243] |
| 17 | + - Added full support of images on host device |
| 18 | + - Added support for profiling info on host device [6c03c4f] |
| 19 | + - `cl::sycl::handler::prefetch` is implemented [feeacc1] |
| 20 | + - SYCL sub-buffers is mapped to OpenCL sub-buffers |
| 21 | + |
| 22 | +## Improvements |
| 23 | +### SYCL Frontend and driver changes |
| 24 | + - Added Intel FPGA Command line interface support for Windows [55ebcae] |
| 25 | + - Added support for one-step compilation from source with `-fsycl-link` |
| 26 | + [55ebcae] |
| 27 | + - Enabled additional aoc options for dependency files input and output report |
| 28 | + [55ebcae] |
| 29 | + - Suppressed warning `"_declspec attribute 'dllexport' is not supported"` |
| 30 | + when run with `-fsycl`. Emit error when import function is called in the |
| 31 | + sycl kernel. [b10bdbb] |
| 32 | + - Changed `-fsycl-device-only` to override `-fsycl` option [d429243] |
| 33 | + - Added user-friendly diagnostic for unsupported math built-in functions usage |
| 34 | + in kernel [0476352] |
| 35 | + - The linking stage is now skipped if -fsycl-device-only option is passed |
| 36 | + [93178d1] |
| 37 | + - When unbundling static libraries on Windows, do not extract the host section |
| 38 | + as it is not being used. This fixes possible disk usage issues when working |
| 39 | + with fat static libraries [93ab97e] |
| 40 | + - Passing `-fsycl-help` with `-###` option now prints the actual call to tool |
| 41 | + being made. [8b8bfa9] |
| 42 | + - Allow for `-gN` to override default setting with `-fintelfpga` [3b20615] |
| 43 | + - Update sub-group reduce/scan syntax [cd8194d] |
| 44 | + - Prevent libraries from being considered for unbundling on Windows [3438a48] |
| 45 | + - Improved Windows behaviors for calling `lib.exe` when creating an archive |
| 46 | + for Intel FPGA AOT [e7afcb1] |
| 47 | + |
| 48 | +### SYCL headers and runtime |
| 49 | + - Removed suppression of exceptions thrown by async_handler from |
| 50 | + `cl::sycl::queue` destructor [61574d8] |
| 51 | + - Added the support for output operator for half data types [6a2cd90] |
| 52 | + - Improved efficiency of stream output of `cl::sycl::h_item` for Intel FPGA |
| 53 | + device [80e97a0] |
| 54 | + - Added support for `std::numeric_limits<cl::sycl::half>` [6edca52] |
| 55 | + - Marked barrier flags as constexpr to avoid its extra runtime translation |
| 56 | + [5635959] |
| 57 | + - Added support for unary plus and minus for `cl::sycl::vec` class |
| 58 | + - Reversed mapping of SYCL range/ID dimensions to OpenCL, to provide expected |
| 59 | + performance through unit stride dimension. The highest dimension in SYCL |
| 60 | + (e.g. r2 in cl::sycl::range<3> R(r0,r1,r2)) now maps to the lowest dimension |
| 61 | + in OpenCL (e.g. an enqueue of size_t[3] cl_R = {r2,r1,r0}). The same applies |
| 62 | + to range and ID queries, in kernels defined through OpenCL interop. |
| 63 | + [40aa3f9] |
| 64 | + - Added support for constructing `cl::sycl::image` without host ptr but with |
| 65 | + pitch provided [d1931fd] |
| 66 | + - Added `sycld` library on Windows which is compiled using `/MDd` option. |
| 67 | + This library should be used when SYCL application is compiled with `/MDd` |
| 68 | + option to avoid ABI issues [71a75c0] |
| 69 | + - Added driver and runtime support for AOT-compiled images for multiple |
| 70 | + devices. This handles the case when the device code is AOT-compiled for |
| 71 | + multiple targets [0d4eb49] [bcf38cf] |
| 72 | + |
| 73 | +### Documentation |
| 74 | + - Get started [guide](doc/GetStartedWithSYCLCompiler.md) was reworked |
| 75 | + [9050a98] [94ee028] |
| 76 | + - Added SYCL compiler [command line guide](doc/SYCLCompilerUserManual.md) |
| 77 | + [af63c6e] |
| 78 | + - New [document](doc/SYCLPluginInterface.md) describing the SYCL Runtime |
| 79 | + Plugin Interface [bffdbcd] |
| 80 | + - Updated interfaces in [Sub-group extension specification](doc/extensions/SubGroupNDRange/SubGroupNDRange.md) |
| 81 | + [cc6e4ae] |
| 82 | + - Updated interfaces in [USM proposal](doc/extensions/USM/USM.adoc) |
| 83 | + [a6d7e12] [d9e8467] |
| 84 | + |
| 85 | +## Bug fixes |
| 86 | +### SYCL Frontend and driver changes |
| 87 | + - Fixed problem with using aliases as kernel names [a784071] |
| 88 | + - Fixed address space in generation of annotate attribute for static vars and |
| 89 | + global Intel FPGA annotation [800c8c0] |
| 90 | + - Suppressed emitting errors for TLS declarations [ddc1a7f] |
| 91 | + - Suppressed device code link warnings that happen during linking `fat` |
| 92 | + and `non-fat` object files [b38a8e0] |
| 93 | + - Fixed pointer width on 64-bit version of Windows [63e2b19] |
| 94 | + - Fixed integration header generation when kernel name type is defined in cl, |
| 95 | + sycl or detail namespaces [5d22a8e] |
| 96 | + - Fixed problem with incorrect generation of output filename caused by |
| 97 | + processing of libraries in SYCL device toolchain [d3d9d2c] |
| 98 | + - Fixed problem with generation of depfile information for Intel FPGA AOT |
| 99 | + compilation [fbe951f] |
| 100 | + - Fixed generation of help message in case of `-fsycl-help=get` option passed |
| 101 | + [8b8bfa9] |
| 102 | + - Improved use of `/Fo` on Windows in offload situations so intermediate |
| 103 | + temporary files are not renamed [6984794] |
| 104 | + - Resolved problem with unnamed lambdas having the same name [f4d182f] |
| 105 | + - Fixed -fsycl-add-targets option to support multiple triple:binary arguments |
| 106 | + and to emit diagnostics for invalid target triples [21fa901] |
| 107 | + - Fixed AOT compilation for GEN devices [cd2dd9b] |
| 108 | + |
| 109 | +### SYCL headers and runtime |
| 110 | + - Fixed problem with using 32 bits integer type as underlying type of |
| 111 | + `cl::sycl::vec` class when 64 bits integer types must be used on Windows |
| 112 | + [b4998f2] |
| 113 | + - `cl::sycl::aligned_alloc*` now returns nullptr in case of error [9266cd5] |
| 114 | + - Fixed bug in conversion from float to half in the host version of |
| 115 | + `cl::sycl::half` type [6a2cd90] |
| 116 | + - Corrected automatic/rte mode conversion of `cl::sycl::vec::convert` method |
| 117 | + [6a2cd90] |
| 118 | + - Fixed memory leak related to incorrectly destroying command group objects |
| 119 | + [d7b5c0d] |
| 120 | + - Fixed layout and alignment of objects of 3 elements `cl::sycl::vec` type, |
| 121 | + now they occupy memory for 4 elements underneath [32f0cd5] [8f7f4a0] |
| 122 | + - Fixed problem with reporting the same asynchronous exceptions multiple times |
| 123 | + [9040739] |
| 124 | + - Fixed a bug with a wrong success code being returned for non-blocking pipes, |
| 125 | + that was resulting in incorrect array data passing through a pipe. [3339c45] |
| 126 | + - Fixed problem with calling atomic_load for float types in |
| 127 | + `cl::sycl::atomic::load`. Now it bitcasts float value to integer one then |
| 128 | + call atomic_load. [f4b7b17] |
| 129 | + - Fixed crash in case incorrect local size is passed. Now an exception is |
| 130 | + thrown in such cases. [1865c79] |
| 131 | + - `cl::sycl::vec` types aliases are now aligned with the SYCL specification. |
| 132 | + - Fixed `cl::sycl::rotate` method to correctly handle over-sized shift widths |
| 133 | + [d2e6a26] |
| 134 | + - Changed underlying address space of `cl::sycl::constant_ptr` from constant |
| 135 | + to global to avoid casts between constant and generic address spaces |
| 136 | + [38c2960] |
| 137 | + - Aligned `cl::sycl::range` class with the SYCL specification by removing its |
| 138 | + default constructor [d3b6a49] |
| 139 | + - Fixed several thread safety problems in `cl::sycl::queue` class [349a0d3] |
| 140 | + - Fixed compare_exchange_strong to properly update expected inout parameter |
| 141 | + [627a137] |
| 142 | + - Fixed issue with host version of `cl::sycl::sub_sat` function [7865dfc] |
| 143 | + - Fixed initialization of `cl::sycl::h_item` object when |
| 144 | + `cl::sycl::handler::parallel_for` method with flexible range is used |
| 145 | + [ab3e71e] |
| 146 | + - Fixed host version of `cl::sycl::mul_hi` built-in to correctly handle |
| 147 | + negative arguments [8a3b7d9] |
| 148 | + - Fix host memory deallocation size of SYCL memory objects [866d634] |
| 149 | + - Fixed bug preventing from passing structure containing accessor to a kernel |
| 150 | + on some devices [1d72965] |
| 151 | + - Fixed bug preventing using types from "inline" namespace as kernel names |
| 152 | + [28d5931] |
| 153 | + - Fixed bug when placeholder accessor behaved like a host accessor fetching |
| 154 | + memory to be available on the host and blocking further operations on the |
| 155 | + accessed memory object [d8505ad] |
| 156 | + - Rectified precision issue with the float to half conversion [2de1379] |
| 157 | + - Fixed `cl::sycl::buffer::reinterpret` method which was working incorrectly |
| 158 | + with sub-buffers [7b2f630] [916c32d] [60b6e3f] |
| 159 | + - Fixed problem with allocating USM memory on the host [01869a0] |
| 160 | + - Fixed compilation issues of built-in functions. [6bcf548] |
| 161 | + |
| 162 | +## Known issues |
| 163 | +- [new] The addition of the static keyword on an array in the presence of Intel |
| 164 | + FPGA memory attributes results in the empty kernel after translation. |
| 165 | +- [new] A loop's attribute in device code may be lost during compilation. |
| 166 | +- [new] Linkage errors with the following message: |
| 167 | + `error LNK2005: "bool const std::_Is_integral<bool>" (??$_Is_integral@_N@std@@3_NB) already defined` |
| 168 | + can happen when a SYCL application is built using MS Visual Studio 2019 |
| 169 | + version below 16.3.0. |
| 170 | + |
| 171 | +## Prerequisites |
| 172 | +### Linux |
| 173 | +- Experimental Intel(R) CPU Runtime for OpenCL(TM) Applications with SYCL |
| 174 | + support version |
| 175 | + [2019.10.10.0.1106_rel](https://github.com/intel/llvm/releases/download/2019-10/oclcpuexp-2019.10.10.0.1106_rel.tar.gz) |
| 176 | + is recommended OpenCL CPU RT prerequisite for the SYCL compiler |
| 177 | +- The Intel(R) Graphics Compute Runtime for OpenCL(TM) version |
| 178 | + [19.43.14583](https://github.com/intel/compute-runtime/releases/tag/19.43.14583) |
| 179 | + is recommended OpenCL GPU RT prerequisite for the SYCL compiler. |
| 180 | +### Windows |
| 181 | +- Experimental Intel(R) CPU Runtime for OpenCL(TM) Applications with SYCL |
| 182 | + support version |
| 183 | + [2019.10.10.0.1106_rel](https://github.com/intel/llvm/releases/download/2019-10/win-oclcpuexp-2019.10.10.0.1106_rel.zip) |
| 184 | + is recommended OpenCL CPU RT prerequisite for the SYCL compiler |
| 185 | +- The Intel(R) Graphics Compute Runtime for OpenCL(TM) version |
| 186 | + [100.7372](https://downloadmirror.intel.com/29127/a08/1910.1007372.exe) |
| 187 | + is recommended OpenCL GPU RT prerequisite for the SYCL compiler. |
| 188 | + |
| 189 | +Please, see the runtime installation guide [here](https://github.com/intel/llvm/blob/sycl/sycl/doc/GetStartedWithSYCLCompiler.md#install-low-level-runtime) |
| 190 | + |
| 191 | + |
1 | 192 | # September'19 release notes
|
2 | 193 |
|
3 | 194 | Release notes for commit d4efd2ae3a708fc995e61b7da9c7419dac900372
|
|
0 commit comments