|
| 1 | +# Specialization constants |
| 2 | + |
| 3 | +DPC++ implements this [proposal](https://github.com/codeplaysoftware/standards-proposals/blob/master/spec-constant/index.md) |
| 4 | +with some restrictions. See this [document](https://github.com/intel/llvm/blob/sycl/sycl/doc/extensions/SpecConstants/README.md) for more details. |
| 5 | + |
| 6 | +#### Requirements: |
| 7 | + |
| 8 | +- must work with separate compilation and linking |
| 9 | +- must support AOT compilation |
| 10 | + |
| 11 | +Implementaion is based on SPIR-V specialization constants. But there is one |
| 12 | +important difference between SYCL and SPIR-V: in SYCL speciazation constants are |
| 13 | +identified by a type ID which is mapped to a symbolic name, in SPIR-V - by an |
| 14 | +ordinal number. This complicates the design, as the compiler |
| 15 | +1) needs to propagate symbolic =\> numeric ID correspondence to the runtime |
| 16 | +2) can assign numeric IDs only when linking due to the separate compilation |
| 17 | + |
| 18 | +Simple source code example: |
| 19 | + |
| 20 | +```cpp |
| 21 | +class MyInt32Const; |
| 22 | +... |
| 23 | + sycl::program p(q.get_context()); |
| 24 | + sycl::ONEAPI::experimental::spec_constant<int32_t, MyInt32Const> i32 = |
| 25 | + p.set_spec_constant<MyInt32Const>(rt_val); |
| 26 | + p.build_with_kernel_type<MyKernel>(); |
| 27 | + sycl::buffer<int, 1> buf(vec.data(), vec.size()); |
| 28 | + |
| 29 | + q.submit([&](cl::sycl::handler &cgh) { |
| 30 | + auto acc = buf.get_access<cl::sycl::access::mode::write>(cgh); |
| 31 | + cgh.single_task<MyKernel>( |
| 32 | + p.get_kernel<MyKernel>(), |
| 33 | + [=]() { |
| 34 | + acc[0] = i32.get(); |
| 35 | + }); |
| 36 | + }); |
| 37 | +... |
| 38 | +``` |
| 39 | +
|
| 40 | +## Design |
| 41 | +
|
| 42 | +This section describes the basic design used to support spec constants of |
| 43 | +primitive numeric types. POD types support is described further in the document. |
| 44 | +
|
| 45 | +#### Compiler |
| 46 | +
|
| 47 | +Key `spec_constant::get()` function implementation for the device code: |
| 48 | +
|
| 49 | +```cpp |
| 50 | +template <typename T, typename ID = T> class spec_constant { |
| 51 | +... |
| 52 | +public: |
| 53 | + T get() const { // explicit access. |
| 54 | +#ifdef __SYCL_DEVICE_ONLY__ |
| 55 | + const char *TName = __builtin_unique_stable_name(ID); |
| 56 | + return __sycl_getSpecConstantValue<T>(TName); |
| 57 | +#else |
| 58 | + return Val; |
| 59 | +#endif // __SYCL_DEVICE_ONLY__ |
| 60 | + } |
| 61 | +``` |
| 62 | + |
| 63 | +here `__builtin_unique_stable_name` is a compiler built-in used to translate |
| 64 | +types to unique strings. `__sycl_getSpecConstantValue<T>` is an "intrinsic" |
| 65 | +recognized by a special LLVM pass later. |
| 66 | + |
| 67 | +Compilation and subsequent linkage of device code results in a number of |
| 68 | +`__sycl_getSpecConstantValue` calls whose arguments are symbolic spec constant |
| 69 | +IDs. Before generating the a device binary, each linked device code LLVMIR |
| 70 | +module undergoes processing by the sycl-post-link tool which can run LLVMIR |
| 71 | +passes before passing the module onto the llvm-spirv translator. |
| 72 | + |
| 73 | +There is a `SpecConstants` LLVMIR pass which |
| 74 | +- assigns numeric IDs to the spec constants |
| 75 | +- brings IR to the form expected by the llvm-spirv translator |
| 76 | +- collects and provides \<Symbolic ID\> =\> \<numeric ID\> spec constant information |
| 77 | + to the sycl-post-link tool |
| 78 | +Particularly, it translates intrinsic calls to the |
| 79 | +`T __sycl_getSpecConstantValue*(const char *symbolic_id)` intrinsic into |
| 80 | +calls to `T __spirv_SpecConstant(int ID, T default_val)` intrinsic known to |
| 81 | +the llvm-spirv translator. Where `ID` is the numeric ID of the corresponding |
| 82 | +spec constant, `default_val` is its default value which will be used if the |
| 83 | +constant is not set at the runtime. |
| 84 | + |
| 85 | +After this pass the sycl-post-link tool will output the |
| 86 | +\<Symbolic ID\> =\> \<numeric ID\> spec constant mapping into a file for later |
| 87 | +attaching this info to the device binary image via the offload wrapper tool as |
| 88 | +a property set: |
| 89 | + |
| 90 | +```cpp |
| 91 | +struct pi_device_binary_struct { |
| 92 | +... |
| 93 | + // Array of preperty sets; e.g. specialization constants symbol-int ID map is |
| 94 | + // propagated to runtime with this mechanism. |
| 95 | + pi_device_binary_property_set PropertySetsBegin; |
| 96 | + pi_device_binary_property_set PropertySetsEnd; |
| 97 | +}; |
| 98 | +``` |
| 99 | +
|
| 100 | +SYCL runtime can then load and access info about particular spec constant using |
| 101 | +its name as a key into the appropriate property set (named "SYCL/specialization |
| 102 | +constants"). |
| 103 | +
|
| 104 | +##### Ahead of time compilation |
| 105 | +
|
| 106 | +With AOT everything is simplified - the `SpecConstants` pass simply replaces |
| 107 | +the `__sycl_getSpecConstantValue` calls with constants - default values of |
| 108 | +the spec constant's type. No maps are generated, and SYCL program can't change |
| 109 | +the value of a spec constant. |
| 110 | +
|
| 111 | +#### LLVMIR-SPIR-V translator |
| 112 | +
|
| 113 | +Given the `__spirv_SpecConstant` intrinsic calls produced by the |
| 114 | +`SpecConstants` pass: |
| 115 | +```cpp |
| 116 | +; Function Attrs: alwaysinline |
| 117 | +define dso_local spir_func i32 @get() local_unnamed_addr #0 { |
| 118 | + ; args are "ID" and "default value": |
| 119 | + %1 = tail call spir_func i32 @_Z20__spirv_SpecConstantii(i32 42, i32 0) |
| 120 | + ret i32 %1 |
| 121 | +} |
| 122 | +``` |
| 123 | + |
| 124 | +the translator will generate `OpSpecConstant` SPIR-V instructions with proper |
| 125 | +`SpecId` decorations: |
| 126 | + |
| 127 | +```cpp |
| 128 | + OpDecorate %i32 SpecId 42 ; ID |
| 129 | + %i32 = OpSpecConstant %int 0 ; Default value |
| 130 | + %1 = OpTypeFunction %int |
| 131 | + |
| 132 | + %get = OpFunction %int None %1 |
| 133 | + %2 = OpLabel |
| 134 | + OpReturnValue %i32 |
| 135 | + OpFunctionEnd |
| 136 | +``` |
| 137 | + |
| 138 | +#### SYCL runtime |
| 139 | + |
| 140 | +For each device binary compiler generates a map \<Symbolic ID\> =\> \<numeric ID\> |
| 141 | +("ID map"). The SYCL runtime imports that map when loading device binaries. |
| 142 | +It also maintains another map \<Spec const symbolic ID\> =\> \<its value\> |
| 143 | +("value map") per `sycl::program` object. The value map is updated upon |
| 144 | +`program::set_spec_constant<IDType>(val)` calls from the app. |
| 145 | + |
| 146 | +**_NOTE_** `IDType` gets translated to the symbolic ID using the integration |
| 147 | +header mechanism, similarly to kernel ID type. The reason why |
| 148 | +`__builtin_unique_stable_name` is not used here is because this code is |
| 149 | +compiled by the host compiler, which can be any C++ 14-compatible compiler |
| 150 | +unaware of the clang-specific built-ins. |
| 151 | + |
| 152 | +Before JIT-ing a program, the runtime "flushes" the spec constants: it iterates |
| 153 | +through the value map and invokes the |
| 154 | + |
| 155 | +```cpp |
| 156 | +pi_result piextProgramSetSpecializationConstant(pi_program prog, |
| 157 | + pi_uint32 spec_id, |
| 158 | + size_t spec_size, |
| 159 | + const void *spec_value); |
| 160 | +``` |
| 161 | +
|
| 162 | +Plugin Interface function for each entry, taking the `spec_id` from the ID map. |
| 163 | +
|
| 164 | +## "Plain Old Data" (POD) types support design |
| 165 | +
|
| 166 | +#### Source representation |
| 167 | +
|
| 168 | +Say, the POD type is |
| 169 | +
|
| 170 | +```cpp |
| 171 | +struct A { |
| 172 | + int x; |
| 173 | + float y; |
| 174 | +}; |
| 175 | +
|
| 176 | +struct POD { |
| 177 | + A a[2]; |
| 178 | + int b; |
| 179 | +}; |
| 180 | +``` |
| 181 | + |
| 182 | +and the user says |
| 183 | + |
| 184 | +```cpp |
| 185 | + POD gold{ |
| 186 | + { |
| 187 | + { goldi, goldf }, |
| 188 | + { goldi + 1, goldf + 1 }, |
| 189 | + }, |
| 190 | + goldi |
| 191 | + }; |
| 192 | + |
| 193 | + cl::sycl::ONEAPI::experimental::spec_constant<POD, MyConst> sc = program4.set_spec_constant<MyConst>(gold); |
| 194 | +``` |
| 195 | +
|
| 196 | +#### Compiler |
| 197 | +
|
| 198 | +##### The SpecConstant pass changes |
| 199 | +
|
| 200 | + - The SpecConstants pass in the post-link will have the following IR as input (`sret` conversion is omitted for clarity): |
| 201 | +
|
| 202 | +```cpp |
| 203 | + %spec_const = call %struct.POD __sycl_getCompositeSpecConstantValue<POD type mangling> ("MyConst_mangled") |
| 204 | +``` |
| 205 | + |
| 206 | +where `__sycl_getCompositeSpecConstantValue` is a new "intrinsic" |
| 207 | + (in addition to `__sycl_getSpecConstantValue`) recognized by SpecConstants pass, |
| 208 | + which creates a value of a composite (of non-primitive type) specialization constant. |
| 209 | + It does not need a default value, because its default value consists of default |
| 210 | + valued of its leaf specialization constants (see below). |
| 211 | + |
| 212 | + - after spec constant enumeration (symbolic -\> int ID translation), the SpecConstants pass |
| 213 | + will handle the `__sycl_getCompositeSpecConstantValue`. Given the knowledge of the composite |
| 214 | + specialization constant's type (`%struct.POD`), the pass will traverse its leaf |
| 215 | + fields and generate 5 "primitive" spec constants using already existing SPIR-V intrinsic: |
| 216 | + |
| 217 | +```cpp |
| 218 | +%gold_POD_a0x = call i32 __spirv_SpecConstant(i32 10, i32 0) |
| 219 | +%gold_POD_a0y = call float __spirv_SpecConstant(i32 11, float 0) |
| 220 | +%gold_POD_a1x = call i32 __spirv_SpecConstant(i32 12, i32 0) |
| 221 | +%gold_POD_a1y = call float __spirv_SpecConstant(i32 13, float 0) |
| 222 | +%gold_POD_b = call i32 __spirv_SpecConstant(i32 14, i32 0) |
| 223 | +``` |
| 224 | +
|
| 225 | +And 1 "composite" |
| 226 | +
|
| 227 | +```cpp |
| 228 | + %gold_POD = call %struct.POD __spirvCompositeSpecConstant<POD type mangling>(i32 10, i32 11, i32 12, i32 13, i32 14) |
| 229 | +``` |
| 230 | + |
| 231 | +where `__spirvCompositeSpecConstant<POD type mangling>` is a new SPIR-V intrinsic which |
| 232 | + represents creation of a composite specialization constant. Its arguments are spec |
| 233 | + constant IDs corresponding to the leaf fields of the POD type of the constant. |
| 234 | +Spec ID for the composite spec constant is not needed, as runtime will never use it - it will use IDs of the leaves instead. |
| 235 | + Yet, the SPIR-V specification does not allow `SpecID` decoration for composite spec constants. |
| 236 | + |
| 237 | +##### The post-link tool changes |
| 238 | + |
| 239 | +For composite specialization constants the post link tool will additionally |
| 240 | +generate linearized list of \<leaf spec ID,type,offset,size\> tuples (descriptors), |
| 241 | +where each tuple describes a leaf field, and store it together with the |
| 242 | +existing meta-information associated with the specialization constants and |
| 243 | +passed to the runtime. Also, for a composite specialization constant there is |
| 244 | +no ID map entry within the meta information, and the composite constant is |
| 245 | +referenced by its symbolic ID. For example: |
| 246 | + |
| 247 | +```cpp |
| 248 | +MyConst_mangled [10,int,0,4],[11,float,4,4],[12,int,8,4],[13,float,12,4],[14,int,16,4] |
| 249 | +``` |
| 250 | + |
| 251 | +#### LLVMIR-\>SPIR-V translator |
| 252 | + |
| 253 | +The translator aims to create the following code (pseudo-code) |
| 254 | + |
| 255 | +```cpp |
| 256 | +%gold_POD_a0x = OpSpecConstant(0) [SpecId = 10] |
| 257 | +%gold_POD_a0y = OpSpecConstant(0.0f) [SpecId = 11] |
| 258 | +%gold_POD_a1x = OpSpecConstant(0) [SpecId = 12] |
| 259 | +%gold_POD_a1y = OpSpecConstant(0.0f) [SpecId = 13] |
| 260 | +%gold_POD_b = OpSpecConstant(0) [SpecId = 14] |
| 261 | + |
| 262 | +%gold_POD_a0 = OpSpecConstantComposite( |
| 263 | + %gold_POD_a0x // gold.a[0].x |
| 264 | + %gold_POD_a0y // gold.a[0].y |
| 265 | +) |
| 266 | + |
| 267 | +%gold_POD_a1 = OpSpecConstantComposite( |
| 268 | + %gold_POD_a1x // gold.a[1].x |
| 269 | + %gold_POD_a1y // gold.a[1].y |
| 270 | +) |
| 271 | + |
| 272 | +%gold_POD = OpSpecConstantComposite( |
| 273 | + %gold_POD_a0, |
| 274 | + %gold_POD_a1, |
| 275 | + %gold_POD_b // gold.b |
| 276 | +} |
| 277 | +``` |
| 278 | + |
| 279 | +- First, `OpSpecConstant` instructions are created using already existing mechanism for |
| 280 | +primitive spec constants. |
| 281 | +- Then the translator will handle `__spirvCompositeSpecConstant*` intrinsic. |
| 282 | +It will recursively traverse the spec constant type structure in parallel with |
| 283 | +the argument list - which is a list of primitive spec constant SpecIds. |
| 284 | +When traversing, it will create all the intermediate OpSpecConstantComposite |
| 285 | +instructions as well as the root one (`%gold_POD`) using simple depth-first tree |
| 286 | +traversal with stack. This requires mapping from SpecId decoration number to |
| 287 | +\<id\> of the corresponding OpSpecConstant instruction, but this should be pretty |
| 288 | +straightforward. |
| 289 | + |
| 290 | +#### SYCL runtime |
| 291 | + |
| 292 | +First, when the runtime loads a binary it gets access to specialization |
| 293 | +constant information. So the mapping from a composite spec constant name to |
| 294 | +its constituents (descriptors of leaf fields) generated by the post-link tool |
| 295 | +will be available. |
| 296 | + |
| 297 | +Now, when the program invokes `program4.set_spec_constant<MyConst>(gold)`, |
| 298 | + SYCL runtime converts the call arguments (template and actual) to the following |
| 299 | + pair of datums: |
| 300 | + - the constant name - "MyConst_mangled" |
| 301 | + - the byte array representing the value of the constant (`gold` value) |
| 302 | + |
| 303 | +Then the runtime fetches the sequence of leaf field descriptors (primitive |
| 304 | +constituents) of the composite constant and iterates through each pair invoking |
| 305 | +`piextProgramSetSpecializationConstant` for each. The ID of the constant is |
| 306 | +taken from the sequence, the value - from the byte array obtained for the |
| 307 | +`gold`. |
0 commit comments