[SYCL] Fix alignment of emulated specialization constants #6132

npmiller · 2022-05-10T09:53:10Z

This patch solves #6093, and finishes to fix #5911

It doesn't change anything for native specialization constants, but correctly aligns emulated specialization constants based on type requirements.

Emulated specialization constant don't use the CompositeOffset mechanism, so this patch re-uses this field to communicate the necessary padding from the compiler pass in sycl-post-link to the runtime to ensure correct alignment.

With this patch the SYCL-CTS specialization constant tests are all passing with the CUDA plugin:

% ./bin/test_specialization_constants  
===============================================================================
All tests passed (56 assertions in 46 test cases)

Note this is on top of #6125

aelovikov-intel · 2022-05-10T18:26:59Z

It seems that some changes are done in sync in both sycl-post-link and inside SYCL runtime. What would be the behavior if we'd use the compilation toolchain (including sycl-post-link) before this change and runtime after it? Is it correct to assume that nothing that wouldn't be failing with old runtime would fail in this scenario?

npmiller · 2022-05-11T08:34:45Z

It seems that some changes are done in sync in both sycl-post-link and inside SYCL runtime. What would be the behavior if we'd use the compilation toolchain (including sycl-post-link) before this change and runtime after it? Is it correct to assume that nothing that wouldn't be failing with old runtime would fail in this scenario?

I believe that would work fine yes.

This patch doesn't change the format of how the specialization constants are communicated between the compilation and the runtime, it simply re-uses a field that's unused for emulated spec constants to communicate the padding, and that field was previously always set to 0 for emulated specialization constants.

So the new runtime using an old binary will read that field and add it as padding but since it's always 0 it will end up with the exact same layout as the old runtime.

asudarsa · 2022-05-12T01:42:25Z

sycl/source/detail/program_manager/program_manager.cpp

-                  NativePrg, SpecIDDesc.ID, SpecIDDesc.Size,
-                  SpecConsts.data() + SpecIDDesc.BlobOffset);
-            }
+    if (!DeviceCodeWasInCache &&


I do not see any changes (other than some format changes) here. Can we do away with this change? Thanks

There is a change, the whole block is now conditioned on InputImpl->get_bin_image_ref()->supportsSpecConstants() when before it was only the call to enableITTAnnotationsIfNeeded, this is necessary because piextProgramSetSpecializationConstant is only supposed to be called for native specialization constants.

It's a little hard to see because of the way it was written originally, "one-line" if with no brackets followed by a scope.

Also note that this is part of #6125 , maybe I should fuse the two MRs.

asudarsa · 2022-05-12T01:44:09Z

sycl/source/detail/device_image_impl.hpp

+            if (It[0] != std::numeric_limits<std::uint32_t>::max()) {
+              // The map is not locked here because updateSpecConstSymMap() is
+              // only supposed to be called from c'tor.
+              MSpecConstSymMap[std::string{SCName}].push_back(


This call can be moved out of the if-then-else (nit pick). Thanks

Extracted it and also re-worked this part of the code so it's more readable, mainly re-named the iterators.