Skip to content

[SYCL] Implement specialization constants. #1356

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
merged 9 commits into from
Mar 27, 2020

Conversation

kbobrovs
Copy link
Contributor

@kbobrovs kbobrovs commented Mar 20, 2020

Implementation is done according to:

Known deficiencies:

  • the SPIRV interop part is not implemented yet
  • only primitive types are supported
  • specific spec constant setting has global effect - i.e. all programs which depend on it will use the latest set value; that's might be not what the specification proposal meant; will be fixed either with this proposed spec or with the upcoming new one

The first three commits are not really SYCL specific and separate PRs will be created. Those are also planned to go to llvm upstream.
property I/O: #1357
table library: #1358
file-table-tform: #1359

A lot of complexity stems from the fact that SPIRV identifies specialization constants by integer ID. It is impossible to assign integer IDs at translation unit compilation time because of separate compilation and potential ID conflict between TUs. So, clang instead generates special intrinsic aka get_spec_constant_value_by_name("spec constant name generated from its type ID into int header") .

Then after linkage, the sycl-post-link tool
(1) runs an LLVMIR pass which assigns integer ID to each spec constant and replaces the get_spec_constant_value_by_name intrinsic with a standard SPIRV intrinsic to get spec constant value with integer ID
(2) saves a map "name"->"int ID" into a file which is later inserted by the clang offload wrapper into the device binary descriptor created for device code (SPIRV module).

At runtime, SYCL runtime receives a request to set a spec constant with given name to some value , uses the "name"->ID map to resolve it to an integer and call piProgramSetSpecializationConstant with the corresponding int ID.

To store the "name"->ID map a "property set" library is developed, which can store arbitrary sets of properties. ClangOffloadWrapper is able to parse them and insert into the wrapper object. Later entries, manifest, build options, etc (as well as future information accompanying the device code) can move to this mechanism instead of having separate fields in the binary descriptor.

Another complexity is clang driver's inability to conveniently model multiple dynamic inputs/outputs in the action graph. The file-table-tform tool fills the gap.
See also comments in the c9a794c.

Copy link
Contributor

@Fznamznon Fznamznon left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Review for post-link part.
Waiting for tests for new functionality.

@kbobrovs kbobrovs force-pushed the spec-const-ALL branch 2 times, most recently from 2223f8c to dc40485 Compare March 23, 2020 03:53
@kbobrovs
Copy link
Contributor Author

kbobrovs commented Mar 23, 2020

@kbobrovs kbobrovs force-pushed the kbobrovs:spec-const-ALL branch from 2223f8c to dc40485 2 minutes ago:

  • Rebased to origin/sycl
  • Applied clang format
  • Added spec_const_redefine.cpp test (under XFAIL for now)
  • Re-enabled spec_const_hw.cpp on CPU device (required llvm-spirv fixes are now in place)

@kbobrovs kbobrovs force-pushed the spec-const-ALL branch 2 times, most recently from 4e0b174 to 316a855 Compare March 24, 2020 00:29
@kbobrovs
Copy link
Contributor Author

Dear reviewers - @AGindinson, @erichkeane, @AlexeySachkov, @Fznamznon, @sndmitriev, @smaslov-intel, @romanovvlad, @mdtoguchi, @sergey-semenov - please take another look. All prior comments have been addressed to the best of my knowledge.

@kbobrovs
Copy link
Contributor Author

@kbobrovs kbobrovs force-pushed the kbobrovs:spec-const-ALL branch from 4e0b174 to 316a855 38 minutes ago

rebase, squashed deploy dependence commit, replaced first three commits with updated after code review. 2 review comments from @erichkeane to be fixed in FE

@kbobrovs
Copy link
Contributor Author

kbobrovs force-pushed the kbobrovs:spec-const-ALL branch from 316a855 to 8766334 3 hours ago

two review comments in FE (SemaSYCL.cpp) addressed

Copy link
Contributor

@erichkeane erichkeane left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Couple of small comments on the CFE parts, otherwise LGTM. Looks like others will have to approve as most of this is not CFE.

@kbobrovs kbobrovs force-pushed the spec-const-ALL branch 2 times, most recently from 49fa9bc to 3324fb3 Compare March 25, 2020 03:47
@kbobrovs
Copy link
Contributor Author

@kbobrovs kbobrovs force-pushed the kbobrovs:spec-const-ALL branch from 49fa9bc to 3324fb3 6 minutes ago

addressed review comments from @erichkeane , @Fznamznon , @mdtoguchi

@kbobrovs
Copy link
Contributor Author

@bader, when merging, please merge these first:
property I/O: #1357
table library: #1358
file-table-tform: #1359
then this one, excluding the first three:
@kbobrovs
Implement a property set I/O library. …
0f04616
@kbobrovs
Implement a simple string tabular data management library. …
36dff1c
@kbobrovs
Add a file-table-tform tool: manipulating tabular string data files. …
745170a

let me know if you need updates from me

Copy link
Contributor

@erichkeane erichkeane left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

CFE Changes LGTM.

Copy link
Contributor

@Fznamznon Fznamznon left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

sycl-post-link part LGTM

@Fznamznon
Copy link
Contributor

BTW, It would be nice to have spec constants design explanation somewhere in this doc https://github.com/intel/llvm/blob/sycl/sycl/doc/CompilerAndRuntimeDesign.md someday.

@kbobrovs
Copy link
Contributor Author

@Fznamznon, @sndmitriev, @smaslov-intel, @romanovvlad, @mdtoguchi - please approve post-link, offload wrapper, PI, RT and driver changes if you are OK with current code

@Fznamznon
Copy link
Contributor

@Fznamznon, @sndmitriev, @smaslov-intel, @romanovvlad, @mdtoguchi - please approve post-link, offload wrapper, PI, RT and driver changes if you are OK with current code

Post link part is approved #1356 (review)

@mdtoguchi
Copy link
Contributor

LGTM for Driver. Thanks!

sndmitriev
sndmitriev previously approved these changes Mar 26, 2020
Copy link
Contributor

@sndmitriev sndmitriev left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

clang-offload-wrapper changes look okay,

smaslov-intel
smaslov-intel previously approved these changes Mar 26, 2020
Copy link
Contributor

@smaslov-intel smaslov-intel left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

PI-related changes look good to me.

@kbobrovs
Copy link
Contributor Author

BTW, It would be nice to have spec constants design explanation somewhere in this doc https://github.com/intel/llvm/blob/sycl/sycl/doc/CompilerAndRuntimeDesign.md someday.

yes, agree. But let's do this after the API and the implementation is finalized.

@kbobrovs kbobrovs dismissed stale reviews from smaslov-intel and sndmitriev via eb382f7 March 27, 2020 04:26
@kbobrovs
Copy link
Contributor Author

@kbobrovs kbobrovs force-pushed the kbobrovs:spec-const-ALL branch from 5ef5cae to eb382f7 5 minutes ago

resolved a conflict in comment due to change in #1359

Copy link
Contributor

@romanovvlad romanovvlad left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM. Minor comment.

Add a spec constant lowering pass to sycl-post-link tool. Support
file table output format.

Signed-off-by: Konstantin S Bobrovsky <[email protected]>
New PI API added:
pi_result piProgramSetSpecializationConstant(pi_program prog, pi_uint32 spec_id,
                                             size_t spec_size,
                                             const void *spec_value);

Signed-off-by: Konstantin S Bobrovsky <[email protected]>
…I version.

This change breaks backward binary compatibility of device binary image descriptors.

Signed-off-by: Konstantin S Bobrovsky <[email protected]>
…d wrapper.

1. New option - "-properties=<file>". <file> must be a property set registry
file, as defined by llvm/Support/PropertySetIO.h. The wrapper will add the
property sets to the binary image descriptor and the them available to the
runtime.

2. New options - "-batch". With this option the only input can be a file table,
as defined by llvm/Support/SimpleTable.h. Column names are a part of interface
between this tool and the sycl-post-link, which produces the file table.

3. Binary image descriptor LLVM type updated to resemble changes in Plugin
Interface v1.2.

Signed-off-by: Konstantin S Bobrovsky <[email protected]>
1. Detect kernel lambda object captures corresponding to specialization
constants and (a) don't create kernel arguments for them (b) generate
specializations of the SpecConstantInfo structure into the integration
header.

2. Recognize the __sycl_fe_getStableUniqueTypeName intrinsic and replace
it with a string literal uniquely identifying the type of the typename
template parameter to this intrinsic.

3. FE-related changes in the runtime:
- new SpecConstantInfo templated struct for type->name translation for
  specialization constants used by integration header
- define the __sycl_fe_getStableUniqueTypeName intrinsic

Signed-off-by: Konstantin S Bobrovsky <[email protected]>
Based on
https://github.com/codeplaysoftware/standards-proposals/blob/master/spec-constant/index.md

1. Define SYCL API (sycl/include/CL/sycl/experimental/spec_constant.hpp)
2. Add convenience C++ wrappers for PI device binary structures and refactor
   runtime to use the wrappers. Get rid of custom deleters for binary images.
3. Implement SYCL spec constant APIs in program an program manager.

Signed-off-by: Konstantin S Bobrovsky <[email protected]>
Clang driver's design can't handily model
(1) multiple inputs/outputs in the action graph. Because of that, for
example, sycl-post-link tool is invoked twice - once to to split the code
and produce multiple bitcode files, and secondly - to generate symbol
files for the split modules.
(2) "Clusters" of inputs/outputs, when subsets of inputs/outputs are
associated and describe different aspects of the same data. Example of
such clustering is the split module + its symbol file above. Clustering
would require support both in the driver and the tools invoked in
response to actions.

This commit moves SYCL offload processing to the "file table concept."
sycl-post-link instead of
(1) being invoked n times, once per each output type requested (once for
    device split and once for symbol file generation)
(2) outputting multiple file lists each listing outputs from the
    corresponding invocation above
is now invoked once and produces single file table output. E.g.
  [Code|Symbols|Properties]
  a_0.bc|a_0.sym|a_0.props
  a_1.bc|a_1.sym|a_1.props
This solves both problems - multiple input/output and clustering.
Combined with the file-table-tform tool, this allows for efficent handling
of multiple clusters of files (each represented as a row in the table file)
in the clang driver infrastructure.
For example, there is a real offload processing problem:
step1. sycl-post-link outputs N clusters of files
step2. "Code" file of each cluster resuilting from step1 ({a_0.bc, a_1.bc}
       in the example above) must undergo further transformations -
       translation to SPIRV and optional ahead-of-time compilation.
step3. In each cluster resulting from step1 the "Code" file needs to be
       replaced with the result of step2
step4. All the clusters are processed by the ClangOffloadWrapper tool, which
       needs to know how files are distributed into clusters and what is
       the roles of each file in a cluster - whether it is "Code", "Symbol"
       or "Properties".
To solve this, the following action graph is constructed in the clang driver:

                        column:"Code"
t1 -> [file-table-tform:extract column] -> t1a -> [for-each:] -> t1b
                                                  llvm-spirv
                                                   aot-comp
t1
   \                  column:"Code"
    [file-table-tform:replace column] -> t2 -> [ClangOffloadWrapper]
   /
t1b

where t1b is ["Code"]  and t2 is  [Code|Symbols|Properties]
              a_0.bin             a_0.bin|a_0.sym|a_0.props
              a_1.bin             a_1.bin|a_1.sym|a_1.props

Note that the graph does not change with growing number of clusters, neither
it changes when more files are added to each cluster (e.g. a "Manifest" file).

Signed-off-by: Konstantin S Bobrovsky <[email protected]>
CPU OpenCL Runtime on build machines is not updated yet.

Signed-off-by: Konstantin S Bobrovsky <[email protected]>
@kbobrovs
Copy link
Contributor Author

@bader, I've got approvals from component owners - please take a look and merge.

@pvchupin pvchupin self-requested a review March 27, 2020 19:57
Copy link
Contributor

@pvchupin pvchupin left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Synced with @kbobrovs. CPU test disabled because requires CPU RT uplift required. He will enable it with followup commit. Other approvals taken.

@pvchupin pvchupin merged commit 29abe37 into intel:sycl Mar 27, 2020
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

9 participants