intel · pvchupin · Jun 30, 2021 · Jun 30, 2021 · Jun 30, 2021 · Jun 30, 2021
@@ -317,13 +317,18 @@ used:
 The driver passes the `-device skl` parameter directly to the Gen device backend
 compiler `ocloc` without parsing it.
 
-**TBD:** Having multiple code forms for the same target in the fat binary might
-mean invoking device compiler multiple times. Multiple invocations are not
-needed if these forms can be dumped at various compilation stages by the single
-device compilation, like SPIR-V → visa → ISA. But if e.g. `gen9:visa3.2` and
-`gen9:visa3.3` are needed at the same time, then some mechanism is needed.
-Should it be a dedicated target triple for each needed visa version or Gen
-generation?
+`ocloc` is also capable of offline compilation for several ISA
+versions/Gen architectures. For example, to make the device binary
+compatible with all Intel Gen9 GPU platforms, one could use:
+
+```
+-fsycl -fsycl-targets=spir64_gen-unknown-unknown-sycldevice
+-Xsycl-target-backend "-device gen9"
+```
+
+For more details on supported platforms and argument syntax, refer to
+the GPU offline compiler manual by detecting your local `ocloc`
+installation and running `ocloc compile --help`.
 
 #### Separate Compilation and Linking
 

@@ -29,21 +29,36 @@ specific build options.
 linked into the final binary, the compilation steps sequence is more
 complicated compared to the usual C++ flow.
 
-In general, we encourage our users to rely on the DPC++ Compiler for handling
-all of the compilation phases "under the hood". However, thorough understanding
-of the above-described steps may allow you to customize your compilation by
-invoking different phases manually. As an example, you could:
-1. preprocess your host code with another C++-capable compiler;
-2. turn to the DPC++ compiler for generating the integration header and
-compiling the device code for the needed target(s);
-3. use your preferred host compiler from 1) to compile your preprocessed host
-code and the integration header into a host object file;
-4. link the host object file and the device image(s) into the final executable.
-
-To learn more about the concepts behind this flow, and the DPC++ Compiler
-internals as such, we welcome you to study our
-[DPC++ Compiler and Runtime architecture design](CompilerAndRuntimeDesign.md)
-document.
+In general, we encourage our users to rely on the DPC++ Compiler for
+handling all of the compilation phases "under the hood". However,
+certain use-cases may involve third-party compiler usage for host-side
+compilation. The DPC++ compiler provides the
+`-fsycl-host-compiler=<compiler_name>` option, which allows one to
+specify the desired third-party compiler name. Usage example:
+
+```
+clang++ -fsycl -fsycl-host-compiler=g++
+-fsycl-host-compiler-options="-g" test.cpp
+```
+
+
+Implicitly, the above command would:
+1. turn to the DPC++ compiler for compiling the device code for the
+needed target(s) and generating dependencies ("integration files") for
+the host side;
+2. detect your preferred host compiler (`g++`, in this case), then use
+it to compile your host code and the dependency files from 1) into
+host object file(s);
+3. link the device image(s) from 1) and the host object(s) from 2)
+into the final executable.
+
+To learn more about the compiler options mentioned, and the DPC++
+compiler command-line interface in general, please refer to the
+[DPC++ Compiler User Manual](UsersManual.md).
+To learn more about the concepts behind this flow, and the DPC++
+Compiler internals as such, we welcome you to study our
+[DPC++ Compiler and Runtime architecture design](
+CompilerAndRuntimeDesign.md) document.
 
 
 ## Using applications built with DPC++