[SYCL][CUDA] Enable multiple CUDA arch in the same SYCL build. #2087

Naghasan · 2020-07-10T11:32:00Z

The patch refactored the SYCL offloading to account for bound arch as well as toolchains.
Compilation target are represented as a pair of a toolchain (one toolchain per target triple) and a bound arch (which encodes the target arch).

The LTO step is now performed per toolchain/bound arch pair and
the produced binary are then bundled per toolchain.

The patch also makes the bundler/unbundler use the bound arch when SYCL offloading if enabled.

Signed-off-by: Victor Lomuller [email protected]
Co-authored-by: Alexander Johnston [email protected]

The patch refactored the SYCL offloading to account for bound arch as well as toolchains. Compilation target are represented as a pair of a toolchain (one toolchain per target triple) and a bound arch (which encodes the target arch). The LTO step is now performed per toolchain/bound arch pair and the produced binary are then bundled per toolchain. The patch also makes the bundler/unbundler use the bound arch when SYCL offloading if enabled. Signed-off-by: Victor Lomuller <[email protected]> Co-authored-by: Alexander Johnston <[email protected]>

clang/lib/Driver/Driver.cpp

clang/test/Driver/sycl-offload.c

clang/lib/Driver/Driver.cpp

Co-authored-by: Artem Gindinson <[email protected]>

Signed-off-by: Victor Lomuller <[email protected]>

bader · 2020-07-22T17:22:26Z

@mdtoguchi, @AGindinson, please, take a look at this change.

clang/test/Driver/sycl-offload-nvptx.cpp

mdtoguchi · 2020-07-22T21:09:00Z

clang/lib/Driver/Driver.cpp

+        auto &LI = LinkInputEnum.value();
+        const ToolChain *TC = SYCLTargetInfoList[LinkInputEnum.index()].TC;
+        const char *BoundArch =
+            SYCLTargetInfoList[LinkInputEnum.index()].BoundArch;


This for loop is pretty big, would it be better to assign BoundArch closer to where it is used for clarity? I guess you could look at it either way due to the fact that LinkInputEnum would be lost at the BoundArch use.

mdtoguchi · 2020-07-22T21:11:41Z

clang/lib/Driver/Driver.cpp

+            // Note: this odd, but the test assert that a integration header
+            // is build no matter what. If only fsycl-add-targets is provided,


Suggested change

// Note: this odd, but the test assert that a integration header

// is build no matter what. If only fsycl-add-targets is provided,

// Note: this is odd, but the test asserts that an integration header

// is built no matter what. If only fsycl-add-targets is provided,

Workaround unsupported freeze insn by: replacing uses of freeze's result with freeze's source or a random (but compilation reproducible) constant if freeze's source is undef/poison deleting freeze insn. Long term solution is to add a freeze instruction extension in SPIR-V. Issue is tracked in (#1140) Signed-off-by: Lu, John <[email protected]> Original commit: KhronosGroup/SPIRV-LLVM-Translator@ed25856

Naghasan requested review from AGindinson and mdtoguchi as code owners July 10, 2020 11:32

Naghasan force-pushed the victor/cuda-multi-arch branch from 451bbd3 to fe2b7e5 Compare July 10, 2020 11:36

Ruyk added the cuda CUDA back-end label Jul 10, 2020

Ruyk requested a review from bader July 10, 2020 11:41

AGindinson reviewed Jul 10, 2020

View reviewed changes

clang/lib/Driver/Driver.cpp Outdated Show resolved Hide resolved

clang/test/Driver/sycl-offload.c Outdated Show resolved Hide resolved

clang/lib/Driver/Driver.cpp Outdated Show resolved Hide resolved

Naghasan and others added 2 commits July 13, 2020 15:39

Apply suggestions from code review

bc459ba

Co-authored-by: Artem Gindinson <[email protected]>

[SYCL][CUDA] Use empty() instead of size()

190032a

Signed-off-by: Victor Lomuller <[email protected]>

bader requested a review from AGindinson July 16, 2020 16:33

bader approved these changes Jul 22, 2020

View reviewed changes

mdtoguchi reviewed Jul 22, 2020

View reviewed changes

clang/test/Driver/sycl-offload-nvptx.cpp Show resolved Hide resolved

mdtoguchi reviewed Jul 22, 2020

View reviewed changes

AidanBeltonS mentioned this pull request Sep 28, 2021

[SYCL] Target ordering breaks compilation #3631

Closed

github-actions bot added the Stale label Feb 18, 2022

github-actions bot closed this Mar 21, 2022

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

[SYCL][CUDA] Enable multiple CUDA arch in the same SYCL build. #2087

[SYCL][CUDA] Enable multiple CUDA arch in the same SYCL build. #2087

Uh oh!

Naghasan commented Jul 10, 2020

Uh oh!

Uh oh!

Uh oh!

Uh oh!

bader commented Jul 22, 2020

Uh oh!

Uh oh!

mdtoguchi Jul 22, 2020

Uh oh!

mdtoguchi Jul 22, 2020

Uh oh!

Uh oh!

		// Note: this odd, but the test assert that a integration header
		// is build no matter what. If only fsycl-add-targets is provided,

[SYCL][CUDA] Enable multiple CUDA arch in the same SYCL build. #2087

[SYCL][CUDA] Enable multiple CUDA arch in the same SYCL build. #2087

Uh oh!

Conversation

Naghasan commented Jul 10, 2020

Uh oh!

Uh oh!

Uh oh!

Uh oh!

bader commented Jul 22, 2020

Uh oh!

Uh oh!

mdtoguchi Jul 22, 2020

Choose a reason for hiding this comment

Uh oh!

mdtoguchi Jul 22, 2020

Choose a reason for hiding this comment

Uh oh!

Uh oh!