-
Notifications
You must be signed in to change notification settings - Fork 787
[SYCL][CUDA] Enable multiple CUDA arch in the same SYCL build. #2087
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
The patch refactored the SYCL offloading to account for bound arch as well as toolchains. Compilation target are represented as a pair of a toolchain (one toolchain per target triple) and a bound arch (which encodes the target arch). The LTO step is now performed per toolchain/bound arch pair and the produced binary are then bundled per toolchain. The patch also makes the bundler/unbundler use the bound arch when SYCL offloading if enabled. Signed-off-by: Victor Lomuller <[email protected]> Co-authored-by: Alexander Johnston <[email protected]>
451bbd3
to
fe2b7e5
Compare
Co-authored-by: Artem Gindinson <[email protected]>
Signed-off-by: Victor Lomuller <[email protected]>
@mdtoguchi, @AGindinson, please, take a look at this change. |
auto &LI = LinkInputEnum.value(); | ||
const ToolChain *TC = SYCLTargetInfoList[LinkInputEnum.index()].TC; | ||
const char *BoundArch = | ||
SYCLTargetInfoList[LinkInputEnum.index()].BoundArch; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This for
loop is pretty big, would it be better to assign BoundArch
closer to where it is used for clarity? I guess you could look at it either way due to the fact that LinkInputEnum
would be lost at the BoundArch
use.
// Note: this odd, but the test assert that a integration header | ||
// is build no matter what. If only fsycl-add-targets is provided, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
// Note: this odd, but the test assert that a integration header | |
// is build no matter what. If only fsycl-add-targets is provided, | |
// Note: this is odd, but the test asserts that an integration header | |
// is built no matter what. If only fsycl-add-targets is provided, |
Workaround unsupported freeze insn by: replacing uses of freeze's result with freeze's source or a random (but compilation reproducible) constant if freeze's source is undef/poison deleting freeze insn. Long term solution is to add a freeze instruction extension in SPIR-V. Issue is tracked in (#1140) Signed-off-by: Lu, John <[email protected]> Original commit: KhronosGroup/SPIRV-LLVM-Translator@ed25856
The patch refactored the SYCL offloading to account for bound arch as well as toolchains.
Compilation target are represented as a pair of a toolchain (one toolchain per target triple) and a bound arch (which encodes the target arch).
The LTO step is now performed per toolchain/bound arch pair and
the produced binary are then bundled per toolchain.
The patch also makes the bundler/unbundler use the bound arch when SYCL offloading if enabled.
Signed-off-by: Victor Lomuller [email protected]
Co-authored-by: Alexander Johnston [email protected]