-
Notifications
You must be signed in to change notification settings - Fork 787
[Driver][SYCL] Improve dependency file behaviors for -fintelfpga #1145
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[Driver][SYCL] Improve dependency file behaviors for -fintelfpga #1145
Conversation
This patch improves the tool's diagnostic upon finding a SPIR kernel within an LLVM module. Despite that the tool's only current use is within the SYCL FPGA flow, it's important to make the message target-agnostic, so that the tool is not tied to a particular device BE. A related commit to the Clang driver has extended these diagnostics with SYCL FPGA specifics without affecting the tool itself. This patch also introduces testing for the return code value. For example, this should allow the Clang driver users/developers to differentiate between the two possible causes of llvm-no-spir-kernel failure. Signed-off-by: Artem Gindinson <[email protected]>
Signed-off-by: Alexey Bader <[email protected]>
intel#1141) Signed-off-by: Aleksander Fadeev <[email protected]>
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM overall, just a couple more nits
// RUN: %clangxx -### -fsycl -fintelfpga %t-1.o %t-2.o 2>&1 \ | ||
// RUN: | FileCheck -check-prefix=CHK-FPGA-DEP-FILES-OBJ -DINPUT1=%t-1.o -DINPUT2=%t-2.o %s |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Would a clang-cl run be also applicable? It may be worth adding as per the usual "just in case"
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Added a clang_cl test here.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Could you also add it for other tests? Sorry for missing this initially
// RUN: %clangxx -### -fsycl -fintelfpga %t-1.o %t-2.o 2>&1 \ | ||
// RUN: | FileCheck -check-prefix=CHK-FPGA-DEP-FILES-OBJ -DINPUT1=%t-1.o -DINPUT2=%t-2.o %s |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Could you also add it for other tests? Sorry for missing this initially
Move internal headers from include/CL/sycl to source directory to prevent implementation details leak to user application and enforce stable ABI. A few more changes were applied to make the movement possible: - addHostAccessorAndWait functions in accessor to avoid calls to RT internals from header file - Removed getImageInfo - Move buffer size acquisition from buffer constructor to SYCLMemObjT cpp to avoid calls to PI - getPluginFromContext function in context - Standard containers replaced with SYCL variants in sycl_mem_obj_i.hpp. Unique ptr replaced with shared - A few implementations moved from queue.hpp to queue.cpp - Some LIT tests temporarily include implementaion specific headers. They will be converted to unit tests later. Signed-off-by: Alexander Batashev <[email protected]>
intel#1144) Since we really just want to be able to memcpy the type to the device, 'is-trivially-copyable' is not the correct trait. Since CWG1734, If we want to support trivially copyable types, we would be required to create 1 of 4 different mechanisms for having a type on the device (depending on the way the type is structured). Additionally, 2 of these ways require us to ALSO have the type be default constructible. This patch transitions to trivially-copy-constructible , so that we can simply memcpy from the existing one into new memory. Signed-off-by: Erich Keane <[email protected]>
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks!
intel#1118) Signed-off-by: James Brodman <[email protected]>
LowerWGScope pass performs required transformations to enable hierarchical parallelism semantics. This pass should not be skipped even if optimizations are disabled. Also some typos in the comments are fixed. Signed-off-by: Artur Gainullin <[email protected]>
…el#1156) After intel#1068 has included the Demangle header, this fix to CMakeLists should guarantee successful builds in all configurations Signed-off-by: Artem Gindinson <[email protected]>
SPIR-V OpGroupBroadcast accepts three forms of local ID: - scalar integer - vector integer with 2 components - vector integer with 3 components Signed-off-by: John Pennycook <[email protected]>
Also remove idle semicolon. Signed-off-by: Alexey Bader <[email protected]>
…#1162) Fix the cl_device_unified_shared_memory_capabilities_intel bitfield type name. Signed-off-by: Alexey Bader <[email protected]>
* [SYCL][LIBCLC] Additional libclc builtins to support SYCL work Adds builtins to libclc to support the CUDA backend for SYCL. Contributors Alexander Johnston <[email protected]> David Wood <[email protected]> Victor Lomuller <[email protected]> Signed-off-by: Alexander Johnston <[email protected]> * [SYCL] CMake and lit support for SYCL CUDA backend Adds defines CMake and lit variables used for SYCL CUDA backend development and test Contributors Alexander Johnston <[email protected]> Bjoern Knafla <[email protected]> Ruyman Reyes <[email protected]> Signed-off-by: Alexander Johnston <[email protected]> * [SYCL] Local Accessor Support for CUDA Provides the LocalAccessorToSharedMemory compiler pass required for supporting SYCL local accessors in CUDA. Contributors Alexander Johnston <[email protected]> David Wood <[email protected]> Signed-off-by: Alexander Johnston <[email protected]> * [SYCL][CUDA] Change __spirv_BuiltIn.. to functions Changes the following builtins to functions __spirv_BuiltInGlobalSize __spirv_BuiltInWorkgroupSize __spirv_BuiltInNumWorkgroups __spirv_BuiltInLocalInvocationId __spirv_BuiltInWorkgroupId __spirv_BuiltInGlobalOffset Contributors David Wood <[email protected]> Signed-off-by: Alexander Johnston <[email protected]> * [SYCL][CUDA] Add SYCL CUDA support to clang driver Adds CUDA support for sycl compilation in the clang driver Contributors Alexander Johnston <[email protected]> David Wood <[email protected]> Victor Lomuller <[email protected]> Signed-off-by: Alexander Johnston <[email protected]> * [SYCL][CUDA] Initial Implementation of the CUDA backend Contributors Alan Forbes <[email protected]> Alexander Johnston <[email protected]> Bjoern Knafla <[email protected]> Daniel Soutar <[email protected]> David Wood <[email protected]> Kumudha Narasimhan <[email protected]> Mehdi Goli <[email protected]> Przemek Malon <[email protected]> Ruyman Reyes <[email protected]> Stuart Adams <[email protected]> Svetlozar Georgiev <[email protected]> Steffen Larsen <[email protected]> Victor Lomuller <[email protected]> Signed-off-by: Alexander Johnston <[email protected]> * [SYCL] Update libclc install rules Have libclc install clc-* and libspirv-* to lib and share Signed-off-by: Alexander Johnston <[email protected]> * [SYCL][CUDA] Inline cl namespace to simplify SYCL API usage Synchronise the CUDA backend with the general SYCL changes from intel#974. Signed-off-by: Andrea Bocci <[email protected]> * Added missing flags for device-side builtins Signed-off-by: Alexander Johnston <[email protected]> * [SYCL][CUDA] Removing unnecessary tool from the tree Acked-by: Victor Lomuller <[email protected]> Signed-off-by: Ruyman <[email protected]> * [SYCL][PI] Fix kernel group info parameter conversion Signed-off-by: Steffen Larsen <[email protected]> * [SYCL][CUDA] Refactor __SYCL_INLINE macro Synchronise the CUDA backend with the general SYCL changes from intel#1121. Signed-off-by: Andrea Bocci <[email protected]> * [SYCL] Have default_selector consider SYCL_BE Have the default_selector consider the env var SYCL_BE when rating device scores to make choosing a backend easier. Signed-off-by: Alexander Johnston <[email protected]> * [SYCL] Select GlobalPlugin based on SYCL_BE Rather than choose the last found plugin as GlobalPlugin, select it depending on the SYCL_BE env var. Signed-off-by: Alexander Johnston <[email protected]> * [SYCL] Improve default device selection checks Better checks for CUDA and OpenCL devices to match with SYCL_BE in the default device selection, based on the platform version info. Signed-off-by: Alexander Johnston <[email protected]> * [SYCL] Formatting update for device_selector.cpp Signed-off-by: Alexander Johnston <[email protected]> * [SYCL] Changed CUDA unit tests to call through plugin Signed-off-by: Steffen Larsen <[email protected]> * [SYCL] Pass SYCL_BE=PI_OPENCL in check-sycl To ensure that the check-sycl targets test OpenCL devices, pass SYCL_BE=PI_OPENCL. This mirrors the check-sycl-cuda target which passes SYCL_BE=PI_CUDA. Without this it is nondeterministic which device is tested by check-sycl. Signed-off-by: Alexander Johnston <[email protected]> * [SYCL][CUDA] Remove PI_CUDA specific details from clang Removes PI_CUDA specific code paths and tests from clang, opting to always enable them. Signed-off-by: Alexander Johnston <[email protected]> * [SYCL][CUDA] Disable linear_id/opencl-interop.cpp for cuda Signed-off-by: Alexander Johnston <[email protected]> * [SYCL][CUDA] Further fixes to CUDA device selection Fix platform string comparison for CUDA platform detection. Fix device info platform query so that it uses the device's plugin, rather than the GlobalPlugin. Signed-off-by: Alexander Johnston <[email protected]> * [SYCL][CUDA] Code style and cleanup to CUDA support Signed-off-by: Alexander Johnston <[email protected]> * [SYCL] Enable asserts in all buildbot builds Signed-off-by: Alexander Johnston <[email protected]> * [SYCL][CUDA] Minor test and build configuration Fix minor test and build configuration issues introduced in the development of the CUDA backend. Signed-off-by: Alexander Johnston <[email protected]> Co-authored-by: Andrea Bocci <[email protected]> Co-authored-by: Ruyman <[email protected]> Co-authored-by: Steffen Larsen <[email protected]>
Signed-off-by: Alexey Bader [email protected] Co-Authored-By: Alexander Batashev <[email protected]>
Error was reproducible in two cases: - using something like `numeric_limits<half>::min()` in within another `constexpr` - not treating SYCL headers as system ones with `-Winvalid-constexpr` treated as error Signed-off-by: Alexey Sachkov <[email protected]>
Signed-off-by: Sergey Kanaev <[email protected]>
Event type triggers are misspelled "open"->"opened", etc. Default event type triggers should work fine. Signed-off-by: Alexey Bader <[email protected]>
…1053) We had issue with wrong mangling of s_upsample. I fixed it a long time ago, so we can delete workaround now. Signed-off-by: Ilya Mashkov <[email protected]>
Signed-off-by: Igor Dubinov <[email protected]>
When compiling AOT for FPGA dependency files are generated that are used by the aoc compilation. Single step compilation is seamless as the dependency file is generated then immediately used. When compiling to object, keeping track of the dependency file is not so intuitive. To alleviate problems of not being able to find the dependency file, the original dependency file is bundled up with the destination fat object and when used is unbundled and passed to the aoc compilation. Signed-off-by: Michael D Toguchi <[email protected]>
Signed-off-by: Michael D Toguchi <[email protected]>
Signed-off-by: Michael D Toguchi <[email protected]>
Use new FPGA dependency type which is used for unbundling of the FPGA dependency from an object. Signed-off-by: Michael D Toguchi <[email protected]>
Signed-off-by: Michael D Toguchi <[email protected]>
Use TY_FPGA_Dependencies for generated file from unbundle Clean up comments and add clang_cl specific test for dep unbundle Signed-off-by: Michael D Toguchi <[email protected]>
Signed-off-by: Michael D Toguchi <[email protected]>
Signed-off-by: Michael D Toguchi <[email protected]>
Signed-off-by: Michael D Toguchi <[email protected]>
@mdtoguchi, it looks like something went wrong with your branch. It has irrelevant commits already present in the target branch. |
whoa. I don't know what happened. I rebased to fix the conflict. I'll start from scratch. |
Created a new PR: #1186 |
redo has been merged, closing this one. |
It is translated to a function with unmangled name __spirv_BuildNDRange_{1|2|3}D with struct return parameter and array arguments, since translator only translates it properly to SPIR-V with this signature. _ND postfix is requred because array arguments are mangled in the same way, so if there was no postfix, translator would produce functions with same name for different dimensions. Original commit: KhronosGroup/SPIRV-LLVM-Translator@a6ca745
When compiling AOT for FPGA dependency files are generated that are used
by the aoc compilation. Single step compilation is seamless as the dependency
file is generated then immediately used. When compiling to object, keeping
track of the dependency file is not so intuitive.
To alleviate problems of not being able to find the dependency file, the
original dependency file is bundled up with the destination fat object and
when used is unbundled and passed to the aoc compilation.
Signed-off-by: Michael D Toguchi [email protected]