-
Notifications
You must be signed in to change notification settings - Fork 787
[Driver][SYCL]Emit an error if c compilation is forced using -x c or -x c-header when -fsycl mode is used #1416
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Closed
Conversation
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
CONFLICT (content): Merge conflict in clang/lib/Sema/Sema.cpp
This patch improves the tool's diagnostic upon finding a SPIR kernel within an LLVM module. Despite that the tool's only current use is within the SYCL FPGA flow, it's important to make the message target-agnostic, so that the tool is not tied to a particular device BE. A related commit to the Clang driver has extended these diagnostics with SYCL FPGA specifics without affecting the tool itself. This patch also introduces testing for the return code value. For example, this should allow the Clang driver users/developers to differentiate between the two possible causes of llvm-no-spir-kernel failure. Signed-off-by: Artem Gindinson <[email protected]>
Signed-off-by: Alexey Bader <[email protected]>
intel#1141) Signed-off-by: Aleksander Fadeev <[email protected]>
Signed-off-by: Dmitry Vodopyanov <[email protected]>
Move internal headers from include/CL/sycl to source directory to prevent implementation details leak to user application and enforce stable ABI. A few more changes were applied to make the movement possible: - addHostAccessorAndWait functions in accessor to avoid calls to RT internals from header file - Removed getImageInfo - Move buffer size acquisition from buffer constructor to SYCLMemObjT cpp to avoid calls to PI - getPluginFromContext function in context - Standard containers replaced with SYCL variants in sycl_mem_obj_i.hpp. Unique ptr replaced with shared - A few implementations moved from queue.hpp to queue.cpp - Some LIT tests temporarily include implementaion specific headers. They will be converted to unit tests later. Signed-off-by: Alexander Batashev <[email protected]>
intel#1144) Since we really just want to be able to memcpy the type to the device, 'is-trivially-copyable' is not the correct trait. Since CWG1734, If we want to support trivially copyable types, we would be required to create 1 of 4 different mechanisms for having a type on the device (depending on the way the type is structured). Additionally, 2 of these ways require us to ALSO have the type be default constructible. This patch transitions to trivially-copy-constructible , so that we can simply memcpy from the existing one into new memory. Signed-off-by: Erich Keane <[email protected]>
intel#1118) Signed-off-by: James Brodman <[email protected]>
LowerWGScope pass performs required transformations to enable hierarchical parallelism semantics. This pass should not be skipped even if optimizations are disabled. Also some typos in the comments are fixed. Signed-off-by: Artur Gainullin <[email protected]>
…el#1156) After intel#1068 has included the Demangle header, this fix to CMakeLists should guarantee successful builds in all configurations Signed-off-by: Artem Gindinson <[email protected]>
SPIR-V OpGroupBroadcast accepts three forms of local ID: - scalar integer - vector integer with 2 components - vector integer with 3 components Signed-off-by: John Pennycook <[email protected]>
Also remove idle semicolon. Signed-off-by: Alexey Bader <[email protected]>
…#1162) Fix the cl_device_unified_shared_memory_capabilities_intel bitfield type name. Signed-off-by: Alexey Bader <[email protected]>
* [SYCL][LIBCLC] Additional libclc builtins to support SYCL work Adds builtins to libclc to support the CUDA backend for SYCL. Contributors Alexander Johnston <[email protected]> David Wood <[email protected]> Victor Lomuller <[email protected]> Signed-off-by: Alexander Johnston <[email protected]> * [SYCL] CMake and lit support for SYCL CUDA backend Adds defines CMake and lit variables used for SYCL CUDA backend development and test Contributors Alexander Johnston <[email protected]> Bjoern Knafla <[email protected]> Ruyman Reyes <[email protected]> Signed-off-by: Alexander Johnston <[email protected]> * [SYCL] Local Accessor Support for CUDA Provides the LocalAccessorToSharedMemory compiler pass required for supporting SYCL local accessors in CUDA. Contributors Alexander Johnston <[email protected]> David Wood <[email protected]> Signed-off-by: Alexander Johnston <[email protected]> * [SYCL][CUDA] Change __spirv_BuiltIn.. to functions Changes the following builtins to functions __spirv_BuiltInGlobalSize __spirv_BuiltInWorkgroupSize __spirv_BuiltInNumWorkgroups __spirv_BuiltInLocalInvocationId __spirv_BuiltInWorkgroupId __spirv_BuiltInGlobalOffset Contributors David Wood <[email protected]> Signed-off-by: Alexander Johnston <[email protected]> * [SYCL][CUDA] Add SYCL CUDA support to clang driver Adds CUDA support for sycl compilation in the clang driver Contributors Alexander Johnston <[email protected]> David Wood <[email protected]> Victor Lomuller <[email protected]> Signed-off-by: Alexander Johnston <[email protected]> * [SYCL][CUDA] Initial Implementation of the CUDA backend Contributors Alan Forbes <[email protected]> Alexander Johnston <[email protected]> Bjoern Knafla <[email protected]> Daniel Soutar <[email protected]> David Wood <[email protected]> Kumudha Narasimhan <[email protected]> Mehdi Goli <[email protected]> Przemek Malon <[email protected]> Ruyman Reyes <[email protected]> Stuart Adams <[email protected]> Svetlozar Georgiev <[email protected]> Steffen Larsen <[email protected]> Victor Lomuller <[email protected]> Signed-off-by: Alexander Johnston <[email protected]> * [SYCL] Update libclc install rules Have libclc install clc-* and libspirv-* to lib and share Signed-off-by: Alexander Johnston <[email protected]> * [SYCL][CUDA] Inline cl namespace to simplify SYCL API usage Synchronise the CUDA backend with the general SYCL changes from intel#974. Signed-off-by: Andrea Bocci <[email protected]> * Added missing flags for device-side builtins Signed-off-by: Alexander Johnston <[email protected]> * [SYCL][CUDA] Removing unnecessary tool from the tree Acked-by: Victor Lomuller <[email protected]> Signed-off-by: Ruyman <[email protected]> * [SYCL][PI] Fix kernel group info parameter conversion Signed-off-by: Steffen Larsen <[email protected]> * [SYCL][CUDA] Refactor __SYCL_INLINE macro Synchronise the CUDA backend with the general SYCL changes from intel#1121. Signed-off-by: Andrea Bocci <[email protected]> * [SYCL] Have default_selector consider SYCL_BE Have the default_selector consider the env var SYCL_BE when rating device scores to make choosing a backend easier. Signed-off-by: Alexander Johnston <[email protected]> * [SYCL] Select GlobalPlugin based on SYCL_BE Rather than choose the last found plugin as GlobalPlugin, select it depending on the SYCL_BE env var. Signed-off-by: Alexander Johnston <[email protected]> * [SYCL] Improve default device selection checks Better checks for CUDA and OpenCL devices to match with SYCL_BE in the default device selection, based on the platform version info. Signed-off-by: Alexander Johnston <[email protected]> * [SYCL] Formatting update for device_selector.cpp Signed-off-by: Alexander Johnston <[email protected]> * [SYCL] Changed CUDA unit tests to call through plugin Signed-off-by: Steffen Larsen <[email protected]> * [SYCL] Pass SYCL_BE=PI_OPENCL in check-sycl To ensure that the check-sycl targets test OpenCL devices, pass SYCL_BE=PI_OPENCL. This mirrors the check-sycl-cuda target which passes SYCL_BE=PI_CUDA. Without this it is nondeterministic which device is tested by check-sycl. Signed-off-by: Alexander Johnston <[email protected]> * [SYCL][CUDA] Remove PI_CUDA specific details from clang Removes PI_CUDA specific code paths and tests from clang, opting to always enable them. Signed-off-by: Alexander Johnston <[email protected]> * [SYCL][CUDA] Disable linear_id/opencl-interop.cpp for cuda Signed-off-by: Alexander Johnston <[email protected]> * [SYCL][CUDA] Further fixes to CUDA device selection Fix platform string comparison for CUDA platform detection. Fix device info platform query so that it uses the device's plugin, rather than the GlobalPlugin. Signed-off-by: Alexander Johnston <[email protected]> * [SYCL][CUDA] Code style and cleanup to CUDA support Signed-off-by: Alexander Johnston <[email protected]> * [SYCL] Enable asserts in all buildbot builds Signed-off-by: Alexander Johnston <[email protected]> * [SYCL][CUDA] Minor test and build configuration Fix minor test and build configuration issues introduced in the development of the CUDA backend. Signed-off-by: Alexander Johnston <[email protected]> Co-authored-by: Andrea Bocci <[email protected]> Co-authored-by: Ruyman <[email protected]> Co-authored-by: Steffen Larsen <[email protected]>
Signed-off-by: Alexey Bader [email protected] Co-Authored-By: Alexander Batashev <[email protected]>
CONFLICT (content): Merge conflict in clang/lib/Sema/SemaChecking.cpp
CONFLICT (content): Merge conflict in clang/lib/Sema/SemaChecking.cpp
Error was reproducible in two cases: - using something like `numeric_limits<half>::min()` in within another `constexpr` - not treating SYCL headers as system ones with `-Winvalid-constexpr` treated as error Signed-off-by: Alexey Sachkov <[email protected]>
Signed-off-by: Sergey Kanaev <[email protected]>
Event type triggers are misspelled "open"->"opened", etc. Default event type triggers should work fine. Signed-off-by: Alexey Bader <[email protected]>
…1053) We had issue with wrong mangling of s_upsample. I fixed it a long time ago, so we can delete workaround now. Signed-off-by: Ilya Mashkov <[email protected]>
Signed-off-by: Igor Dubinov <[email protected]>
During the building x64 Debug configuration of Windows using scripts from buildbot folder, there were two issues: 1. OpenCL ICD Loader failed to build because of the missing OpenCL headers 2. Fatal error C1128: clang\lib\Sema\SemaTemplateDeduction.cpp : number of sections exceeded object file format limit: compile with /bigobj Signed-off-by: Dmitry Vodopyanov <[email protected]>
Signed-off-by: Dmitry Vodopyanov <[email protected]>
It turns out that my original implementation was correct and I just mis-understand the double dot commit range description from ProGit https://git-scm.com/book/en/v2/Git-Tools-Revision-Selection. Signed-off-by: Alexey Bader <[email protected]>
CONFLICT (content): Merge conflict in clang/lib/Sema/SemaChecking.cpp
Signed-off-by: Alexey Sotkin <[email protected]>
Define __SPIRV_BUILTIN_DECLARATIONS__ when passing -fdeclare-spirv-builtins to clang. Signed-off-by: Victor Lomuller <[email protected]>
Added OpenCL SPIR-V extended set builtins bindings and part of the core SPIR-V (mostly missing Images and Pipes) Known vendor extensions are not implemented yet. Signed-off-by: Victor Lomuller <[email protected]> Co-Authored-By: Alexey Bader <[email protected]>
…l#1252) Implementation of piEventSetCallback with tests GlueEvent uses now the correct plugins The SYCL RT code for GlueEvent calls now the right plugin to create the event that triggers the dependency chain. Renamed variables to clarify the source code and avoid confusions between Context and Plugin Signed-off-by: Ruyman Reyes <[email protected]> Signed-off-by: Stuart Adams <[email protected]> Signed-off-by: Steffen Larsen <[email protected]>
Signed-off-by: Stuart Adams <[email protected]>
…ntel#1381) Signed-off-by: gejin <[email protected]>
…#1376) NOTE: This flag is not exposed to the driver and not intended for users. It's added to make experiments and identify issues with optimizations. Signed-off-by: Alexey Bader <[email protected]>
…#1383) By emitting the legacy variant of the LLVM IR alongside the newer representation of the attribute, backwards compatibility with any existing BE implementation is restored. A smooth transition period is thus achieved for the aforementiond BE - until it's able to consume the new LLVM IR, it has an option to simply ignore the unknown metadata. Signed-off-by: Artem Gindinson <[email protected]>
If found alloca command is not sub-buffer alloca, then it's parent alloca which has same context Signed-off-by: Ivan Karachun <[email protected]>
…ntel#1344) Signed-off-by: Michael Kinsner <[email protected]>
Signed-off-by: Alexey Sachkov <[email protected]>
Enable -fdeclare-spirv-builtins for SYCL device compilation mode For device compilation, SPIR-V builtins are now looked up by the device compiler. They now longer need to be forward declared. [SYCL-PTX] Revert manual mangling of some SPIR-V builtins [SYCL-PTX] Add fmod builtin [SYCL-PTX] Update Atomic mangling Signed-off-by: Victor Lomuller <[email protected]>
…<dir> (intel#1346) When using /Fo<dir> the improper dependency file name was generated, causing the bundle step to not be able to locate the dependency file when compiling to object Signed-off-by: Michael D Toguchi <[email protected]>
This patch introduces the following loop attributes: - loop_coalesce: Indicates that the loop nest should be coalesced into a single loop without affecting functionality - speculated_iterations: Specifies the number of concurrent speculated iterations that will be in flight for a loop invocation - disable_loop_pipelining: Disables pipelining of the loop data path, causing the loop to be executed serially - max_interleaving: Places a maximum limit N on the number of interleaved invocations of an inner loop by an outer loop Signed-off-by: Viktoria Maksimova <[email protected]>
Fixed the buffer constructor called with a pair of iterators. The current implementation has a problem due to ambiguous spec. The buffer should never write back data unless there is a call to set_final_data(), but the current implementation does it. I corrected the spec in KhronosGroup/SYCL-Docs#76. So, now we can change the buffer implementation according to the clarified spec. The test case buffer.cpp also needed change because of this change. The user should not expect the automatic write-back of data upon destruction of buffer. Signed-off-by: Byoungro So <[email protected]> Co-authored-by: Ronan Keryell <[email protected]>
A simple library which allows to construct and serialize/deserialize a sequence of typed property sets, where each property is a <name,typed value> pair. To be used in offload tools. Signed-off-by: Konstantin S Bobrovsky <[email protected]>
) The library allows to create, serialize/deserialize tables of strings, insert/delete/replace/rename columns, add rows. To be used in offload tools. Signed-off-by: Konstantin S Bobrovsky <[email protected]>
This reverts commit d357add. Signed-off-by: Vladimir Lazarev <[email protected]>
Signed-off-by: Alexander Batashev <[email protected]>
…ntel#1359) Signed-off-by: Konstantin S Bobrovsky <[email protected]>
…for (intel#1348) The kernel callable being invoked from an nd_range parallel_for is accepting an id argument, while it should be nd_item. After my analysis, I found we check arguments' type for kernel_parallel_for instead of parallel_for. But that check is useless, because the compiler can still find a candidate for kernel_parallel_for with nd_range and id which is a wrong combination. In my solution, parallel_for with nd_range calls kernel_parallel_for_nd_range(...) which is only available for nd_item. Signed-off-by: Bing1 Yu <[email protected]>
Implements a few code simplification/unification for LowerWGScope. Signed-off-by: Victor Lomuller <[email protected]>
…tel#1405) For NVPTX target address space inference for kernel arguments and allocas is happening in the backend (NVPTXLowerArgs and NVPTXLowerAlloca passes). After frontend these pointers are in LLVM default address space 0 which is the generic address space for NVPTX target. Perform address space cast of a pointer to the shadow global variable from the local to the generic address space before replacing all usages of a byval argument. Signed-off-by: Artur Gainullin <[email protected]>
- Adds static members to sub_group class. - sub_group member functions marked deprecated, to be removed later. - SPIR-V helpers expanded to convert SYCL group to SPIR-V scope. - Add workaround for half types Signed-off-by: John Pennycook <[email protected]>
Whereas it is not possible to generate vector of bools in FE, we have to change return type for corresponding instructions in SPIRV translator to vector of bools. SPIRV translator already did this for some instructions, this patch extends this behaviour to handle more instructions.
Adding doxygen documentation to PI CUDA backend. Some code is re-ordered in the file to help sorting the doxygen. Co-Authored-By: Alexey Bader <[email protected]> Co-Authored-By: Alexander Batashev <[email protected]> Co-Authored-By: Romanov Vlad <[email protected]> Signed-off-by: Ruyman Reyes <[email protected]>
Based on https://github.com/codeplaysoftware/standards-proposals/blob/master/spec-constant/index.md * [SYCL] PI changes: 1. Add specialization constant API to the SYCL RT Plugin Interface. New PI API added: pi_result piProgramSetSpecializationConstant(pi_program prog, pi_uint32 spec_id, size_t spec_size, const void *spec_value); 2. Add property set fields to the binary image descriptor, bump PI version. This change breaks backward binary compatibility of device binary image descriptors. 3. Add convenience C++ wrappers for PI binary image hierarchy objects. * [SYCL] Support device binary properties and file tables in the offload wrapper. 1. New option - "-properties=<file>". <file> must be a property set registry file, as defined by llvm/Support/PropertySetIO.h. The wrapper will add the property sets to the binary image descriptor and the them available to the runtime. 2. New options - "-batch". With this option the only input can be a file table, as defined by llvm/Support/SimpleTable.h. Column names are a part of interface between this tool and the sycl-post-link, which produces the file table. 3. Binary image descriptor LLVM type updated to resemble changes in Plugin Interface v1.2. * [SYCL] Specialization constants support in the Front End. 1. Detect kernel lambda object captures corresponding to specialization constants and (a) don't create kernel arguments for them (b) generate specializations of the SpecConstantInfo structure into the integration header. 2. Recognize the __unique_stable_name intrinsic and replace it with a string literal uniquely identifying the type of the typename template parameter to this intrinsic. 3. FE-related changes in the runtime: - new SpecConstantInfo templated struct for type->name translation for specialization constants used by integration header - define the __sycl_fe_getStableUniqueTypeName intrinsic * [SYCL] Add specialization constant support in SYCL runtime. 1. Define SYCL API (sycl/include/CL/sycl/experimental/spec_constant.hpp) 2. Add convenience C++ wrappers for PI device binary structures and refactor runtime to use the wrappers. Get rid of custom deleters for binary images. 3. Implement SYCL spec constant APIs in program an program manager. * [SYCL] Use file-table-tform in SYCL offload processing in clang driver. Clang driver's design can't handily model (1) multiple inputs/outputs in the action graph. Because of that, for example, sycl-post-link tool is invoked twice - once to to split the code and produce multiple bitcode files, and secondly - to generate symbol files for the split modules. (2) "Clusters" of inputs/outputs, when subsets of inputs/outputs are associated and describe different aspects of the same data. Example of such clustering is the split module + its symbol file above. Clustering would require support both in the driver and the tools invoked in response to actions. This commit moves SYCL offload processing to the "file table concept." sycl-post-link instead of (1) being invoked n times, once per each output type requested (once for device split and once for symbol file generation) (2) outputting multiple file lists each listing outputs from the corresponding invocation above is now invoked once and produces single file table output. E.g. [Code|Symbols|Properties] a_0.bc|a_0.sym|a_0.props a_1.bc|a_1.sym|a_1.props This solves both problems - multiple input/output and clustering. Combined with the file-table-tform tool, this allows for efficent handling of multiple clusters of files (each represented as a row in the table file) in the clang driver infrastructure. For example, there is a real offload processing problem: step1. sycl-post-link outputs N clusters of files step2. "Code" file of each cluster resuilting from step1 ({a_0.bc, a_1.bc} in the example above) must undergo further transformations - translation to SPIRV and optional ahead-of-time compilation. step3. In each cluster resulting from step1 the "Code" file needs to be replaced with the result of step2 step4. All the clusters are processed by the ClangOffloadWrapper tool, which needs to know how files are distributed into clusters and what is the roles of each file in a cluster - whether it is "Code", "Symbol" or "Properties". To solve this, the following action graph is constructed in the clang driver: column:"Code" t1 -> [file-table-tform:extract column] -> t1a -> [for-each:] -> t1b llvm-spirv aot-comp t1 \ column:"Code" [file-table-tform:replace column] -> t2 -> [ClangOffloadWrapper] / t1b where t1b is ["Code"] and t2 is [Code|Symbols|Properties] a_0.bin a_0.bin|a_0.sym|a_0.props a_1.bin a_1.bin|a_1.sym|a_1.props Note that the graph does not change with growing number of clusters, neither it changes when more files are added to each cluster (e.g. a "Manifest" file). * [SYCL] Process specialization constants in sycl-post-link tool. Add a spec constant lowering pass to sycl-post-link tool. Support file table output format. * [SYCL] Temporarily disable spec_const_hw.cpp on CPU. CPU OpenCL Runtime on build machines is not updated yet. Signed-off-by: Konstantin S Bobrovsky <[email protected]>
aelovikov-intel
pushed a commit
to aelovikov-intel/llvm
that referenced
this pull request
Feb 23, 2023
Test integration of kernel fusion into the SYCL runtime scheduler. Check that cancellation of the fusion happens if required by synchronization rules, as described in the [extension proposal](https://github.com/intel/llvm/blob/sycl/sycl/doc/extensions/experimental/sycl_ext_codeplay_kernel_fusion.asciidoc#synchronization-in-the-sycl-application). Spec: intel#7098 Implementation: intel#7531 Signed-off-by: Lukas Sommer <[email protected]>
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
No description provided.