-
Notifications
You must be signed in to change notification settings - Fork 787
LLVM and SPIRV-LLVM-Translator pulldown (WW27) #4006
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Merged
Merged
Conversation
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
OR, XOR and AND entries are added to the cost table. An extra cost is added when vector splitting occurs. This is done to address the issue of a missed SLP vectorization opportunity due to unreasonably high costs being attributed to the vector Or reduction (see: https://bugs.llvm.org/show_bug.cgi?id=44593). Differential Revision: https://reviews.llvm.org/D104538
This patch generalizes MatchBinaryAddToConst to support matching (A + C1), (A + C2), instead of just matching (A + C1), A. The existing cases can be handled by treating non-add expressions A as A + 0. Reviewed By: mkazantsev Differential Revision: https://reviews.llvm.org/D104634
It looks like the fold introduced in 63f3383 can cause crashes if the type of the bitcasted value is not a valid vector element type, like x86_mmx. To resolve the crash, reject invalid vector element types. The way it is done in the patch is a bit clunky. Perhaps there's a better way to check? Reviewed By: RKSimon Differential Revision: https://reviews.llvm.org/D104792
Change --max-timeline-cycles=0 to mean no limit on the number of cycles. Use this in AMDGPU tests to show all instructions in the timeline view instead of having it arbitrarily truncated. Differential Revision: https://reviews.llvm.org/D104846
As a minor adjustment to the existing lowering of offset scatters, this extends any smaller-than-legal vectors into full vectors using a zext, so that the truncating scatters can be used. Due to the way MVE legalizes the vectors this should be cheap in most situations, and will prevent the vector from being scalarized. Differential Revision: https://reviews.llvm.org/D103704
This patch enables the salvaging of debug values that may be calculated from more than one SSA value, such as with binary operators that do not use a constant argument. The actual functionality for this behaviour is added in a previous commit (c727056), but with the ability to actually emit the resulting debug values switched off. The reason for this is that the prior patch has been reverted several times due to issues discovered downstream, some time after the actual landing of the patch. The patch in question is rather large and touches several widely used header files, and all issues discovered are more related to the handling of variadic debug values as a whole rather than the details of the patch itself. Therefore, to minimize the build time impact and risk of conflicts involved in any potential future revert/reapply of that patch, this significantly smaller patch (that touches no header files) will instead be used as the capstone to enable variadic debug value salvaging. The review linked to this patch is mostly implemented by the previous commit, c727056, but also contains the changes in this patch. Differential Revision: https://reviews.llvm.org/D91722
Extend the yaml code generation to support the index attributes that https://reviews.llvm.org/D104711 added to the OpDSL. Differential Revision: https://reviews.llvm.org/D104712
This patch handles sinking a replicate region after another replicate region. In that case, we can connect the sink region after the target region. This properly handles the case for which an assertion has been added in 337d765. Fixes https://bugs.chromium.org/p/oss-fuzz/issues/detail?id=34842. Reviewed By: Ayal Differential Revision: https://reviews.llvm.org/D103514
This adds the MemoryTagManager class and a specialisation of that class for AArch64 MTE tags. It provides a generic interface for various tagging operations. Adding/removing tags, diffing tagged pointers, etc. Later patches will use this manager to handle memory tags in generic code in both lldb and lldb-server. Since it will be used in both, the base class header is in lldb/Target. (MemoryRegionInfo is another example of this pattern) Reviewed By: omjavaid Differential Revision: https://reviews.llvm.org/D97281
Add an index_dim annotation to specify the shape to loop mapping of shape-only tensors. A shape-only tensor serves is not accessed withing the body of the operation but is required to span the iteration space of certain operations such as pooling. Differential Revision: https://reviews.llvm.org/D104767
Everything includes clang/Config/config.h by qualified "clang/Config/config.h" path, so there's no need for `-Igen/clang/include/clang/Config/clang/include`. No behavior change.
This feature "memory-tagging+" indicates that lldb-server supports memory tagging packets. (added in a later patch) We check HWCAP2_MTE to decide whether to enable this feature for Linux. Reviewed By: omjavaid Differential Revision: https://reviews.llvm.org/D97282
Use %zu to print size_t vars.
…ion (7/n) scf::ForOp bufferization analysis proceeds just like for any other op (including FuncOp) at its boundaries; i.e. if: 1. The tensor operand is inplaceable. 2. The matching result has no subsequent read (i.e. all reads dominate the scf::ForOp). 3. In and does not create a RAW interference. then it can bufferize inplace. Still there are a few differences: 1. bbArgs for an scf::ForOp are always considered inplaceable when seen from ops inside the body. This is because a) either the matching tensor operand is not inplaceable and an alloc will be inserted (which makes bbArg itself inplaceable); or b) the tensor operand and bbArg are both already inplaceable. 2. Bufferization within the scf::ForOp body has implications to the outside world : the scf.yield terminator may well ping-pong values of the same type. This muddies the water for alias analysis and is not supported atm. Such cases result in a pass failure. Differential revision: https://reviews.llvm.org/D104490
This adds memory tag reading using the new "qMemTags" packet and ptrace on AArch64 Linux. This new packet is following the one used by GDB. (https://sourceware.org/gdb/current/onlinedocs/gdb/General-Query-Packets.html) On AArch64 Linux we use ptrace's PEEKMTETAGS to read tags and we assume that lldb has already checked that the memory region actually has tagging enabled. We do not assume that lldb has expanded the requested range to granules and expand it again to be sure. (although lldb will be sending aligned ranges because it happens to need them client side anyway) Also we don't assume untagged addresses. So for AArch64 we'll remove the top byte before using them. (the top byte includes MTE and other non address data) To do the ptrace read NativeProcessLinux will ask the native register context for a memory tag manager based on the type in the packet. This also gives you the ptrace numbers you need. (it's called a register context but it also has non register data, so it saves adding another per platform sub class) The only supported platform for this is AArch64 Linux and the only supported tag type is MTE allocation tags. Anything else will error. Ptrace can return a partial result but for lldb-server we will be treating that as an error. To succeed we need to get all the tags we expect. (Note that the protocol leaves room for logical tags to be read via qMemTags but this is not going to be implemented for lldb at this time.) Reviewed By: omjavaid Differential Revision: https://reviews.llvm.org/D95601
This commit moves the type translator from LLVM to MLIR to a public header for use by external projects or other code. Unlike a previous attempt (https://reviews.llvm.org/D104726), this patch moves the type conversion into separate files which remedies the linker error which was only caught by CI. Differential Revision: https://reviews.llvm.org/D104834
Currently when .llvm.call-graph-profile is created by llvm it explicitly encodes the symbol indices. This section is basically a black box for post processing tools. For example, if we run strip -s on the object files the symbol table changes, but indices in that section do not. In non-visible behavior indices point to wrong symbols. The visible behavior indices point outside of Symbol table: "invalid symbol index". This patch changes the format by using R_*_NONE relocations to indicate the from/to symbols. The Frequency (Weight) will still be in the .llvm.call-graph-profile, but symbol information will be in relocation section. In LLD information from both sections is used to reconstruct call graph profile. Relocations themselves will never be applied. With this approach post processing tools that handle relocations correctly work for this section also. Tools can add/remove symbols and as long as they handle relocation sections with this approach information stays correct. Doing a quick experiment with clang-13. The size went up from 107KB to 322KB, aggregate of all the input sections. Size of clang-13 binary is ~118MB. For users of -fprofile-use/-fprofile-sample-use the size of object files will go up slightly, it will not impact final binary size. Reviewed By: jhenderson, MaskRay Differential Revision: https://reviews.llvm.org/D104080
This adds GDB client support for the qMemTags packet which reads memory tags. Following the design which was recently committed to GDB. https://sourceware.org/gdb/current/onlinedocs/gdb/General-Query-Packets.html#General-Query-Packets (look for qMemTags) lldb commands will use the new Process methods GetMemoryTagManager and ReadMemoryTags. The former takes a range and checks that: * The current process architecture has an architecture plugin * That plugin provides a MemoryTagManager * That the range of memory requested lies in a tagged range (it will expand it to granules for you) If all that was true you get a MemoryTagManager you can give to ReadMemoryTags. This two step process is done to allow commands to get the tag manager without having to read tags as well. For example you might just want to remove a logical tag, or error early if a range with tagged addresses is inverted. Note that getting a MemoryTagManager doesn't mean that the process or a specific memory range is tagged. Those are seperate checks. Having a tag manager just means this architecture *could* have a tagging feature enabled. An architecture plugin has been added for AArch64 which will return a MemoryTagManagerAArch64MTE, which was added in a previous patch. Reviewed By: omjavaid Differential Revision: https://reviews.llvm.org/D95602
This new command looks much like "memory read" and mirrors its basic behaviour. (lldb) memory tag read new_buf_ptr new_buf_ptr+32 Logical tag: 0x9 Allocation tags: [0x900fffff7ffa000, 0x900fffff7ffa010): 0x9 [0x900fffff7ffa010, 0x900fffff7ffa020): 0x0 Important proprties: * The end address is optional and defaults to reading 1 tag if ommitted * It is an error to try to read tags if the architecture or process doesn't support it, or if the range asked for is not tagged. * It is an error to read an inverted range (end < begin) (logical tags are removed for this check so you can pass tagged addresses here) * The range will be expanded to fit the tagging granule, so you can get more tags than simply (end-begin)/granule size. Whatever you get back will always cover the original range. Reviewed By: omjavaid Differential Revision: https://reviews.llvm.org/D97285
Updates Bazel build files to match llvm/llvm-project@929189a499 Differential Revision: https://reviews.llvm.org/D104864
- Currently, the emitting of labels in the parsePrimaryExpr function is case independent. It just takes the identifier and emits it. - However, for HLASM the emitting of labels is case independent. We are emitting them in the upper case only, to enforce case independency. So we need to ensure that at the time of parsing the label we are emitting the upper case (in `parseAsHLASMLabel`), but also, when we are processing a PC-relative relocatable expression, we need to ensure we emit it in upper case (in `parsePrimaryExpr`) - To achieve this a new MCAsmInfo attribute has been introduced which corresponding targets can override if needed. Reviewed By: abhina.sreeskantharajan, uweigand Differential Revision: https://reviews.llvm.org/D104715
This is a follow up to D102732 which also expands the logic to Darwin. Differential Revision: https://reviews.llvm.org/D104764
…:ShrinkDemandedConstant. We don't constant fold based on demanded bits elsewhere in SimplifyDemandedBits, so I don't think we should shrink them either. The affected ARM test changes because a constant become non-opaque and eventually enabled some constant folding. This no longer happens. I checked and InstCombine is able to simplify this test. I'm not sure exactly what it was trying to test. Reviewed By: lebedev.ri, dmgreen Differential Revision: https://reviews.llvm.org/D104832
There were 3 assumptions made during translation of llvm.loop metadata into LoopMerge instruction: 1. A latch block for 'for' and 'while' loop ends with a unconditional branch instruction; 2. A latch block for 'do-while' loop ends with a conditional branch instruction; 3. For a 'conditional' latch basic block it assumes, that the first successor is the loop header and the second one is the exit block. All three of them can be violated in a case of an optimized IR is passed to the translator. In this case LoopMerge: a. can be placed in a wrong basic block; b. have continue target and merge block parameters sharing the same id. This patch makes the translator to assume less and do more checks via LLVM LoopInfo infrustructure. Signed-off-by: Dmitry Sidorov <[email protected]> Original commit: KhronosGroup/SPIRV-LLVM-Translator@7670633
Signed-off-by: Dmitry Sidorov <[email protected]> Original commit: KhronosGroup/SPIRV-LLVM-Translator@ce647ca
Signed-off-by: Dmitry Sidorov <[email protected]> Original commit: KhronosGroup/SPIRV-LLVM-Translator@2b8b4a8
There are 3 metadata, which map on LoopControlLoopCountINTELMask. Mask has 3 parameters. Default values are -1. Signed-off-by: Leonid Pauzin <[email protected]> Original commit: KhronosGroup/SPIRV-LLVM-Translator@b4d6f0b
The SYCL/OpenCL specifications require that 3-element vectors are sized equally to the 4-element ones (see 4.10.2.1 and 4.10.2.6, khronos.org/registry/SYCL/specs/sycl-1.2.1.pdf). Meanwhile, the logic for translating SPIR-V debug info into LLVM IR obeys a trivial `vec_size = num_elems * elem_size` formula, which fails for the 3-element edge case. Example LLVM IR pre-translation: ``` !0 = !DICompositeType(tag: DW_TAG_array_type, baseType: !3, size: 128, flags: DIFlagVector, elements: !1) !1 = !{!2} !2 = !DISubrange(count: 3) !3 = !DIBasicType(name: "int", size: 32, align: 32, encoding: DW_ATE_signed) ``` Faulty LLVM IR after bi-directional translation (note the size in `!0`): ``` !0 = !DICompositeType(tag: DW_TAG_array_type, baseType: !3, size: 96, flags: DIFlagVector, elements: !1) !1 = !{!2} !2 = !DISubrange(count: 3) !3 = !DIBasicType(name: "int", size: 32, align: 32, encoding: DW_ATE_signed) ``` Until SPIR-V DI instructions are re-designed to store the array size information, handle the 3-element case explicitly to favor OpenCL/SYCL requirements. Signed-off-by: Artem Gindinson <[email protected]> Original commit: KhronosGroup/SPIRV-LLVM-Translator@ddb5c96
It should be a valid case if alias.scope and noalias mask/decoration are applied to the same instruction. Signed-off-by: Dmitry Sidorov <[email protected]> Original commit: KhronosGroup/SPIRV-LLVM-Translator@d5573a0
This refactoring phase comes down to moving the translation algorithm out of the already-cluttered `SPIRVToLLVM::setLLVMLoopMetadata()` body. For now, a static member-only class is employed: it provides encapsulation for helper functions while avoiding the unnecessary complexity that "true" class entities would bring. A simple `LoopsEmitted` set is used to guard against duplication of metadata for a particular loop by the callers. Signed-off-by: Artem Gindinson <[email protected]> Original commit: KhronosGroup/SPIRV-LLVM-Translator@348d3cf
Since there's a dedicated `// Enums` section in the `OCLUtil.h` header, move the enum definition there. Signed-off-by: Artem Gindinson <[email protected]> Original commit: KhronosGroup/SPIRV-LLVM-Translator@f8a2e49
Some LLVM optimizations started generation of ascast -> gep -> load sequence, this patch adds support for it. Original commit: KhronosGroup/SPIRV-LLVM-Translator@fc71302
/summary:run |
Bring spirv.hpp in sync with f95c3b3 ("Merge pull request intel#219 from cmarcelo/SPV_EXT_shader_atomic_float16_add", 2021-06-23) from github.com/KhronosGroup/SPIRV-Headers . Notably, this brings the SPV_KHR_integer_dot_product extension enum values and CPP_for_OpenCL source language enum value, together with other additions. There are some reorderings too. Original commit: KhronosGroup/SPIRV-LLVM-Translator@6847551
Regenerate SPIRVIsValidEnum.h and SPIRVNameMapEnum.h using tools/spirv-tool/gen_spirv.bash after syncing spirv.hpp and manually fix the following: - internal:: values are not handled, so they had to be added manually again. Move all internal values to the end of the generated enum/function, so that they are together. - NameMap entries for IOPipesINTEL and FuncParamIOKindINTEL do not follow the convention of the other values, so they had to be changed back to their original values. Original commit: KhronosGroup/SPIRV-LLVM-Translator@c4c9b0c
…insic The correct translation of this case will be done later, once the spec is updated. Signed-off-by: Dmitry Sidorov <[email protected]> Original commit: KhronosGroup/SPIRV-LLVM-Translator@a84f589
/summary:run |
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
LLVM: llvm/llvm-project@7c73c2ed
SPIRV-LLVM-Translator: KhronosGroup/SPIRV-LLVM-Translator@a84f589