LLVM and SPIRV-LLVM-Translator pulldown (WW27) #4006

vmaksimo · 2021-06-28T09:49:53Z

LLVM: llvm/llvm-project@7c73c2ed
SPIRV-LLVM-Translator: KhronosGroup/SPIRV-LLVM-Translator@a84f589

OR, XOR and AND entries are added to the cost table. An extra cost is added when vector splitting occurs. This is done to address the issue of a missed SLP vectorization opportunity due to unreasonably high costs being attributed to the vector Or reduction (see: https://bugs.llvm.org/show_bug.cgi?id=44593). Differential Revision: https://reviews.llvm.org/D104538

This patch generalizes MatchBinaryAddToConst to support matching (A + C1), (A + C2), instead of just matching (A + C1), A. The existing cases can be handled by treating non-add expressions A as A + 0. Reviewed By: mkazantsev Differential Revision: https://reviews.llvm.org/D104634

It looks like the fold introduced in 63f3383 can cause crashes if the type of the bitcasted value is not a valid vector element type, like x86_mmx. To resolve the crash, reject invalid vector element types. The way it is done in the patch is a bit clunky. Perhaps there's a better way to check? Reviewed By: RKSimon Differential Revision: https://reviews.llvm.org/D104792

Change --max-timeline-cycles=0 to mean no limit on the number of cycles. Use this in AMDGPU tests to show all instructions in the timeline view instead of having it arbitrarily truncated. Differential Revision: https://reviews.llvm.org/D104846

…terminators

As a minor adjustment to the existing lowering of offset scatters, this extends any smaller-than-legal vectors into full vectors using a zext, so that the truncating scatters can be used. Due to the way MVE legalizes the vectors this should be cheap in most situations, and will prevent the vector from being scalarized. Differential Revision: https://reviews.llvm.org/D103704

This patch enables the salvaging of debug values that may be calculated from more than one SSA value, such as with binary operators that do not use a constant argument. The actual functionality for this behaviour is added in a previous commit (c727056), but with the ability to actually emit the resulting debug values switched off. The reason for this is that the prior patch has been reverted several times due to issues discovered downstream, some time after the actual landing of the patch. The patch in question is rather large and touches several widely used header files, and all issues discovered are more related to the handling of variadic debug values as a whole rather than the details of the patch itself. Therefore, to minimize the build time impact and risk of conflicts involved in any potential future revert/reapply of that patch, this significantly smaller patch (that touches no header files) will instead be used as the capstone to enable variadic debug value salvaging. The review linked to this patch is mostly implemented by the previous commit, c727056, but also contains the changes in this patch. Differential Revision: https://reviews.llvm.org/D91722

Extend the yaml code generation to support the index attributes that https://reviews.llvm.org/D104711 added to the OpDSL. Differential Revision: https://reviews.llvm.org/D104712

This patch handles sinking a replicate region after another replicate region. In that case, we can connect the sink region after the target region. This properly handles the case for which an assertion has been added in 337d765. Fixes https://bugs.chromium.org/p/oss-fuzz/issues/detail?id=34842. Reviewed By: Ayal Differential Revision: https://reviews.llvm.org/D103514

Differential Revision: https://reviews.llvm.org/D104245

This adds the MemoryTagManager class and a specialisation of that class for AArch64 MTE tags. It provides a generic interface for various tagging operations. Adding/removing tags, diffing tagged pointers, etc. Later patches will use this manager to handle memory tags in generic code in both lldb and lldb-server. Since it will be used in both, the base class header is in lldb/Target. (MemoryRegionInfo is another example of this pattern) Reviewed By: omjavaid Differential Revision: https://reviews.llvm.org/D97281

Add an index_dim annotation to specify the shape to loop mapping of shape-only tensors. A shape-only tensor serves is not accessed withing the body of the operation but is required to span the iteration space of certain operations such as pooling. Differential Revision: https://reviews.llvm.org/D104767

Everything includes clang/Config/config.h by qualified "clang/Config/config.h" path, so there's no need for `-Igen/clang/include/clang/Config/clang/include`. No behavior change.

This feature "memory-tagging+" indicates that lldb-server supports memory tagging packets. (added in a later patch) We check HWCAP2_MTE to decide whether to enable this feature for Linux. Reviewed By: omjavaid Differential Revision: https://reviews.llvm.org/D97282

…ix types

Use %zu to print size_t vars.

…ion (7/n) scf::ForOp bufferization analysis proceeds just like for any other op (including FuncOp) at its boundaries; i.e. if: 1. The tensor operand is inplaceable. 2. The matching result has no subsequent read (i.e. all reads dominate the scf::ForOp). 3. In and does not create a RAW interference. then it can bufferize inplace. Still there are a few differences: 1. bbArgs for an scf::ForOp are always considered inplaceable when seen from ops inside the body. This is because a) either the matching tensor operand is not inplaceable and an alloc will be inserted (which makes bbArg itself inplaceable); or b) the tensor operand and bbArg are both already inplaceable. 2. Bufferization within the scf::ForOp body has implications to the outside world : the scf.yield terminator may well ping-pong values of the same type. This muddies the water for alias analysis and is not supported atm. Such cases result in a pass failure. Differential revision: https://reviews.llvm.org/D104490

This adds memory tag reading using the new "qMemTags" packet and ptrace on AArch64 Linux. This new packet is following the one used by GDB. (https://sourceware.org/gdb/current/onlinedocs/gdb/General-Query-Packets.html) On AArch64 Linux we use ptrace's PEEKMTETAGS to read tags and we assume that lldb has already checked that the memory region actually has tagging enabled. We do not assume that lldb has expanded the requested range to granules and expand it again to be sure. (although lldb will be sending aligned ranges because it happens to need them client side anyway) Also we don't assume untagged addresses. So for AArch64 we'll remove the top byte before using them. (the top byte includes MTE and other non address data) To do the ptrace read NativeProcessLinux will ask the native register context for a memory tag manager based on the type in the packet. This also gives you the ptrace numbers you need. (it's called a register context but it also has non register data, so it saves adding another per platform sub class) The only supported platform for this is AArch64 Linux and the only supported tag type is MTE allocation tags. Anything else will error. Ptrace can return a partial result but for lldb-server we will be treating that as an error. To succeed we need to get all the tags we expect. (Note that the protocol leaves room for logical tags to be read via qMemTags but this is not going to be implemented for lldb at this time.) Reviewed By: omjavaid Differential Revision: https://reviews.llvm.org/D95601

This commit moves the type translator from LLVM to MLIR to a public header for use by external projects or other code. Unlike a previous attempt (https://reviews.llvm.org/D104726), this patch moves the type conversion into separate files which remedies the linker error which was only caught by CI. Differential Revision: https://reviews.llvm.org/D104834

Currently when .llvm.call-graph-profile is created by llvm it explicitly encodes the symbol indices. This section is basically a black box for post processing tools. For example, if we run strip -s on the object files the symbol table changes, but indices in that section do not. In non-visible behavior indices point to wrong symbols. The visible behavior indices point outside of Symbol table: "invalid symbol index". This patch changes the format by using R_*_NONE relocations to indicate the from/to symbols. The Frequency (Weight) will still be in the .llvm.call-graph-profile, but symbol information will be in relocation section. In LLD information from both sections is used to reconstruct call graph profile. Relocations themselves will never be applied. With this approach post processing tools that handle relocations correctly work for this section also. Tools can add/remove symbols and as long as they handle relocation sections with this approach information stays correct. Doing a quick experiment with clang-13. The size went up from 107KB to 322KB, aggregate of all the input sections. Size of clang-13 binary is ~118MB. For users of -fprofile-use/-fprofile-sample-use the size of object files will go up slightly, it will not impact final binary size. Reviewed By: jhenderson, MaskRay Differential Revision: https://reviews.llvm.org/D104080

This adds GDB client support for the qMemTags packet which reads memory tags. Following the design which was recently committed to GDB. https://sourceware.org/gdb/current/onlinedocs/gdb/General-Query-Packets.html#General-Query-Packets (look for qMemTags) lldb commands will use the new Process methods GetMemoryTagManager and ReadMemoryTags. The former takes a range and checks that: * The current process architecture has an architecture plugin * That plugin provides a MemoryTagManager * That the range of memory requested lies in a tagged range (it will expand it to granules for you) If all that was true you get a MemoryTagManager you can give to ReadMemoryTags. This two step process is done to allow commands to get the tag manager without having to read tags as well. For example you might just want to remove a logical tag, or error early if a range with tagged addresses is inverted. Note that getting a MemoryTagManager doesn't mean that the process or a specific memory range is tagged. Those are seperate checks. Having a tag manager just means this architecture *could* have a tagging feature enabled. An architecture plugin has been added for AArch64 which will return a MemoryTagManagerAArch64MTE, which was added in a previous patch. Reviewed By: omjavaid Differential Revision: https://reviews.llvm.org/D95602

This new command looks much like "memory read" and mirrors its basic behaviour. (lldb) memory tag read new_buf_ptr new_buf_ptr+32 Logical tag: 0x9 Allocation tags: [0x900fffff7ffa000, 0x900fffff7ffa010): 0x9 [0x900fffff7ffa010, 0x900fffff7ffa020): 0x0 Important proprties: * The end address is optional and defaults to reading 1 tag if ommitted * It is an error to try to read tags if the architecture or process doesn't support it, or if the range asked for is not tagged. * It is an error to read an inverted range (end < begin) (logical tags are removed for this check so you can pass tagged addresses here) * The range will be expanded to fit the tagging granule, so you can get more tags than simply (end-begin)/granule size. Whatever you get back will always cover the original range. Reviewed By: omjavaid Differential Revision: https://reviews.llvm.org/D97285

Updates Bazel build files to match llvm/llvm-project@929189a499 Differential Revision: https://reviews.llvm.org/D104864

- Currently, the emitting of labels in the parsePrimaryExpr function is case independent. It just takes the identifier and emits it. - However, for HLASM the emitting of labels is case independent. We are emitting them in the upper case only, to enforce case independency. So we need to ensure that at the time of parsing the label we are emitting the upper case (in `parseAsHLASMLabel`), but also, when we are processing a PC-relative relocatable expression, we need to ensure we emit it in upper case (in `parsePrimaryExpr`) - To achieve this a new MCAsmInfo attribute has been introduced which corresponding targets can override if needed. Reviewed By: abhina.sreeskantharajan, uweigand Differential Revision: https://reviews.llvm.org/D104715

This is a follow up to D102732 which also expands the logic to Darwin. Differential Revision: https://reviews.llvm.org/D104764

…:ShrinkDemandedConstant. We don't constant fold based on demanded bits elsewhere in SimplifyDemandedBits, so I don't think we should shrink them either. The affected ARM test changes because a constant become non-opaque and eventually enabled some constant folding. This no longer happens. I checked and InstCombine is able to simplify this test. I'm not sure exactly what it was trying to test. Reviewed By: lebedev.ri, dmgreen Differential Revision: https://reviews.llvm.org/D104832

…pulldown

…v_pulldown

There were 3 assumptions made during translation of llvm.loop metadata into LoopMerge instruction: 1. A latch block for 'for' and 'while' loop ends with a unconditional branch instruction; 2. A latch block for 'do-while' loop ends with a conditional branch instruction; 3. For a 'conditional' latch basic block it assumes, that the first successor is the loop header and the second one is the exit block. All three of them can be violated in a case of an optimized IR is passed to the translator. In this case LoopMerge: a. can be placed in a wrong basic block; b. have continue target and merge block parameters sharing the same id. This patch makes the translator to assume less and do more checks via LLVM LoopInfo infrustructure. Signed-off-by: Dmitry Sidorov <[email protected]> Original commit: KhronosGroup/SPIRV-LLVM-Translator@7670633

Signed-off-by: Dmitry Sidorov <[email protected]> Original commit: KhronosGroup/SPIRV-LLVM-Translator@ce647ca

Signed-off-by: Dmitry Sidorov <[email protected]> Original commit: KhronosGroup/SPIRV-LLVM-Translator@2b8b4a8

There are 3 metadata, which map on LoopControlLoopCountINTELMask. Mask has 3 parameters. Default values are -1. Signed-off-by: Leonid Pauzin <[email protected]> Original commit: KhronosGroup/SPIRV-LLVM-Translator@b4d6f0b

Original commit: KhronosGroup/SPIRV-LLVM-Translator@ff19818

The SYCL/OpenCL specifications require that 3-element vectors are sized equally to the 4-element ones (see 4.10.2.1 and 4.10.2.6, khronos.org/registry/SYCL/specs/sycl-1.2.1.pdf). Meanwhile, the logic for translating SPIR-V debug info into LLVM IR obeys a trivial `vec_size = num_elems * elem_size` formula, which fails for the 3-element edge case. Example LLVM IR pre-translation: ``` !0 = !DICompositeType(tag: DW_TAG_array_type, baseType: !3, size: 128, flags: DIFlagVector, elements: !1) !1 = !{!2} !2 = !DISubrange(count: 3) !3 = !DIBasicType(name: "int", size: 32, align: 32, encoding: DW_ATE_signed) ``` Faulty LLVM IR after bi-directional translation (note the size in `!0`): ``` !0 = !DICompositeType(tag: DW_TAG_array_type, baseType: !3, size: 96, flags: DIFlagVector, elements: !1) !1 = !{!2} !2 = !DISubrange(count: 3) !3 = !DIBasicType(name: "int", size: 32, align: 32, encoding: DW_ATE_signed) ``` Until SPIR-V DI instructions are re-designed to store the array size information, handle the 3-element case explicitly to favor OpenCL/SYCL requirements. Signed-off-by: Artem Gindinson <[email protected]> Original commit: KhronosGroup/SPIRV-LLVM-Translator@ddb5c96

Original commit: KhronosGroup/SPIRV-LLVM-Translator@9a81ba8

It should be a valid case if alias.scope and noalias mask/decoration are applied to the same instruction. Signed-off-by: Dmitry Sidorov <[email protected]> Original commit: KhronosGroup/SPIRV-LLVM-Translator@d5573a0

Original commit: KhronosGroup/SPIRV-LLVM-Translator@2eaec4b

This refactoring phase comes down to moving the translation algorithm out of the already-cluttered `SPIRVToLLVM::setLLVMLoopMetadata()` body. For now, a static member-only class is employed: it provides encapsulation for helper functions while avoiding the unnecessary complexity that "true" class entities would bring. A simple `LoopsEmitted` set is used to guard against duplication of metadata for a particular loop by the callers. Signed-off-by: Artem Gindinson <[email protected]> Original commit: KhronosGroup/SPIRV-LLVM-Translator@348d3cf

Since there's a dedicated `// Enums` section in the `OCLUtil.h` header, move the enum definition there. Signed-off-by: Artem Gindinson <[email protected]> Original commit: KhronosGroup/SPIRV-LLVM-Translator@f8a2e49

Some LLVM optimizations started generation of ascast -> gep -> load sequence, this patch adds support for it. Original commit: KhronosGroup/SPIRV-LLVM-Translator@fc71302

vmaksimo · 2021-06-28T09:50:26Z

/summary:run

Original commit: KhronosGroup/SPIRV-LLVM-Translator@23a28b9

Bring spirv.hpp in sync with f95c3b3 ("Merge pull request intel#219 from cmarcelo/SPV_EXT_shader_atomic_float16_add", 2021-06-23) from github.com/KhronosGroup/SPIRV-Headers . Notably, this brings the SPV_KHR_integer_dot_product extension enum values and CPP_for_OpenCL source language enum value, together with other additions. There are some reorderings too. Original commit: KhronosGroup/SPIRV-LLVM-Translator@6847551

Regenerate SPIRVIsValidEnum.h and SPIRVNameMapEnum.h using tools/spirv-tool/gen_spirv.bash after syncing spirv.hpp and manually fix the following: - internal:: values are not handled, so they had to be added manually again. Move all internal values to the end of the generated enum/function, so that they are together. - NameMap entries for IOPipesINTEL and FuncParamIOKindINTEL do not follow the convention of the other values, so they had to be changed back to their original values. Original commit: KhronosGroup/SPIRV-LLVM-Translator@c4c9b0c

…insic The correct translation of this case will be done later, once the spec is updated. Signed-off-by: Dmitry Sidorov <[email protected]> Original commit: KhronosGroup/SPIRV-LLVM-Translator@a84f589

vmaksimo · 2021-06-28T15:50:10Z

/summary:run

RosieSumpter and others added 30 commits June 24, 2021 12:02

[NFC][SimplifyCFG] Add basic test for tail-merging resume function …

9f5f917

…terminators

[mlir][linalg][python] Add attribute support to the YAML codegen.

25bb616

Extend the yaml code generation to support the index attributes that https://reviews.llvm.org/D104711 added to the OpDSL. Differential Revision: https://reviews.llvm.org/D104712

[GlobalISel] Describe undefined values for G_SBFX/G_UBFX operands

927b809

Differential Revision: https://reviews.llvm.org/D104245

[gn build] Remove an unneeded -I flag

d57a587

Everything includes clang/Config/config.h by qualified "clang/Config/config.h" path, so there's no need for `-Igen/clang/include/clang/Config/clang/include`. No behavior change.

Add documentation for compound assignment and type conversion of matr…

cd256c8

…ix types

[lldb][AArch64] Fix unpack tags test case

cc05418

Use %zu to print size_t vars.

[AArch64] Precommit extending load tests for D104782. NFC.

c74aea4

[gn build] Fix a comment typo and a comment copy-pasto

b1061e3

[VPlan] Fix indentation of check lines in sinking test (NFC).

f6ba845

[mlir] remove repeated use of TypeToLLVM.cpp in cmake targets

10b8eb4

Update Bazel build for 929189a

1ca4cf9

Updates Bazel build files to match llvm/llvm-project@929189a499 Differential Revision: https://reviews.llvm.org/D104864

[CMake] Don't LTO optimize targets on Darwin either

aac4de9

This is a follow up to D102732 which also expands the logic to Darwin. Differential Revision: https://reviews.llvm.org/D104764

vmaksimo and others added 14 commits June 28, 2021 09:00

Merge remote-tracking branch 'otcshare_llvm/sycl-web' into llvmspirv_…

555dfb2

…pulldown

Merge commit '7c73c2ede8088802adb8191d05cad09e3ad88539' into llvmspir…

5724776

…v_pulldown

Apply suggestion

d97e60f

Signed-off-by: Dmitry Sidorov <[email protected]> Original commit: KhronosGroup/SPIRV-LLVM-Translator@ce647ca

Fix comments

6805f21

Signed-off-by: Dmitry Sidorov <[email protected]> Original commit: KhronosGroup/SPIRV-LLVM-Translator@2b8b4a8

Implemented SPIR-V support for loopcount attributes

90b1ef0

There are 3 metadata, which map on LoopControlLoopCountINTELMask. Mask has 3 parameters. Default values are -1. Signed-off-by: Leonid Pauzin <[email protected]> Original commit: KhronosGroup/SPIRV-LLVM-Translator@b4d6f0b

Remove unncessary LLVM_FALLTHROUGH statements for empty switch cases

d9ab923

Original commit: KhronosGroup/SPIRV-LLVM-Translator@ff19818

Fix SPIRV-IR for group opcodes

ec624d7

Original commit: KhronosGroup/SPIRV-LLVM-Translator@9a81ba8

Remove alias.scope/noalias exclusive restriction

d377c5c

It should be a valid case if alias.scope and noalias mask/decoration are applied to the same instruction. Signed-off-by: Dmitry Sidorov <[email protected]> Original commit: KhronosGroup/SPIRV-LLVM-Translator@d5573a0

Fix TopologicalSort memory leak

9598f11

Original commit: KhronosGroup/SPIRV-LLVM-Translator@2eaec4b

Move FPGA memory access enum definition

2501f39

Since there's a dedicated `// Enums` section in the `OCLUtil.h` header, move the enum definition there. Signed-off-by: Artem Gindinson <[email protected]> Original commit: KhronosGroup/SPIRV-LLVM-Translator@f8a2e49

Add support of new IR pattern to transOCLBuiltinFromVariable

691d657

Some LLVM optimizations started generation of ascast -> gep -> load sequence, this patch adds support for it. Original commit: KhronosGroup/SPIRV-LLVM-Translator@fc71302

svenvh and others added 4 commits June 28, 2021 18:29

Emit spirv_internal.hpp include

e9bec61

Original commit: KhronosGroup/SPIRV-LLVM-Translator@23a28b9

Fix a crash when alias metadata is attached to a call to lifetime inr…

b5c4b14

…insic The correct translation of this case will be done later, once the spec is updated. Signed-off-by: Dmitry Sidorov <[email protected]> Original commit: KhronosGroup/SPIRV-LLVM-Translator@a84f589

vmaksimo marked this pull request as ready for review June 29, 2021 08:08

vmaksimo requested review from AaronBallman, AGindinson, AlexeySachkov, AlexeySotkin, bader, elizabethandrews, mdtoguchi and premanandrao as code owners June 29, 2021 08:08

vladimirlaz merged commit e946a0f into intel:sycl Jun 29, 2021

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

LLVM and SPIRV-LLVM-Translator pulldown (WW27) #4006

LLVM and SPIRV-LLVM-Translator pulldown (WW27) #4006

Uh oh!

vmaksimo commented Jun 28, 2021 •

edited

Loading

Uh oh!

vmaksimo commented Jun 28, 2021

Uh oh!

vmaksimo commented Jun 28, 2021

Uh oh!

Uh oh!

LLVM and SPIRV-LLVM-Translator pulldown (WW27) #4006

LLVM and SPIRV-LLVM-Translator pulldown (WW27) #4006

Uh oh!

Conversation

vmaksimo commented Jun 28, 2021 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

vmaksimo commented Jun 28, 2021

Uh oh!

vmaksimo commented Jun 28, 2021

Uh oh!

Uh oh!

vmaksimo commented Jun 28, 2021 •

edited

Loading