LLVM and SPIRV-LLVM-Translator pull down #1140

vladimirlaz · 2020-02-18T08:04:12Z

LLVM: c7fa409
SPIRV-LLVM-Translator: 0401a329

This is required to be legal. I'm not sure how we were getting away without defining any rules for it.

There's no reason why we should require a directory when asking for the version. Differential Revision: https://reviews.llvm.org/D74553

Summary: Recursion is a powerful tool, but like any tool without care it can be dangerous. For example, if the recursion is unbounded, you will eventually run out of stack and crash. You can of course track the recursion depth but if it is hardcoded, there can always be some other environment when that depth is too large, so said magic number would need to be env-dependent. But then your program's behavior is suddenly more env-dependent. Also, recursion, while it does not outright stop optimization, recursive calls are less great than normal calls, for example they hinder inlining. Recursion is banned in some coding guidelines: * SEI CERT DCL56-CPP. Avoid cycles during initialization of static objects * JPL 2.4 Do not use direct or indirect recursion. * I'd say it is frowned upon in LLVM, although not banned And is plain unsupported in some cases: * OpenCL 1.2, 6.9 Restrictions: i. Recursion is not supported. So there's clearly a lot of reasons why one might want to avoid recursion, and replace it with worklist handling. It would be great to have a enforcement for it though. This implements such a check. Here we detect both direct and indirect recursive calls, although since clang-tidy (unlike clang static analyzer) is CTU-unaware, if the recursion transcends a single standalone TU, we will naturally not find it :/ The algorithm is pretty straight-forward: 1. Build call-graph for the entire TU. For that, the existing `clang::CallGraph` is re-used, although it had to be modified to also track the location of the call. 2. Then, the hard problem: how do we detect recursion? Since we have a graph, let's just do the sane thing, and look for Strongly Connected Function Declarations - widely known as `SCC`. For that LLVM provides `llvm::scc_iterator`, which is internally an Tarjan's DFS algorithm, and is used throught LLVM, so this should be as performant as possible. 3. Now that we've got SCC's, we discard those that don't contain loops. Note that there may be more than one loop in SCC! 4. For each loopy SCC, we call out each function, and print a single example call graph that shows recursion -- it didn't seem worthwhile enumerating every possible loop in SCC, although i suppose it could be implemented. * To come up with that call graph cycle example, we start at first SCC node, see which callee of the node is within SCC (and is thus known to be in cycle), and recurse into it until we hit the callee that is already in call stack. Reviewers: JonasToth, aaron.ballman, ffrankies, Eugene.Zelenko, erichkeane, NoQ Reviewed By: aaron.ballman Subscribers: Charusso, Naghasan, bader, riccibruno, mgorny, Anastasia, xazax.hun, cfe-commits Tags: #llvm, #clang Differential Revision: https://reviews.llvm.org/D72362

…lGraphNode::CallRecord Summary: Storing not just the callee, but the actual call may be interesting for some use-cases. In particular, D72362 would like that to better pretty-print the cycles in call graph. Reviewers: NoQ, erichkeane Reviewed By: NoQ Subscribers: martong, Charusso, cfe-commits Tags: #clang Differential Revision: https://reviews.llvm.org/D74081

Looks like on some system, version is printed on stderr, on some it's on stdout...

Same as D73328 but for TBD_V4. One notable tidbit is that the swift abi version for swift 1 & 2 is emitted as a float which is considered invalid input. Differential revision: https://reviews.llvm.org/D73330

This extract common code between the 4 TBD formats in a header that can be shared. Differential revision: https://reviews.llvm.org/D73332

Summary: Due to Unity, we had to reduce our region sizes, but in some rare situations, some programs (mostly tests AFAICT) manage to fill up a region for a given size class. So this adds a workaround for that attempts to allocate the block from the immediately larger size class, wasting some memory but allowing the application to keep going. Reviewers: pcc, eugenis, cferris, hctim, morehouse Subscribers: #sanitizers, llvm-commits Tags: #sanitizers, #llvm Differential Revision: https://reviews.llvm.org/D74567

callback unconditionally; it was added to lldb five years ago and we don't need to qualify its availability.

Summary: It complains about reaching the end of a non-void returning function. Reviewers: eugenis, hctim, morehouse Subscribers: #sanitizers, llvm-commits Tags: #sanitizers, #llvm Differential Revision: https://reviews.llvm.org/D74578

This option add a line break then a lambda is inside a function call. Reviewers : djasper, klimek, krasimir, MyDeveloperDay Reviewed By: MyDeveloperDay Differential Revision: https://reviews.llvm.org/D44609

Summary: Fixes a crash in the backend where optimizations produce calls to the cbrt runtime functions. Fixes PR 44227. Reviewers: aheejin Subscribers: dschuff, sbc100, jgravelle-google, hiraditya, sunfish, llvm-commits Tags: #llvm Differential Revision: https://reviews.llvm.org/D74259

…End}OfAsmFile

…-bits on KNL. If we widen the compare we might trigger a spurious exception from the garbage data. We have two choices here. Explicitly force the upper bits to zero. Or use a legacy VEX vcmpps/pd instruction and convert the XMM/YMM result to mask register. I've chosen to go with the second option. I'm not sure which is really best. In some cases we could get rid of the zeroing since the producing instruction probably already zeroed it. But we lose the ability to fold a load. So which is best is dependent on surrounding code. Differential Revision: https://reviews.llvm.org/D74522

Similar to rL328848.

…notes.

binop (extelt X, C), (extelt Y, C) --> extelt (binop X, Y), C This is a transform that has been considered for canonicalization (instcombine) in the past because it reduces instruction count. But as shown in the x86 tests, it's impossible to know if it's profitable without a cost model. There are many potential target constraints to consider. We have implemented similar transforms in the backend (DAGCombiner and target-specific), but I don't think we have this exact fold there either (and if we did it in SDAG, it wouldn't work across blocks). Note: this patch was intended to handle the more general case where the extract indexes do not match, but it got too big, so I scaled it back to this pattern for now. Differential Revision: https://reviews.llvm.org/D74495

replaceDbgDeclare is used to update the descriptions of stack variables when they are moved (e.g. by ASan or SafeStack). A side effect of replaceDbgDeclare is that it moves dbg.declares around in the instruction stream (typically by hoisting them into the entry block). This behavior was introduced in llvm/r227544 to fix an assertion failure (llvm.org/PR22386), but no longer appears to be necessary. Hoisting a dbg.declare generally does not create problems. Usually, dbg.declare either describes an argument or an alloca in the entry block, and backends have special handling to emit locations for these. In optimized builds, LowerDbgDeclare places dbg.values in the right spots regardless of where the dbg.declare is. And no one uses replaceDbgDeclare to handle things like VLAs. However, there doesn't seem to be a positive case for moving dbg.declares around anymore, and this reordering can get in the way of understanding other bugs. I propose getting rid of it. Testing: stage2 RelWithDebInfo sanitized build, check-llvm rdar://59397340 Differential Revision: https://reviews.llvm.org/D74517

…ect ffp-contract=on""" This reverts commit abd0905. It's causing internal buildbot fails on ppc Conflicts: clang/lib/Driver/ToolChains/Clang.cpp

…r.contract Summary: This sets the basic framework for lowering vector.contract progressively into simpler vector.contract operations until a direct vector.reduction operation is reached. More details will be filled out progressively as well. Reviewers: nicolasvasilache Reviewed By: nicolasvasilache Subscribers: mehdi_amini, rriddle, jpienaar, burmako, shauheen, antiagainst, nicolasvasilache, arpith-jacob, mgester, lucyrfox, liufengdb, Joonsoo, llvm-commits Tags: #llvm Differential Revision: https://reviews.llvm.org/D74520

…rogram. Mach allows you to suspend and resume other threads within a program, so debugserver has to be careful not to interfere with this when it goes to supend and resume threads while stepping over breakpoints and calling functions. Even trickier, if you call a function on a suspended thread, it has to resume the thread to get the expression to run, and then suspend it properly when done. This all works already, but there wasn't a test for it. Adding that here. This same test could be written for a unix that supports pthread_{suspend,resume}_np, but macOS doesn't support these calls, only the mach version. It doesn't look like a lot of Linux'es support this (AIX does apparently...) And IIUC Windows allows you to suspend and resume other threads, but the code for that would look pretty different than this main.c. So for simplicity's sake I wrote this test for Darwin-only.

This NFC commit updates several llc tests checks by automatically generated ones.

…ontract=on" Buildbot are failing with the current revert status. So reland with a fix to fp-model.c

Summary: Also make return calls terminator instructions so epilogues are inserted before them rather than after them. Together, these changes make WebAssembly's tail call optimization more stack-safe. Reviewers: aheejin, dschuff Subscribers: sbc100, jgravelle-google, hiraditya, sunfish, llvm-commits Tags: #llvm Differential Revision: https://reviews.llvm.org/D73943

This allows some simplification.

Add fixits for messaging self in MRR or using super, as the intent is clear, and it turns out people do that a lot more than expected. Allow for objc_direct_members on main interfaces, it's extremely useful for internal only classes, and proves to be quite annoying for adoption. Add some better warnings around properties direct/non-direct clashes (it was done for methods but properties were a miss). Add some errors when direct properties are marked @dynamic. Radar-Id: rdar://problem/58355212 Signed-off-by: Pierre Habouzit <[email protected]> Differential Revision: https://reviews.llvm.org/D73755

…k instructions in SKX scheduler model Uops.info shows these as 4 cycle latency.

norecurse function attr indicates the function is not called recursively directly or indirectly. Add norecurse to OpenCL functions, SYCL functions in device compilation and CUDA/HIP kernels. Although there is LLVM pass adding norecurse to functions, it only works for whole-program compilation. Also FE adding norecurse can make that pass run faster since functions with norecurse do not need to be checked again. Differential Revision: https://reviews.llvm.org/D73651

On Powerpc, set instruction count as lsr first priority of lsr by default. Add an option ppc-lsr-no-insns-cost to return back to default lsr cost model. Reviewed By: steven.zhang, jsji Differential Revision: https://reviews.llvm.org/D72683

The driver version of the flag seems to confuse goma.

…ravese This patch removes the explicit call graph for CUDA/HIP/OpenMP deferred diagnostics generated during parsing since it is error prone due to incomplete information about function declarations during parsing. In stead, this patch does a post-parsing AST traverse and emits deferred diagnostics based on the use graph implicitly generated during the traverse. Differential Revision: https://reviews.llvm.org/D70172

These aren't using any local state

This wouldn't work for s33-s63 sources.

CONFLICT (content): Merge conflict in clang/lib/Sema/Sema.cpp

This patch implements the rest of SPV_INTEL_kernel_attributes extension. NoGlobalOffsetINTEL execution mode indicates that the global offset is always (0, 0, 0). Only valid with the Kernel Execution Model.

This commit mirrors llvm/llvm-project@e956952 "DebugInfo: Flag Dwarf Version metadata for merging during LTO When the Dwarf Version metadata was initially added (r184276) there was no support for Module::Max - though the comment suggested that was the desired behavior. The original behavior was Module::Warn which would warn and then pick whichever version came first - which is pretty arbitrary/luck-based if the consumer has some need for one version or the other. Now that the functionality's been added (r303590) this change updates the implementation to match the desired goal. The general logic here is - if you compile /some/ of your program with a more recent DWARF version, you must have a consumer that can handle it, so might as well use it for /everything/. The only place where this might fall down is if you have a need to use an old tool (supporting only the older DWARF version) for some subset of your program. In which case now it'll all be the higher version. That seems pretty narrow (& the inverse could happen too - you specifically /need/ the higher DWARF version for some extra expressivity, etc, in some part of the program)" Signed-off-by: Alexey Sotkin <[email protected]>

Update for 9f6ff07 ("[DebugInfo] Enable the debug entry values feature by default", 2020-02-10).

These were introduced by commit 33d4946 ("Enable NoGlobalOffsetINTEL execution mode translation", 2020-01-16).

Commit 10dee68 ("Fix translation of 64-bit atomics to OpenCL 1.2", 2019-08-15) changes the "atomic_" prefix to "atom_" for 64-bit types, but this caused the "atom_" builtins to no longer have the volatile qualifier to their pointer argument. Update the atomics_int64.spt test to check for the fully mangled names; these should include the volatile qualifier.

Signed-off-by: Alexey Sotkin <[email protected]>

Signed-off-by: Vladimir Lazarev <[email protected]>

Workaround unsupported freeze insn by: replacing uses of freeze's result with freeze's source or a random (but compilation reproducible) constant if freeze's source is undef/poison deleting freeze insn. Long term solution is to add a freeze instruction extension in SPIR-V. Issue is tracked in (#1140) Signed-off-by: Lu, John <[email protected]> Original commit: KhronosGroup/SPIRV-LLVM-Translator@ed25856

arsenm and others added 30 commits February 13, 2020 15:25

AMDGPU/GlobalISel: Make G_TRUNC legal

5adbf7d

This is required to be legal. I'm not sure how we were getting away without defining any rules for it.

Fix handling of --version in lit

1d48493

There's no reason why we should require a directory when asking for the version. Differential Revision: https://reviews.llvm.org/D74553

[OPENMP][DOCS]Fix misprint, NFC.

7ecf066

[gn build] Port 49bffa5

f888ae7

Fix lit version test

f8b8a1c

Looks like on some system, version is printed on stderr, on some it's on stdout...

Remove unnecessary typedef that GCC doesn't like

e3548e2

[llvm][TextAPI/MachO] Extend TBD_V4 unittest to verify writing

c6e8bfe

Same as D73328 but for TBD_V4. One notable tidbit is that the swift abi version for swift 1 & 2 is emitted as a float which is considered invalid input. Differential revision: https://reviews.llvm.org/D73330

[llvm][TextAPI/MachO] Extract common code into unittest helper (NFC)

5810ed5

This extract common code between the 4 TBD formats in a header that can be shared. Differential revision: https://reviews.llvm.org/D73332

Document third option to python synthetic type summary

1287977

callback unconditionally; it was added to lldb five years ago and we don't need to qualify its availability.

Small reformat to avoid tripping up possible formatting.

14d6863

[GWP-ASan] Silence gcc error

ef7488e

Summary: It complains about reaching the end of a non-void returning function. Reviewers: eugenis, hctim, morehouse Subscribers: #sanitizers, llvm-commits Tags: #sanitizers, #llvm Differential Revision: https://reviews.llvm.org/D74578

[clang-format] Add new option BeforeLambdaBody in Allman style.

fa0118e

This option add a line break then a lambda is inside a function call. Reviewers : djasper, klimek, krasimir, MyDeveloperDay Reviewed By: MyDeveloperDay Differential Revision: https://reviews.llvm.org/D44609

[AsmPrinter] De-capitalize Emit{Function,BasicBlock]* and Emit{Start,…

0dce409

…End}OfAsmFile

[AsmPrinter] De-capitalize some AsmPrinter::Emit* functions

0bc77a0

Similar to rL328848.

[clang] Fix bad line ending (DOS instead of Unix) inside the release …

f7e2227

…notes.

[build] Fix shared lib builds.

fe36127

Add dbgs() output to help track down missing DW_AT_location bugs, NFC

3091049

Revert "Revert "Revert "Change clang option -ffp-model=precise to sel…

88ec01c

…ect ffp-contract=on""" This reverts commit abd0905. It's causing internal buildbot fails on ppc Conflicts: clang/lib/Driver/ToolChains/Clang.cpp

[AArch64][NFC] Update test checks.

b23ec43

This NFC commit updates several llc tests checks by automatically generated ones.

Reland D74436 "Change clang option -ffp-model=precise to select ffp-c…

0a1123e

…ontract=on" Buildbot are failing with the current revert status. So reland with a fix to fp-model.c

MaskRay and others added 22 commits February 16, 2020 13:14

[IR] Change maybeSetDSOLocal to isImplicitDSOLocal

a35b728

This allows some simplification.

[X86] Add more avx512 instrutions to llvm-mca resource tests

c636f69

[X86] Increase latency of port5 masked compares and kshift/kadd/kunpc…

20c5968

…k instructions in SKX scheduler model Uops.info shows these as 4 cycle latency.

[gn build] use -Xclang form for fdebug-comp-dir for now

e8e078c

The driver version of the flag seems to confuse goma.

AMDGPU/GlobalISel: Move lambdas to normal function

044d40e

These aren't using any local state

AMDGPU/GlobalISel: Add some missing tests for non-power-of-2 cases

24c1561

AMDGPU/GlobalISel: Fix non-power-of-2 G_SITOFP/G_UITOFP

295bbea

This wouldn't work for s33-s63 sources.

[X86] Remove unnecessary check for null SDValue. NFC

464729c

[CUDA][HIP][OpenMP] Add lib/Sema/UsedDeclVisitor.h after D70172

c7fa409

Merge from 'master' to 'sycl-web' (intel#2)

8c779b7

CONFLICT (content): Merge conflict in clang/lib/Sema/Sema.cpp

Merge commit 'c7fa409bcad' into sycl-web

0b61f04

Enable NoGlobalOffsetINTEL execution mode translation

37e7a78

This patch implements the rest of SPV_INTEL_kernel_attributes extension. NoGlobalOffsetINTEL execution mode indicates that the global offset is always (0, 0, 0). Only valid with the Kernel Execution Model.

Update DebugInfo test after LLVM change 9f6ff07

771ac92

Update for 9f6ff07 ("[DebugInfo] Enable the debug entry values feature by default", 2020-02-10).

Fix -Wunused-variable warnings

7df0e1f

These were introduced by commit 33d4946 ("Enable NoGlobalOffsetINTEL execution mode translation", 2020-01-16).

Translate LLVM's cmpxchg instruction to SPIR-V

c78bbd0

Signed-off-by: Alexey Sotkin <[email protected]>

[SYCL] Fix LIT after LLVM change in community

793895c

Signed-off-by: Vladimir Lazarev <[email protected]>

vladimirlaz merged commit 322d86e into intel:sycl Feb 18, 2020

vladimirlaz deleted the private/vladimirlaz/llvmspirv_pulldown branch February 18, 2020 12:48

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

LLVM and SPIRV-LLVM-Translator pull down #1140

LLVM and SPIRV-LLVM-Translator pull down #1140

Uh oh!

vladimirlaz commented Feb 18, 2020

Uh oh!

Uh oh!

LLVM and SPIRV-LLVM-Translator pull down #1140

LLVM and SPIRV-LLVM-Translator pull down #1140

Uh oh!

Conversation

vladimirlaz commented Feb 18, 2020

Uh oh!

Uh oh!