Skip to content

LLVM and SPIR-V translator pulldown #1105

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
merged 1,946 commits into from
Feb 7, 2020

Conversation

vladimirlaz
Copy link
Contributor

LLVM: 67905fc
SPIRV-LLVM-Translator: e980e04

MaskRay and others added 30 commits February 4, 2020 09:42
…2,64}

Similar to D73680 (AArch64 BTI).

A local linkage function whose address is not taken does not need ENDBR32/ENDBR64. Placing the patch label after ENDBR32/ENDBR64 has the advantage that code does not need to differentiate whether the function has an initial ENDBR.

Also, add 32-bit tests and test that .cfi_startproc is at the function
entry. The line information has a general implementation and is tested
by AArch64/patchable-function-entry-empty.mir

Reviewed By: nickdesaulniers

Differential Revision: https://reviews.llvm.org/D73760
AMDGPU and x86 at least both have separate controls for whether
denormal results are flushed on output, and for whether denormals are
implicitly treated as 0 as an input. The current DAGCombiner use only
really cares about the input treatment of denormals.
Existing tests:
rG5d04e008f708
rG2a191cf8500f
...should verify that the underlying analysis doesn't improve
too much without updating this user code.
Summary:
Add a debug check for frequency queries for unknown blocks (typically blocks
that are created after BFI is computed but their frequencies are not
communicated to BFI.)

This is useful for detecting and debugging missed BFI updates.

This is debug build only and disabled behind a flag.

Reviewers: davidxl

Subscribers: hiraditya, llvm-commits

Tags: #llvm

Differential Revision: https://reviews.llvm.org/D73920
…ExpandedFromMacro`.

Summary: Spells out some `auto`s explicitly and adds another test for the matcher `isExpandedFromMacro`.

Reviewers: aaron.ballman

Subscribers: gribozavr, cfe-commits

Tags: #clang

Differential Revision: https://reviews.llvm.org/D73975
We weren't account for load latencies in the SSE42/AES/CLMUL schedule classes
The usage of the Imm out argument from SelectSMRDOffset is pretty
confusing. Stop trying to reject CI immediates in the case where the
offset field can be used. It's not an illegal way to encode the
immediate, so just prefer the better encoding pattern with
AddedComplexity.

We probably don't even really need the different opcodes for the
different offset types anymore, but that will be more work to cleanup.

The SMRD non-buffer load patterns could also use a cleanup to be done
separately.
Make usage more consistent, and make it possible to enable LongOptionsUseDoubleDash.
…TailCallFrames

In order to synthesize tail call frames, the stack frame list must not
be empty (otherwise, there is no "previous" frame to infer a tail call
from).

This case is hard to hit. To trigger it, we must first fail to push
`unwind_frame_sp` because we either fail to get its SymbolContext, or
given its SymbolContext the GetParentOfInlineScope call fails. This
causes m_concrete_frames_fetched to be incremented while m_frames
remains empty. Then, the next frame in the stack may fail within
SynthesizeTailCallFrames. This crash arose during a kernel debugging
session.

rdar://59147051
Prepare to accurately track the future denormal-fp-math attribute
changes. The way to actually set these separately is not wired in yet.

This is just a mechanical change, and mostly still assumes the input
and output mode match. This should be refined for some cases. For
example, fcanonicalize lowering should use the flushing variant if
either input or output flushing is enabled
Summary: Add comments to the list of tokens that can follow the ']' at the end of a C# attribute specifier to prevent comments after attribute specifiers from being formatted as continuations.

Reviewers: MyDeveloperDay, krasimir

Reviewed By: MyDeveloperDay

Tags: #clang-format

Differential Revision: https://reviews.llvm.org/D73977
Summary:
A few details were missing in the description. These
changes makes the documented code "compile".

Reviewers: nicolasvasilache, andydavis1

Reviewed By: nicolasvasilache, andydavis1

Subscribers: merge_guards_bot, mehdi_amini, rriddle, jpienaar, burmako, shauheen, antiagainst, nicolasvasilache, arpith-jacob, mgester, lucyrfox, liufengdb, Joonsoo, llvm-commits

Tags: #llvm

Differential Revision: https://reviews.llvm.org/D73923
shouldOptimizeForSize is showing up in a profile, spending around 10%
of the pass time in one function. This should probably not be so slow,
but the much cheaper attribute check should be done first anyway.
Fixes a wimpy-mode CTS failure for asin(float).

Passes non-wimpy for both float/double on RX580.

Signed-off-by: Aaron Watry <[email protected]>
Tested-by: Jan Vesely <[email protected]>
Reviewed-by: Jan Vesely <[email protected]>
Summary:
Changes:
 - Calls to consteval function are now evaluated in constant context but IR is still generated for them.
 - Add diagnostic for taking address of a consteval function in non-constexpr context.
 - Add diagnostic for address of consteval function accessible at runtime.
 - Add tests

Reviewers: rsmith, aaron.ballman

Reviewed By: rsmith

Subscribers: mgrang, riccibruno, cfe-commits

Tags: #clang

Differential Revision: https://reviews.llvm.org/D63960
Reviewers: sivachandra, abrachet

Reviewed By: sivachandra, abrachet

Subscribers: libc-commits, MaskRay, tschuett

Tags: #libc-project, #llvm

Differential Revision: https://reviews.llvm.org/D72248
Summary:
It can be useful to tune the default inline threshold without overriding other inlining thresholds (e.g. in code compiled for size).

The existing `-inline-threshold` flag overrides other thresholds, so it is insufficient in codebases where there is a mix of code compiled for size and speed.

Patch by Michael Holman <[email protected]>

Reviewers: eraman, tejohnson

Reviewed By: tejohnson

Subscribers: tejohnson, mtrofin, davidxl, hiraditya, haicheng, llvm-commits

Tags: #llvm

Differential Revision: https://reviews.llvm.org/D73217
This allows for reusing the internal state of the printer, which is more
efficient and also allows for using type aliases
This time with correct types for the data result from the SUB.

Original commit message:

Our normal lowering for ISD::SETCC uses X86ISD::SUB to enable
CSE unless the RHS is 0. optimizeCompareInstr called by the peephole
pass can turn subs with unused results into cmps to clean this up.

This commit makes other places that create X86ISD::CMP have the
same behavior.
Summary:
- The device compilation needs to have a consistent source code compared
  to the corresponding host compilation. If macros based on the
  host-specific target processor is not properly populated, the device
  compilation may fail due to the inconsistent source after the
  preprocessor. So far, only the host triple is used to build the
  macros. If a detailed host CPU target or certain features are
  specified, macros derived from them won't be populated properly, e.g.
  `__SSE3__` won't be added unless `+sse3` feature is present. On
  Windows compilation compatible with MSVC, that missing macros result
  in that intrinsics are not included and cause device compilation
  failure on the host-side source.

- This patch addresses this issue by introducing two `cc1` options,
  i.e., `-aux-target-cpu` and `-aux-target-feature`. If a specific host
  CPU target or certain features are specified, the compiler driver will
  append them during the construction of the offline compilation
  actions. Then, the toolchain in `cc1` phase will populate macros
  accordingly.

- An internal option `--gpu-use-aux-triple-only` is added to fall back
  the original behavior to help diagnosing potential issues from the new
  behavior.

Reviewers: tra, yaxunl

Subscribers: cfe-commits

Tags: #clang

Differential Revision: https://reviews.llvm.org/D73942
hctim and others added 27 commits February 5, 2020 16:46
Constant is too large to fit into uintptr_t on 32-bit.
Summary:
Previously, vector.contract did not allow an empty set of
free or batch dimensions (K = 0) which defines a basic
reduction into a scalar (like a dot product). This CL
relaxes that restriction. Also adds constraints on
element type of operands and results. With tests.

Reviewers: nicolasvasilache, andydavis1, rriddle

Reviewed By: andydavis1

Subscribers: merge_guards_bot, mehdi_amini, rriddle, jpienaar, burmako, shauheen, antiagainst, nicolasvasilache, arpith-jacob, mgester, lucyrfox, liufengdb, Joonsoo, llvm-commits

Tags: #llvm

Differential Revision: https://reviews.llvm.org/D74014
Summary: NFC - See title

Reviewers: eugenis

Reviewed By: eugenis

Subscribers: merge_guards_bot, #sanitizers, llvm-commits

Tags: #sanitizers, #llvm

Differential Revision: https://reviews.llvm.org/D74100
Summary:
This revision adds basic support for emitting line table information when exporting to LLVMIR. We don't yet have a story for supporting all of the LLVM debug metadata, so this revision stubs some features(like subprograms) to enable emitting line tables.

Differential Revision: https://reviews.llvm.org/D73934
This is the imported target that find_package(ZLIB) defines.
classes.

This allows for the `LLVM::ModuleTranslation::translateModule` to properly access the constructors of the derived classes.
The developer mode check is now working.

Add another check for user id mismatch, e.g. a regular user
trying to attach to something running as root, and describe
the problem for the user.
…il after ICP"

This reverts commit 748bb5a.

Due to Chromium CFI+ThinLTO test crashes reported on patch.
Some SB API methods returns strings through a char* and a length. This
is a problem for the deserializer, which considers a single type at a
time, and therefore cannot know how many bytes to allocate for the
character buffer.

We can solve this problem by implementing a custom replayer, which
ignores the passed-in char* and allocates a buffer of the correct size
itself, before invoking the original API method or function.

This patch adds three new macros to register a custom replayer for
methods that take a char* and a size_t. It supports arbitrary return
values (some functions return a bool while others return a size_t).
…cit.

Commit 777180a "[ADT] Make StringRef's std::string conversion operator explicit"
caused Polly's GPU code generator to not compile anymore. The rest of
Polly has already been fixed in commit
0257a9 "Fix polly build after StringRef change."
…ile unit

Summary:
This is a preparatory patch to re-enable DWP support in lldb (we already
have code claiming to do that, but it has been completely broken for a
while now).

The idea of the new approach is to make the SymbolFileDWARFDwo class
handle both dwo and dwo files, similar to how llvm uses one DWARFContext
to handle the two.

The first step is to remove the assumption that a SymbolFileDWARFDwo
holds just a single compile unit, i.e. the GetBaseCompileUnit method.
This requires changing the way how we reach the skeleton compile unit
(and the lldb_private::CompileUnit) from a dwo unit, which was
previously done via GetSymbolFile()->GetBaseCompileUnit() (and some
virtual dispatch).

The new approach reuses the "user data" mechanism of DWARFUnits, which
was used to link dwarf units (both skeleton and split) to their
lldb_private counterparts. Now, this is done only for non-dwo units, and
instead of that, the dwo units holds a pointer to the relevant skeleton
unit.

Reviewers: JDevlieghere, aprantl, clayborg

Reviewed By: JDevlieghere, clayborg

Subscribers: arphaman, lldb-commits

Tags: #lldb

Differential Revision: https://reviews.llvm.org/D73781
mlir-opt needs to link against MLIRLoopAnalysis
This shouldn't be needed but MLIR "hack" for
"whole-archive" linking is not compatible with
CMake transitive dependencies management.

Differential Revision: https://reviews.llvm.org/D74097
For the methods taking a char* and a length that have a custom replayer,
ignore the incoming string in the instrumentation macro. This prevents
potentially reading garbage and blowing up the SB API log.
This patch allows the index does to provide a way to distinguish
implicit references (e.g. coming from macro expansions) from the spelled
ones. The corresponding flag was added to RefKind and symbols that are
referenced without spelling their name explicitly are now marked
implicit. This allows fixing incorrect behavior when renaming a symbol
that was referenced in macro expansions would try to rename macro
invocations.

Differential Revision: D72746

Reviewed by: hokein
D72746 was missing a part of the patch before landing.
Update for 41206b6 ("[DebugInfo] Re-instate LiveDebugVariables
scope trimming", 2020-02-04).
Update tests that still called llvm.mem intrinsics with an alignment
argument.  The alignment argument has been removed by 1e68724
("Remove alignment argument from memcpy/memmove/memset in favour of
alignment attributes (Step 1)", 2018-01-19).
Change return value from `void` to `bool`.
Return `false` to indicate a failure. Return `true` on success.

Signed-off-by: Alexey Sotkin <[email protected]>
For 16-bit floating point constants we have to emit the Float16
capability. The implementation is based on checking if cl_khr_fp16
extension from a source langue (OpenCL C) was enabled.
This commit fixes a bug: we should check for *source* extension instead
of SPIR-V extension.

Signed-off-by: Alexey Sotkin <[email protected]>
Translate SPIRV-friendly LLVM IR builtins represnting specialization constants
`__spirv_SpecConstant` to corresponding SPIR-V instructions.

This approach works only for scalar constants, so support for
`OpSpecConstantComposite` and `OpSpecConstantOp` is not implemented in this
patch.

Signed-off-by: Alexey Sotkin <[email protected]>
@vladimirlaz vladimirlaz merged commit aabe58f into intel:sycl Feb 7, 2020
@vladimirlaz vladimirlaz deleted the private/vlazarev/llvmspirv_pulldown branch February 7, 2020 08:02
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.