-
Notifications
You must be signed in to change notification settings - Fork 787
LLVM and SPIR-V translator pulldown #1105
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Merged
vladimirlaz
merged 1,946 commits into
intel:sycl
from
vladimirlaz:private/vlazarev/llvmspirv_pulldown
Feb 7, 2020
Merged
LLVM and SPIR-V translator pulldown #1105
vladimirlaz
merged 1,946 commits into
intel:sycl
from
vladimirlaz:private/vlazarev/llvmspirv_pulldown
Feb 7, 2020
Conversation
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
…2,64} Similar to D73680 (AArch64 BTI). A local linkage function whose address is not taken does not need ENDBR32/ENDBR64. Placing the patch label after ENDBR32/ENDBR64 has the advantage that code does not need to differentiate whether the function has an initial ENDBR. Also, add 32-bit tests and test that .cfi_startproc is at the function entry. The line information has a general implementation and is tested by AArch64/patchable-function-entry-empty.mir Reviewed By: nickdesaulniers Differential Revision: https://reviews.llvm.org/D73760
AMDGPU and x86 at least both have separate controls for whether denormal results are flushed on output, and for whether denormals are implicitly treated as 0 as an input. The current DAGCombiner use only really cares about the input treatment of denormals.
Existing tests: rG5d04e008f708 rG2a191cf8500f ...should verify that the underlying analysis doesn't improve too much without updating this user code.
Summary: Add a debug check for frequency queries for unknown blocks (typically blocks that are created after BFI is computed but their frequencies are not communicated to BFI.) This is useful for detecting and debugging missed BFI updates. This is debug build only and disabled behind a flag. Reviewers: davidxl Subscribers: hiraditya, llvm-commits Tags: #llvm Differential Revision: https://reviews.llvm.org/D73920
…ExpandedFromMacro`. Summary: Spells out some `auto`s explicitly and adds another test for the matcher `isExpandedFromMacro`. Reviewers: aaron.ballman Subscribers: gribozavr, cfe-commits Tags: #clang Differential Revision: https://reviews.llvm.org/D73975
We weren't account for load latencies in the SSE42/AES/CLMUL schedule classes
The usage of the Imm out argument from SelectSMRDOffset is pretty confusing. Stop trying to reject CI immediates in the case where the offset field can be used. It's not an illegal way to encode the immediate, so just prefer the better encoding pattern with AddedComplexity. We probably don't even really need the different opcodes for the different offset types anymore, but that will be more work to cleanup. The SMRD non-buffer load patterns could also use a cleanup to be done separately.
Make usage more consistent, and make it possible to enable LongOptionsUseDoubleDash.
…TailCallFrames In order to synthesize tail call frames, the stack frame list must not be empty (otherwise, there is no "previous" frame to infer a tail call from). This case is hard to hit. To trigger it, we must first fail to push `unwind_frame_sp` because we either fail to get its SymbolContext, or given its SymbolContext the GetParentOfInlineScope call fails. This causes m_concrete_frames_fetched to be incremented while m_frames remains empty. Then, the next frame in the stack may fail within SynthesizeTailCallFrames. This crash arose during a kernel debugging session. rdar://59147051
Prepare to accurately track the future denormal-fp-math attribute changes. The way to actually set these separately is not wired in yet. This is just a mechanical change, and mostly still assumes the input and output mode match. This should be refined for some cases. For example, fcanonicalize lowering should use the flushing variant if either input or output flushing is enabled
Summary: Add comments to the list of tokens that can follow the ']' at the end of a C# attribute specifier to prevent comments after attribute specifiers from being formatted as continuations. Reviewers: MyDeveloperDay, krasimir Reviewed By: MyDeveloperDay Tags: #clang-format Differential Revision: https://reviews.llvm.org/D73977
Summary: A few details were missing in the description. These changes makes the documented code "compile". Reviewers: nicolasvasilache, andydavis1 Reviewed By: nicolasvasilache, andydavis1 Subscribers: merge_guards_bot, mehdi_amini, rriddle, jpienaar, burmako, shauheen, antiagainst, nicolasvasilache, arpith-jacob, mgester, lucyrfox, liufengdb, Joonsoo, llvm-commits Tags: #llvm Differential Revision: https://reviews.llvm.org/D73923
shouldOptimizeForSize is showing up in a profile, spending around 10% of the pass time in one function. This should probably not be so slow, but the much cheaper attribute check should be done first anyway.
Fixes a wimpy-mode CTS failure for asin(float). Passes non-wimpy for both float/double on RX580. Signed-off-by: Aaron Watry <[email protected]> Tested-by: Jan Vesely <[email protected]> Reviewed-by: Jan Vesely <[email protected]>
Summary: Changes: - Calls to consteval function are now evaluated in constant context but IR is still generated for them. - Add diagnostic for taking address of a consteval function in non-constexpr context. - Add diagnostic for address of consteval function accessible at runtime. - Add tests Reviewers: rsmith, aaron.ballman Reviewed By: rsmith Subscribers: mgrang, riccibruno, cfe-commits Tags: #clang Differential Revision: https://reviews.llvm.org/D63960
Reviewers: sivachandra, abrachet Reviewed By: sivachandra, abrachet Subscribers: libc-commits, MaskRay, tschuett Tags: #libc-project, #llvm Differential Revision: https://reviews.llvm.org/D72248
Summary: It can be useful to tune the default inline threshold without overriding other inlining thresholds (e.g. in code compiled for size). The existing `-inline-threshold` flag overrides other thresholds, so it is insufficient in codebases where there is a mix of code compiled for size and speed. Patch by Michael Holman <[email protected]> Reviewers: eraman, tejohnson Reviewed By: tejohnson Subscribers: tejohnson, mtrofin, davidxl, hiraditya, haicheng, llvm-commits Tags: #llvm Differential Revision: https://reviews.llvm.org/D73217
This allows for reusing the internal state of the printer, which is more efficient and also allows for using type aliases
This time with correct types for the data result from the SUB. Original commit message: Our normal lowering for ISD::SETCC uses X86ISD::SUB to enable CSE unless the RHS is 0. optimizeCompareInstr called by the peephole pass can turn subs with unused results into cmps to clean this up. This commit makes other places that create X86ISD::CMP have the same behavior.
are equally constrained.
Summary: - The device compilation needs to have a consistent source code compared to the corresponding host compilation. If macros based on the host-specific target processor is not properly populated, the device compilation may fail due to the inconsistent source after the preprocessor. So far, only the host triple is used to build the macros. If a detailed host CPU target or certain features are specified, macros derived from them won't be populated properly, e.g. `__SSE3__` won't be added unless `+sse3` feature is present. On Windows compilation compatible with MSVC, that missing macros result in that intrinsics are not included and cause device compilation failure on the host-side source. - This patch addresses this issue by introducing two `cc1` options, i.e., `-aux-target-cpu` and `-aux-target-feature`. If a specific host CPU target or certain features are specified, the compiler driver will append them during the construction of the offline compilation actions. Then, the toolchain in `cc1` phase will populate macros accordingly. - An internal option `--gpu-use-aux-triple-only` is added to fall back the original behavior to help diagnosing potential issues from the new behavior. Reviewers: tra, yaxunl Subscribers: cfe-commits Tags: #clang Differential Revision: https://reviews.llvm.org/D73942
Constant is too large to fit into uintptr_t on 32-bit.
Summary: Previously, vector.contract did not allow an empty set of free or batch dimensions (K = 0) which defines a basic reduction into a scalar (like a dot product). This CL relaxes that restriction. Also adds constraints on element type of operands and results. With tests. Reviewers: nicolasvasilache, andydavis1, rriddle Reviewed By: andydavis1 Subscribers: merge_guards_bot, mehdi_amini, rriddle, jpienaar, burmako, shauheen, antiagainst, nicolasvasilache, arpith-jacob, mgester, lucyrfox, liufengdb, Joonsoo, llvm-commits Tags: #llvm Differential Revision: https://reviews.llvm.org/D74014
Summary: NFC - See title Reviewers: eugenis Reviewed By: eugenis Subscribers: merge_guards_bot, #sanitizers, llvm-commits Tags: #sanitizers, #llvm Differential Revision: https://reviews.llvm.org/D74100
…ging JIT'd code Differential Revision: https://reviews.llvm.org/D73932
Summary: This revision adds basic support for emitting line table information when exporting to LLVMIR. We don't yet have a story for supporting all of the LLVM debug metadata, so this revision stubs some features(like subprograms) to enable emitting line tables. Differential Revision: https://reviews.llvm.org/D73934
This is the imported target that find_package(ZLIB) defines.
classes. This allows for the `LLVM::ModuleTranslation::translateModule` to properly access the constructors of the derived classes.
The developer mode check is now working. Add another check for user id mismatch, e.g. a regular user trying to attach to something running as root, and describe the problem for the user.
…il after ICP" This reverts commit 748bb5a. Due to Chromium CFI+ThinLTO test crashes reported on patch.
Some SB API methods returns strings through a char* and a length. This is a problem for the deserializer, which considers a single type at a time, and therefore cannot know how many bytes to allocate for the character buffer. We can solve this problem by implementing a custom replayer, which ignores the passed-in char* and allocates a buffer of the correct size itself, before invoking the original API method or function. This patch adds three new macros to register a custom replayer for methods that take a char* and a size_t. It supports arbitrary return values (some functions return a bool while others return a size_t).
…cit. Commit 777180a "[ADT] Make StringRef's std::string conversion operator explicit" caused Polly's GPU code generator to not compile anymore. The rest of Polly has already been fixed in commit 0257a9 "Fix polly build after StringRef change."
…ile unit Summary: This is a preparatory patch to re-enable DWP support in lldb (we already have code claiming to do that, but it has been completely broken for a while now). The idea of the new approach is to make the SymbolFileDWARFDwo class handle both dwo and dwo files, similar to how llvm uses one DWARFContext to handle the two. The first step is to remove the assumption that a SymbolFileDWARFDwo holds just a single compile unit, i.e. the GetBaseCompileUnit method. This requires changing the way how we reach the skeleton compile unit (and the lldb_private::CompileUnit) from a dwo unit, which was previously done via GetSymbolFile()->GetBaseCompileUnit() (and some virtual dispatch). The new approach reuses the "user data" mechanism of DWARFUnits, which was used to link dwarf units (both skeleton and split) to their lldb_private counterparts. Now, this is done only for non-dwo units, and instead of that, the dwo units holds a pointer to the relevant skeleton unit. Reviewers: JDevlieghere, aprantl, clayborg Reviewed By: JDevlieghere, clayborg Subscribers: arphaman, lldb-commits Tags: #lldb Differential Revision: https://reviews.llvm.org/D73781
mlir-opt needs to link against MLIRLoopAnalysis This shouldn't be needed but MLIR "hack" for "whole-archive" linking is not compatible with CMake transitive dependencies management. Differential Revision: https://reviews.llvm.org/D74097
For the methods taking a char* and a length that have a custom replayer, ignore the incoming string in the instrumentation macro. This prevents potentially reading garbage and blowing up the SB API log.
This patch allows the index does to provide a way to distinguish implicit references (e.g. coming from macro expansions) from the spelled ones. The corresponding flag was added to RefKind and symbols that are referenced without spelling their name explicitly are now marked implicit. This allows fixing incorrect behavior when renaming a symbol that was referenced in macro expansions would try to rename macro invocations. Differential Revision: D72746 Reviewed by: hokein
D72746 was missing a part of the patch before landing.
Differential Revision: https://reviews.llvm.org/D74084
Update for 41206b6 ("[DebugInfo] Re-instate LiveDebugVariables scope trimming", 2020-02-04).
Update tests that still called llvm.mem intrinsics with an alignment argument. The alignment argument has been removed by 1e68724 ("Remove alignment argument from memcpy/memmove/memset in favour of alignment attributes (Step 1)", 2018-01-19).
Change return value from `void` to `bool`. Return `false` to indicate a failure. Return `true` on success. Signed-off-by: Alexey Sotkin <[email protected]>
For 16-bit floating point constants we have to emit the Float16 capability. The implementation is based on checking if cl_khr_fp16 extension from a source langue (OpenCL C) was enabled. This commit fixes a bug: we should check for *source* extension instead of SPIR-V extension. Signed-off-by: Alexey Sotkin <[email protected]>
Translate SPIRV-friendly LLVM IR builtins represnting specialization constants `__spirv_SpecConstant` to corresponding SPIR-V instructions. This approach works only for scalar constants, so support for `OpSpecConstantComposite` and `OpSpecConstantOp` is not implemented in this patch. Signed-off-by: Alexey Sotkin <[email protected]>
Signed-off-by: Alexey Sotkin <[email protected]>
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
LLVM: 67905fc
SPIRV-LLVM-Translator: e980e04