-
Notifications
You must be signed in to change notification settings - Fork 787
LLVM and SPIRV-LLVM-Translator pulldown (WW26) #3961
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Merged
Merged
Conversation
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
…m-readobj. NFC. The --coff-exports option to llvm-readobj prints the exported symbols from a DLL/EXE, it doesn't do anything with regards to an import library. Differential Revision: https://reviews.llvm.org/D104214
The existing tests only test that some options (but not e.g. arm) are accepted, but it doesn't test their functional effect of affecting the generated object files. Differential Revision: https://reviews.llvm.org/D104215
Also use the default LLVM target as default for dlltool. This matches how GNU dlltool behaves; it is compiled with one default target, which is used if no option is provided. Extend the anonymous namespace in the implementation file instead of using static functions. Based on a patch by Mateusz Mikuła. The effect of the default LLVM target, if neither the -m option nor a tool triple prefix is provided, isn't tested, as we can't make assumptions about what it is set to. (We could make the default be forced to one of the four supported architectures if the default triple is another arch, and then just test that llvm-dlltool without an -m option is able to produce an import library, without checking the actual architecture though.) Differential Revision: https://reviews.llvm.org/D104212
The following class isn't part of the export table; there's a second correctly placed comment about the things that actually belong to the export table.
After D77330, the comments are inconsistent with the disassembled code. As the value of `far` has been changed, a thunk to reach it is now generated, and target addresses of branch instructions are different from what was initially expected. The patch fixes that and makes the test closer to what it was originally. Differential Revision: https://reviews.llvm.org/D104286
Do not use ultimate symbols in DescriptorInquiry. Using the ultimate symbol may lead to issues later for at least two reasons: - The original symbols may have volatile/asynchronous attributes that the ultimate may not have. Later phases working on the DescriptorInquiry would then not apply potential care required by these attributes. - HostAssociatedDetails symbols are used by OpenMP for symbols with special OpenMP attributes inside OpenMP region (e.g variables with private attribute), so it is very important to preserve this aspect in the DescriptorInquiry, that would otherwise apply on the symbol outside of the region. Differential Revision: https://reviews.llvm.org/D104385
As noted in PR45210: https://bugs.llvm.org/show_bug.cgi?id=45210 ...the bug is triggered as Eli say when sext(idx) * ElementSize overflows. ``` // assume that GV is an array of 4-byte elements GEP = gep GV, 0, Idx // this is accessing Idx * 4 L = load GEP ICI = icmp eq L, value => ICI = icmp eq Idx, NewIdx ``` The foldCmpLoadFromIndexedGlobal function simplifies GEP+load operation to icmp. And there is a problem because Idx * ElementSize can overflow. Let's assume that the wanted value is at offset 0. Then, there are actually four possible values for Idx to match offset 0: 0x00..00, 0x40..00, 0x80..00, 0xC0..00. We should return true for all these values, but currently, the new icmp only returns true for 0x00..00. This problem can be solved by masking off (trailing zeros of ElementSize) bits from Idx. ``` ... => Idx' = and Idx, 0x3F..FF ICI = icmp eq Idx', NewIdx ``` Reviewed By: efriedma Differential Revision: https://reviews.llvm.org/D99481
Short granule tags as poison cause a UaF to read the referenced memory to retrieve the tag, and means we do not detect the UaF if the last granule's tag is still around. This only increases the change of not catching a UaF from 0.39 % (1 / 256) to 0.42 % (1 / (256 - 17)). Reviewed By: eugenis Differential Revision: https://reviews.llvm.org/D104304
Before: ADDR is located -320 bytes to the right of 1072-byte region After: ADDR is located 752 bytes inside 1072-byte region Reviewed By: eugenis, walli99 Differential Revision: https://reviews.llvm.org/D104412
…aces This functionality is similar to delayed registration of dialect interfaces. It allows external interface models to be registered before the dialect containing the attribute/operation/type interface is loaded, or even before the context is created. Reviewed By: rriddle Differential Revision: https://reviews.llvm.org/D104397
The idea is now that AppendError<...> will set eReturnStatusFailed for you so you don't have to call SetStatus again. Previously if the error message was empty, the status wouldn't be set. I don't think there are any sitautions where the message is in fact empty but it potentially could be depending on where we get the string from. So let's set the status up front then return early if the message is empty. Reviewed By: teemperor Differential Revision: https://reviews.llvm.org/D104380
Since https://reviews.llvm.org/D103701 AppendError<...> sets this for you. This change includes all of the non-command uses. Some uses remain where it's either tricky to reason about the logic, or they aren't paired with AppendError calls. Reviewed By: teemperor Differential Revision: https://reviews.llvm.org/D104379
This patch adds a test case showing how a single extra .loc can cause binary differences when using -x86-pad-for-align=true. The issue has been discussed in D94542, PR42138, PR48742.
We now generate as many benchmarks as there are implementations. Differential Revision: https://reviews.llvm.org/D102156
Make sure llvm-mc is invariant with respect to debug locations in the test (checks update to use the -x86-pad-for-align default value)
Differential Revision: https://reviews.llvm.org/D104449
…er to add additional select(setcc,x,y) folds. NFCI. I need to add some additional handling to address some of the regressions from D101074
LLVM_DEBUG in headers is awkward, better avoid it. DEBUG_TYPE in a header results in a lot of macro redefinition warnings.
This is part 2, covering the commands source. Some uses remain where it's tricky to see what the logic is or they are not used with AppendError. Reviewed By: teemperor Differential Revision: https://reviews.llvm.org/D104448
…a value This patch fixes an issue where builds of programs with multiple dbg.values with DIArgList locations could have non-deterministic output. This issue was caused by ReplaceableMetadataImpl::getAllArgListUsers, which returned DIArgList pointers in a random order; the output of this function would later be used to insert dbg.values, causing the order of insertion to be non-deterministic. This patch changes getAllArgListUsers to return pointers in a fixed order. Differential Revision: https://reviews.llvm.org/D104105
Fixed crash when doing pointer math on a void pointer. Also, reworked test to use -verify rather than FileCheck. Reviewed By: erichkeane Differential Revision: https://reviews.llvm.org/D104424
…sers of a value" Commit caused build errors on buildbots with [-Werror,-Wreturn-std-move] enabled. This reverts commit fa1de88.
In D103169 I'm adding to InstSimplify support for NaN to constrained intrinsics that have a regular FP IR instruction counterpart. Precommit the tests for clarity when that ticket lands.
We need to dedup archive loads (similar to what we do for dylib loads). I noticed this issue after building some Swift stuff that used `-force_load_swift_libs`, as it caused some Swift archives to be loaded many times. Reviewed By: #lld-macho, thakis, MaskRay Differential Revision: https://reviews.llvm.org/D104353
Summary: This patch, as a follow-up of D95505, adds support for writing the long symbol name by implementing the StringTable. Only XCOFF32 is suppoted now. Reviewed By: jhenderson, shchenz Differential Revision: https://reviews.llvm.org/D103455
Differential Revision: https://reviews.llvm.org/D103789
There does not seem to be any use of these functions. They just put the value to a local which is never used again. Reviewed By: JonChesterfield Differential Revision: https://reviews.llvm.org/D104512
The main motivation behind pointer replacement of LDS use within non-kernel functions is - to *avoid* subsequent LDS lowering pass from directly packing LDS (assume large LDS) into a struct type which would otherwise cause allocating huge memory for struct instance within every kernel. Reviewed By: rampitec Differential Revision: https://reviews.llvm.org/D103225
This revision adds a BufferizationAliasInfo which maintains and updates information about which tensors will alias once bufferized, which bufferized tensors are equivalent to others and how to handle clobbers. Bufferization greedily tries to bufferize inplace by: 1. first trying to bufferize SubTensorInsertOp inplace, in reverse order (these are deemed the most expensives). 2. then trying to bufferize all non SubTensorOp / SubTensorInsertOp, in reverse order. 3. lastly trying to bufferize all SubTensorOp in reverse order. Reverse order is a heuristic that seems to work nicely because structured tensor codegen very often proceeds by: 1. take a subset of a tensor 2. compute on that subset 3. insert the result subset into the full tensor and yield a new tensor. BufferizationAliasInfo + equivalence sets + clobber analysis allows bufferizing nested subtensor/compute/subtensor_insert sequences inplace to a certain extent. To fully realize inplace bufferization, additional container-containee analysis will be necessary and is left for a subsequent commit. Differential revision: https://reviews.llvm.org/D104110
This pass aims to optimize VGPR live-range in a typical divergent if-else control flow. For example: def(a) if(cond) use(a) ... // A else use(a) As AMDGPU access vgpr with respect to active-mask, we can mark `a` as dead in region A. For details, please refer to the comments in implementation file. The pass is enabled by default, the frontend can disable it through "-amdgpu-opt-vgpr-liverange=false". Differential Revision: https://reviews.llvm.org/D102212
getSpecializationCost was returning INT_MAX for a case when specialisation shouldn't happen, but this wasn't properly checked if specialisation was forced. Differential Revision: https://reviews.llvm.org/D104461
Modify the `SPIRVShuffleVector` constructor to allow passing a nullptr basic block (as is the case for variable initializers). Modify the `SPIRVShuffleVector` constructor to take `SPIRVId`s instead of `SPIRVValue`s, which better reflects what is stored in the class, and saves us an unnecessary ID-to-Value-to-ID round trip in `createInstFromSpecConstantOp`. Original commit: KhronosGroup/SPIRV-LLVM-Translator@72f99e3
Modify the constructors to allow passing a nullptr basic block (as is the case for variable initializers). Modify the constructors to take `SPIRVId`s instead of `SPIRVValue`s, which better reflects what is stored in the class, and saves us an unnecessary ID-to-Value-to-ID round trip in `createInstFromSpecConstantOp`. Modify test/opundef.spt to return the constructed value, so that it does not get optimized out. Original commit: KhronosGroup/SPIRV-LLVM-Translator@9bc1d5a
When some constant expression is used as operand of regular instruction and operand of other constant expression, the current algorithm of lowering could produce incorrect order of instructions because it is not possible to add instruction operand to a constant expression. Consider following pseudo code example: ``` call foo(constexpr2(constexpr1), constexpr1) // After first loop iteration through operands of the call instruction: A = constexpr2op(constexpr1) call foo(A, constexpr1) // Ok, instruction A is now a user of constexpr1, so, when second // operand is processed, all uses of constexpr1 are updated, but the // instruction that represents constexpr1 is inserted before call // instruction because it is now being processed, so it will look like // this: A = constexpr2(B) B = constexpr1 call foo(A, B) // So, instruction B needs to be moved after all its users to get a // valid module: B = constexpr1 A = constexpr2(B) call foo(A, B) ``` Original commit: KhronosGroup/SPIRV-LLVM-Translator@390aba9
/summary:run |
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
LLVM: llvm/llvm-project@342bbb7
SPIRV-LLVM-Translator: KhronosGroup/SPIRV-LLVM-Translator@8679b96