Skip to content

LLVM and SPIRV-LLVM-Translator pulldown (WW26) #3961

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
merged 480 commits into from
Jun 22, 2021

Conversation

vmaksimo
Copy link
Contributor

mstorsjo and others added 30 commits June 17, 2021 13:02
…m-readobj. NFC.

The --coff-exports option to llvm-readobj prints the exported symbols
from a DLL/EXE, it doesn't do anything with regards to an import
library.

Differential Revision: https://reviews.llvm.org/D104214
The existing tests only test that some options (but not e.g. arm)
are accepted, but it doesn't test their functional effect of
affecting the generated object files.

Differential Revision: https://reviews.llvm.org/D104215
Also use the default LLVM target as default for dlltool. This
matches how GNU dlltool behaves; it is compiled with one default
target, which is used if no option is provided.

Extend the anonymous namespace in the implementation file instead
of using static functions.

Based on a patch by Mateusz Mikuła.

The effect of the default LLVM target, if neither the -m option
nor a tool triple prefix is provided, isn't tested, as we can't
make assumptions about what it is set to.

(We could make the default be forced to one of the four supported
architectures if the default triple is another arch, and then just
test that llvm-dlltool without an -m option is able to produce an
import library, without checking the actual architecture though.)

Differential Revision: https://reviews.llvm.org/D104212
The following class isn't part of the export table; there's a
second correctly placed comment about the things that actually
belong to the export table.
After D77330, the comments are inconsistent with the disassembled code.
As the value of `far` has been changed, a thunk to reach it is now
generated, and target addresses of branch instructions are different
from what was initially expected.

The patch fixes that and makes the test closer to what it was originally.

Differential Revision: https://reviews.llvm.org/D104286
Do not use ultimate symbols in DescriptorInquiry. Using the ultimate
symbol may lead to issues later for at least two reasons:

- The original symbols may have volatile/asynchronous attributes that
  the ultimate may not have. Later phases working on the DescriptorInquiry
  would then not apply potential care required by these attributes.
- HostAssociatedDetails symbols are used by OpenMP for symbols with
  special OpenMP attributes inside OpenMP region (e.g variables with
  private attribute), so it is very important to preserve this
  aspect in the DescriptorInquiry, that would otherwise apply on the
  symbol outside of the region.

Differential Revision: https://reviews.llvm.org/D104385
As noted in PR45210: https://bugs.llvm.org/show_bug.cgi?id=45210
...the bug is triggered as Eli say when sext(idx) * ElementSize overflows.

```
   // assume that GV is an array of 4-byte elements
   GEP = gep GV, 0, Idx // this is accessing Idx * 4
   L = load GEP
   ICI = icmp eq L, value
 =>
   ICI = icmp eq Idx, NewIdx
```

The foldCmpLoadFromIndexedGlobal function simplifies GEP+load operation to icmp.
And there is a problem because Idx * ElementSize can overflow.

Let's assume that the wanted value is at offset 0.
Then, there are actually four possible values for Idx to match offset 0: 0x00..00, 0x40..00, 0x80..00, 0xC0..00.
We should return true for all these values, but currently, the new icmp only returns true for 0x00..00.

This problem can be solved by masking off (trailing zeros of ElementSize) bits from Idx.

```
   ...
 =>
   Idx' = and Idx, 0x3F..FF
   ICI = icmp eq Idx', NewIdx
```

Reviewed By: efriedma

Differential Revision: https://reviews.llvm.org/D99481
Short granule tags as poison cause a UaF to read the referenced
memory to retrieve the tag, and means we do not detect the UaF
if the last granule's tag is still around.

This only increases the change of not catching a UaF from
0.39 % (1 / 256) to 0.42 % (1 / (256 - 17)).

Reviewed By: eugenis

Differential Revision: https://reviews.llvm.org/D104304
Before: ADDR is located -320 bytes to the right of 1072-byte region
After: ADDR is located 752 bytes inside 1072-byte region

Reviewed By: eugenis, walli99

Differential Revision: https://reviews.llvm.org/D104412
…aces

This functionality is similar to delayed registration of dialect interfaces. It
allows external interface models to be registered before the dialect containing
the attribute/operation/type interface is loaded, or even before the context is
created.

Reviewed By: rriddle

Differential Revision: https://reviews.llvm.org/D104397
The idea is now that AppendError<...> will set eReturnStatusFailed
for you so you don't have to call SetStatus again.

Previously if the error message was empty, the status wouldn't
be set.

I don't think there are any sitautions where the message is in
fact empty but it potentially could be depending on where
we get the string from.

So let's set the status up front then return early if the message is empty.

Reviewed By: teemperor

Differential Revision: https://reviews.llvm.org/D104380
Since https://reviews.llvm.org/D103701 AppendError<...>
sets this for you.

This change includes all of the non-command uses.

Some uses remain where it's either tricky to reason about
the logic, or they aren't paired with AppendError calls.

Reviewed By: teemperor

Differential Revision: https://reviews.llvm.org/D104379
This patch adds a test case showing how a single extra .loc can cause
binary differences when using -x86-pad-for-align=true.

The issue has been discussed in D94542, PR42138, PR48742.
We now generate as many benchmarks as there are implementations.

Differential Revision: https://reviews.llvm.org/D102156
Make sure llvm-mc is invariant with respect to debug locations in the
test (checks update to use the -x86-pad-for-align default value)
…er to add additional select(setcc,x,y) folds. NFCI.

I need to add some additional handling to address some of the regressions from D101074
LLVM_DEBUG in headers is awkward, better avoid it. DEBUG_TYPE in a
header results in a lot of macro redefinition warnings.
This is part 2, covering the commands source.

Some uses remain where it's tricky to see what the
logic is or they are not used with AppendError.

Reviewed By: teemperor

Differential Revision: https://reviews.llvm.org/D104448
…a value

This patch fixes an issue where builds of programs with multiple dbg.values
with DIArgList locations could have non-deterministic output. This issue
was caused by ReplaceableMetadataImpl::getAllArgListUsers, which
returned DIArgList pointers in a random order; the output of this
function would later be used to insert dbg.values, causing the order of
insertion to be non-deterministic. This patch changes getAllArgListUsers
to return pointers in a fixed order.

Differential Revision: https://reviews.llvm.org/D104105
Fixed crash when doing pointer math on a void pointer.

Also, reworked test to use -verify rather than FileCheck.

Reviewed By: erichkeane

Differential Revision: https://reviews.llvm.org/D104424
…sers of a value"

Commit caused build errors on buildbots with [-Werror,-Wreturn-std-move]
enabled.

This reverts commit fa1de88.
In D103169 I'm adding to InstSimplify support for NaN to constrained
intrinsics that have a regular FP IR instruction counterpart. Precommit
the tests for clarity when that ticket lands.
We need to dedup archive loads (similar to what we do for dylib
loads).

I noticed this issue after building some Swift stuff that used
`-force_load_swift_libs`, as it caused some Swift archives to be loaded
many times.

Reviewed By: #lld-macho, thakis, MaskRay

Differential Revision: https://reviews.llvm.org/D104353
Esme-Yi and others added 17 commits June 21, 2021 05:09
Summary: This patch, as a follow-up of D95505, adds
support for writing the long symbol name by implementing
the StringTable. Only XCOFF32 is suppoted now.

Reviewed By: jhenderson, shchenz

Differential Revision: https://reviews.llvm.org/D103455
There does not seem to be any use of these functions. They just
put the value to a local which is never used again.

Reviewed By: JonChesterfield

Differential Revision: https://reviews.llvm.org/D104512
The main motivation behind pointer replacement of LDS use within non-kernel
functions is - to *avoid* subsequent LDS lowering pass from directly packing
LDS (assume large LDS) into a struct type which would otherwise cause allocating
huge memory for struct instance within every kernel.

Reviewed By: rampitec

Differential Revision: https://reviews.llvm.org/D103225
This revision adds a BufferizationAliasInfo which maintains and updates information about which tensors will alias once bufferized, which bufferized tensors are equivalent to others and how to handle clobbers.

Bufferization greedily tries to bufferize inplace by:

1. first trying to bufferize SubTensorInsertOp inplace, in reverse order (these are deemed the most expensives).
2. then trying to bufferize all non SubTensorOp / SubTensorInsertOp, in reverse order.
3. lastly trying to bufferize all SubTensorOp in reverse order.

Reverse order is a heuristic that seems to work nicely because structured tensor codegen very often proceeds by:

1. take a subset of a tensor
2. compute on that subset
3. insert the result subset into the full tensor and yield a new tensor.

BufferizationAliasInfo + equivalence sets + clobber analysis allows bufferizing nested
subtensor/compute/subtensor_insert sequences inplace to a certain extent.
To fully realize inplace bufferization, additional container-containee analysis will be necessary and is left for a subsequent commit.

Differential revision: https://reviews.llvm.org/D104110
This pass aims to optimize VGPR live-range in a typical divergent if-else
control flow. For example:

def(a)
if(cond)
  use(a)
  ... // A
else
  use(a)

As AMDGPU access vgpr with respect to active-mask, we can mark `a` as
dead in region A. For details, please refer to the comments in
implementation file.

The pass is enabled by default, the frontend can disable it through
"-amdgpu-opt-vgpr-liverange=false".

Differential Revision: https://reviews.llvm.org/D102212
getSpecializationCost was returning INT_MAX for a case when specialisation
shouldn't happen, but this wasn't properly checked if specialisation was
forced.

Differential Revision: https://reviews.llvm.org/D104461
Modify the `SPIRVShuffleVector` constructor to allow passing a nullptr
basic block (as is the case for variable initializers).

Modify the `SPIRVShuffleVector` constructor to take `SPIRVId`s instead
of `SPIRVValue`s, which better reflects what is stored in the class,
and saves us an unnecessary ID-to-Value-to-ID round trip in
`createInstFromSpecConstantOp`.

Original commit:
KhronosGroup/SPIRV-LLVM-Translator@72f99e3
Modify the constructors to allow passing a nullptr basic block (as is
the case for variable initializers).

Modify the constructors to take `SPIRVId`s instead of `SPIRVValue`s,
which better reflects what is stored in the class, and saves us an
unnecessary ID-to-Value-to-ID round trip in
`createInstFromSpecConstantOp`.

Modify test/opundef.spt to return the constructed value, so that it
does not get optimized out.

Original commit:
KhronosGroup/SPIRV-LLVM-Translator@9bc1d5a
When some constant expression is used as operand of regular instruction
and operand of other constant expression, the current algorithm of
lowering could produce incorrect order of instructions because it is not
possible to add instruction operand to a constant expression. Consider
following pseudo code example:
```
call foo(constexpr2(constexpr1), constexpr1)

// After first loop iteration through operands of the call instruction:

A = constexpr2op(constexpr1)
call foo(A, constexpr1)

// Ok, instruction A is now a user of constexpr1, so, when second
// operand is processed, all uses of constexpr1 are updated, but the
// instruction that represents constexpr1 is inserted before call
// instruction because it is now being processed, so it will look like
// this:

A = constexpr2(B)
B = constexpr1
call foo(A, B)

// So, instruction B needs to be moved after all its users to get a
// valid module:

B = constexpr1
A = constexpr2(B)
call foo(A, B)

```

Original commit:
KhronosGroup/SPIRV-LLVM-Translator@390aba9
@vmaksimo
Copy link
Contributor Author

/summary:run

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.