Skip to content

add a patch to fix https://github.com/JuliaLang/julia/issues/45702 #10

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Closed
wants to merge 10,000 commits into from

Conversation

KristofferC
Copy link
Member

Bad commit message but I don't know the code well enough to formulate a better one.

Shao-Ce SUN and others added 30 commits February 18, 2022 00:12
This patch added the MC layer support of Zfinx extension.

Authored-by: StephenFan
Co-Authored-by: Shao-Ce Sun

Reviewed By: asb

Differential Revision: https://reviews.llvm.org/D93298

(cherry picked from commit 7798ecc)
For the release/14.x branch.

Reviewed By: tstellar

Differential Revision: https://reviews.llvm.org/D119889
This is a simplified c12d49c in main
which just fixes the bug but does not affect the -O2 deduplication.
While matching widening multiply, if we matched an extend from i8->i32,
i16->i64 or i8->i64, we need to reintroduce a narrower extend. If we're
matching a vwmulsu we need to use a sext for op0 and a zext for op1.

This bug exists in LLVM 14 and will need to be backported.

Differential Revision: https://reviews.llvm.org/D119618

(cherry picked from commit 478c237)
…promote vNi1 vectors.

This patch updates the handling of vectors in getPreferredVectorAction():

For single-element and scalable vectors, fall back to default vector legalization
handling. For vNi1 vectors, add handling to either split or promote them in
order to prevent the production of wide v256i1/v512i1 types.

The following assertion is fixed by this patch, as we ended up producing the
wide vector types (that are used for MMA) in the backend prior to this fix.

```
Assertion failed: VT.getSizeInBits() == Operand.getValueSizeInBits() &&
"Cannot BITCAST between types of different sizes!"
```

Differential Revision: https://reviews.llvm.org/D119521

(cherry picked from commit ac5a5a9)
Get rid of non-constant and undef indices of insertelements
at `buildTree()` stage. Fix bugs.

Differential Revision: https://reviews.llvm.org/D119623
There was a fixme in the code pertaining to attributing functions as
noreturn.  By using reachability, if none of the blocks that are
reachable from the entry return, then the function is noreturn.

Previously, the code only checked if any blocks returned. If they're
unreachable, then they don't matter.

This improves codegen for the Linux kernel.

Fixes: ClangBuiltLinux/linux#1563

Reviewed By: nikic

Differential Revision: https://reviews.llvm.org/D119571

(cherry picked from commit 9dcb006)
b07b5bd adds a use of `__comp_ref_type.h` to `std::min`. When libc++ is built with `-D_LIBCPP_DEBUG=0`, this enables `std::__debug_less`, which is only marked constexpr after c++17.

`std::min` itself is marked as being `constexpr` as of c++14, so by extension, `std::__debug_less` should also be marked `constexpr` for the same versions so that `std::min` can use it. This change lowers the guard from `> 17` to `> 11`.

Reproducer in godbolt: https://godbolt.org/z/ans3TGsj8

```

constexpr int x() { return std::min<int>({1, 2, 3, 4}); }

static_assert(x() == 1);
```

Reviewed By: #libc, philnik, Quuxplusone, ldionne

Differential Revision: https://reviews.llvm.org/D118940

(cherry picked from commit 99e5c52)
Use existing functionality to strip constant offsets that works well
with AS casts and avoids the code duplication.

Since we strip AS casts during the computation of the offset we also
need to adjust the APInt properly to avoid mismatches in the bit width.
This code ensures the caller of `compute` sees APInts that match the
index type size of the value passed to `compute`, not the value result
of the strip pointer cast.

Fixes llvm#53559.

Differential Revision: https://reviews.llvm.org/D118727

(cherry picked from commit 29c8eba)
The oversight caused us to ignore call sites that are effectively dead
when we computed reachability (or more precise the call edges of a
function). The problem is that loads in the readonly callee might depend
on stores prior to the callee. If we do not track the call edge we
mistakenly assumed the store before the call cannot reach the load.
The problem is nicely visible in:
  `llvm/test/Transforms/Attributor/ArgumentPromotion/basictest.ll`

Caused by D118673.

Fixes llvm#53726
This fixes building the unit tests on OpenBSD. OpenBSD does not support RLIMIT_AS.

Reviewed By: kadircet

Differential Revision: https://reviews.llvm.org/D119989

(cherry picked from commit f374c8d)
With https://reviews.llvm.org/D105466 the tree does not build on OpenBSD/amd64.
Moritz suggested only building this code on Linux.

Reviewed By: MoritzS

Differential Revision: https://reviews.llvm.org/D119991

(cherry picked from commit 7db1d4d)
This patch adds the '_kmpc_get_hardware_num_threads_in_block'
OpenMP RTL function to the externalization RAII struct. This was getting
optimized out and then being replaced with an undefined value once added
back in, causing bugs for complex reductions.

Fixes llvm#53909.

Reviewed By: jdoerfert

Differential Revision: https://reviews.llvm.org/D120076

(cherry picked from commit 74cacf2)
The `IsSPMD` global can only be read by threads other than the main
thread *after* initialization is complete. To allow usage of
`mapping::getBlockSize` before initialization is done, we can pass the
`IsSPMD` state explicitly. This is similar to other APIs that take
`IsSPMD` explicitly to avoid such a race, e.g.,
`mapping::isInitialThreadInLevel0(IsSPMD)`

Fixes llvm#53857

(cherry picked from commit 57b4c52)
When we move an allocation from the heap to the stack we need to
allocate it in the alloca AS and then cast the result. This also
prevents us from inserting the alloca after the allocation call but
rather right before.

Fixes llvm#53858

(cherry picked from commit 8ad39fb)
Instead of doing an inbounds strip first and another non-inbounds
strip afterward for equality comparisons, directly do a single
inbounds or non-inbounds strip based on whether we have an equality
predicate or not.

This is NFC-ish in that the alloca equality codepath is the only
part that sees additional non-inbounds offsets now, and for that
codepath it doesn't matter whether or not the GEP is inbounds, as
it does a stronger check itself. InstCombine would infer inbounds
for such GEPs.

(cherry picked from commit f35af77)
While testing LLVM 14.0.0 rc1 on Solaris, compilation of `FAIL`ed with

  /var/llvm/llvm-14.0.0-rc1/rc1/llvm-project/mlir/lib/Analysis/Presburger/Utils.cpp: In lambda function:
  /var/llvm/llvm-14.0.0-rc1/rc1/llvm-project/mlir/lib/Analysis/Presburger/Utils.cpp:48:58: error: call of overloaded ‘floor(int64_t)’ is ambiguous
     48 |                  [gcd](int64_t &n) { return floor(n / gcd); });
        |                                                          ^
  ...
  /usr/gcc/10/lib/gcc/sparcv9-sun-solaris2.11/10.3.0/include-fixed/iso/math_iso.h:201:21:
note: candidate: ‘long double std::floor(long double)’
    201 |  inline long double floor(long double __X) { return __floorl(__X); }
        |                     ^~~~~
  /usr/gcc/10/lib/gcc/sparcv9-sun-solaris2.11/10.3.0/include-fixed/iso/math_iso.h:165:15:
note: candidate: ‘float std::floor(float)’
    165 |  inline float floor(float __X) { return __floorf(__X); }
        |               ^~~~~
  /usr/gcc/10/lib/gcc/sparcv9-sun-solaris2.11/10.3.0/include-fixed/iso/math_iso.h:78:15:
note: candidate: ‘double std::floor(double)’
     78 | extern double floor __P((double));
        |               ^~~~~

The same issue had already occured in the past, cf. D108750
<https://reviews.llvm.org/D108750>, and the solution is the same: cast the
`floor` arg to `double`.

Tested on `amd64-pc-solaris2.11` and `sparcv9-sun-solaris2.11`.

Differential Revision: https://reviews.llvm.org/D119324

(cherry picked from commit 9159675)
Large COFF section names are moved into the string table and the
section header field is the offset into the string table encoded in
ASCII for offset smaller than 7 digits and in base64 for larger
offsets.

The operation of taking the string table offsets is done in a few
places in the codebase, so it is helpful to move this operation into
`BinaryFormat` so that it can be shared everywhere it's done.

So this patch takes the implementation of this operation from
`llvm/lib/MC/WinCOFFObjectWriter.cpp` and moves it into `BinaryFormat`.

Reviewed By: jhenderson, rnk

Differential Revision: https://reviews.llvm.org/D118793

(cherry picked from commit 85f4023)
The section name encoding for `llvm-objcopy` had two main issues, the
first is that the size used for the `snprintf` in the original code is
incorrect because `snprintf` adds a null byte, so this code was only
able to encode offsets of 6 digits - `/`, `\0` and 6 digits of the
offset - rather than the 7 digits it should support.

And the second part is that it didn't support the base64 encoding for
offsets larger than 7 digits.

This issue specifically showed up when using the `clang-offload-bundler`
with a binary containing a lot of symbols/sections, since it uses
`llvm-objcopy` to add the sections containing the offload code.

Reviewed By: jhenderson

Differential Revision: https://reviews.llvm.org/D118692

(cherry picked from commit ddf528b)
Currently, loading from or storing to a stack location with a structured load
or store crashes in isAArch64FrameOffsetLegal as the opcodes are not handled by
getMemOpInfo. This patch adds the opcodes for structured load/store instructions
with an immediate index to getMemOpInfo & getLoadStoreImmIdx, setting appropriate
values for the scale, width & min/max offsets.

Reviewed By: sdesmalen, david-arm

Differential Revision: https://reviews.llvm.org/D119338

(cherry picked from commit fc1b212)
…::ReconstructShuffle

Previously the code in AArch64TargetLowering::ReconstructShuffle assumed
the input vectors were always fixed-width, however this is not always
the case since you can extract elements from scalable vectors and insert
into fixed-width ones. We were hitting crashes here for two different
cases:

1. When lowering a fixed-length vector extract from a scalable vector
with i1 element types. This happens due to the fact the i1 elements
get promoted to larger integer types for fixed-width vectors and leads
to sequences of INSERT_VECTOR_ELT and EXTRACT_VECTOR_ELT nodes. In this
case AArch64TargetLowering::ReconstructShuffle will still fail to make
a transformation, but at least it no longer crashes.
2. When lowering a sequence of extractelement/insertelement operations
on mixed fixed-width/scalable vectors.

For now, I've just changed AArch64TargetLowering::ReconstructShuffle to
bail out if it finds a scalable vector.

Tests for both instances described above have been added here:

  (1) CodeGen/AArch64/sve-extract-fixed-vector.ll
  (2) CodeGen/AArch64/sve-fixed-length-reshuffle.ll

Differential Revision: https://reviews.llvm.org/D116602

(cherry picked from commit a57a7f3)
…tionPHI

The code was relying upon the implicit conversion of TypeSize to
uint64_t and assuming the type in question was always fixed. However,
I discovered an issue when running the canon-freeze pass with some
IR loops that contains scalable vector types. I've changed the code
to bail out if the size is unknown at compile time, since we cannot
compute whether the step is a multiple of the type size or not.

I added a test here:

  Transforms/CanonicalizeFreezeInLoops/phis.ll

Differential Revision: https://reviews.llvm.org/D118696

(cherry picked from commit 1badfbb)
The lowering code for shuffle_vector has a code path that looks through
extract_subvector, this code path did not properly account for the
potential presense of larger than Neon vector types and could produce
unselectable DAG nodes.

Differential Revision: https://reviews.llvm.org/D119252

(cherry picked from commit 98936ae)
While testing LLVM 14.0.0 rc1 on Solaris, I ran into a compile failure:

                   from /var/llvm/llvm-14.0.0-rc1/rc1/llvm-project/mlir/lib/ExecutionEngine/SparseTensorUtils.cpp:22:
  /usr/include/sys/types.h:103:16: error: conflicting declaration ‘typedef short int index_t’
    103 | typedef short  index_t;
        |                ^~~~~~~
  In file included from
/var/llvm/llvm-14.0.0-rc1/rc1/llvm-project/mlir/lib/ExecutionEngine/SparseTensorUtils.cpp:17:
  /var/llvm/llvm-14.0.0-rc1/rc1/llvm-project/mlir/include/mlir/ExecutionEngine/SparseTensorUtils.h:26:7:
note: previous declaration as ‘using index_t = uint64_t’
     26 | using index_t = uint64_t;
        |       ^~~~~~~

The same issue had already occured in the past and fixed in D72619
<https://reviews.llvm.org/D72619>.  More detailed explanation can also be
found there.

Tested on `amd64-pc-solaris2.11` and `sparcv9-solaris2.11`.

Differential Revision: https://reviews.llvm.org/D119323

(cherry picked from commit d2215e7)
Even after D86621 <https://reviews.llvm.org/D86621>, `clang -m32` on
Solaris/sparcv9 doesn't inline atomics with 8-byte operands, unlike `gcc`.
This leads to many link failures in the testsuite (undefined references to
`__atomic_load_8` and `__sync_val_compare_and_swap_8`.  Until a proper
codegen fix can be implemented, this patch works around the first of those
by linking with `-latomic`.

Tested on `sparcv9-sun-solaris2.11`.

Differential Revision: https://reviews.llvm.org/D118021

(cherry picked from commit a6afa9e)
CLANG_DEFAULT_PIE_ON_LINUX=on will soon become the default. The purpose
of these tests has gone.

(cherry picked from commit 0ac6be6)
trofi and others added 25 commits June 7, 2022 17:39
Without the change llvm build fails on this week's gcc-13 snapshot as:

    [  0%] Building CXX object lib/Support/CMakeFiles/LLVMSupport.dir/Signals.cpp.o
    In file included from llvm/lib/Support/Signals.cpp:14:
    llvm/include/llvm/Support/Signals.h:119:8: error: variable or field 'CleanupOnSignal' declared void
      119 |   void CleanupOnSignal(uintptr_t Context);
          |        ^~~~~~~~~~~~~~~

(cherry picked from commit ff1681d)
Without the change llvm build fails on this week's gcc-13 snapshot as:

    [ 91%] Building CXX object unittests/Support/CMakeFiles/SupportTests.dir/Base64Test.cpp.o
    In file included from llvm/unittests/Support/Base64Test.cpp:14:
    llvm/include/llvm/Support/Base64.h: In function 'std::string llvm::encodeBase64(const InputBytes&)':
    llvm/include/llvm/Support/Base64.h:29:5: error: 'uint32_t' was not declared in this scope
       29 |     uint32_t x = ((unsigned char)Bytes[i] << 16) |
          |     ^~~~~~~~

(cherry picked from commit 5e9be93)
The MVE shuffle costing for VREV instructions was making incorrect
assumptions as to legalized vector types remaining as vectors. Add a
quick check to ensure they are indeed vectors before attempting to get
the number of elements.

(cherry picked from commit 53be6ab)
Commit dd5991c modified the aliasing checks here to allow transforming
a memcpy where the source and destination point into the same object.
However, the change accidentally made the code skip the alias check for
other operations in the loop.

Instead of completely skipping the alias check, just skip the check for
whether the memcpy aliases itself.

Differential Revision: https://reviews.llvm.org/D126486

(cherry picked from commit abdf0da)
Previously, the choice between the instruction selection of ISD::FABS was
decided at the point of setting the MIPS target lowering operation choice
either `Custom` lowering or `Legal`. This lead to instruction selection
failures as functions could be marked as having no NaNs.

Changing the lowering to always be `Custom` and directly handling the
the cases where MIPS selects the instructions for ISD::FABS resolves
this crash.

Thanks to kray for reporting the issue and to Simon Atanasyan for producing
the reduced test case.

This resolves PR/53722.

Differential Revision: https://reviews.llvm.org/D124651

(cherry picked from commit 938ed8a)
Factor in the TBAA of adjacent stores instead of just the head store
when merging stores into a memset. We were seeing GVN remove a load that
had a TBAA that matched the 2nd store because GVN determined it didn't
match the TBAA of the memset. The memset had the TBAA of only the first
store.

i.e. Loading the field pi_ of shared_count after memset to create an
array of shared_ptr

template<class T>
class shared_ptr {
  T *p;
  shared_count refcount;
};

class shared_count {
  sp_counted_base *pi_;
};

Differential Revision: https://reviews.llvm.org/D122205

(cherry picked from commit e02f497)
…tation.

Fixes llvm#55407.

Given configuration:
```
UseTab: Always
PointerAlignment: Right
AlignConsecutiveDeclarations: true
```

Before, the pointer was misaligned in this code:
```
void f() {
	unsigned long long big;
	char	      *ptr; // misaligned
	int		   i;
}
```

That was due to the fact that when handling right-aligned pointers, the Spaces were changed but StartOfTokenColumn was not.

Also, a tab was used not only for indentation but for spacing too when using `UseTab: ForIndentation` config option:
```
void f() {
	unsigned long long big;
	char	      *ptr; // \t after char
	int                i;
}
```

Reviewed By: owenpan

Differential Revision: https://reviews.llvm.org/D125528
Fixes llvm#53799.

Reviewed By: HazardyKnusperkeks, owenpan

Differential Revision: https://reviews.llvm.org/D119680
…da captures

Previously, we were treating a move in the lambda capture as if it happened
within the body of the lambda, not within the function that defines the lambda.

This fixes the same bug as https://reviews.llvm.org/D119165 (which it appears
may have been abandoned by the author?) but does so more simply.

Reviewed By: njames93

Differential Revision: https://reviews.llvm.org/D126780

(cherry picked from commit 8b90b25)
Consider the form of the first operand of a class assignment not the
second operand when implicitly starting the lifetimes of union members.
Also add a missing check that the assignment call actually came from a
syntactic assignment, not from a direct call to `operator=`.

(cherry picked from commit 30baa5d)
A crash was seen in CastValueChecker due to a null pointer dereference.

The fix uses QualType::getAsString to avoid the null dereference
when a CXXRecordDecl cannot be obtained. A small reproducer is added,
and cast value notes LITs are updated for the new debug messages.

Reviewed By: steakhal

Differential Revision: https://reviews.llvm.org/D127105

(cherry picked from commit c7fa4e8)
Since D110065, the 'R' profile support is added to LLVM. It turns the
`generic` cpu into the intersection of v8-a and v8-r. However, this
makes some backward compatibility problems. The original patch makes
the clang driver implicitly pass -march=armv8-a when only the triple
is specified. Since it only applies to clang, other tools like
llvm-objdump still faces the backward compatibility problem.

This patch applies the same idea to MC related tools by enabling '+v8a'
feature when nothing is specified (both CPU and FS are empty) for
MCSubtargetInfo creation.

This patch should fix PR53956.

Reviewed by: labrinea

Differential Revision: https://reviews.llvm.org/D124319

(cherry picked from commit 4a31af8)
…nces"

This reverts commit 5a5ac65.

(cherry picked from commit ae2638d)
(cherry picked from commit 05848b6)
Julia uses addressspaces for GC and we want these to be sanitized as well.

(cherry picked from commit 3f53397)
(cherry picked from commit 58df73b)
Otherwise clang installs all of its tools into `bin/` while
LLVM installs its tools into (LLVM_TOOLS_INSTALL_DIR).
I could swear this used to work (and in fact the julia build system
assumes it), but I can't pin down a specific commit that would
have broken this, and julia has been relying on pre-compiled binaries
for a while now (that don't use this setting), so it may have been
broken for quite a while.

Differential Revision: https://reviews.llvm.org/D88630

(cherry picked from commit 6104e14)
(cherry picked from commit f252e17)
(cherry picked from commit 9039ce8)
Differential Revision: https://reviews.llvm.org/D92210

(cherry picked from commit 822b19a)
(cherry picked from commit fcdf8f2)
(cherry picked from commit 2208d33)
IIUC we can't emit `memcmp` between pointers in addressspaces,
doing so will trigger an assertion since the signature of the memcmp
will not match it's arguments (https://bugs.llvm.org/show_bug.cgi?id=48661).

This PR disables the attempt to merge icmps,
when the pointer is in an addressspace.

Differential Revision: https://reviews.llvm.org/D94813

(cherry picked from commit 458b259)
(cherry picked from commit aaf2d27)
Removes the code responsible for causing https://bugs.llvm.org/show_bug.cgi?id=49357.
A fix is in progress upstream, but I don't think it's easy, so this
fixes the bug in the meantime. The optimization it does is minor.

(cherry picked from commit e4f1085)
(cherry picked from commit 7627d61)
…egality

In general, anywhere we might need to insert a blind bitcast, we need to make sure the types are losslessly convertible.

This fixes pr54634.

(cherry picked from commit 88de27e)
Co-authored-by: Valentin Churavy <[email protected]>
Co-authored-by: Julian P Samaroo <[email protected]>
(cherry picked from commit a0defe0)
Follow-up on https://reviews.llvm.org/D118970 FP_XSTATE_MAGIC1 is only available on glibc 2.27 and upwards

Differential Revision: https://reviews.llvm.org/D124770
We need to force the emission of the EH Frame section (currently done via SupportsCompactUnwindWithoutEHFrame in the MCObjectFileInfo for the target), since libunwind doesn't yet support dynamically registering compact unwind information at run-time.

(cherry picked from commit 60e0418)
(cherry picked from commit 6275013)
@vchuravy vchuravy changed the base branch from julia-release/14.x to julia-release/13.x June 30, 2022 11:12
@vchuravy
Copy link
Member

Urgh GitHub...

This is for 1.8, right? Can you rebase onto julia-Release/13.x?

Message maybe: "Fix non-integral AS handling in simplify-cfg"

@vchuravy vchuravy closed this Jun 30, 2022
@KristofferC KristofferC deleted the patch_45702 branch June 30, 2022 21:27
stahta01 pushed a commit to stahta01/llvm-project that referenced this pull request Feb 9, 2023
We experienced some deadlocks when we used multiple threads for logging
using `scan-builds` intercept-build tool when we used multiple threads by
e.g. logging `make -j16`

```
(gdb) bt
#0  0x00007f2bb3aff110 in __lll_lock_wait () from /lib/x86_64-linux-gnu/libpthread.so.0
JuliaLang#1  0x00007f2bb3af70a3 in pthread_mutex_lock () from /lib/x86_64-linux-gnu/libpthread.so.0
JuliaLang#2  0x00007f2bb3d152e4 in ?? ()
JuliaLang#3  0x00007ffcc5f0cc80 in ?? ()
JuliaLang#4  0x00007f2bb3d2bf5b in ?? () from /lib64/ld-linux-x86-64.so.2
JuliaLang#5  0x00007f2bb3b5da27 in ?? () from /lib/x86_64-linux-gnu/libc.so.6
JuliaLang#6  0x00007f2bb3b5dbe0 in exit () from /lib/x86_64-linux-gnu/libc.so.6
JuliaLang#7  0x00007f2bb3d144ee in ?? ()
JuliaLang#8  0x746e692f706d742f in ?? ()
JuliaLang#9  0x692d747065637265 in ?? ()
JuliaLang#10 0x2f653631326b3034 in ?? ()
JuliaLang#11 0x646d632e35353532 in ?? ()
JuliaLang#12 0x0000000000000000 in ?? ()
```

I think the gcc's exit call caused the injected `libear.so` to be unloaded
by the `ld`, which in turn called the `void on_unload() __attribute__((destructor))`.
That tried to acquire an already locked mutex which was left locked in the
`bear_report_call()` call, that probably encountered some error and
returned early when it forgot to unlock the mutex.

All of these are speculation since from the backtrace I could not verify
if frames 2 and 3 are in fact corresponding to the `libear.so` module.
But I think it's a fairly safe bet.

So, hereby I'm releasing the held mutex on *all paths*, even if some failure
happens.

PS: I would use lock_guards, but it's C.

Reviewed-by: NoQ

Differential Revision: https://reviews.llvm.org/D118439

(cherry picked from commit d919d02)
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.