Skip to content

LLVM and SPIRV-LLVM-Translator pulldown (WW32) #6494

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
merged 956 commits into from
Aug 1, 2022

Conversation

dwoodwor-intel
Copy link
Contributor

chapuni and others added 30 commits July 27, 2022 22:41
This reverts commit 7132bcd.

It is the likely cause of a clang-tools-extra test regression. Reverting
until I can investigate what's going on.
This patch changes legacy LTO to set data-sections by default. The user can
explicitly unset data-sections. The reason for this patch is to match the
behaviour of lld and gold plugin. Both lld and gold plugin have data-sections on
by default.

This patch also fixes the forwarding of the clang options -fno-data-sections and
-fno-function-sections to libLTO. Now, when -fno-data/function-sections are
specified in clang, -data/function-sections=0 will be passed to libLTO to
explicitly unset data/function-sections.

Reviewed By: w2yehia, MaskRay

Differential Revision: https://reviews.llvm.org/D129401
The changes ensure the VPPredInstPHIRecipes are actually used and cannot
be remove by VP-DCE.
The behaviour of this patch is not great, but it has some side-effects
that are required for OpenMPOpt to work. The problem is that when we use
`-mlink-builtin-bitcode` we only import used symbols from the runtime.
Then OpenMPOpt will insert calls to symbols that were not previously
included. This patch removed this implicit behaviour as these functions
were kept alive by the `noinline` simply because it kept calls to them
in the module. This caused regression in some tests that relied on some
OpenMPOpt passes without using LTO. Reverting for the LLVM15 release but
will try to fix it more correctly on main.

This reverts commit d61d72d.

Fixes #56752
The header file OpenMPDialect.h is added in Bridge.cpp in D130027,
but it is unused. Remove it.

Differential Revision: https://reviews.llvm.org/D130625
It errors out in the Bazel CI:

AMDGPULowerModuleLDSPass.cpp:384:12: error: chosen constructor is
explicit in copy-initialization
    return {SGV, std::move(Map)};

Reviewed By: rupprecht

Differential Revision: https://reviews.llvm.org/D130623
Adds the papers and LWG issues voted in during the July 2022 plenary.

Note the updating of the project based statuses is left to the active
contributors of these projects.

Reviewed By: #libc, huixie90, philnik

Differential Revision: https://reviews.llvm.org/D130595
…rinsic transforms

This should fix issue #56383 (at least when compiled with -O3 because this pass is only
run at -O3 currently).
There is post comment of adding TODO/FIXME for privatization of loop
bounds in D127137. D127137 fixes the bug in OpenMP firstprivate clause,
which should be refactored later according to the post comment. Add
FIXME for it.

Differential Revision: https://reviews.llvm.org/D130625
Remove some no longer relevant details and adds the C++23 papers voted
in at the last plenary.
…tics

The options -f{no-}color-diagnostics have been supported in driver. This
supports the behaviors in scanning, parsing, and semantics, and the
behaviors are exactly the same as the driver.

To illustrate the added behaviour, consider the following input file:
```! file.f90
program m
  integer :: i = k
end
```
In the following invocations, "error: Must be a constant value" _will be_
formatted:
```
$ flang-new file.f90
error: Semantic errors in file.f90
./file.f90:2:18: error: Must be a constant value
    integer :: i = k
```
Note that "error: Semantic errors in file.f90" is also formatted, which
is supported in https://reviews.llvm.org/D126164.

Also note that only "error", "warning" and "portability" are formatted.
Check the following input file:
```! file2.f90
program m
  integer :: i =
end
```
```
$ flang-new test2.f90
error: Could not parse test2.f90
./test2.f90:2:11: error: expected '('
    integer :: i =
            ^
./test2.f90:2:3: in the context: statement function definition
    integer :: i =
    ^
...
```
The "error: Could not parse test2.f90" and "error: expected '('" are
formatted. Others such as "in the context" are not formatted yet, which
may or may not be supported.

Reviewed By: awarzynski

Differential Revision: https://reviews.llvm.org/D126166
…rual operator.

Sparse compiler failed on the provided test (when the sparse kernel is nested in a scf structrual operator).

Reviewed By: bixia

Differential Revision: https://reviews.llvm.org/D130609
All of our other tests are functionality tests constrained to some
specific configuration.  This one is intended to float with the
default configuration so that changes in that default are visible
in reviews.  Note that our current default does not enable
vectorization at all; thus the current output is unvectorized.
After evaluating the new style I noticed inner namespaces are now
indented. I am not fond of that style and I've seen some other review
comment in this regard so I propose we remove this option and use the
LLVM default not to indent it.

Reviewed By: ldionne, philnik, var-const, #libc

Differential Revision: https://reviews.llvm.org/D129441
This addresses a request during the review of D128929.

Reviewed By: ldionne, #libc

Differential Revision: https://reviews.llvm.org/D129310
Change `sinf` range reduction to mod pi/16 to be shared with `cosf`.

Previously, `sinf` used range reduction `mod pi`, but this cannot be used to implement `cosf` since the minimax algorithm for `cosf` does not converge due to critical points at `pi/2`.  In order to be able to share the same range reduction functions for both `sinf` and `cosf`, we change the range reduction to `mod pi/16` for the following reasons:
- The table size is sufficiently small: 32 entries for `sin(k * pi/16)` with `k = 0..31`.  It could be reduced to 16 entries if we treat the final sign separately, with an extra multiplication at the end.
- The polynomials' degrees are reduced to 7/8 from 15, with extra computations to combine `sin` and `cos` with trig sum equality.
- The number of exceptional cases reduced to 2 (with FMA) and 3 (without FMA).
- The latency is reduced while maintaining similar throughput as before.

Reviewed By: zimmermann6

Differential Revision: https://reviews.llvm.org/D130629
We can't guarantee the long always 64 bits like WINDOWS or LLP64 data
model (rare but we should consider).

So use int64_t from inttypes.h and safe in this case.

Fixes llvm/llvm-project#55911 .
A mul by a negated power of 2 is a slli followed by neg. This doesn't
require any constant materialization and may be lower latency than mul.
The neg may also be foldable into other arithmetic.

Reviewed By: reames

Differential Revision: https://reviews.llvm.org/D130047
We can use slli.uw by C followed by sh1add. Similar can be done
for multiples of 5 and 9. We need to make sure that C is less than
32 to stay in bounds of the 5-bit immediate for slli.uw.

We have existing patterns for (mul X, 3<<C) that use sh1add
followed by slli. That order doesn't allow the and to be folded.

Reviewed By: reames

Differential Revision: https://reviews.llvm.org/D130146
I don't have any evidence these particular uses are actually causing any
issues, but we should avoid accidentally truncating immediate values
depending on the host.
D98179 added a mechanism to sort tests by test time to run slow tests
early, increasing potential parallelism. It also added a feature where
negative tests would be marked as negative, allowing subsequent test
runs to run them earlier. Unfortunately it never actually stored the
negative time, even if all the other code seemed to be inplace to sort
them early. Luckily the fix seems simple.

Differential Revision: https://reviews.llvm.org/D130570
I am playing with the LoopDataPrefetch pass and found out that it
bails to work with a pointer in a non-zero address space. This
patch adds the target callback to check if an address space is to
be considered for prefetching. Default implementation still only
allows address space 0, so this is NFCI.

This does not currently affect any known targets, but seems to be
generally useful for the future.

Differential Revision: https://reviews.llvm.org/D129795
…an. Part 3

Improve LLDB reliability by fixing the following "uninitialized variables" static code inspection warnings from
scan.coverity.com/projects/llvm:

1355854, 1347549, 1316348, 1372028, 1431625,
1315634, 1315637, 1355855, 1364803, 1420505,
1420563, 1420685, 1366014, 1203966, 1204029,
1204031, 1204032, 1328411, 1325969, 1325968,
1374921, 1094809

Differential Revision: https://reviews.llvm.org/D130602
This adds similar heuristics to G_GLOBAL_VALUE, querying the cost of
materializing a specific constant in code size. Doing so prevents us from
sinking constants which require multiple instructions to generate into
use blocks.

Code size savings on CTMark -Os:
Program                                       size.__text
                                              before         after           diff
ClamAV/clamscan                               381940.00      382052.00       0.0%
lencod/lencod                                 428408.00      428428.00       0.0%
SPASS/SPASS                                   411868.00      411876.00       0.0%
kimwitu++/kc                                  449944.00      449944.00       0.0%
Bullet/bullet                                 463588.00      463556.00      -0.0%
sqlite3/sqlite3                               284696.00      284668.00      -0.0%
consumer-typeset/consumer-typeset             414492.00      414424.00      -0.0%
7zip/7zip-benchmark                           595244.00      594972.00      -0.0%
mafft/pairlocalalign                          247512.00      247368.00      -0.1%
tramp3d-v4/tramp3d-v4                         372884.00      372044.00      -0.2%
                           Geomean difference                               -0.0%

Differential Revision: https://reviews.llvm.org/D130554
Erich Keane and others added 10 commits July 29, 2022 05:54
In preperation of the deferred instantation progress, this patch
propagates the multi-level template argument lists further through the
API to reduce the size of that patch.
If the subregister uses were dead, this would leave the main range
segment pointing to a deleted instruction.

Not sure if this should try to avoid shrinking if we know we don't
have dead components.
Since 814a0ab, this would break if we
had a function in the module that becomes dead in any codegen IR
pass. The function wasn't deleted since it was initially used in dead
code, but is detached from the call graph and doesn't appear in the PO
traversal. Do a second walk over the module to populate the resources
of any functions which weren't already processed.
  CONFLICT (content): Merge conflict in clang/lib/Driver/ToolChains/Gnu.cpp
…sary computations. NFC.

Many of these cases, an early-out on the much cheaper getOpcode() check will avoid us needing to call hasOneUse() entirely.
When constant folding "ANDNP(C,X) -> AND(~C,X)" we hit cases such as this where we interfered with the "OR(AND(X,C),AND(Y,~C)) -> OR(AND(X,C),ANDNP(C,Y))" fold in canonicalizeBitSelect
This avoids the need for casts at each call site.

Original commit:
KhronosGroup/SPIRV-LLVM-Translator@70ebd9b
@dwoodwor-intel dwoodwor-intel added the disable-lint Skip linter check step and proceed with build jobs label Jul 29, 2022
@dwoodwor-intel dwoodwor-intel requested review from a team and pvchupin as code owners July 29, 2022 15:37
@dwoodwor-intel
Copy link
Contributor Author

The CUDA failure seems to be a known flaky test: #8800

I've re-started the CUDA testing.

@dwoodwor-intel
Copy link
Contributor Author

ProtexIP seems to have had problems again after 11 hours and 18 minutes; there shouldn't be any IP issues with our added changes here, so I'll go ahead with the merge

@dwoodwor-intel
Copy link
Contributor Author

/merge

@bb-sycl
Copy link
Contributor

bb-sycl commented Aug 1, 2022

Mon 01 Aug 2022 03:24:59 PM UTC --- Start to merge the commit into sycl branch. It will take several minutes.

@bb-sycl
Copy link
Contributor

bb-sycl commented Aug 1, 2022

Mon 01 Aug 2022 03:28:51 PM UTC --- Merge the branch in this PR to base automatically. Will close the PR later.

@bb-sycl bb-sycl merged commit 6485ec4 into intel:sycl Aug 1, 2022
@pvchupin
Copy link
Contributor

pvchupin commented Aug 1, 2022

@dwoodwor-intel, looks like we have --shared-libs build broken after this merge...
https://github.com/intel/llvm/runs/7614635174?check_suite_focus=true

pvchupin pushed a commit that referenced this pull request Aug 1, 2022
Guess it happened after a recent pulldown:
#6494
@dwoodwor-intel
Copy link
Contributor Author

We should do our testing for sycl-web and sycl pulldowns with --shared-libs builds too; I'll update the process docs

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
disable-lint Skip linter check step and proceed with build jobs
Projects
None yet
Development

Successfully merging this pull request may close these issues.