-
Notifications
You must be signed in to change notification settings - Fork 787
LLVM and SPIRV-LLVM-Translator pulldown (WW33) #6536
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
… enough for the amount mask (PR56859) matchRotateSub is given shift amounts that will already have stripped any/zero-extend nodes from - so make sure those values are wide enough to take a mask.
I went over the output of the following mess of a command: `(ulimit -m 2000000; ulimit -v 2000000; git ls-files -z | parallel --xargs -0 cat | aspell list --mode=none --ignore-case | grep -E '^[A-Za-z][a-z]*$' | sort | uniq -c | sort -n | grep -vE '.{25}' | aspell pipe -W3 | grep : | cut -d' ' -f2 | less)` and proceeded to spend a few days looking at it to find probable typos and fixed a few hundred of them in all of the llvm project (note, the ones I found are not anywhere near all of them, but it seems like a good start). Reviewed By: #libc, philnik Spies: philnik, libcxx-commits, mgorny, arichardson Differential Revision: https://reviews.llvm.org/D130905
… `explicit`) Reviewed By: #libc, philnik, Mordante Spies: Mordante, jloser, philnik, libcxx-commits Differential Revision: https://reviews.llvm.org/D130785
Multiple top-level MLIR documents did not have a table of contents tag, making them harder to nagivate.
This patch adds constant folder for TanhOp which only supports single and double precision floating-point. Differential Revision: https://reviews.llvm.org/D130960
TestInlineStepping.py is flaky while TestUseSourceCache.py fails on Windows 11 only. Marked them skipped to make buildbot happy.
This folds a v4i32 Mul(And(Srl(X, 15), 0x10001), 0xffff) into a v8i16 CMLTz instruction. The Srl and And extract the top bit (whether the input is negative) and the Mul sets all values in the i16 half to all 1/0 depending on if that top bit was set. This is equivalent to a v8i16 CMLTz instruction. The same applies to other sizes with equivalent constants. Differential Revision: https://reviews.llvm.org/D130874
Add a new IRBuilderBase::CreateIntrinsic which takes the return type and argument values for the intrinsic call but does not take an explicit list of types to mangle. Instead the builder works this out from the intrinsic declaration and the types of the supplied arguments. This means that the mangling is hidden from the client, which in turn means that intrinsic definitions can change which arguments are mangled without requiring any changes to the client code. Differential Revision: https://reviews.llvm.org/D130776
If you're having trouble getting a yaml2obj macro expansion to do what you want, it's useful to be able to print the output of the preprocessing to see what your macros expanded to //before// going into the YAML processing phase. yaml2obj has its own preprocessing system which isn't the same as any other well-known thing like cpp. So there's no way to do this macro expansion via another tool: yaml2obj will have to do it itself. In this commit I add an `-E` flag to yaml2obj to do that. Differential Revision: https://reviews.llvm.org/D130981
This reverts commit 0cc3c18. The changes did not account for templated code where one instantiation may trigger the diagnostic but other instantiations will not, as in: ``` template <int I, class T> void foo(int x) { bool b1 = (x & sizeof(T)) == 8; bool b2 = (x & I) == 8; bool b3 = (x & 4) == 8; } void run(int x) { foo<4, int>(8); } ```
I went over the output of the following mess of a command: `(ulimit -m 2000000; ulimit -v 2000000; git ls-files -z | parallel --xargs -0 cat | aspell list --mode=none --ignore-case | grep -E '^[A-Za-z][a-z]*$' | sort | uniq -c | sort -n | grep -vE '.{25}' | aspell pipe -W3 | grep : | cut -d' ' -f2 | less)` and proceeded to spend a few days looking at it to find probable typos and fixed a few hundred of them in all of the llvm project (note, the ones I found are not anywhere near all of them, but it seems like a good start). Differential Revision: https://reviews.llvm.org/D130982
s/true/ArchSpec::ExactMatch s/false/ArchSpec::CompatibleMatch Differential Revision: https://reviews.llvm.org/D121290
…lems The problem Alexander reported on D127982 was caused by an optimization for AVX512-FP16 instruction. We must limit it to the feature enabled only. During the investigation, I found we didn't expand for fp_round/fp_extend without F16C. This may result runtime crash, so change them too. Reviewed By: RKSimon Differential Revision: https://reviews.llvm.org/D130817
As Fortran 2018 18.2.3.6, the intrinsic `c_loc(x)` gets the C address of argument `x`. It returns the scalar of type C_PTR. As defined in iso_c_binding in `flang/module/__fortran_builtins.f90`, C_PTR is the derived type with only one component of integer 64. This supports the lowering of intrinsic module procedure `c_loc` by converting the address of argument into integer 64, where the argument is lowered as Box and the address is generated using fir.box_addr. The lowering of intrinsic `c_funloc` has the similar characteristic and will be supported later. The execution tests for various data types are in issue llvm/llvm-project#56552. Reviewed By: Jean Perier Differential Revision: https://reviews.llvm.org/D129659
The semantic checks and runtime have been supported. This supports the lowering of intrinsic ABORT. `gfortran` prints a backtrace before abort, unless `-fno-backtrace` is given. This is good to use. The intrinsic BACKTRACE is not supported yet, so add TODO in the runtime. This extention is needed in SPEC2017 521.wrf_r in llvm/llvm-project#55955. Reviewed By: klausler Differential Revision: https://reviews.llvm.org/D130439
This supports checks in C1801-C1805 for derived type with BIND attribute. The other compilers such as 'gfortran' and 'ifort' do not report error for C1802 and C1805, so emit warnings for them. Reviewed By: klausler Differential Revision: https://reviews.llvm.org/D130438
This extends the handling of uniform memory operations to handle the case where a store is storing a loop invariant value. Unlike the general case of a store to an invariant address where we must use the last active lane, in this case we can use any lane since all lanes must produce the same result. For context, the basic structure of the existing code and how the change fits in: * First, we select a widening strategy. (The result is irrelevant for this patch.) * Then we determine if a computation is uniform within all lanes of VF. (Note this is the uniform-per-part definition, not LAI's uniform across all unrolled iterations definition.) * If it is, we overrule the widening strategy, and unconditionally scalarize. * VPReplicationRecipe - which is what actually does the scalarization - knows how to handle unform-per-part values including for scalable vectors. However, we do need to know that the expression is safe to execute without predication - e.g. the uniform mem op was unconditional in the original loop. (This part was split off and already landed.) An obvious question is why not simply implement the generic case? The answer is that I'm going to, but doing so without a canonicalization towards uniform causes regressions due to bad interaction with scalarization/uniformity of values feeding the uniform mem-op. This patch is needed to avoid those regressions. Differential Revision: https://reviews.llvm.org/D130364
The Clang compiler generates internal functions for OpenMP. Current patch marks these functions as artificial. Reviewed By: aprantl Differential Revision: https://reviews.llvm.org/D111521
This improves a corner case where v_fmac can be converted to v_fma on GFX10+ even if it has a literal operand. Differential Revision: https://reviews.llvm.org/D130992
…n bodies; this makes reductions go faster
This pass seems to have very little effect because all it does is hoist some instructions, but it is followed later in the codegen pipeline by the IR CodeSinking pass which does the opposite. Differential Revision: https://reviews.llvm.org/D130258
In the 2e29b01 we introduce a specific solving algorithm that analyzes the VGPR to SGPR copies use chains and either lowers the copy to v_readfirstlane_b32 or converts the whole chain to VALU forms. Same time we still have the code that blindly converts to VALU REG_SEQUENCE and PHIs in case they produce SGPR but have VGPRs input operands. In case the REG_SEQUENCE and PHIs are in the VGPR to SGPR copy use chain, and this chain was considered long enough to convert copy to v_readfistlane_b32, further lowering them to VALU leads to several kinds of issues. At first, we have v_readfistlane_b32 which is completely useless because most parts of its use chain were moved to VALU forms. Second, we may encounter subtle bugs related to the EXEC-dependent CF because of the weird mixing of SALU and VALU instructions. This change removes the code that moves REG_SEQUENCE and PHIs to VALU. Instead, we use the fact that both REG_SEQUENCE and PHIs have copy semantics. That is, if they define SGPR but have VGPR inputs, we insert VGPR to SGPR copies to make them pure SGPR. Then, the new copies are processed by the common VGPR to SGPR lowering algorithm. This is Part 2 in the series of commits aiming at the massive refactoring of the SIFixSGPRCopies pass. Reviewed By: rampitec Differential Revision: https://reviews.llvm.org/D130367
Reviewed By: monkchiang Differential Revision: https://reviews.llvm.org/D130938
I think these pseudos will exist when the post-RA scheduler runs so they should have sched classes. Reviewed By: monkchiang Differential Revision: https://reviews.llvm.org/D130945
The macro is only enabled when the Clang is used with -fexperimental-library. Reviewed By: ldionne, #libc Differential Revision: https://reviews.llvm.org/D130792
This improves the formatting of the generated files. That allows it to remove the clang-format step in D129668. Reviewed By: ldionne, #libc Differential Revision: https://reviews.llvm.org/D130911
The ZERO register should be exposed as a constant physical register through the interface TargetRegisterInfo::isConstantPhysReg. Differential Revision: https://reviews.llvm.org/D130932
Remove redundant code that can never execute due to preceeding logic checks in the code. Differential Revision: https://reviews.llvm.org/D130929
Follow-up to D129097. It is no longer a requirement that the `QualType` passed to to `DataflowAnalysisContext::getStableStorageLocation()` is not null. A null type pass as an argument is only applicable as the pointee type of a `std::nullptr_t` pointer. Differential Revision: https://reviews.llvm.org/D131109
Adds the details for P2286 and its status.
`SymbolTable::lookupSymbolIn` is an expensive operation and we do not want to do it twice Reviewed By: ftynse Differential Revision: https://reviews.llvm.org/D131145
Original commit: KhronosGroup/SPIRV-LLVM-Translator@06709c1
Split the .clang-tidy check lists out over multiple lines to improve readability. Suppress `misc-non-private-member-variables-in-classes` as the code currently contains many instances that fail this check. Drop `constexpr` from `LoopControlLoopCountINTELMask` after clang started diagnosing this with b364535 ("[Clang] Diagnose ill-formed constant expression when setting ...", 2022-07-28). Removing `constexpr` is just a workaround, the long term fix would be to upstream the new enum value to `spirv.hpp`. Original commit: KhronosGroup/SPIRV-LLVM-Translator@c5b29f2
In SYCL 2020 the "::cl::sycl" has been changed to "::sycl". Update SYCL half type support to account for that. Original commit: KhronosGroup/SPIRV-LLVM-Translator@a827fb0
@dbudanov-cmplr, please update PR title and description similar to #6358 |
/summary:run |
\merge |
/merge |
Thu 11 Aug 2022 09:04:36 PM UTC --- Start to merge the commit into sycl branch. It will take several minutes. |
Thu 11 Aug 2022 09:08:31 PM UTC --- Merge the branch in this PR to base automatically. Will close the PR later. |
I think some translator patches a lost. |
LLVM: llvm/llvm-project@ec7f4a7
SPIRV-LLVM-Translator: KhronosGroup/SPIRV-LLVM-Translator@a827fb0