LLVM and SPIRV-LLVM-Translator pulldown (WW47 2024) #16165

iclsrc · 2024-11-22T23:42:00Z

LLVM: llvm/llvm-project@f8bae3a
SPIRV-LLVM-Translator: KhronosGroup/SPIRV-LLVM-Translator@9207ef2aa150773

…4988) [5.2:625:17] The syntax of the DESTROY clause on the DEPOBJ construct with no argument was deprecated.

See #105195 as well as the big comment in DynamicRecursiveASTVisitor.cpp for more context.

…ate handlers. NFC. Cleanup the SHLI/SRLI/SRAI handlers to be more consistent - prep for a future patch.

…andedBits on SSE shift-by-immediate nodes. Attempt to peek through multiple-use SHLI/SRLI/SRAI source vectors.

Some tests were including LibcTest.h directly. Instead you should include Test.h which does proper indirection for other test frameworks we support (zxtest, gtest). Also added some license headers to tests that were missing them.

- create a clang built-in in Builtins.td - link dot4add_i8packed in hlsl_intrinsics.h - add lowering to spirv backend through expansion of operation as OPSDot is missing up to SPIRV 1.6 in SPIRVInstructionSelector.cpp - add lowering to spirv backend using OpSDot in applicable SPIRV version or if SPV_KHR_integer_dot_product is enabled - add dot4add_i8packed intrinsic to IntrinsicsDirectX.td and mapping to DXIL.td op Dot4AddI8Packed - add tests for HLSL intrinsic lowering to dx/spv intrinsic in dot4add_i8packed.hlsl - add tests for sema checks in dot4add_i8packed-errors.hlsl - add test of spir-v lowering in SPIRV/dot4add_i8packed.ll - add test to dxil lowering in DirectX/dot4add_i8packed.ll Resolves #99220

This retries the PR 113521 skipping a test in a remote environment.

When you set a "next branch breakpoint" and run to it while stepping, you have to claim the stop at that breakpoint to be the top of the inlined call stack, or you will seem to "step in" and then plans might try to step back out again. This records the PrefferedLineEntry for next branch breakpoints and adds a test to make sure this works.

For GFX10+, image_gather4 instructions that have v[254:255] as dst reg and the d16 bit on can be assembled correctly but the generated binary fails to disassemble (e.g. image_gather4 v[254:255], v[1:2], s[8:15], s[12:15] dmask:0x8 dim:SQ_RSRC_IMG_2D d16). This patch fixes this problem.

Until now, suppression of `DT_DEBUG` has been hardcoded as a downstream patch in lld. This can instead be achieved by passing `-z rodynamic`. Have the driver do this so that the private patch can be removed. If the scope of lld's `-z rodynamic` is broadened (within reason) to do more in future, that's likely to be fine as `PT_DYNAMIC` isn't writable on PlayStation. PS5 only. On PS4, the equivalent hardcoded configuration will remain in the proprietary linker. SIE tracker: TOOLCHAIN-16704

…14994)

LLVM support for the attribute has been implemented already, so it just plumbs it through to the CUDA front-end. One notable difference from NVCC is that the attribute can be used regardless of the targeted GPU. On the older GPUs it will just be ignored. The attribute is a performance hint, and does not warrant a hard error if compiler can't benefit from it on a particular GPU variant.

The assumed-rank array are represented by DIGenericSubrange in debug metadata. We have to provide 2 things. 1. Expression to get rank value at the runtime from descriptor. 2. Assuming the dimension number for which we want the array information has been put on the DWARF expression stack, expressions which will extract the lowerBound, count and stride information from the descriptor for the said dimension. With this patch in place, this is how I see an assumed_rank variable being evaluated by GDB. ``` function mean(x) result(y) integer, intent(in) :: x(..) ... end program main use mod implicit none integer :: x1,xvec(3),xmat(3,3),xtens(3,3,3) x1 = 5 xvec = 6 xmat = 7 xtens = 8 print *,mean(xvec), mean(xmat), mean(xtens), mean(x1) end program main (gdb) p x $1 = (6, 6, 6) (gdb) p x $2 = ((7, 7, 7) (7, 7, 7) (7, 7, 7)) (gdb) p x $3 = (((8, 8, 8) (8, 8, 8) (8, 8, 8)) ((8, 8, 8) (8, 8, 8) (8, 8, 8)) ((8, 8, 8) (8, 8, 8) (8, 8, 8))) (gdb) p x $4 = 5 ```

…114559) Currently, `FoldTensorCastProducerOp` incorrectly folds the following: ```mlir %pack = tensor.pack %src padding_value(%pad : i32) inner_dims_pos = [0, 1] inner_tiles = [%c8, 1] into %cast : tensor<7x?xi32> -> tensor<1x1x?x1xi32> %res = tensor.cast %pack : tensor<1x1x?x1xi32> to tensor<1x1x8x1xi32> ``` as (note the static trailing dim in the result and dynamic tile dimension that corresponds to that): ```mlir %res = tensor.pack %src padding_value(%pad : i32) inner_dims_pos = [0, 1] inner_tiles = [%c8, 1] into %cast : tensor<7x?xi32> -> tensor<1x1x8x1xi32> ``` This triggers an Op verification failure and is due to the fact that the folder does not update the inner tile sizes in the pack Op. This PR addresses that. Note, supporting other Ops with size-like attributes is left as a TODO.

This patch fixes: mlir/lib/Dialect/Tensor/IR/TensorOps.cpp:4781:17: error: unused variable 'tileSize' [-Werror,-Wunused-variable]

Finish hooking up ClangIR code gen into the Clang control flow, initializing enough that basic code gen is possible. Add an almost empty `cir.func` op to the ClangIR dialect. Currently the only property of the function is its name. Add the code necessary to code gen a cir.func op. Create essentially empty files clang/lib/CIR/Dialect/IR/{CIRAttrs.cpp,CIRTypes.cpp}. These will be filled in later as attributes and types are defined in the ClangIR dialect. (Part of upstreaming the ClangIR incubator project into LLVM.)

I have to check for the sc list size being changed by the call-site search, not just that it had more than one element. Added a test for multiple CU's with the same name in a given module, which would have caught this mistake. We were also doing all the work to find call sites when the found decl and specified decl's only difference was a column, but the incoming specification hadn't specified a column (column number == 0).

…e"" (#115034) In C++ it's UB to use undeclared values as enum. And there is support __ATOMIC_HLE_ACQUIRE and __ATOMIC_HLE_RELEASE need such values. So use `int` in TSAN interface, and mask out irrelevant bits and cast to enum ASAP. `ThreadSanitizer.cpp` already declare morder parameterd in these functions as `i32`. This may looks like a slight change, as we previously didn't mask out additional bits for `fmo`, and `NoTsanAtomic` call. But from implementation it's clear that they are expecting exact enum. Reverts llvm/llvm-project#115032 Reapply llvm/llvm-project#114724

…5023) Data transfer from a variable with a descriptor to a pointer. We create a descriptor for the pointer so we can use the flang runtime to perform the transfer. The Assign function handles all corner cases. We add a new entry points `CUFDataTransferDescDescNoRealloc` to avoid reallocation since the variable on the LHS is not an allocatable.

I plan to remove s32 as a legal type to match SelectionDAG and to remove i32 from the GPR regclass on RV64.

The lit test fmuladd-soft-float.ll only specifies s390x as platform, but the test is Linux specific, causing problems when run on z/OS. This change updates the triple to fix this.

Adds the runtime support routines for XRay on SystemZ. Only function entry/exit is implemented.

Expands pseudo instructions PATCHABLE_FUNCTION_ENTER and PATCHABLE_RET into a small instruction sequence which calls into the XRay library.

These cannot be 0.

…groups (#113751) 0 does not make sense as a value for this to be, much less the default. Also stop emitting each individual field if it is the default, rather than if any element was the default. Also fix the name of the test since it didn't exactly match the real attribute name.

With the support for xray for SystemZ in place, the option can now be enabled in clang.

…4436) Some old "t16" VOP2 instructions are actually in fake16 format. Correct and update test file

jsji · 2024-12-02T02:05:49Z

This is ready for review.

Update sycl test using new splat syntax @intel/dpcpp-kernel-fusion-reviewers @intel/llvm-reviewers-runtime
Update enqueue_kernel.cl after 88b18319c943 @intel/dpcpp-spirv-reviewers
libclc: fix clcmaro.h path for rsqrt.cl @frasercrmck @intel/llvm-reviewers-cuda
[NVPTX] Update test w/ nvvm.ldg.global.* removal in fb33af0 @JackAKirk @intel/llvm-reviewers-cuda

frasercrmck

libclc LGTM

jsji · 2024-12-03T01:48:02Z

Ping @intel/llvm-reviewers-runtime @intel/dpcpp-spirv-reviewers @intel/llvm-reviewers-cuda

MrSidims · 2024-12-03T10:17:21Z

llvm-spirv/test/transcoding/enqueue_kernel.cl

-// CHECK-SPIRV: Name [[#InvokeFunc3:]] "__device_side_enqueue_block_invoke_3_kernel"
-// CHECK-SPIRV: Name [[#InvokeFunc4:]] "__device_side_enqueue_block_invoke_4_kernel"
-// CHECK-SPIRV: Name [[#InvokeFunc5:]] "__device_side_enqueue_block_invoke_5_kernel"
+// CHECK-SPIRV: Name [[#InvokeFunc1:]] "__device_side_enqueue_block_invoke"


It's different from https://github.com/KhronosGroup/SPIRV-LLVM-Translator/blob/main/test/transcoding/enqueue_kernel.cl . I'm ok to merge it as is, I've created internal tracker to investigate this

Thanks @MrSidims

uditagarwal97

Changes in sycl/test/check_device_code/vector/vector_math_ops.cpp due to use of splat LGTM.

jsji · 2024-12-03T13:52:50Z

@intel/llvm-gatekeepers I think this is ready for merge. Thanks.

sarnex · 2024-12-03T15:15:26Z

/merge

sarnex · 2024-12-03T15:29:02Z

Is the command wrong or is the bot broken?

jsji · 2024-12-03T15:53:37Z

Is the command wrong or is the bot broken?

Looks like the bot is failing again due to old git version. @DoyleLi

bb-sycl · 2024-12-04T01:18:19Z

Wed 04 Dec 2024 01:18:19 AM UTC --- Merge failed with error: PR is not clean for merge. Please examine ci check status before merge.

DoyleLi · 2024-12-04T01:19:56Z

Is the command wrong or is the bot broken?

Looks like the bot is failing again due to old git version. @DoyleLi

Hi, The issue is resolved. Will take effect in next pulldown.

ldionne and others added 30 commits November 5, 2024 13:18

[libc++][NFC] Remove unused header in test/support

76f993b

[flang][OpenMP] Deprecation message for DESTROY with no argument (#11…

5f8b83e

…4988) [5.2:625:17] The syntax of the DESTROY clause on the DEPOBJ construct with no argument was deprecated.

[Clang] [NFC] Introduce DynamicRecursiveASTVisitor (#110040)

ff5551c

See #105195 as well as the big comment in DynamicRecursiveASTVisitor.cpp for more context.

[X86] SimplifyDemandedBitsForTargetNode - cleanup SSE shift-by-immedi…

02e5c25

…ate handlers. NFC. Cleanup the SHLI/SRLI/SRAI handlers to be more consistent - prep for a future patch.

[X86] SimplifyDemandedBitsForTargetNode - call SimplifyMultipleUseDem…

61d5add

…andedBits on SSE shift-by-immediate nodes. Attempt to peek through multiple-use SHLI/SRLI/SRAI source vectors.

[LLDB] Retry Add a target.launch-working-dir setting

e952728

This retries the PR 113521 skipping a test in a remote environment.

AMDGPU: Rename test file

ce067c5

[flang] Tweak a SCALE/IEEE_SCALB folding overflow warning message (#1…

592c0fe

…14994)

[mlir] Fix a warning

d02d9ce

This patch fixes: mlir/lib/Dialect/Tensor/IR/TensorOps.cpp:4781:17: error: unused variable 'tileSize' [-Werror,-Wunused-variable]

[gn build] Port ff5551c

a33d42a

[RISCV][GISel] Remove s32 support for G_ABS on RV64.

e566ae8

I plan to remove s32 as a legal type to match SelectionDAG and to remove i32 from the GPR regclass on RV64.

[SystemZ] Make lit test more specific (#115050)

8b65973

The lit test fmuladd-soft-float.ll only specifies s390x as platform, but the test is Linux specific, causing problems when run on z/OS. This change updates the triple to fix this.

[SystemZ][XRay] XRay runtime support for SystemZ (#113252)

db1882e

Adds the runtime support routines for XRay on SystemZ. Only function entry/exit is implemented.

[SystemZ][XRay] Implement XRay instrumentation for SystemZ (#113253)

4a37799

Expands pseudo instructions PATCHABLE_FUNCTION_ENTER and PATCHABLE_RET into a small instruction sequence which calls into the XRay library.

clang/AMDGPU: Emit grid size builtins with range metadata (#113038)

0c60573

These cannot be 0.

[SystemZ][XRay] Enable XRay for SystemZ in clang (#113254)

0428f2c

With the support for xray for SystemZ in place, the option can now be enabled in clang.

[AMDGPU][True16][MC] VOP2 update instructions with fake16 format (#11…

e8644e3

…4436) Some old "t16" VOP2 instructions are actually in fake16 format. Correct and update test file

jsji temporarily deployed to WindowsCILock November 23, 2024 03:56 — with GitHub Actions Inactive

jsji temporarily deployed to WindowsCILock November 23, 2024 05:18 — with GitHub Actions Inactive

libclc: fix clcmaro.h path for rsqrt.cl

69da5c5

jsji temporarily deployed to WindowsCILock December 1, 2024 01:49 — with GitHub Actions Inactive

jsji temporarily deployed to WindowsCILock December 1, 2024 01:50 — with GitHub Actions Inactive

jsji temporarily deployed to WindowsCILock December 1, 2024 03:53 — with GitHub Actions Inactive

jsji temporarily deployed to WindowsCILock December 1, 2024 03:59 — with GitHub Actions Inactive

Merge branch 'sycl' into llvmspirv_pulldown

f6e2549

jsji had a problem deploying to WindowsCILock December 1, 2024 20:24 — with GitHub Actions Error

jsji had a problem deploying to WindowsCILock December 1, 2024 20:25 — with GitHub Actions Error

[NVPTX] Update test w/ nvvm.ldg.global.* removal in fb33af0

16e0fe5

jsji force-pushed the llvmspirv_pulldown branch from 8fe0d1c to 16e0fe5 Compare December 1, 2024 21:04

jsji temporarily deployed to WindowsCILock December 1, 2024 21:05 — with GitHub Actions Inactive

jsji temporarily deployed to WindowsCILock December 1, 2024 21:06 — with GitHub Actions Inactive

jsji temporarily deployed to WindowsCILock December 1, 2024 21:53 — with GitHub Actions Inactive

jsji temporarily deployed to WindowsCILock December 1, 2024 21:54 — with GitHub Actions Inactive

frasercrmck approved these changes Dec 2, 2024

View reviewed changes

MrSidims reviewed Dec 3, 2024

View reviewed changes

MrSidims approved these changes Dec 3, 2024

View reviewed changes

uditagarwal97 approved these changes Dec 3, 2024

View reviewed changes

sarnex merged commit d7b2605 into sycl Dec 3, 2024
24 of 30 checks passed

bb-sycl approved these changes Dec 4, 2024

View reviewed changes

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

LLVM and SPIRV-LLVM-Translator pulldown (WW47 2024) #16165

LLVM and SPIRV-LLVM-Translator pulldown (WW47 2024) #16165

Uh oh!

iclsrc commented Nov 22, 2024

Uh oh!

jsji commented Dec 2, 2024 •

edited by MrSidims

Loading

Uh oh!

frasercrmck left a comment

Uh oh!

jsji commented Dec 3, 2024

Uh oh!

MrSidims Dec 3, 2024

Uh oh!

jsji Dec 3, 2024

Uh oh!

uditagarwal97 left a comment

Uh oh!

jsji commented Dec 3, 2024

Uh oh!

sarnex commented Dec 3, 2024

Uh oh!

sarnex commented Dec 3, 2024

Uh oh!

jsji commented Dec 3, 2024 •

edited

Loading

Uh oh!

Uh oh!

bb-sycl commented Dec 4, 2024

Uh oh!

DoyleLi commented Dec 4, 2024

Uh oh!

Uh oh!

LLVM and SPIRV-LLVM-Translator pulldown (WW47 2024) #16165

LLVM and SPIRV-LLVM-Translator pulldown (WW47 2024) #16165

Uh oh!

Conversation

iclsrc commented Nov 22, 2024

Uh oh!

jsji commented Dec 2, 2024 • edited by MrSidims Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

frasercrmck left a comment

Choose a reason for hiding this comment

Uh oh!

jsji commented Dec 3, 2024

Uh oh!

MrSidims Dec 3, 2024

Choose a reason for hiding this comment

Uh oh!

jsji Dec 3, 2024

Choose a reason for hiding this comment

Uh oh!

uditagarwal97 left a comment

Choose a reason for hiding this comment

Uh oh!

jsji commented Dec 3, 2024

Uh oh!

sarnex commented Dec 3, 2024

Uh oh!

sarnex commented Dec 3, 2024

Uh oh!

jsji commented Dec 3, 2024 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

Uh oh!

bb-sycl commented Dec 4, 2024

Uh oh!

DoyleLi commented Dec 4, 2024

Uh oh!

Uh oh!

jsji commented Dec 2, 2024 •

edited by MrSidims

Loading

jsji commented Dec 3, 2024 •

edited

Loading