Skip to content

Commit 67da385

Browse files
[SYCL][NFCI] Refactor device code split implementation once again (#8833)
#### Intro This is a refactoring of how we perform device code split in `sycl-post-link`, which is intended to solve several existing issues with the current implementation: 1. increased peak RAM consumption by `sycl-post-link` 2. bad scaling with more and more split "dimensions" being added 3. increased tests maintenance cost due to non-deterministic order (between commits) of output files produced by `sycl-post-link` #### A bit more context about the issues above: (1) Increase peak RAM consumption is caused by the fact that we currently preserve **all** splits in-memory, even though we can process them on-by-one and discard them as soon as we stored them to a disk. This was implemented as a memory consumption optimization in #5021, but it got accidentally reverted in #7302 as an attempt to workaround (2). (2) is pretty much summarized in our source code: https://github.com/intel/llvm/blob/afebb2543ccecb89f83c84b68fba7616bbab89ac/llvm/tools/sycl-post-link/sycl-post-link.cpp#L806-L811 (3) is caused by a bad implementation decision made in #7302: because every split is now identified by a hash, every time you add a new split "dimension"/new feature to an account, it results in different hashes for existing tests. Just look how many unrelated tests had to be updated in #7512, #8056 and #8167 #### Now to the PR itself: It introduces a new infrastructure for categorizing/grouping kernel functions: instead of using hashes, we now build a string description for each kernel function and then group kernels with the same description string together. String description is built by a new entity: it accepts a set of rules, where each rule is a simple function which returns a string for passed `llvm::Function`. Results of all rules are concatenated together and rules are invoked in a stable order of their registration. There is a simple API for building those rules. It provides some predefined rules for the most popular use cases like turning a function attribute or a metadata into a string descriptor for the function. There is also a possibility to pass a custom callback there to implement more complicated logic. #### How does this PR help with issues above? (1) and (2) are fixed in conjunction: `sycl-post-link` was refactored to avoid storing more than one split module at a time and that is possible because the PR unifies per-scope and optional-kernel-features splitters into a single generic splitter. The new API for kernels categorization seems to be flexible enough to provide that infrastructure so merged splitters still look OK code-wise. (3) is caused by using string identifiers instead of hashes as well as by using a data structure which sorts identifiers. #### Any other benefits from this PR? About 50 lines of code less to support :) Extending device code split for more optional features would be even easier than it is now: instead of adding several changes to various places around `UsedOptionalFeatures` structure, it will be just adding a 1-3 lines of code. Please also note that `UsedOptionalFeatures` contains tons of inconsistencies in its implementation, which will all gone with this PR: in `operator==` we don't use hash and instead compare certain fields directly (and we do miss some of them); `generateModuleName` method skips some of optional features and ignores them. Cross-module `device_global` usages checks should now work at all split dimensions (except for ESIMD). #### Any potential downsides? With current `UsedOptionalFeatures` there is a possibility to embed various information (used aspects, `large-grf` flag, etc.) directly during device code split to avoid re-gathering that information later when we generate properties. With the suggested approach, it would be harder to do, because it doesn't seem to naturally fit to the proposed infrastructure: see changes I did around `large-grf` in this PR. However, we have never actually implemented this and re-querying some metadata from function doesn't seem like a bottleneck, so it should really be a very minor and only theoretical downside.
1 parent 33facd0 commit 67da385

20 files changed

+432
-385
lines changed

llvm/include/llvm/SYCLLowerIR/SYCLUtils.h

Lines changed: 1 addition & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -22,6 +22,7 @@ namespace llvm {
2222
namespace sycl {
2323
namespace utils {
2424
constexpr char ATTR_SYCL_MODULE_ID[] = "sycl-module-id";
25+
constexpr char ATTR_SYCL_OPTLEVEL[] = "sycl-optlevel";
2526

2627
using CallGraphNodeAction = ::std::function<void(Function *)>;
2728
using CallGraphFunctionFilter =

llvm/test/tools/sycl-post-link/assert/property-1.ll

Lines changed: 3 additions & 3 deletions
Original file line numberDiff line numberDiff line change
@@ -12,9 +12,9 @@
1212
; RUN: FileCheck %s -input-file=%t_0.prop --implicit-check-not TheKernel2
1313
;
1414
; RUN: sycl-post-link -split=kernel -symbols -S < %s -o %t.table
15-
; RUN: FileCheck %s -input-file=%t_0.prop --check-prefixes=CHECK-K1
16-
; RUN: FileCheck %s -input-file=%t_1.prop --check-prefixes=CHECK-K2
17-
; RUN: FileCheck %s -input-file=%t_2.prop --check-prefixes=CHECK-K3
15+
; RUN: FileCheck %s -input-file=%t_0.prop --check-prefixes=CHECK-K3
16+
; RUN: FileCheck %s -input-file=%t_1.prop --check-prefixes=CHECK-K1
17+
; RUN: FileCheck %s -input-file=%t_2.prop --check-prefixes=CHECK-K2
1818

1919
; SYCL source:
2020
; void foo() {

llvm/test/tools/sycl-post-link/device-code-split/per-aspect-split-1.ll

Lines changed: 8 additions & 8 deletions
Original file line numberDiff line numberDiff line change
@@ -10,29 +10,29 @@
1010
; RUN: sycl-post-link -split=auto -symbols -S < %s -o %t.table
1111
; RUN: FileCheck %s -input-file=%t_0.ll --check-prefixes CHECK-M0-IR \
1212
; RUN: --implicit-check-not kernel0 --implicit-check-not kernel1
13-
; RUN: FileCheck %s -input-file=%t_2.ll --check-prefixes CHECK-M1-IR \
13+
; RUN: FileCheck %s -input-file=%t_1.ll --check-prefixes CHECK-M1-IR \
1414
; RUN: --implicit-check-not kernel0 --implicit-check-not kernel1
15-
; RUN: FileCheck %s -input-file=%t_1.ll --check-prefixes CHECK-M2-IR \
15+
; RUN: FileCheck %s -input-file=%t_2.ll --check-prefixes CHECK-M2-IR \
1616
; RUN: --implicit-check-not kernel0 --implicit-check-not kernel1
1717
; RUN: FileCheck %s -input-file=%t_0.sym --check-prefixes CHECK-M0-SYMS \
1818
; RUN: --implicit-check-not kernel0 --implicit-check-not kernel1
19-
; RUN: FileCheck %s -input-file=%t_2.sym --check-prefixes CHECK-M1-SYMS \
19+
; RUN: FileCheck %s -input-file=%t_1.sym --check-prefixes CHECK-M1-SYMS \
2020
; RUN: --implicit-check-not kernel0 --implicit-check-not kernel1
21-
; RUN: FileCheck %s -input-file=%t_1.sym --check-prefixes CHECK-M2-SYMS \
21+
; RUN: FileCheck %s -input-file=%t_2.sym --check-prefixes CHECK-M2-SYMS \
2222
; RUN: --implicit-check-not kernel0 --implicit-check-not kernel1
2323

2424
; RUN: sycl-post-link -split=source -symbols -S < %s -o %t.table
2525
; RUN: FileCheck %s -input-file=%t_0.ll --check-prefixes CHECK-M0-IR \
2626
; RUN: --implicit-check-not kernel0 --implicit-check-not kernel1
27-
; RUN: FileCheck %s -input-file=%t_2.ll --check-prefixes CHECK-M1-IR \
27+
; RUN: FileCheck %s -input-file=%t_1.ll --check-prefixes CHECK-M1-IR \
2828
; RUN: --implicit-check-not kernel0 --implicit-check-not kernel1
29-
; RUN: FileCheck %s -input-file=%t_1.ll --check-prefixes CHECK-M2-IR \
29+
; RUN: FileCheck %s -input-file=%t_2.ll --check-prefixes CHECK-M2-IR \
3030
; RUN: --implicit-check-not kernel0 --implicit-check-not kernel1
3131
; RUN: FileCheck %s -input-file=%t_0.sym --check-prefixes CHECK-M0-SYMS \
3232
; RUN: --implicit-check-not kernel0 --implicit-check-not kernel1
33-
; RUN: FileCheck %s -input-file=%t_2.sym --check-prefixes CHECK-M1-SYMS \
33+
; RUN: FileCheck %s -input-file=%t_1.sym --check-prefixes CHECK-M1-SYMS \
3434
; RUN: --implicit-check-not kernel0 --implicit-check-not kernel1
35-
; RUN: FileCheck %s -input-file=%t_1.sym --check-prefixes CHECK-M2-SYMS \
35+
; RUN: FileCheck %s -input-file=%t_2.sym --check-prefixes CHECK-M2-SYMS \
3636
; RUN: --implicit-check-not kernel0 --implicit-check-not kernel1
3737

3838
; RUN: sycl-post-link -split=kernel -symbols -S < %s -o %t.table

llvm/test/tools/sycl-post-link/device-code-split/per-aspect-split-2.ll

Lines changed: 4 additions & 4 deletions
Original file line numberDiff line numberDiff line change
@@ -21,12 +21,12 @@
2121
; CHECK-TABLE-NEXT: _2.sym
2222
; CHECK-TABLE-EMPTY:
2323

24-
; CHECK-M0-SYMS: kernel0
24+
; CHECK-M0-SYMS: kernel3
2525

26-
; CHECK-M1-SYMS: kernel3
26+
; CHECK-M1-SYMS: kernel1
27+
; CHECK-M1-SYMS: kernel2
2728

28-
; CHECK-M2-SYMS: kernel1
29-
; CHECK-M2-SYMS: kernel2
29+
; CHECK-M2-SYMS: kernel0
3030

3131
target datalayout = "e-i64:64-v16:16-v24:32-v32:32-v48:64-v96:128-v192:256-v256:256-v512:512-v1024:1024"
3232
target triple = "spir64-unknown-linux"

llvm/test/tools/sycl-post-link/device-code-split/per-reqd-wg-size-split-1.ll

Lines changed: 8 additions & 8 deletions
Original file line numberDiff line numberDiff line change
@@ -10,29 +10,29 @@
1010
; RUN: sycl-post-link -split=auto -symbols -S < %s -o %t.table
1111
; RUN: FileCheck %s -input-file=%t_0.ll --check-prefixes CHECK-M0-IR \
1212
; RUN: --implicit-check-not kernel0 --implicit-check-not kernel1
13-
; RUN: FileCheck %s -input-file=%t_2.ll --check-prefixes CHECK-M1-IR \
13+
; RUN: FileCheck %s -input-file=%t_1.ll --check-prefixes CHECK-M1-IR \
1414
; RUN: --implicit-check-not kernel0 --implicit-check-not kernel1
15-
; RUN: FileCheck %s -input-file=%t_1.ll --check-prefixes CHECK-M2-IR \
15+
; RUN: FileCheck %s -input-file=%t_2.ll --check-prefixes CHECK-M2-IR \
1616
; RUN: --implicit-check-not kernel1 --implicit-check-not kernel2
1717
; RUN: FileCheck %s -input-file=%t_0.sym --check-prefixes CHECK-M0-SYMS \
1818
; RUN: --implicit-check-not kernel0 --implicit-check-not kernel1
19-
; RUN: FileCheck %s -input-file=%t_2.sym --check-prefixes CHECK-M1-SYMS \
19+
; RUN: FileCheck %s -input-file=%t_1.sym --check-prefixes CHECK-M1-SYMS \
2020
; RUN: --implicit-check-not kernel0 --implicit-check-not kernel2
21-
; RUN: FileCheck %s -input-file=%t_1.sym --check-prefixes CHECK-M2-SYMS \
21+
; RUN: FileCheck %s -input-file=%t_2.sym --check-prefixes CHECK-M2-SYMS \
2222
; RUN: --implicit-check-not kernel1 --implicit-check-not kernel2
2323

2424
; RUN: sycl-post-link -split=source -symbols -S < %s -o %t.table
2525
; RUN: FileCheck %s -input-file=%t_0.ll --check-prefixes CHECK-M0-IR \
2626
; RUN: --implicit-check-not kernel0 --implicit-check-not kernel1
27-
; RUN: FileCheck %s -input-file=%t_2.ll --check-prefixes CHECK-M1-IR \
27+
; RUN: FileCheck %s -input-file=%t_1.ll --check-prefixes CHECK-M1-IR \
2828
; RUN: --implicit-check-not kernel0 --implicit-check-not kernel2
29-
; RUN: FileCheck %s -input-file=%t_1.ll --check-prefixes CHECK-M2-IR \
29+
; RUN: FileCheck %s -input-file=%t_2.ll --check-prefixes CHECK-M2-IR \
3030
; RUN: --implicit-check-not kernel1 --implicit-check-not kernel2
3131
; RUN: FileCheck %s -input-file=%t_0.sym --check-prefixes CHECK-M0-SYMS \
3232
; RUN: --implicit-check-not kernel0 --implicit-check-not kernel1
33-
; RUN: FileCheck %s -input-file=%t_2.sym --check-prefixes CHECK-M1-SYMS \
33+
; RUN: FileCheck %s -input-file=%t_1.sym --check-prefixes CHECK-M1-SYMS \
3434
; RUN: --implicit-check-not kernel0 --implicit-check-not kernel2
35-
; RUN: FileCheck %s -input-file=%t_1.sym --check-prefixes CHECK-M2-SYMS \
35+
; RUN: FileCheck %s -input-file=%t_2.sym --check-prefixes CHECK-M2-SYMS \
3636
; RUN: --implicit-check-not kernel1 --implicit-check-not kernel2
3737

3838
; RUN: sycl-post-link -split=kernel -symbols -S < %s -o %t.table

llvm/test/tools/sycl-post-link/device-code-split/per-reqd-wg-size-split-2.ll

Lines changed: 9 additions & 9 deletions
Original file line numberDiff line numberDiff line change
@@ -4,29 +4,29 @@
44
; RUN: sycl-post-link -split=auto -symbols -S < %s -o %t.table
55
; RUN: FileCheck %s -input-file=%t.table --check-prefix CHECK-TABLE
66
;
7-
; RUN: FileCheck %s -input-file=%t_1.sym --check-prefix CHECK-M0-SYMS \
8-
; RUN: --implicit-check-not kernel0 --implicit-check-not kernel1 \
7+
; RUN: FileCheck %s -input-file=%t_0.sym --check-prefix CHECK-M0-SYMS \
8+
; RUN: --implicit-check-not kernel0 --implicit-check-not kernel2
9+
;
10+
; RUN: FileCheck %s -input-file=%t_1.sym --check-prefix CHECK-M1-SYMS \
11+
; RUN: --implicit-check-not kernel1 --implicit-check-not kernel3 \
912
; RUN: --implicit-check-not kernel2
1013
;
1114
; RUN: FileCheck %s -input-file=%t_2.sym --check-prefix CHECK-M2-SYMS \
12-
; RUN: --implicit-check-not kernel0 --implicit-check-not kernel3
13-
;
14-
; RUN: FileCheck %s -input-file=%t_0.sym --check-prefix CHECK-M1-SYMS \
1515
; RUN: --implicit-check-not kernel1 --implicit-check-not kernel2 \
16-
; RUN: --implicit-check-not kernel3
16+
; RUN: --implicit-check-not kernel0
1717

1818
; CHECK-TABLE: Code
1919
; CHECK-TABLE-NEXT: _0.sym
2020
; CHECK-TABLE-NEXT: _1.sym
2121
; CHECK-TABLE-NEXT: _2.sym
2222
; CHECK-TABLE-EMPTY:
2323

24-
; CHECK-M0-SYMS: kernel3
24+
; CHECK-M0-SYMS: kernel1
25+
; CHECK-M0-SYMS: kernel2
2526

2627
; CHECK-M1-SYMS: kernel0
2728

28-
; CHECK-M2-SYMS: kernel1
29-
; CHECK-M2-SYMS: kernel2
29+
; CHECK-M2-SYMS: kernel3
3030

3131
target datalayout = "e-i64:64-v16:16-v24:32-v32:32-v48:64-v96:128-v192:256-v256:256-v512:512-v1024:1024"
3232
target triple = "spir64-unknown-linux"

llvm/test/tools/sycl-post-link/device-code-split/split-with-kernel-declarations.ll

Lines changed: 3 additions & 3 deletions
Original file line numberDiff line numberDiff line change
@@ -8,9 +8,9 @@
88
;
99
; RUN: sycl-post-link -split=kernel -symbols -S < %s -o %t1.table
1010
; RUN: FileCheck %s -input-file=%t1.table --check-prefix CHECK-PER-KERNEL-TABLE
11-
; RUN: FileCheck %s -input-file=%t1_0.sym --check-prefix CHECK-PER-KERNEL-SYM0
12-
; RUN: FileCheck %s -input-file=%t1_1.sym --check-prefix CHECK-PER-KERNEL-SYM1
13-
; RUN: FileCheck %s -input-file=%t1_2.sym --check-prefix CHECK-PER-KERNEL-SYM2
11+
; RUN: FileCheck %s -input-file=%t1_0.sym --check-prefix CHECK-PER-KERNEL-SYM1
12+
; RUN: FileCheck %s -input-file=%t1_1.sym --check-prefix CHECK-PER-KERNEL-SYM2
13+
; RUN: FileCheck %s -input-file=%t1_2.sym --check-prefix CHECK-PER-KERNEL-SYM0
1414

1515
; With per-source split, there should be two device images
1616
; CHECK-PER-SOURCE-TABLE: [Code|Properties|Symbols]

llvm/test/tools/sycl-post-link/device-globals/test_global_variable_many_kernels_in_one_module.ll

Lines changed: 2 additions & 2 deletions
Original file line numberDiff line numberDiff line change
@@ -1,6 +1,6 @@
11
; RUN: sycl-post-link --device-globals --split=source -S < %s -o %t.files.table
2-
; RUN: FileCheck %s -input-file=%t.files_0.ll --check-prefix CHECK-MOD0
3-
; RUN: FileCheck %s -input-file=%t.files_1.ll --check-prefix CHECK-MOD1
2+
; RUN: FileCheck %s -input-file=%t.files_0.ll --check-prefix CHECK-MOD1
3+
; RUN: FileCheck %s -input-file=%t.files_1.ll --check-prefix CHECK-MOD0
44

55
; This test is intended to check that sycl-post-link generates no errors
66
; when a device global variable with the 'device_image_scope' property

llvm/test/tools/sycl-post-link/device-globals/test_global_variable_many_modules_no_dev_global.ll

Lines changed: 3 additions & 3 deletions
Original file line numberDiff line numberDiff line change
@@ -1,7 +1,7 @@
11
; RUN: sycl-post-link --device-globals --split=source -S < %s -o %t.files.table
2-
; RUN: FileCheck %s -input-file=%t.files_0.ll --check-prefix CHECK-MOD0
3-
; RUN: FileCheck %s -input-file=%t.files_1.ll --check-prefix CHECK-MOD1
4-
; RUN: FileCheck %s -input-file=%t.files_2.ll --check-prefix CHECK-MOD2
2+
; RUN: FileCheck %s -input-file=%t.files_0.ll --check-prefix CHECK-MOD2
3+
; RUN: FileCheck %s -input-file=%t.files_1.ll --check-prefix CHECK-MOD0
4+
; RUN: FileCheck %s -input-file=%t.files_2.ll --check-prefix CHECK-MOD1
55

66
; This test is intended to check that sycl-post-link generates no error if the
77
; 'device_image_scope' property is attached to not a device global variable.

llvm/test/tools/sycl-post-link/device-globals/test_global_variable_many_modules_no_dev_img_scope.ll

Lines changed: 3 additions & 3 deletions
Original file line numberDiff line numberDiff line change
@@ -1,7 +1,7 @@
11
; RUN: sycl-post-link --device-globals --split=source -S < %s -o %t.files.table
2-
; RUN: FileCheck %s -input-file=%t.files_0.ll --check-prefix CHECK-MOD0
3-
; RUN: FileCheck %s -input-file=%t.files_1.ll --check-prefix CHECK-MOD1
4-
; RUN: FileCheck %s -input-file=%t.files_2.ll --check-prefix CHECK-MOD2
2+
; RUN: FileCheck %s -input-file=%t.files_0.ll --check-prefix CHECK-MOD2
3+
; RUN: FileCheck %s -input-file=%t.files_1.ll --check-prefix CHECK-MOD0
4+
; RUN: FileCheck %s -input-file=%t.files_2.ll --check-prefix CHECK-MOD1
55

66
; ModuleID = 'llvm/test/tools/sycl-post-link/device-globals/test_global_variable_many_modules_no_dev_img_scope.ll'
77
source_filename = "llvm/test/tools/sycl-post-link/device-globals/test_global_variable_many_modules_no_dev_img_scope.ll"

llvm/test/tools/sycl-post-link/device-globals/test_global_variable_many_modules_two_vars_ok.ll

Lines changed: 2 additions & 2 deletions
Original file line numberDiff line numberDiff line change
@@ -1,6 +1,6 @@
11
; RUN: sycl-post-link --device-globals --split=source -S < %s -o %t.files.table
2-
; RUN: FileCheck %s -input-file=%t.files_0.ll --check-prefix CHECK-MOD0
3-
; RUN: FileCheck %s -input-file=%t.files_1.ll --check-prefix CHECK-MOD1
2+
; RUN: FileCheck %s -input-file=%t.files_1.ll --check-prefix CHECK-MOD0
3+
; RUN: FileCheck %s -input-file=%t.files_2.ll --check-prefix CHECK-MOD1
44

55
; This test is intended to check that sycl-post-link generates no errors
66
; when each device global variable with the 'device_image_scope' property

llvm/test/tools/sycl-post-link/device-requirements/aspects.ll

Lines changed: 2 additions & 2 deletions
Original file line numberDiff line numberDiff line change
@@ -17,8 +17,8 @@
1717
; RUN: FileCheck %s -input-file=%t.files_0.prop --check-prefix CHECK-PROP-AUTO-SPLIT
1818

1919
; RUN: sycl-post-link -split=kernel < %s -o %t.files.table
20-
; RUN: FileCheck %s -input-file=%t.files_0.prop --check-prefix CHECK-PROP-KERNEL-SPLIT-0
21-
; RUN: FileCheck %s -input-file=%t.files_1.prop --check-prefix CHECK-PROP-KERNEL-SPLIT-1
20+
; RUN: FileCheck %s -input-file=%t.files_0.prop --check-prefix CHECK-PROP-KERNEL-SPLIT-1
21+
; RUN: FileCheck %s -input-file=%t.files_1.prop --check-prefix CHECK-PROP-KERNEL-SPLIT-0
2222

2323
; CHECK-PROP-AUTO-SPLIT: [SYCL/device requirements]
2424
; CHECK-PROP-AUTO-SPLIT-NEXT: aspects=2|gCAAAAAAAAAAAAAABAAAAYAAAAQCAAAAMAAAAA

llvm/test/tools/sycl-post-link/emit_exported_symbols.ll

Lines changed: 7 additions & 7 deletions
Original file line numberDiff line numberDiff line change
@@ -6,17 +6,17 @@
66
;
77
; Per-module split
88
; RUN: sycl-post-link -symbols -split=source -emit-exported-symbols -S < %s -o %t.per_module.files.table
9-
; RUN: FileCheck %s -input-file=%t.per_module.files_0.prop -implicit-check-not="NotExported" --check-prefix=CHECK-PERMODULE-0-PROP
10-
; RUN: FileCheck %s -input-file=%t.per_module.files_1.prop -implicit-check-not="NotExported" --check-prefix=CHECK-KERNELONLY-PROP
9+
; RUN: FileCheck %s -input-file=%t.per_module.files_0.prop -implicit-check-not="NotExported" --check-prefix=CHECK-KERNELONLY-PROP
10+
; RUN: FileCheck %s -input-file=%t.per_module.files_1.prop -implicit-check-not="NotExported" --check-prefix=CHECK-PERMODULE-0-PROP
1111
; RUN: FileCheck %s -input-file=%t.per_module.files_2.prop -implicit-check-not="NotExported" --check-prefix=CHECK-PERMODULE-2-PROP
1212
;
1313
; Per-kernel split
1414
; RUN: sycl-post-link -symbols -split=kernel -emit-exported-symbols -S < %s -o %t.per_kernel.files.table
15-
; RUN: FileCheck %s -input-file=%t.per_kernel.files_0.prop --implicit-check-not="NotExported" --check-prefix=CHECK-PERKERNEL-0-PROP
16-
; RUN: FileCheck %s -input-file=%t.per_kernel.files_1.prop --implicit-check-not="NotExported" --check-prefix=CHECK-PERKERNEL-1-PROP
17-
; RUN: FileCheck %s -input-file=%t.per_kernel.files_2.prop --implicit-check-not="NotExported" --check-prefix=CHECK-PERKERNEL-2-PROP
18-
; RUN: FileCheck %s -input-file=%t.per_kernel.files_3.prop --implicit-check-not="NotExported" --check-prefix=CHECK-KERNELONLY-PROP
19-
; RUN: FileCheck %s -input-file=%t.per_kernel.files_4.prop --implicit-check-not="NotExported" --check-prefix=CHECK-KERNELONLY-PROP
15+
; RUN: FileCheck %s -input-file=%t.per_kernel.files_0.prop --implicit-check-not="NotExported" --check-prefix=CHECK-KERNELONLY-PROP
16+
; RUN: FileCheck %s -input-file=%t.per_kernel.files_1.prop --implicit-check-not="NotExported" --check-prefix=CHECK-KERNELONLY-PROP
17+
; RUN: FileCheck %s -input-file=%t.per_kernel.files_2.prop --implicit-check-not="NotExported" --check-prefix=CHECK-PERKERNEL-0-PROP
18+
; RUN: FileCheck %s -input-file=%t.per_kernel.files_3.prop --implicit-check-not="NotExported" --check-prefix=CHECK-PERKERNEL-1-PROP
19+
; RUN: FileCheck %s -input-file=%t.per_kernel.files_4.prop --implicit-check-not="NotExported" --check-prefix=CHECK-PERKERNEL-2-PROP
2020

2121
target triple = "spir64-unknown-unknown"
2222

llvm/test/tools/sycl-post-link/sycl-esimd-large-grf.ll

Lines changed: 7 additions & 7 deletions
Original file line numberDiff line numberDiff line change
@@ -9,16 +9,16 @@
99

1010
; RUN: sycl-post-link -split=source -symbols -split-esimd -lower-esimd -S < %s -o %t.table
1111
; RUN: FileCheck %s -input-file=%t.table
12-
; RUN: FileCheck %s -input-file=%t_esimd_large_grf_1.ll --check-prefixes CHECK-ESIMD-LargeGRF-IR --implicit-check-not='RegisterAllocMode'
13-
; RUN: FileCheck %s -input-file=%t_esimd_large_grf_1.prop --check-prefixes CHECK-ESIMD-LargeGRF-PROP
14-
; RUN: FileCheck %s -input-file=%t_0.sym --check-prefixes CHECK-SYCL-SYM
15-
; RUN: FileCheck %s -input-file=%t_esimd_0.sym --check-prefixes CHECK-ESIMD-SYM
16-
; RUN: FileCheck %s -input-file=%t_esimd_large_grf_1.sym --check-prefixes CHECK-ESIMD-LargeGRF-SYM
12+
; RUN: FileCheck %s -input-file=%t_esimd_0.ll --check-prefixes CHECK-ESIMD-LargeGRF-IR --implicit-check-not='RegisterAllocMode'
13+
; RUN: FileCheck %s -input-file=%t_esimd_0.prop --check-prefixes CHECK-ESIMD-LargeGRF-PROP
14+
; RUN: FileCheck %s -input-file=%t_1.sym --check-prefixes CHECK-SYCL-SYM
15+
; RUN: FileCheck %s -input-file=%t_esimd_1.sym --check-prefixes CHECK-ESIMD-SYM
16+
; RUN: FileCheck %s -input-file=%t_esimd_0.sym --check-prefixes CHECK-ESIMD-LargeGRF-SYM
1717

1818
; CHECK: [Code|Properties|Symbols]
19-
; CHECK: {{.*}}esimd-large-grf.ll.tmp_0.ll|{{.*}}esimd-large-grf.ll.tmp_0.prop|{{.*}}esimd-large-grf.ll.tmp_0.sym
20-
; CHECK: {{.*}}esimd-large-grf.ll.tmp_esimd_0.ll|{{.*}}esimd-large-grf.ll.tmp_esimd_0.prop|{{.*}}esimd-large-grf.ll.tmp_esimd_0.sym
19+
; CHECK: {{.*}}_0.ll|{{.*}}_0.prop|{{.*}}_0.sym
2120
; CHECK: {{.*}}_1.ll|{{.*}}_1.prop|{{.*}}_1.sym
21+
; CHECK: {{.*}}_esimd_1.ll|{{.*}}_esimd_1.prop|{{.*}}_esimd_1.sym
2222

2323
; CHECK-ESIMD-LargeGRF-PROP: isEsimdImage=1|1
2424
; CHECK-ESIMD-LargeGRF-PROP: isLargeGRF=1|1

llvm/test/tools/sycl-post-link/sycl-large-grf.ll

Lines changed: 5 additions & 5 deletions
Original file line numberDiff line numberDiff line change
@@ -9,13 +9,13 @@
99

1010
; RUN: sycl-post-link -split=source -symbols -split-esimd -lower-esimd -S < %s -o %t.table
1111
; RUN: FileCheck %s -input-file=%t.table
12-
; RUN: FileCheck %s -input-file=%t_large_grf_1.ll --check-prefixes CHECK-LARGE-GRF-IR
13-
; RUN: FileCheck %s -input-file=%t_large_grf_1.prop --check-prefixes CHECK-LARGE-GRF-PROP
14-
; RUN: FileCheck %s -input-file=%t_0.sym --check-prefixes CHECK-SYCL-SYM
15-
; RUN: FileCheck %s -input-file=%t_large_grf_1.sym --check-prefixes CHECK-LARGE-GRF-SYM
12+
; RUN: FileCheck %s -input-file=%t_0.ll --check-prefixes CHECK-LARGE-GRF-IR
13+
; RUN: FileCheck %s -input-file=%t_0.prop --check-prefixes CHECK-LARGE-GRF-PROP
14+
; RUN: FileCheck %s -input-file=%t_1.sym --check-prefixes CHECK-SYCL-SYM
15+
; RUN: FileCheck %s -input-file=%t_0.sym --check-prefixes CHECK-LARGE-GRF-SYM
1616

1717
; CHECK: [Code|Properties|Symbols]
18-
; CHECK: {{.*}}-large-grf.ll.tmp_0.ll|{{.*}}-large-grf.ll.tmp_0.prop|{{.*}}-large-grf.ll.tmp_0.sym
18+
; CHECK: {{.*}}_0.ll|{{.*}}_0.prop|{{.*}}_0.sym
1919
; CHECK: {{.*}}_1.ll|{{.*}}_1.prop|{{.*}}_1.sym
2020

2121
; CHECK-LARGE-GRF-PROP: isLargeGRF=1|1
Lines changed: 68 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,68 @@
1+
; This test checks, that 'optLevel' property is only emitted based on the module
2+
; entry points
3+
;
4+
; In this test we have functions 'foo' and 'boo' defined in different
5+
; translation units. They are both entry points and 'foo' calls 'boo'.
6+
; As a result, we expect two modules:
7+
; - module with 'foo' (as entry point) and 'bar' (included as dependency) with
8+
; 'optLevel' set to 1 (taken from 'foo')
9+
; - module with 'bar' (as entry point) with 'optLevel' set to 2 (taken from
10+
; 'bar')
11+
12+
; RUN: sycl-post-link -split=source -symbols -S < %s -o %t.table
13+
; RUN: FileCheck %s -input-file=%t.table
14+
; RUN: FileCheck %s -input-file=%t_0.prop --check-prefixes CHECK-OPT-LEVEL-PROP-0
15+
; RUN: FileCheck %s -input-file=%t_1.prop --check-prefixes CHECK-OPT-LEVEL-PROP-1
16+
; RUN: FileCheck %s -input-file=%t_0.sym --check-prefixes CHECK-SYM-0
17+
; RUN: FileCheck %s -input-file=%t_1.sym --check-prefixes CHECK-SYM-1
18+
; RUN: FileCheck %s -input-file=%t_0.ll --check-prefix CHECK-IR-0
19+
; RUN: FileCheck %s -input-file=%t_1.ll --check-prefix CHECK-IR-1
20+
21+
; CHECK: [Code|Properties|Symbols]
22+
; CHECK: {{.*}}_0.ll|{{.*}}_0.prop|{{.*}}_0.sym
23+
; CHECK: {{.*}}_1.ll|{{.*}}_1.prop|{{.*}}_1.sym
24+
; CHECK-EMPTY:
25+
26+
; CHECK-OPT-LEVEL-PROP-0: optLevel=1|1
27+
; CHECK-OPT-LEVEL-PROP-1: optLevel=1|2
28+
; CHECK-SYM-0: _Z3fooii
29+
; CHECK-SYM-0-EMPTY:
30+
; CHECK-SYM-1: _Z3barii
31+
;
32+
; CHECK-IR-0-DAG: define {{.*}} @_Z3fooii{{.*}} #[[#ATTR0:]]
33+
; CHECK-IR-0-DAG: define {{.*}} @_Z3barii{{.*}} #[[#ATTR1:]]
34+
; CHECK-IR-0-DAG: attributes #[[#ATTR0]] = { {{.*}} "sycl-optlevel"="1" }
35+
; CHECK-IR-0-DAG: attributes #[[#ATTR1]] = { {{.*}} "sycl-optlevel"="2" }
36+
;
37+
; CHECK-IR-1: define {{.*}} @_Z3barii{{.*}} #[[#ATTR0:]]
38+
; CHECK-IR-1: attributes #[[#ATTR0]] = { {{.*}} "sycl-optlevel"="2" }
39+
40+
target datalayout = "e-i64:64-v16:16-v24:32-v32:32-v48:64-v96:128-v192:256-v256:256-v512:512-v1024:1024-n8:16:32:64"
41+
target triple = "spir64-unknown-unknown"
42+
43+
define dso_local spir_func noundef i32 @_Z3fooii(i32 noundef %a, i32 noundef %b) local_unnamed_addr #0 {
44+
entry:
45+
%call = call i32 @_Z3barii(i32 %a, i32 %b)
46+
%sub = sub nsw i32 %a, %call
47+
ret i32 %sub
48+
}
49+
50+
define dso_local spir_func noundef i32 @_Z3barii(i32 noundef %a, i32 noundef %b) #1 {
51+
entry:
52+
%retval = alloca i32, align 4
53+
%a.addr = alloca i32, align 4
54+
%b.addr = alloca i32, align 4
55+
%retval.ascast = addrspacecast i32* %retval to i32 addrspace(4)*
56+
%a.addr.ascast = addrspacecast i32* %a.addr to i32 addrspace(4)*
57+
%b.addr.ascast = addrspacecast i32* %b.addr to i32 addrspace(4)*
58+
store i32 %a, i32 addrspace(4)* %a.addr.ascast, align 4
59+
store i32 %b, i32 addrspace(4)* %b.addr.ascast, align 4
60+
%0 = load i32, i32 addrspace(4)* %a.addr.ascast, align 4
61+
%1 = load i32, i32 addrspace(4)* %b.addr.ascast, align 4
62+
%add = add nsw i32 %0, %1
63+
ret i32 %add
64+
}
65+
66+
attributes #0 = { mustprogress nofree norecurse nosync nounwind willreturn memory(none) "sycl-module-id"="test3.cpp" "sycl-optlevel"="1" }
67+
attributes #1 = { convergent mustprogress noinline norecurse nounwind optnone "sycl-module-id"="test2.cpp" "sycl-optlevel"="2" }
68+

0 commit comments

Comments
 (0)