-
Notifications
You must be signed in to change notification settings - Fork 790
[ESIMD] Implement stateless memory accesses enforcement #6287
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Merged
Merged
Changes from all commits
Commits
Show all changes
18 commits
Select commit
Hold shift + click to select a range
f1bf868
[ESIMD] Implement stateless memory accesses enforcement
v-klochkov eadc644
Add the bool argument EnforceStateless to ESIMDVerifier pass
v-klochkov 1568f79
clang-format
v-klochkov cda2d69
Address reviewer's comments (all except more tests and documentation)
v-klochkov 225d5a1
Merge remote-tracking branch 'intel_llvm/sycl' into esimd_stateless_acc
v-klochkov e6bb2e9
Address reviewer's comments (couple tests + func for repeated code pa…
v-klochkov ceeed8d
Merge remote-tracking branch 'intel_llvm/sycl' into esimd_stateless_acc
v-klochkov 6034502
Fix an error in scatter implementation
v-klochkov 50e06cb
Merge remote-tracking branch 'intel_llvm/sycl' into esimd_stateless_acc
v-klochkov da4184f
Fix LIT test
v-klochkov d4383ad
Disable intrinsics that are not supported with __ESIMD_FORCE_STATELES…
v-klochkov 55b3c24
Add the description for -fsycl-esimd-force-stateless-mem to user manual
v-klochkov 1f3f757
Marked the new option with the "EXPERIMENTAL" keyword
v-klochkov 8559274
Update the option description/definition in driver. Update UserManual.
v-klochkov 80b009d
Merge remote-tracking branch 'intel_llvm/sycl' into esimd_stateless_acc
v-klochkov fc687f6
Address reviewer's comments.
v-klochkov e910230
Merge remote-tracking branch 'intel_llvm/sycl' into esimd_stateless_acc
v-klochkov 11d9c6d
Address reviewer's comment (fix in UserManual.md only)
v-klochkov File filter
Filter by extension
Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
There are no files selected for viewing
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,14 @@ | ||
|
||
/// Verify that the driver option is translated to corresponding options | ||
/// to device compilation and sycl-post-link. | ||
// RUN: %clang -### -fsycl -fsycl-esimd-force-stateless-mem \ | ||
// RUN: %s 2>&1 | FileCheck -check-prefix=CHECK-PASS-TO-COMPS %s | ||
// CHECK-PASS-TO-COMPS: clang{{.*}} "-fsycl-esimd-force-stateless-mem" | ||
// CHECK-PASS-TO-COMPS: sycl-post-link{{.*}} "-lower-esimd-force-stateless-mem" | ||
// CHECK-PASS-TO-COMPS-NOT: clang{{.*}} "-fsycl-is-host" {{.*}}"-fsycl-esimd-force-stateless-mem" | ||
// CHECK-PASS-TO-COMPS-NOT: clang{{.*}} "-fsycl-esimd-force-stateless-mem" {{.*}}"-fsycl-is-host" | ||
|
||
/// Verify that stateless memory accesses mapping is not enforced by default | ||
// RUN: %clang -### -fsycl %s 2>&1 | FileCheck -check-prefix=CHECK-DEFAULT %s | ||
// CHECK-DEFAULT-NOT: clang{{.*}} "-fsycl-esimd-force-stateless-mem" | ||
// CHECK-DEFAULT-NOT: sycl-post-link{{.*}} "-lower-esimd-force-stateless-mem" |
11 changes: 11 additions & 0 deletions
11
clang/test/Preprocessor/sycl-esimd-force-stateless-mem.cpp
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,11 @@ | ||
/// This test checks that the macro __ESIMD_FORCE_STATELESS_MEM is automatically | ||
/// defined only if the option -fsycl-esimd-force-stateless-mem is used. | ||
|
||
// RUN: %clang_cc1 %s -fsycl-is-device -fsycl-esimd-force-stateless-mem -E -dM | FileCheck --check-prefix=CHECK-OPT %s | ||
|
||
// RUN: %clang_cc1 %s -E -dM | FileCheck --check-prefix=CHECK-NOOPT %s | ||
// RUN: %clang_cc1 %s -fsycl-is-device -E -dM | FileCheck --check-prefix=CHECK-NOOPT %s | ||
// RUN: %clang_cc1 %s -fsycl-is-host -E -dM | FileCheck --check-prefix=CHECK-NOOPT %s | ||
|
||
// CHECK-OPT:#define __ESIMD_FORCE_STATELESS_MEM 1 | ||
// CHECK-NOOPT-NOT:#define __ESIMD_FORCE_STATELESS_MEM 1 |
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
|
@@ -290,6 +290,31 @@ and not recommended to use in production environment. | |
|
||
NOTE: This flag is currently only supported with the CUDA and HIP targets. | ||
|
||
|
||
**`-f[no-]sycl-esimd-force-stateless-mem`** [EXPERIMENTAL] | ||
|
||
Enforces stateless memory access and enables the automatic conversion of | ||
"stateful" memory access via SYCL accessors to "stateless" within ESIMD | ||
(Explicit SIMD) kernels. | ||
|
||
-fsycl-esimd-force-stateless-mem disables the intrinsics and methods | ||
accepting SYCL accessors or "surface-index" which cannot be automatically | ||
converted to their "stateless" equivalents. | ||
|
||
-fno-sycl-esimd-force-stateless-mem is used to tell compiler not to | ||
enforce usage of stateless memory accesses. This is the default behavior. | ||
|
||
NOTE: "Stateful" access is the one that uses SYCL accessor or a pair | ||
of "surface-index" + 32-bit byte-offset and uses specific memory access | ||
data port messages to read/write/fetch. | ||
"Stateless" memory access uses memory location represented with virtual | ||
memory address pointer such as USM pointer. | ||
|
||
The "stateless" memory may be beneficial as it does not have the limit | ||
of 4Gb per surface. | ||
Also, some of Intel GPUs or GPU run-time/drivers may support only | ||
"stateless" memory accesses. | ||
|
||
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Please add information about default value. There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Added. Thank you. |
||
# Example: SYCL device code compilation | ||
|
||
To invoke SYCL device compiler set `-fsycl-device-only` flag. | ||
|
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Oops, something went wrong.
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
Uh oh!
There was an error while loading. Please reload this page.