Skip to content

Commit 838163b

Browse files
committed
[SYCL] WG-shared global variables must have external linkage
Currently hierarchical parallelism semantics is handled by SYCL specific code generation and LowerWGScope pass. WG-shared global variables are created for automatic variables in PFWG scope by CG and WG-shared shadow variables are created by LowerWGScope pass to broadcast private value from leader work item to other work items. Currently these global variables are created with internal linkage which is not correct. As a result wrong transformations are happening in the LLVM middle end. For example, ... if (Leader work item) store %PrivateValue to @SharedGlobal -> leader shares the value memory_barrier() load %PrivateValue from @SharedGlobal -> all WIs load the shared value ... Generated load/store operations are not supposed to be moved across memory barrier but barrier intrinsics like @llvm.nvvm.barrier0() are considered as regular functions in the LLVM middle end. As soon as global has an interanl linkage it is considered as non-escaping and alias analysis thinks that @llvm.nvvm.barrier0() cannot modify global variable and only reads it. As a result the following transformation is performed by GVN: ... crit_edge: load %PrivateValue from @SharedGlobal -> all WIs load the shared value if (Leader work item) store %PrivateValue to @SharedGlobal -> leader shares the value memory_barrier() ... That is why all WG-shared variables should have external linkage. Signed-off-by: Artur Gainullin <[email protected]>
1 parent 4436fa2 commit 838163b

File tree

7 files changed

+7
-13
lines changed

7 files changed

+7
-13
lines changed

clang/lib/CodeGen/CGSYCLRuntime.cpp

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -82,7 +82,7 @@ void CGSYCLRuntime::emitWorkGroupLocalVarDecl(CodeGenFunction &CGF,
8282
#endif // NDEBUG
8383
// generate global variable in the address space selected by the clang CodeGen
8484
// (should be local)
85-
CGF.EmitStaticVarDecl(D, llvm::GlobalValue::InternalLinkage);
85+
CGF.EmitStaticVarDecl(D, llvm::GlobalValue::ExternalLinkage);
8686
}
8787

8888
bool CGSYCLRuntime::actOnAutoVarEmit(CodeGenFunction &CGF, const VarDecl &D,

clang/test/CodeGenSYCL/wg_scope_var.cpp

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -1,7 +1,7 @@
11
// RUN: %clang_cc1 -triple spir64-unknown-unknown-sycldevice -fsycl-is-device -disable-llvm-passes -I %S/Inputs -emit-llvm %s -o - | FileCheck %s
22

33
// Checked that local variables declared by the user in PWFG scope are turned into globals in the local address space.
4-
// CHECK: @{{.*myLocal.*}} = internal addrspace(3) global i32 0
4+
// CHECK: @{{.*myLocal.*}} = addrspace(3) global i32 0
55

66
#include "sycl.hpp"
77

llvm/lib/SYCLLowerIR/LowerWGScope.cpp

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -838,7 +838,7 @@ GlobalVariable *spirv::createWGLocalVariable(Module &M, Type *T,
838838
new GlobalVariable(M, // module
839839
T, // type
840840
false, // isConstant
841-
GlobalValue::InternalLinkage, // Linkage
841+
GlobalValue::ExternalLinkage, // Linkage
842842
UndefValue::get(T), // Initializer
843843
Name, // Name
844844
nullptr, // InsertBefore

llvm/test/SYCLLowerIR/byval_arg.ll

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -6,7 +6,7 @@
66

77
%struct.baz = type { i64 }
88

9-
; CHECK: @[[SHADOW:[a-zA-Z0-9]+]] = internal unnamed_addr addrspace(3) global %struct.baz undef
9+
; CHECK: @[[SHADOW:[a-zA-Z0-9]+]] = unnamed_addr addrspace(3) global %struct.baz
1010

1111
define internal spir_func void @wibble(%struct.baz* byval(%struct.baz) %arg1) !work_group_scope !0 {
1212
; CHECK-LABEL: @wibble(

llvm/test/SYCLLowerIR/pfwg_and_pfwi.ll

Lines changed: 3 additions & 3 deletions
Original file line numberDiff line numberDiff line change
@@ -13,9 +13,9 @@
1313
%struct.foo = type { %struct.barney }
1414
%struct.foo.0 = type { i8 }
1515

16-
; CHECK: @[[PFWG_SHADOW:.*]] = internal unnamed_addr addrspace(3) global %struct.bar addrspace(4)*
17-
; CHECK: @[[PFWI_SHADOW:.*]] = internal unnamed_addr addrspace(3) global %struct.foo.0
18-
; CHECK: @[[GROUP_SHADOW:.*]] = internal unnamed_addr addrspace(3) global %struct.zot
16+
; CHECK: @[[PFWG_SHADOW:.*]] = unnamed_addr addrspace(3) global %struct.bar addrspace(4)*
17+
; CHECK: @[[PFWI_SHADOW:.*]] = unnamed_addr addrspace(3) global %struct.foo.0
18+
; CHECK: @[[GROUP_SHADOW:.*]] = unnamed_addr addrspace(3) global %struct.zot
1919

2020
define internal spir_func void @wibble(%struct.bar addrspace(4)* %arg, %struct.zot* byval(%struct.zot) align 8 %arg1) align 2 !work_group_scope !0 {
2121
; CHECK-LABEL: @wibble(

sycl/test/hier_par/hier_par_basic.cpp

Lines changed: 0 additions & 3 deletions
Original file line numberDiff line numberDiff line change
@@ -12,9 +12,6 @@
1212
// RUN: %GPU_RUN_PLACEHOLDER %t.out
1313
// RUN: %ACC_RUN_PLACEHOLDER %t.out
1414

15-
// TODO: ptxas fatal : Unresolved extern function '__spirv_ControlBarrier'
16-
// XFAIL: cuda
17-
1815
// This test checks hierarchical parallelism invocation APIs, but without any
1916
// data or code with side-effects between the work group and work item scopes.
2017

sycl/test/hier_par/hier_par_wgscope.cpp

Lines changed: 0 additions & 3 deletions
Original file line numberDiff line numberDiff line change
@@ -18,9 +18,6 @@
1818
// RUN: %GPU_RUN_PLACEHOLDER %t.out
1919
// RUN: %ACC_RUN_PLACEHOLDER %t.out
2020

21-
// TODO: ptxas fatal : Unresolved extern function '__spirv_ControlBarrier'
22-
// UNSUPPORTED: cuda
23-
2421
// This test checks correctness of hierarchical kernel execution when there is
2522
// code and data in the work group scope.
2623

0 commit comments

Comments
 (0)