Skip to content

[SYCL][ESIMD] Lower global volatile stores to vstores #9088

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
merged 5 commits into from
Apr 19, 2023
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
58 changes: 51 additions & 7 deletions llvm/lib/SYCLLowerIR/ESIMD/LowerESIMD.cpp
Original file line number Diff line number Diff line change
Expand Up @@ -131,8 +131,8 @@ enum class lsc_subopcode : uint8_t {
// /^_Z(\d+)__esimd_\w+/
static constexpr char ESIMD_INTRIN_PREF0[] = "_Z";
static constexpr char ESIMD_INTRIN_PREF1[] = "__esimd_";
static constexpr char ESIMD_INSERTED_VSTORE_FUNC_NAME[] = "_Z14__esimd_vstorev";
static constexpr char SPIRV_INTRIN_PREF[] = "__spirv_BuiltIn";

struct ESIMDIntrinDesc {
// Denotes argument translation rule kind.
enum GenXArgRuleKind {
Expand Down Expand Up @@ -714,10 +714,10 @@ static bool isDevicelibFunction(StringRef FunctionName) {
.Default(false);
}

// Mangle deviceLib function to make it pass through the regular workflow
// These functions are defined as extern "C" which Demangler that is used
// fails to handle properly.
static std::string mangleDevicelibFunction(StringRef FunctionName) {
static std::string mangleFunction(StringRef FunctionName) {
// Mangle deviceLib function to make it pass through the regular workflow
// These functions are defined as extern "C" which Demangler that is used
// fails to handle properly.
if (isDevicelibFunction(FunctionName)) {
if (FunctionName.startswith("__devicelib_ConvertFToBF16INTEL")) {
return (Twine("_Z31") + FunctionName + "RKf").str();
Expand All @@ -726,6 +726,11 @@ static std::string mangleDevicelibFunction(StringRef FunctionName) {
return (Twine("_Z31") + FunctionName + "RKt").str();
}
}
// Every inserted vstore gets its own function with the same name,
// so they are mangled with ".[0-9]+". Just use the
// raw name to pass through the demangler.
if (FunctionName.startswith(ESIMD_INSERTED_VSTORE_FUNC_NAME))
return ESIMD_INSERTED_VSTORE_FUNC_NAME;
return FunctionName.str();
}

Expand Down Expand Up @@ -1469,7 +1474,7 @@ static void translateESIMDIntrinsicCall(CallInst &CI) {
using Demangler = id::ManglingParser<SimpleAllocator>;
Function *F = CI.getCalledFunction();
llvm::esimd::assert_and_diag(F, "function to translate is invalid");
std::string MnglNameStr = mangleDevicelibFunction(F->getName());
std::string MnglNameStr = mangleFunction(F->getName());
StringRef MnglName = MnglNameStr;

Demangler Parser(MnglName.begin(), MnglName.end());
Expand Down Expand Up @@ -1718,6 +1723,45 @@ SmallPtrSet<Type *, 4> collectGenXVolatileTypes(Module &M) {
return GenXVolatileTypeSet;
}

// genx_volatile variables are special and require vstores instead of stores.
// In most cases, the vstores are called directly in the implementation
// of the simd object operations, but in some cases clang can implicitly
// insert stores, such as after a write in inline assembly. To handle that
// case, lower any stores of genx_volatiles into vstores.
void lowerGlobalStores(Module &M, const SmallPtrSet<Type *, 4> &GVTS) {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

You can pass SmallPtrSetImpl<Type *> to not carry 4.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The GVTS variable is created and used elsewhere and already has type SmallPtrSet<Type *, 4>, and I don't think we are supposed to directly use Impl types, so I would prefer to leave this one as-is if that's okay.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Impl types are a common pattern in LLVM to pass the ADT as parameter to the function.
Please refer to the Note section on the bottom of this chapter in LLVM Programmer's Manual.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

oh i didnt know of this use case, ill make a follow up pr to address this feedback, thx

SmallVector<Instruction *, 4> ToErase;
for (auto &F : M.functions()) {
for (Instruction &I : instructions(F)) {
auto SI = dyn_cast_or_null<StoreInst>(&I);
if (!SI)
continue;
if (GVTS.find(SI->getValueOperand()->getType()) == GVTS.end())
continue;
SmallVector<Type *, 2> ArgTypes;
IRBuilder<> Builder(SI);
ArgTypes.push_back(SI->getPointerOperand()->getType());
ArgTypes.push_back(SI->getValueOperand()->getType());
auto *NewFType = FunctionType::get(SI->getType(), ArgTypes, false);
auto *NewF =
Function::Create(NewFType, GlobalVariable::ExternalWeakLinkage,
ESIMD_INSERTED_VSTORE_FUNC_NAME, M);
NewF->addFnAttr(Attribute::NoUnwind);
NewF->addFnAttr(Attribute::Convergent);
NewF->setDSOLocal(true);
NewF->setCallingConv(CallingConv::SPIR_FUNC);
SmallVector<Value *, 2> Args;
Args.push_back(SI->getPointerOperand());
Args.push_back(SI->getValueOperand());
auto *NewCI = Builder.CreateCall(NewFType, NewF, Args);
NewCI->setDebugLoc(SI->getDebugLoc());
ToErase.push_back(SI);
}
}
for (auto *Inst : ToErase) {
Inst->eraseFromParent();
}
}

} // namespace

PreservedAnalyses SYCLLowerESIMDPass::run(Module &M, ModuleAnalysisManager &) {
Expand All @@ -1726,7 +1770,7 @@ PreservedAnalyses SYCLLowerESIMDPass::run(Module &M, ModuleAnalysisManager &) {
// uses the generated metadata:
size_t AmountOfESIMDIntrCalls = lowerSLMReservationCalls(M);
SmallPtrSet<Type *, 4> GVTS = collectGenXVolatileTypes(M);

lowerGlobalStores(M, GVTS);
for (auto &F : M.functions()) {
AmountOfESIMDIntrCalls += this->runOnFunction(F, GVTS);
}
Expand Down
24 changes: 24 additions & 0 deletions llvm/test/SYCLLowerIR/ESIMD/lower_global_stores.ll
Original file line number Diff line number Diff line change
@@ -0,0 +1,24 @@
; This test checks whether global stores are converted to vstores
;
; RUN: opt < %s -passes=LowerESIMD -S | FileCheck %s

target datalayout = "e-i64:64-v16:16-v24:32-v32:32-v48:64-v96:128-v192:256-v256:256-v512:512-v1024:1024-n8:16:32:64"
target triple = "spir64-unknown-unknown"

%"class.sycl::_V1::ext::intel::esimd::simd" = type { %"class.sycl::_V1::ext::intel::esimd::detail::simd_obj_impl" }
%"class.sycl::_V1::ext::intel::esimd::detail::simd_obj_impl" = type { <16 x float> }

@va = dso_local global %"class.sycl::_V1::ext::intel::esimd::simd" zeroinitializer, align 64 #0
@vb = dso_local global %"class.sycl::_V1::ext::intel::esimd::simd" zeroinitializer, align 64 #0

define weak_odr dso_local spir_kernel void @foo() {
%1 = call <16 x float> asm "", "=rw"()
; CHECK: call void @llvm.genx.vstore.v16f32.p0v16f32(<16 x float> %1, <16 x float>* getelementptr inbounds (%"class.sycl::_V1::ext::intel::esimd::simd", %"class.sycl::_V1::ext::intel::esimd::simd"* @va, i64 0, i32 0, i32 0))
store <16 x float> %1, <16 x float>* getelementptr inbounds (%"class.sycl::_V1::ext::intel::esimd::simd", %"class.sycl::_V1::ext::intel::esimd::simd"* @va, i64 0, i32 0, i32 0)
; CHECK-NEXT: call void @llvm.genx.vstore.v16f32.p0v16f32(<16 x float> %1, <16 x float>* getelementptr inbounds (%"class.sycl::_V1::ext::intel::esimd::simd", %"class.sycl::_V1::ext::intel::esimd::simd"* @vb, i64 0, i32 0, i32 0))
store <16 x float> %1, <16 x float>* getelementptr inbounds (%"class.sycl::_V1::ext::intel::esimd::simd", %"class.sycl::_V1::ext::intel::esimd::simd"* @vb, i64 0, i32 0, i32 0)
ret void
}

attributes #0 = { "genx_byte_offset"="0" "genx_volatile" }
attributes #1 = { "" }
Original file line number Diff line number Diff line change
Expand Up @@ -819,7 +819,6 @@ The parameter and the return type in the ABI form will be `<8 x float>`.
Inline assembly is supported with ESIMD classes `simd`, `simd_mask` and `simd_view`. `simd_view` only supports read operations.
In order the access the raw underlying vector required for inline assembly, the `data` function can be used for read-only access and
the `data_ref` function can be used for write access. The `data_ref` function only exists for `simd` and `simd_mask`, and should only be used in inline assembly.
If the `simd` or `simd_mask` object is a private global variable, the `commit` function must be called after any write in inline assembly.

Example of inline GEN assembly:
```cpp
Expand All @@ -842,7 +841,6 @@ void calledFromKernel() {
__asm__("add (M1, 16) %0 %1 %2"
: "=rw"(vc.data_ref())
: "rw"(va.data()), "rw"(vb.data()));
vc.commit();
}
```

Expand Down
Original file line number Diff line number Diff line change
Expand Up @@ -361,7 +361,9 @@ class simd_obj_impl {

/// Commit the current stored underlying raw vector to memory.
/// This is required when using inline assembly with private global variables.
void commit() { __esimd_vstore<RawTy, N>(&M_data, M_data); }
__SYCL_DEPRECATED(
"commit is deprecated and will be removed in a future release")
void commit() {}

/// @return Newly constructed (from the underlying data) object of the Derived
/// type.
Expand Down
1 change: 0 additions & 1 deletion sycl/test-e2e/ESIMD/InlineAsm/asm_glb.cpp
Original file line number Diff line number Diff line change
Expand Up @@ -66,7 +66,6 @@ int main(void) {
__asm__("add (M1, 16) %0 %1 %2"
: "=rw"(vc.data_ref())
: "rw"(va.data()), "rw"(vb.data()));
vc.commit();
#else
vc = va+vb;
#endif
Expand Down
3 changes: 0 additions & 3 deletions sycl/test/esimd/simd_inline_asm.cpp
Original file line number Diff line number Diff line change
Expand Up @@ -24,7 +24,4 @@ void test_error() SYCL_ESIMD_FUNCTION {
__asm__("%0" : "=rw"(mask.data()));

__asm__("%0" : "=rw"(mask.data_ref()));

s.commit();
mask.commit();
}