Skip to content

[SYCL-MLIR] Merge from intel/llvm sycl branch #8431

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
merged 25 commits into from
Feb 23, 2023
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
Show all changes
25 commits
Select commit Hold shift + click to select a range
629dbc9
[SYCL] Add alignment support to compile time properties (#7924)
Feb 17, 2023
4bd0876
[NFC] Refactor FPGA based offloading tests (#8398)
mdtoguchi Feb 18, 2023
e60c549
[SYCL][Fusion] Update SPIR-V version to 1.4 (#8358)
MrSidims Feb 18, 2023
b0bfdfa
[SYCL] Add imf simd emulation APIs to sycl_ext_intel_math (#8262)
jinge90 Feb 20, 2023
ec34869
[SYCL][Docs] Add design document for device aspect traits (#8182)
steffenlarsen Feb 20, 2023
7a29f61
[CI] Use ubuntu-20.04 image
bader Feb 20, 2023
8b55761
[OpenCL] Disable vector to scalar types coercion for OpenCL (#8160)
cdai2 Feb 20, 2023
429e7fa
[CI] Use ubuntu-20.04 image
bader Feb 21, 2023
b66236a
[SYCL][NFC] Use reducer-access helper function instead of deduction g…
steffenlarsen Feb 21, 2023
d0b25d4
[SYCL][CUDA] Support host-device memcpy2D (#8181)
abagusetty Feb 21, 2023
8f5000c
[SYCL][CUDA] Define __SYCL_CUDA_ARCH__ instead of __CUDA_ARCH__ for S…
GeorgeWeb Feb 21, 2023
6761f0e
[NFC][clang][SYCL] Refine the test check by adding `:` (#8416)
Fznamznon Feb 21, 2023
039b538
[ESIMD] Reduce number of bit-casts generated for lsc_block_load/store…
fineg74 Feb 21, 2023
84fe658
[SYCL] Add sub-group functions emulation for vector of doubles. (#8252)
maksimsab Feb 22, 2023
1b22544
[SYCL][PI][CUDA][HIP] Fix bugs that can cause events not to be waited…
t4c1 Feb 22, 2023
570dc5e
[SYCL][Fusion] Do not internalize stored argument pointers (#8376)
victor-eds Feb 22, 2023
928645a
[SYCL] Avoid optimizing out integer conversion (#8409)
steffenlarsen Feb 22, 2023
d6f5b35
[SYCL] Implement device_global host-side memory operations (#8022)
steffenlarsen Feb 22, 2023
b3b5985
[Doc] Remove unused SPV_INTEL_non_constant_addrspace_printf extension…
vmaksimo Feb 22, 2023
812cd8a
[SPIR-V] Cherry-pick of "Add SPIR-V 1.4 checks" (#7493)
MrSidims Feb 22, 2023
045f5ab
[SYCL][Doc] Update sycl_ext_oneapi_sub_group_mask (#8174)
AlexeySachkov Feb 22, 2023
60e97e7
[Driver][SYCL] Improve AOT option passing with intel_gpu targets (#8419)
mdtoguchi Feb 22, 2023
a1787de
[SYCL] Add missing marray binary operator overloads (#8276)
steffenlarsen Feb 22, 2023
e945d42
Revert "Revert "[SYCL][Reduction] Hide reducer non-standard members a…
whitneywhtsang Feb 22, 2023
d226c17
Merge remote-tracking branch 'upstream/sycl' into sycl-mlir
whitneywhtsang Feb 22, 2023
File filter

Filter by extension

Filter by extension


Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
2 changes: 1 addition & 1 deletion .github/workflows/gh_pages.yml
Original file line number Diff line number Diff line change
Expand Up @@ -13,7 +13,7 @@ on:

jobs:
build:
runs-on: ubuntu-latest
runs-on: ubuntu-20.04
if: github.repository == 'intel/llvm'
steps:
- uses: actions/checkout@v3
Expand Down
2 changes: 1 addition & 1 deletion .github/workflows/sycl_cleanup.yml
Original file line number Diff line number Diff line change
Expand Up @@ -10,7 +10,7 @@ on:

jobs:
cleanup:
runs-on: ubuntu-latest
runs-on: ubuntu-20.04
steps:
- uses: actions/github-script@v6
with:
Expand Down
16 changes: 8 additions & 8 deletions .github/workflows/sycl_containers.yaml
Original file line number Diff line number Diff line change
Expand Up @@ -25,7 +25,7 @@ jobs:
base_image_ubuntu2004:
if: github.repository == 'intel/llvm'
name: Base Ubuntu 20.04 Docker image
runs-on: ubuntu-latest
runs-on: ubuntu-20.04
steps:
- name: Checkout
uses: actions/checkout@v3
Expand All @@ -44,7 +44,7 @@ jobs:
build_image_ubuntu2004:
if: github.repository == 'intel/llvm'
name: Build Ubuntu Docker image
runs-on: ubuntu-latest
runs-on: ubuntu-20.04
steps:
- name: Checkout
uses: actions/checkout@v3
Expand All @@ -66,7 +66,7 @@ jobs:
drivers_image_ubuntu2004:
if: github.repository == 'intel/llvm'
name: Intel Drivers Ubuntu 20.04 Docker image
runs-on: ubuntu-latest
runs-on: ubuntu-20.04
needs: base_image_ubuntu2004
steps:
- name: Checkout
Expand Down Expand Up @@ -105,7 +105,7 @@ jobs:
drivers_image_ubuntu2004_unstable:
if: github.repository == 'intel/llvm'
name: Intel Drivers (unstable) Ubuntu 20.04 Docker image
runs-on: ubuntu-latest
runs-on: ubuntu-20.04
needs: base_image_ubuntu2004
steps:
- name: Checkout
Expand Down Expand Up @@ -136,7 +136,7 @@ jobs:
base_image_ubuntu2204:
if: github.repository == 'intel/llvm'
name: Base Ubuntu 22.04 Docker image
runs-on: ubuntu-latest
runs-on: ubuntu-20.04
steps:
- name: Checkout
uses: actions/checkout@v3
Expand All @@ -155,7 +155,7 @@ jobs:
build_image_ubuntu2204:
if: github.repository == 'intel/llvm'
name: Build Ubuntu Docker image
runs-on: ubuntu-latest
runs-on: ubuntu-20.04
steps:
- name: Checkout
uses: actions/checkout@v3
Expand All @@ -177,7 +177,7 @@ jobs:
drivers_image_ubuntu2204:
if: github.repository == 'intel/llvm'
name: Intel Drivers Ubuntu 22.04 Docker image
runs-on: ubuntu-latest
runs-on: ubuntu-20.04
needs: base_image_ubuntu2204
steps:
- name: Checkout
Expand Down Expand Up @@ -215,7 +215,7 @@ jobs:
drivers_image_ubuntu2204_unstable:
if: github.repository == 'intel/llvm'
name: Intel Drivers (unstable) Ubuntu 22.04 Docker image
runs-on: ubuntu-latest
runs-on: ubuntu-20.04
needs: base_image_ubuntu2204
steps:
- name: Checkout
Expand Down
2 changes: 1 addition & 1 deletion .github/workflows/sycl_gen_test_matrix.yml
Original file line number Diff line number Diff line change
Expand Up @@ -46,7 +46,7 @@ on:
jobs:
test_matrix:
name: Generate Test Matrix
runs-on: ubuntu-latest
runs-on: ubuntu-20.04
outputs:
lts_matrix: ${{ steps.work.outputs.lts_matrix }}
cts_matrix: ${{ steps.work.outputs.cts_matrix }}
Expand Down
2 changes: 1 addition & 1 deletion .github/workflows/sycl_stale_issues.yml
Original file line number Diff line number Diff line change
Expand Up @@ -6,7 +6,7 @@ on:

jobs:
close-issues:
runs-on: ubuntu-latest
runs-on: ubuntu-20.04
steps:
- uses: actions/stale@v4
with:
Expand Down
1 change: 1 addition & 0 deletions clang/include/clang/Basic/LangOptions.def
Original file line number Diff line number Diff line change
Expand Up @@ -350,6 +350,7 @@ LANGOPT(ObjCDisableDirectMethodsForTesting, 1, 0,
"Disable recognition of objc_direct methods")
LANGOPT(CFProtectionBranch , 1, 0, "Control-Flow Branch Protection enabled")
LANGOPT(FakeAddressSpaceMap , 1, 0, "OpenCL fake address space map")
LANGOPT(OpenCLForceVectorABI, 1, 0, "OpenCL vector to scalar coercion disabling")
ENUM_LANGOPT(AddressSpaceMapMangling , AddrSpaceMapMangling, 2, ASMM_Target, "OpenCL address space map mangling mode")
LANGOPT(IncludeDefaultHeader, 1, 0, "Include default header file for OpenCL")
LANGOPT(DeclareOpenCLBuiltins, 1, 0, "Declare OpenCL builtin functions")
Expand Down
3 changes: 3 additions & 0 deletions clang/include/clang/Driver/Options.td
Original file line number Diff line number Diff line change
Expand Up @@ -6474,6 +6474,9 @@ defm const_strings : BoolOption<"f", "const-strings",
def fno_bitfield_type_align : Flag<["-"], "fno-bitfield-type-align">,
HelpText<"Ignore bit-field types when aligning structures">,
MarshallingInfoFlag<LangOpts<"NoBitFieldTypeAlign">>;
def fopencl_force_vector_abi : Flag<["-"], "fopencl-force-vector-abi">,
HelpText<"Disable vector to scalar coercion for OpenCL">,
MarshallingInfoFlag<LangOpts<"OpenCLForceVectorABI">>;
def ffake_address_space_map : Flag<["-"], "ffake-address-space-map">,
HelpText<"Use a fake address space map; OpenCL testing purposes only">,
MarshallingInfoFlag<LangOpts<"FakeAddressSpaceMap">>;
Expand Down
10 changes: 8 additions & 2 deletions clang/lib/Basic/Targets/NVPTX.cpp
Original file line number Diff line number Diff line change
Expand Up @@ -169,7 +169,8 @@ void NVPTXTargetInfo::getTargetDefines(const LangOptions &Opts,
Builder.defineMacro("__PTX__");
Builder.defineMacro("__NVPTX__");
if (Opts.CUDAIsDevice || Opts.OpenMPIsDevice || Opts.SYCLIsDevice) {
// Set __CUDA_ARCH__ for the GPU specified.
// Set __CUDA_ARCH__ or __SYCL_CUDA_ARCH__ for the GPU specified.
// The SYCL-specific macro is used to distinguish the SYCL and CUDA APIs.
std::string CUDAArchCode = [this] {
switch (GPU) {
case CudaArch::GFX600:
Expand Down Expand Up @@ -260,7 +261,12 @@ void NVPTXTargetInfo::getTargetDefines(const LangOptions &Opts,
}
llvm_unreachable("unhandled CudaArch");
}();
Builder.defineMacro("__CUDA_ARCH__", CUDAArchCode);

if (Opts.SYCLIsDevice) {
Builder.defineMacro("__SYCL_CUDA_ARCH__", CUDAArchCode);
} else {
Builder.defineMacro("__CUDA_ARCH__", CUDAArchCode);
}
}
}

Expand Down
46 changes: 46 additions & 0 deletions clang/lib/CodeGen/TargetInfo.cpp
Original file line number Diff line number Diff line change
Expand Up @@ -100,6 +100,41 @@ Address ABIInfo::EmitMSVAArg(CodeGenFunction &CGF, Address VAListAddr,
return Address::invalid();
}

static ABIArgInfo classifyOpenCL(QualType Ty, ASTContext &Context) {
if (Ty->isVoidType())
return ABIArgInfo::getIgnore();

if (const EnumType *EnumTy = Ty->getAs<EnumType>())
Ty = EnumTy->getDecl()->getIntegerType();

if (const RecordType *RT = Ty->getAs<RecordType>())
return ABIArgInfo::getIndirect(Context.getTypeAlignInChars(RT),
/*ByVal=*/false);

if (Context.isPromotableIntegerType(Ty))
return ABIArgInfo::getExtend(Ty);

return ABIArgInfo::getDirect();
}

static bool doOpenCLClassification(CGFunctionInfo &FI, ASTContext &Context) {
if (!Context.getLangOpts().OpenCL)
return false;
if (!Context.getLangOpts().OpenCLForceVectorABI)
return false;

// Use OpenCL classify to prevent coercing.
// Vector ABI must be enforced by enabling the corresponding option.
// Otherwise, vector types will be coerced to a matching integer
// type to conform with ABI, e.g.: <8 x i8> will be coerced to i64.
FI.getReturnInfo() = classifyOpenCL(FI.getReturnType(), Context);

for (auto &Arg : FI.arguments())
Arg.info = classifyOpenCL(Arg.type, Context);

return true;
}

static llvm::Type *getVAListElementType(CodeGenFunction &CGF) {
return CGF.ConvertTypeForMem(
CGF.getContext().getBuiltinVaListType()->getPointeeType());
Expand Down Expand Up @@ -1984,6 +2019,10 @@ ABIArgInfo X86_32ABIInfo::classifyArgumentType(QualType Ty,
}

void X86_32ABIInfo::computeInfo(CGFunctionInfo &FI) const {
ASTContext &Context = getContext();
if (doOpenCLClassification(FI, Context))
return;

CCState State(FI);
if (IsMCUABI)
State.FreeRegs = 3;
Expand Down Expand Up @@ -3970,6 +4009,9 @@ X86_64ABIInfo::classifyRegCallStructType(QualType Ty, unsigned &NeededInt,
}

void X86_64ABIInfo::computeInfo(CGFunctionInfo &FI) const {
ASTContext &Context = getContext();
if (doOpenCLClassification(FI, Context))
return;

const unsigned CallingConv = FI.getCallingConvention();
// It is possible to force Win64 calling convention on any x86_64 target by
Expand Down Expand Up @@ -4427,6 +4469,10 @@ ABIArgInfo WinX86_64ABIInfo::classify(QualType Ty, unsigned &FreeSSERegs,
}

void WinX86_64ABIInfo::computeInfo(CGFunctionInfo &FI) const {
ASTContext &Context = getContext();
if (doOpenCLClassification(FI, Context))
return;

const unsigned CC = FI.getCallingConvention();
bool IsVectorCall = CC == llvm::CallingConv::X86_VectorCall;
bool IsRegCall = CC == llvm::CallingConv::X86_RegCall;
Expand Down
15 changes: 6 additions & 9 deletions clang/lib/Driver/ToolChains/SYCL.cpp
Original file line number Diff line number Diff line change
Expand Up @@ -901,17 +901,14 @@ void SYCLToolChain::TranslateTargetOpt(const llvm::opt::ArgList &Args,
OptNoTriple = A->getOption().matches(Opt);
if (A->getOption().matches(Opt_EQ)) {
// Passing device args: -X<Opt>=<triple> -opt=val.
if (getDriver().MakeSYCLDeviceTriple(A->getValue()) != getTriple())
StringRef GenDevice = SYCL::gen::resolveGenDevice(A->getValue());
if (getDriver().MakeSYCLDeviceTriple(A->getValue()) != getTriple() &&
GenDevice.empty())
// Provided triple does not match current tool chain.
continue;
if (getTriple().isSPIR() &&
getTriple().getSubArch() == llvm::Triple::SPIRSubArch_gen) {
if (Device.empty() && StringRef(A->getValue()).startswith("intel_gpu"))
continue;
if (!Device.empty() &&
getDriver().MakeSYCLDeviceTriple(A->getValue()) == getTriple())
continue;
}
if (Device != GenDevice && getTriple().isSPIR() &&
getTriple().getSubArch() == llvm::Triple::SPIRSubArch_gen)
continue;
} else if (!OptNoTriple)
// Don't worry about any of the other args, we only want to pass what is
// passed in -X<Opt>
Expand Down
2 changes: 2 additions & 0 deletions clang/lib/Frontend/CompilerInvocation.cpp
Original file line number Diff line number Diff line change
Expand Up @@ -3928,6 +3928,8 @@ bool CompilerInvocation::ParseLangArgs(LangOptions &Opts, ArgList &Args,
}
}

Opts.OpenCLForceVectorABI = Args.hasArg(OPT_fopencl_force_vector_abi);

// Check if -fopenmp is specified and set default version to 5.0.
Opts.OpenMP = Args.hasArg(OPT_fopenmp) ? 50 : 0;
// Check if -fopenmp-simd is specified.
Expand Down
2 changes: 1 addition & 1 deletion clang/test/CodeGen/sycl-instrumentation-option.c
Original file line number Diff line number Diff line change
Expand Up @@ -5,7 +5,7 @@
// RUN: %clang_cc1 -fsycl-instrument-device-code -triple spir64_gen-unknown-unknown %s -emit-llvm -o - 2>&1 | FileCheck %s
// RUN: %clang_cc1 -fsycl-instrument-device-code -triple spir64_fpga-unknown-unknown %s -emit-llvm -o - 2>&1 | FileCheck %s
// RUN: %clang_cc1 -fsycl-instrument-device-code -triple spir64_x86_64-unknown-unknown %s -emit-llvm -o - 2>&1 | FileCheck %s
// CHECK-NOT: error
// CHECK-NOT: error:

// RUN: not %clang_cc1 -fsycl-instrument-device-code -triple spirv32 -emit-llvm %s -o /dev/null 2>&1 | FileCheck %s --check-prefix=CHECK-ERR
// RUN: not %clang_cc1 -fsycl-instrument-device-code -triple spirv64 -emit-llvm %s -o /dev/null 2>&1 | FileCheck %s --check-prefix=CHECK-ERR
Expand Down
29 changes: 29 additions & 0 deletions clang/test/CodeGenOpenCL/vector-to-scalar-coercion.cl
Original file line number Diff line number Diff line change
@@ -0,0 +1,29 @@
// RUN: %clang_cc1 -x cl -triple i686-pc-win32-gnu -fopencl-force-vector-abi %s -O0 -emit-llvm -o - | FileCheck %s --check-prefix NOCOER
// RUN: %clang_cc1 -x cl -triple x86_64-unknown-linux -fopencl-force-vector-abi %s -O0 -emit-llvm -o - | FileCheck %s --check-prefix NOCOER
// RUN: %clang_cc1 -x cl -triple x86_64-pc-win32-gnu -fopencl-force-vector-abi %s -O0 -emit-llvm -o - | FileCheck %s --check-prefix NOCOER

// RUN: %clang_cc1 -x cl -triple i686-pc-win32-gnu %s -O0 -emit-llvm -o - | FileCheck %s --check-prefix COER32CL
// RUN: %clang_cc1 -x cl -triple x86_64-unknown-linux %s -O0 -emit-llvm -o - | FileCheck %s --check-prefix COER64
// RUN: %clang_cc1 -x cl -triple x86_64-pc-win32-gnu %s -O0 -emit-llvm -o - | FileCheck %s --check-prefix NOCOER

// RUN: %clang_cc1 -x c -triple i686-pc-win32-gnu %s -O0 -emit-llvm -o - | FileCheck %s --check-prefix COER32
// RUN: %clang_cc1 -x c -triple x86_64-unknown-linux %s -O0 -emit-llvm -o - | FileCheck %s --check-prefix COER64
// RUN: %clang_cc1 -x c -triple x86_64-pc-win32-gnu %s -O0 -emit-llvm -o - | FileCheck %s --check-prefix NOCOER-C-WIN

typedef unsigned short ushort;
typedef ushort ushort4 __attribute__((ext_vector_type(4)));

typedef unsigned long ulong;
typedef ulong ulong4 __attribute__((ext_vector_type(4)));

ulong4 __attribute__((const)) __attribute__((overloadable)) convert_ulong4_rte(ushort4 x)
{
return 1;
}

// NOCOER: define {{.*}}<4 x i64> @_Z18convert_ulong4_rteDv4_t(<4 x i16> noundef %{{.*}})
// NOCOER-C-WIN: define {{.*}}<4 x i32> @_Z18convert_ulong4_rteDv4_t(<4 x i16> noundef %{{.*}})
// COER32CL: define {{.*}}<4 x i64> @_Z18convert_ulong4_rteDv4_t(i64 noundef %{{.*}})
// COER32: define {{.*}}<4 x i32> @_Z18convert_ulong4_rteDv4_t(i64 noundef %{{.*}})
// FIXME: <4 x i16> should be coerced to i64 instead of double
// COER64: define {{.*}}<4 x i64> @_Z18convert_ulong4_rteDv4_t(double noundef %{{.*}})
Loading