Skip to content

Add SPIRV support to HIPAMD toolchain #75357

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Closed
wants to merge 1 commit into from
Closed
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
28 changes: 28 additions & 0 deletions clang/docs/HIPSupport.rst
Original file line number Diff line number Diff line change
Expand Up @@ -266,3 +266,31 @@ Example Usage
Base* basePtr = &obj;
basePtr->virtualFunction(); // Allowed since obj is constructed in device code
}

SPIRV Support on HIPAMD ToolChain
=================================

SPIRV is a target-neutral device executable format. The support for SPIRV in the ROCm and HIPAMD toolchain is under active development.

Compilation Process
-------------------

When compiling HIP programs with the intent of utilizing SPIRV, the process diverges from the traditional compilation flow:

Using ``--offload-arch=generic``
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^

- **Target Triple**: The ``--offload-arch=generic`` flag instructs the compiler to use the target triple ``spirv64-unknown-unknown``. This approach does not generate ISA (Instruction Set Architecture) for a specific GPU architecture.

- **LLVM IR Translation**: The program is compiled to LLVM Intermediate Representation (IR), which is subsequently translated into SPIRV.

- **Clang Offload Bundler**: The resulting SPIRV is embedded in the Clang offload bundler with the bundle ID ``hipv4-hip-amdgcn-amd-amdhsa-generic``.

Mixed with Normal ``--offload-arch``
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^

- **ISA Generation**: Alongside SPIRV, the compiler can also generate ISA for specific GPU architectures when normal ``--offload-arch`` options are used.

- **Runtime Behavior**: The HIP runtime prioritizes the use of ISA for a specific GPU if available. In its absence, and if SPIRV is available, the runtime will JIT (Just-In-Time) compile SPIRV into ISA.

This approach allows for greater flexibility and portability in HIP programming, particularly in environments where the specific GPU architecture may vary or be unknown at compile time. The ability to mix SPIRV with specific ISA generation also provides a balanced solution for optimizing performance while maintaining portability.
5 changes: 4 additions & 1 deletion clang/lib/Basic/TargetID.cpp
Original file line number Diff line number Diff line change
Expand Up @@ -46,8 +46,11 @@ getAllPossibleTargetIDFeatures(const llvm::Triple &T,
/// Returns canonical processor name or empty string if \p Processor is invalid.
static llvm::StringRef getCanonicalProcessorName(const llvm::Triple &T,
llvm::StringRef Processor) {
if (T.isAMDGPU())
if (T.isAMDGPU()) {
if (Processor == "generic")
return Processor;
return llvm::AMDGPU::getCanonicalArchName(T, Processor);
}
return Processor;
}

Expand Down
3 changes: 2 additions & 1 deletion clang/lib/Driver/Driver.cpp
Original file line number Diff line number Diff line change
Expand Up @@ -3438,7 +3438,8 @@ class OffloadingActionBuilder final {
// compiler phases, including backend and assemble phases.
ActionList AL;
Action *BackendAction = nullptr;
if (ToolChains.front()->getTriple().isSPIRV()) {
if (ToolChains.front()->getTriple().isSPIRV() ||
StringRef(GpuArchList[I]) == "generic") {
// Emit LLVM bitcode for SPIR-V targets. SPIR-V device tool chain
// (HIPSPVToolChain) runs post-link LLVM IR passes.
types::ID Output = Args.hasArg(options::OPT_S)
Expand Down
4 changes: 4 additions & 0 deletions clang/lib/Driver/ToolChain.cpp
Original file line number Diff line number Diff line change
Expand Up @@ -1008,6 +1008,10 @@ std::string ToolChain::ComputeLLVMTriple(const ArgList &Args,
tools::arm::setFloatABIInTriple(getDriver(), Args, Triple);
return Triple.getTriple();
}
case llvm::Triple::amdgcn:
if (Args.getLastArgValue(options::OPT_mcpu_EQ) == "generic")
return "spirv64-unknown-unknown";
return getTripleString();
}
}

Expand Down
5 changes: 3 additions & 2 deletions clang/lib/Driver/ToolChains/AMDGPU.cpp
Original file line number Diff line number Diff line change
Expand Up @@ -930,7 +930,7 @@ bool RocmInstallationDetector::checkCommonBitcodeLibs(
D.Diag(diag::err_drv_no_rocm_device_lib) << 0;
return false;
}
if (LibDeviceFile.empty()) {
if (!GPUArch.empty() && LibDeviceFile.empty()) {
D.Diag(diag::err_drv_no_rocm_device_lib) << 1 << GPUArch;
return false;
}
Expand Down Expand Up @@ -958,7 +958,8 @@ RocmInstallationDetector::getCommonBitcodeLibs(
AddBCLib(getFiniteOnlyPath(FiniteOnly || FastRelaxedMath));
AddBCLib(getCorrectlyRoundedSqrtPath(CorrectSqrt));
AddBCLib(getWavefrontSize64Path(Wave64));
AddBCLib(LibDeviceFile);
if (!LibDeviceFile.empty())
AddBCLib(LibDeviceFile);
auto ABIVerPath = getABIVersionPath(ABIVer);
if (!ABIVerPath.empty())
AddBCLib(ABIVerPath);
Expand Down
8 changes: 8 additions & 0 deletions clang/lib/Driver/ToolChains/HIPAMD.cpp
Original file line number Diff line number Diff line change
Expand Up @@ -10,6 +10,7 @@
#include "AMDGPU.h"
#include "CommonArgs.h"
#include "HIPUtility.h"
#include "SPIRV.h"
#include "clang/Basic/Cuda.h"
#include "clang/Basic/TargetID.h"
#include "clang/Driver/Compilation.h"
Expand Down Expand Up @@ -209,6 +210,13 @@ void AMDGCN::Linker::ConstructJob(Compilation &C, const JobAction &JA,
if (JA.getType() == types::TY_LLVM_BC)
return constructLlvmLinkCommand(C, JA, Inputs, Output, Args);

if (Args.getLastArgValue(options::OPT_mcpu_EQ) == "generic") {
llvm::opt::ArgStringList TrArgs{"--spirv-max-version=1.1",
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm not sure we want to stick with 1.1 here, the Translator goes up to 1.4 at the moment - should we consider going to that instead?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I wonder if -mcpu is the correct way to encode this. Targeting SPIR-V is more like the triple than the architecture as far as I'm aware.

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm not sure we want to stick with 1.1 here, the Translator goes up to 1.4 at the moment - should we consider going to that instead?

Thanks for reminder. I think we should be able to go up with the version since we will use ToT of the LLVM/SPIRV translator.

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I wonder if -mcpu is the correct way to encode this. Targeting SPIR-V is more like the triple than the architecture as far as I'm aware.

I will see whether I can use triple instead.

"--spirv-ext=+all"};
return SPIRV::constructTranslateCommand(C, *this, JA, Output, Inputs[0],
TrArgs);
}

return constructLldCommand(C, JA, Inputs, Output, Args);
}

Expand Down
16 changes: 11 additions & 5 deletions clang/test/Driver/hip-phases.hip
Original file line number Diff line number Diff line change
Expand Up @@ -11,10 +11,13 @@
//
// RUN: %clang -x hip --target=x86_64-unknown-linux-gnu -ccc-print-phases \
// RUN: --cuda-gpu-arch=gfx803 %s 2>&1 \
// RUN: | FileCheck -check-prefixes=BIN,NRD,OLD %s
// RUN: | FileCheck -check-prefixes=BIN,NRD,OLD,GFX803 %s
// RUN: %clang -x hip --target=x86_64-unknown-linux-gnu -ccc-print-phases \
// RUN: --offload-new-driver --cuda-gpu-arch=gfx803 %s 2>&1 \
// RUN: | FileCheck -check-prefixes=BIN,NRD,NEW %s
// RUN: | FileCheck -check-prefixes=BIN,NRD,NEW,GFX803 %s
// RUN: %clang -x hip --target=x86_64-unknown-linux-gnu -ccc-print-phases \
// RUN: --offload-arch=generic %s 2>&1 \
// RUN: | FileCheck -check-prefixes=BIN,NRD,OLD,GENERIC %s
//
// RUN: %clang -x hip --target=x86_64-unknown-linux-gnu -ccc-print-phases \
// RUN: --cuda-gpu-arch=gfx803 -fgpu-rdc %s 2>&1 \
Expand All @@ -26,11 +29,14 @@
// RDC-DAG: [[P12:[0-9]+]]: backend, {[[P2]]}, assembler, (host-[[T]])
// RDC-DAG: [[P13:[0-9]+]]: assembler, {[[P12]]}, object, (host-[[T]])

// BIN-DAG: [[P3:[0-9]+]]: input, "{{.*}}hip-phases.hip", [[T]], (device-[[T]], [[ARCH:gfx803]])
// GFX803-DAG: [[P3:[0-9]+]]: input, "{{.*}}hip-phases.hip", [[T]], (device-[[T]], [[ARCH:gfx803]])
// RDC-DAG: [[P3:[0-9]+]]: input, "{{.*}}hip-phases.hip", [[T]], (device-[[T]], [[ARCH:gfx803]])
// GENERIC-DAG: [[P3:[0-9]+]]: input, "{{.*}}hip-phases.hip", [[T]], (device-[[T]], [[ARCH:generic]])
// BIN-DAG: [[P4:[0-9]+]]: preprocessor, {[[P3]]}, [[T]]-cpp-output, (device-[[T]], [[ARCH]])
// BIN-DAG: [[P5:[0-9]+]]: compiler, {[[P4]]}, ir, (device-[[T]], [[ARCH]])
// NRD-DAG: [[P6:[0-9]+]]: backend, {[[P5]]}, assembler, (device-[[T]], [[ARCH]])
// NRD-DAG: [[P7:[0-9]+]]: assembler, {[[P6]]}, object, (device-[[T]], [[ARCH]])
// GFX803-DAG: [[P6:[0-9]+]]: backend, {[[P5]]}, assembler, (device-[[T]], [[ARCH]])
// GFX803-DAG: [[P7:[0-9]+]]: assembler, {[[P6]]}, object, (device-[[T]], [[ARCH]])
// GENERIC-DAG: [[P7:[0-9]+]]: backend, {[[P5]]}, ir, (device-[[T]], [[ARCH]])
// RDC-DAG: [[P7:[0-9]+]]: backend, {[[P5]]}, ir, (device-[[T]], [[ARCH]])
// BIN-DAG: [[P8:[0-9]+]]: linker, {[[P7]]}, image, (device-[[T]], [[ARCH]])
// BIN-DAG: [[P9:[0-9]+]]: offload, "device-[[T]] (amdgcn-amd-amdhsa:[[ARCH]])" {[[P8]]}, image
Expand Down
17 changes: 17 additions & 0 deletions clang/test/Driver/hip-toolchain-no-rdc.hip
Original file line number Diff line number Diff line change
Expand Up @@ -39,6 +39,11 @@
// RUN: %t/a.o %t/b.o \
// RUN: 2>&1 | FileCheck -check-prefixes=LKONLY %s

// RUN: %clang -### --target=x86_64-linux-gnu \
// RUN: --offload-arch=generic --offload-arch=gfx900 \
// RUN: %s -nogpuinc -nogpulib \
// RUN: 2>&1 | FileCheck -check-prefixes=GENERIC %s

//
// Compile device code in a.cu to code object for gfx803.
//
Expand Down Expand Up @@ -180,3 +185,15 @@
// LKONLY-NOT: {{".*/llc"}}
// LKONLY: [[LD:".*ld.*"]] {{.*}} "{{.*/a.o}}" "{{.*/b.o}}"
// LKONLY-NOT: "-T" "{{.*}}.lk"

//
// Check mixed SPIRV and GPU arch.
//

// GENERIC: "-cc1" "-triple" "spirv64-unknown-unknown" {{.*}}"-emit-llvm-bc" {{.*}} "-o" "[[GEN_BC:.*bc]]"
// GENERIC: {{".*llvm-spirv"}} "--spirv-max-version=1.1" "--spirv-ext=+all" "[[GEN_BC]]" "-o" "[[GEN_SPV:.*out]]"
// GENERIC: "-cc1" "-triple" "amdgcn-amd-amdhsa" {{.*}}"-emit-obj" {{.*}}"-target-cpu" "gfx900"{{.*}} "-o" "[[GFX900_OBJ:.*o]]"
// GENERIC: {{".*lld"}} {{.*}}"-plugin-opt=mcpu=gfx900" {{.*}} "-o" "[[GFX900_CO:.*out]]" {{.*}}"[[GFX900_OBJ]]"
// GENERIC: {{".*clang-offload-bundler"}} "-type=o"
// GENERIC-SAME: "-targets={{.*}}hipv4-amdgcn-amd-amdhsa--generic,hipv4-amdgcn-amd-amdhsa--gfx900"
// GENERIC-SAME: "-input=[[GEN_SPV]]" "-input=[[GFX900_CO]]"