Skip to content

[AMDGPU] Introduce Code Object V6 #76954

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
merged 5 commits into from
Feb 5, 2024
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
4 changes: 4 additions & 0 deletions clang/include/clang/Basic/DiagnosticDriverKinds.td
Original file line number Diff line number Diff line change
Expand Up @@ -93,6 +93,10 @@ def err_drv_hipspv_no_hip_path : Error<
"'--hip-path' must be specified when offloading to "
"SPIR-V%select{| unless %1 is given}0.">;

// TODO: Remove when COV6 is fully supported by ROCm.
def warn_drv_amdgpu_cov6: Warning<
"code object v6 is still in development and not ready for production use yet;"
" use at your own risk">;
def err_drv_undetermined_gpu_arch : Error<
"cannot determine %0 architecture: %1; consider passing it via "
"'%2'">;
Expand Down
4 changes: 2 additions & 2 deletions clang/include/clang/Driver/Options.td
Original file line number Diff line number Diff line change
Expand Up @@ -4801,9 +4801,9 @@ defm amdgpu_ieee : BoolOption<"m", "amdgpu-ieee",
def mcode_object_version_EQ : Joined<["-"], "mcode-object-version=">, Group<m_Group>,
HelpText<"Specify code object ABI version. Defaults to 5. (AMDGPU only)">,
Visibility<[ClangOption, FlangOption, CC1Option, FC1Option]>,
Values<"none,4,5">,
Values<"none,4,5,6">,
NormalizedValuesScope<"llvm::CodeObjectVersionKind">,
NormalizedValues<["COV_None", "COV_4", "COV_5"]>,
NormalizedValues<["COV_None", "COV_4", "COV_5", "COV_6"]>,
MarshallingInfoEnum<TargetOpts<"CodeObjectVersion">, "COV_5">;

defm cumode : SimpleMFlag<"cumode",
Expand Down
6 changes: 3 additions & 3 deletions clang/lib/CodeGen/CGBuiltin.cpp
Original file line number Diff line number Diff line change
Expand Up @@ -17756,9 +17756,9 @@ Value *EmitAMDGPUImplicitArgPtr(CodeGenFunction &CGF) {
// \p Index is 0, 1, and 2 for x, y, and z dimension, respectively.
/// Emit code based on Code Object ABI version.
/// COV_4 : Emit code to use dispatch ptr
/// COV_5 : Emit code to use implicitarg ptr
/// COV_5+ : Emit code to use implicitarg ptr
/// COV_NONE : Emit code to load a global variable "__oclc_ABI_version"
/// and use its value for COV_4 or COV_5 approach. It is used for
/// and use its value for COV_4 or COV_5+ approach. It is used for
/// compiling device libraries in an ABI-agnostic way.
///
/// Note: "__oclc_ABI_version" is supposed to be emitted and intialized by
Expand Down Expand Up @@ -17801,7 +17801,7 @@ Value *EmitAMDGPUWorkGroupSize(CodeGenFunction &CGF, unsigned Index) {
Address(Result, CGF.Int16Ty, CharUnits::fromQuantity(2)));
} else {
Value *GEP = nullptr;
if (Cov == CodeObjectVersionKind::COV_5) {
if (Cov >= CodeObjectVersionKind::COV_5) {
// Indexing the implicit kernarg segment.
GEP = CGF.Builder.CreateConstGEP1_32(
CGF.Int8Ty, EmitAMDGPUImplicitArgPtr(CGF), 12 + Index * 2);
Expand Down
8 changes: 7 additions & 1 deletion clang/lib/Driver/ToolChains/CommonArgs.cpp
Original file line number Diff line number Diff line change
Expand Up @@ -2650,7 +2650,7 @@ getAMDGPUCodeObjectArgument(const Driver &D, const llvm::opt::ArgList &Args) {
void tools::checkAMDGPUCodeObjectVersion(const Driver &D,
const llvm::opt::ArgList &Args) {
const unsigned MinCodeObjVer = 4;
const unsigned MaxCodeObjVer = 5;
const unsigned MaxCodeObjVer = 6;
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm wondering if we should print a warning when V6 is enabled (either here or in the backend) to note that it's in development and not ready yet? Something like "code object v6 is still experimental and not ready for production use"

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think it is a good idea.


if (auto *CodeObjArg = getAMDGPUCodeObjectArgument(D, Args)) {
if (CodeObjArg->getOption().getID() ==
Expand All @@ -2661,6 +2661,12 @@ void tools::checkAMDGPUCodeObjectVersion(const Driver &D,
if (Remnant || CodeObjVer < MinCodeObjVer || CodeObjVer > MaxCodeObjVer)
D.Diag(diag::err_drv_invalid_int_value)
<< CodeObjArg->getAsString(Args) << CodeObjArg->getValue();

// COV6 is only supported by LLVM at the time of writing this, and it's
// expected to take some time before all ROCm components fully
// support it. In the meantime, make sure users are aware of this.
if (CodeObjVer == 6)
D.Diag(diag::warn_drv_amdgpu_cov6);
}
}
}
Expand Down
37 changes: 37 additions & 0 deletions clang/test/CodeGenCUDA/amdgpu-code-object-version-linking.cu
Original file line number Diff line number Diff line change
Expand Up @@ -4,6 +4,9 @@
// RUN: %clang_cc1 -fcuda-is-device -triple amdgcn-amd-amdhsa -emit-llvm-bc \
// RUN: -mcode-object-version=5 -DUSER -x hip -o %t_5.bc %s

// RUN: %clang_cc1 -fcuda-is-device -triple amdgcn-amd-amdhsa -emit-llvm-bc \
// RUN: -mcode-object-version=6 -DUSER -x hip -o %t_6.bc %s

// RUN: %clang_cc1 -fcuda-is-device -triple amdgcn-amd-amdhsa -emit-llvm-bc \
// RUN: -mcode-object-version=none -DDEVICELIB -x hip -o %t_0.bc %s

Expand All @@ -15,6 +18,10 @@
// RUN: %t_5.bc -mlink-builtin-bitcode %t_0.bc -o - |\
// RUN: FileCheck -check-prefix=LINKED5 %s

// RUN: %clang_cc1 -triple amdgcn-amd-amdhsa -fcuda-is-device -emit-llvm -O3 \
// RUN: %t_6.bc -mlink-builtin-bitcode %t_0.bc -o - |\
// RUN: FileCheck -check-prefix=LINKED6 %s

#include "Inputs/cuda.h"

// LINKED4: @__oclc_ABI_version = weak_odr hidden local_unnamed_addr addrspace(4) constant i32 400
Expand Down Expand Up @@ -77,6 +84,36 @@
// LINKED5: load i16, ptr addrspace(4) %{{.*}}, align 2, !range [[$WS_RANGE:![0-9]*]], !invariant.load{{.*}}, !noundef
// LINKED5: "amdgpu_code_object_version", i32 500

// LINKED6: __oclc_ABI_version = weak_odr hidden local_unnamed_addr addrspace(4) constant i32 600
// LINKED6-LABEL: bar
// LINKED6-NOT: load i32, ptr addrspacecast (ptr addrspace(4) @__oclc_ABI_version to ptr), align {{.*}}
// LINKED6-NOT: icmp sge i32 %{{.*}}, 500
// LINKED6: call align 8 dereferenceable(256) ptr addrspace(4) @llvm.amdgcn.implicitarg.ptr()
// LINKED6: [[GEP_5_X:%.*]] = getelementptr i8, ptr addrspace(4) %{{.*}}, i32 12
// LINKED6: call align 4 dereferenceable(64) ptr addrspace(4) @llvm.amdgcn.dispatch.ptr()
// LINKED6: [[GEP_4_X:%.*]] = getelementptr i8, ptr addrspace(4) %{{.*}}, i32 4
// LINKED6: select i1 true, ptr addrspace(4) [[GEP_5_X]], ptr addrspace(4) [[GEP_4_X]]
// LINKED6: load i16, ptr addrspace(4) %{{.*}}, align 2, !range [[$WS_RANGE:![0-9]*]], !invariant.load{{.*}}, !noundef

// LINKED6-NOT: load i32, ptr addrspacecast (ptr addrspace(4) @__oclc_ABI_version to ptr), align {{.*}}
// LINKED6-NOT: icmp sge i32 %{{.*}}, 500
// LINKED6: call align 8 dereferenceable(256) ptr addrspace(4) @llvm.amdgcn.implicitarg.ptr()
// LINKED6: [[GEP_5_Y:%.*]] = getelementptr i8, ptr addrspace(4) %{{.*}}, i32 14
// LINKED6: call align 4 dereferenceable(64) ptr addrspace(4) @llvm.amdgcn.dispatch.ptr()
// LINKED6: [[GEP_4_Y:%.*]] = getelementptr i8, ptr addrspace(4) %{{.*}}, i32 6
// LINKED6: select i1 true, ptr addrspace(4) [[GEP_5_Y]], ptr addrspace(4) [[GEP_4_Y]]
// LINKED6: load i16, ptr addrspace(4) %{{.*}}, align 2, !range [[$WS_RANGE:![0-9]*]], !invariant.load{{.*}}, !noundef

// LINKED6-NOT: load i32, ptr addrspacecast (ptr addrspace(4) @__oclc_ABI_version to ptr), align {{.*}}
// LINKED6-NOT: icmp sge i32 %{{.*}}, 500
// LINKED6: call align 8 dereferenceable(256) ptr addrspace(4) @llvm.amdgcn.implicitarg.ptr()
// LINKED6: [[GEP_5_Z:%.*]] = getelementptr i8, ptr addrspace(4) %{{.*}}, i32 16
// LINKED6: call align 4 dereferenceable(64) ptr addrspace(4) @llvm.amdgcn.dispatch.ptr()
// LINKED6: [[GEP_4_Z:%.*]] = getelementptr i8, ptr addrspace(4) %{{.*}}, i32 8
// LINKED6: select i1 true, ptr addrspace(4) [[GEP_5_Z]], ptr addrspace(4) [[GEP_4_Z]]
// LINKED6: load i16, ptr addrspace(4) %{{.*}}, align 2, !range [[$WS_RANGE:![0-9]*]], !invariant.load{{.*}}, !noundef
// LINKED6: "amdgpu_code_object_version", i32 600

#ifdef DEVICELIB
__device__ void bar(int *x, int *y, int *z)
{
Expand Down
4 changes: 4 additions & 0 deletions clang/test/CodeGenCUDA/amdgpu-code-object-version.cu
Original file line number Diff line number Diff line change
Expand Up @@ -9,6 +9,9 @@
// RUN: %clang_cc1 -fcuda-is-device -triple amdgcn-amd-amdhsa -emit-llvm \
// RUN: -mcode-object-version=5 -o - %s | FileCheck -check-prefix=V5 %s

// RUN: %clang_cc1 -fcuda-is-device -triple amdgcn-amd-amdhsa -emit-llvm \
// RUN: -mcode-object-version=6 -o - %s | FileCheck -check-prefix=V6 %s

// RUN: %clang_cc1 -fcuda-is-device -triple amdgcn-amd-amdhsa -emit-llvm \
// RUN: -mcode-object-version=none -o - %s | FileCheck %s -check-prefix=NONE

Expand All @@ -17,5 +20,6 @@

// V4: !{{.*}} = !{i32 1, !"amdgpu_code_object_version", i32 400}
// V5: !{{.*}} = !{i32 1, !"amdgpu_code_object_version", i32 500}
// V6: !{{.*}} = !{i32 1, !"amdgpu_code_object_version", i32 600}
// NONE-NOT: !{{.*}} = !{i32 1, !"amdgpu_code_object_version",
// INV: error: invalid value '4.1' in '-mcode-object-version=4.1'
4 changes: 4 additions & 0 deletions clang/test/CodeGenCUDA/amdgpu-workgroup-size.cu
Original file line number Diff line number Diff line change
Expand Up @@ -7,6 +7,10 @@
// RUN: -fcuda-is-device -emit-llvm -o - -x hip %s \
// RUN: | FileCheck -check-prefix=COV5 %s

// RUN: %clang_cc1 -triple amdgcn-amd-amdhsa \
// RUN: -fcuda-is-device -mcode-object-version=6 -emit-llvm -o - -x hip %s \
// RUN: | FileCheck -check-prefix=COV5 %s

// RUN: %clang_cc1 -triple amdgcn-amd-amdhsa \
// RUN: -fcuda-is-device -mcode-object-version=none -emit-llvm -o - -x hip %s \
// RUN: | FileCheck -check-prefix=COVNONE %s
Expand Down
Empty file.
13 changes: 13 additions & 0 deletions clang/test/Driver/hip-code-object-version.hip
Original file line number Diff line number Diff line change
Expand Up @@ -23,6 +23,19 @@
// V5: "-mllvm" "--amdhsa-code-object-version=5"
// V5: "-targets=host-x86_64-unknown-linux,hipv4-amdgcn-amd-amdhsa--gfx906"

// Check bundle ID for code object version 6.

// RUN: not %clang -### --target=x86_64-linux-gnu \
// RUN: -mcode-object-version=6 \
// RUN: --offload-arch=gfx906 --rocm-path=%S/Inputs/rocm \
// RUN: %s 2>&1 | FileCheck -check-prefix=V6 %s

// V6: warning: code object v6 is still in development and not ready for production use yet; use at your own risk
// V6: "-mcode-object-version=6"
// V6: "-mllvm" "--amdhsa-code-object-version=6"
// V6: "-targets=host-x86_64-unknown-linux,hipv4-amdgcn-amd-amdhsa--gfx906"


// Check bundle ID for code object version default

// RUN: %clang -### --target=x86_64-linux-gnu \
Expand Down
18 changes: 17 additions & 1 deletion clang/test/Driver/hip-device-libs.hip
Original file line number Diff line number Diff line change
Expand Up @@ -187,13 +187,26 @@
// RUN: --rocm-path=%S/Inputs/rocm %S/Inputs/hip_multiple_inputs/b.hip \
// RUN: 2>&1 | FileCheck %s --check-prefixes=ABI5

// Test -mcode-object-version=5 with old device library without abi_version_400.bc
// Test -mcode-object-version=5 with old device library without abi_version_500.bc
// RUN: not %clang -### --target=x86_64-linux-gnu --offload-arch=gfx900 \
// RUN: -mcode-object-version=5 \
// RUN: --hip-device-lib-path=%S/Inputs/rocm/amdgcn/bitcode-no-abi-ver \
// RUN: --rocm-path=%S/Inputs/rocm %S/Inputs/hip_multiple_inputs/b.hip \
// RUN: 2>&1 | FileCheck %s --check-prefixes=NOABI5

// Test -mcode-object-version=6
// RUN: %clang -### --target=x86_64-linux-gnu --offload-arch=gfx900 \
// RUN: -mcode-object-version=6 \
// RUN: --rocm-path=%S/Inputs/rocm %S/Inputs/hip_multiple_inputs/b.hip \
// RUN: 2>&1 | FileCheck %s --check-prefixes=ABI6

// Test -mcode-object-version=6 with old device library without abi_version_600.bc
// RUN: not %clang -### --target=x86_64-linux-gnu --offload-arch=gfx900 \
// RUN: -mcode-object-version=6 \
// RUN: --hip-device-lib-path=%S/Inputs/rocm/amdgcn/bitcode-no-abi-ver \
// RUN: --rocm-path=%S/Inputs/rocm %S/Inputs/hip_multiple_inputs/b.hip \
// RUN: 2>&1 | FileCheck %s --check-prefixes=NOABI6

// ALL-NOT: error:
// ALL: {{"[^"]*clang[^"]*"}}

Expand Down Expand Up @@ -237,7 +250,10 @@
// ABI4: "-mlink-builtin-bitcode" "{{.*}}oclc_abi_version_400.bc"
// ABI5-NOT: error:
// ABI5: "-mlink-builtin-bitcode" "{{.*}}oclc_abi_version_500.bc"
// ABI6-NOT: error:
// ABI6: "-mlink-builtin-bitcode" "{{.*}}oclc_abi_version_600.bc"
// NOABI4-NOT: error:
// NOABI4-NOT: "-mlink-builtin-bitcode" "{{.*}}oclc_abi_version_400.bc"
// NOABI4-NOT: "-mlink-builtin-bitcode" "{{.*}}oclc_abi_version_500.bc"
// NOABI5: error: cannot find ROCm device libraryfor ABI version 5; provide its path via '--rocm-path' or '--rocm-device-lib-path', or pass '-nogpulib' to build without ROCm device library
// NOABI6: error: cannot find ROCm device libraryfor ABI version 6; provide its path via '--rocm-path' or '--rocm-device-lib-path', or pass '-nogpulib' to build without ROCm device library
3 changes: 2 additions & 1 deletion clang/test/Misc/warning-flags.c
Original file line number Diff line number Diff line change
Expand Up @@ -18,7 +18,7 @@ This test serves two purposes:

The list of warnings below should NEVER grow. It should gradually shrink to 0.

CHECK: Warnings without flags (65):
CHECK: Warnings without flags (66):

CHECK-NEXT: ext_expected_semi_decl_list
CHECK-NEXT: ext_explicit_specialization_storage_class
Expand All @@ -43,6 +43,7 @@ CHECK-NEXT: warn_collection_expr_type
CHECK-NEXT: warn_conflicting_variadic
CHECK-NEXT: warn_delete_array_type
CHECK-NEXT: warn_double_const_requires_fp64
CHECK-NEXT: warn_drv_amdgpu_cov6
CHECK-NEXT: warn_drv_assuming_mfloat_abi_is
CHECK-NEXT: warn_drv_clang_unsupported
CHECK-NEXT: warn_drv_pch_not_first_include
Expand Down
2 changes: 2 additions & 0 deletions flang/lib/Frontend/CompilerInvocation.cpp
Original file line number Diff line number Diff line change
Expand Up @@ -284,6 +284,8 @@ static void parseCodeGenArgs(Fortran::frontend::CodeGenOptions &opts,
if (const llvm::opt::Arg *a = args.getLastArg(
clang::driver::options::OPT_mcode_object_version_EQ)) {
llvm::StringRef s = a->getValue();
if (s == "6")
opts.CodeObjectVersion = llvm::CodeObjectVersionKind::COV_6;
if (s == "5")
opts.CodeObjectVersion = llvm::CodeObjectVersionKind::COV_5;
if (s == "4")
Expand Down
3 changes: 2 additions & 1 deletion flang/test/Lower/AMD/code-object-version.f90
Original file line number Diff line number Diff line change
Expand Up @@ -3,11 +3,12 @@
!RUN: %flang_fc1 -emit-hlfir -triple amdgcn-amd-amdhsa -target-cpu gfx908 -mcode-object-version=none %s -o - | FileCheck --check-prefix=COV_NONE %s
!RUN: %flang_fc1 -emit-hlfir -triple amdgcn-amd-amdhsa -target-cpu gfx908 -mcode-object-version=4 %s -o - | FileCheck --check-prefix=COV_4 %s
!RUN: %flang_fc1 -emit-hlfir -triple amdgcn-amd-amdhsa -target-cpu gfx908 -mcode-object-version=5 %s -o - | FileCheck --check-prefix=COV_5 %s
!RUN: %flang_fc1 -emit-hlfir -triple amdgcn-amd-amdhsa -target-cpu gfx908 -mcode-object-version=6 %s -o - | FileCheck --check-prefix=COV_6 %s

!COV_DEFAULT: llvm.mlir.global weak_odr hidden local_unnamed_addr constant @__oclc_ABI_version(400 : i32) {addr_space = 4 : i32} : i32
!COV_NONE-NOT: llvm.mlir.global weak_odr hidden local_unnamed_addr constant @__oclc_ABI_version(400 : i32) {addr_space = 4 : i32} : i32
!COV_4: llvm.mlir.global weak_odr hidden local_unnamed_addr constant @__oclc_ABI_version(400 : i32) {addr_space = 4 : i32} : i32
!COV_5: llvm.mlir.global weak_odr hidden local_unnamed_addr constant @__oclc_ABI_version(500 : i32) {addr_space = 4 : i32} : i32
!COV_6: llvm.mlir.global weak_odr hidden local_unnamed_addr constant @__oclc_ABI_version(600 : i32) {addr_space = 4 : i32} : i32
subroutine target_simple
end subroutine target_simple

21 changes: 21 additions & 0 deletions lld/ELF/Arch/AMDGPU.cpp
Original file line number Diff line number Diff line change
Expand Up @@ -25,6 +25,7 @@ class AMDGPU final : public TargetInfo {
private:
uint32_t calcEFlagsV3() const;
uint32_t calcEFlagsV4() const;
uint32_t calcEFlagsV6() const;

public:
AMDGPU();
Expand Down Expand Up @@ -106,6 +107,24 @@ uint32_t AMDGPU::calcEFlagsV4() const {
return retMach | retXnack | retSramEcc;
}

uint32_t AMDGPU::calcEFlagsV6() const {
uint32_t flags = calcEFlagsV4();

uint32_t genericVersion =
getEFlags(ctx.objectFiles[0]) & EF_AMDGPU_GENERIC_VERSION;

// Verify that all input files have compatible generic version.
for (InputFile *f : ArrayRef(ctx.objectFiles).slice(1)) {
if (genericVersion != (getEFlags(f) & EF_AMDGPU_GENERIC_VERSION)) {
error("incompatible generic version: " + toString(f));
return 0;
}
}

flags |= genericVersion;
return flags;
}

uint32_t AMDGPU::calcEFlags() const {
if (ctx.objectFiles.empty())
return 0;
Expand All @@ -121,6 +140,8 @@ uint32_t AMDGPU::calcEFlags() const {
case ELFABIVERSION_AMDGPU_HSA_V4:
case ELFABIVERSION_AMDGPU_HSA_V5:
return calcEFlagsV4();
case ELFABIVERSION_AMDGPU_HSA_V6:
return calcEFlagsV6();
default:
error("unknown abi version: " + Twine(abiVersion));
return 0;
Expand Down
16 changes: 16 additions & 0 deletions lld/test/ELF/amdgpu-tid.s
Original file line number Diff line number Diff line change
Expand Up @@ -43,3 +43,19 @@
# SRAMECC-OFF: EF_AMDGPU_FEATURE_SRAMECC_OFF_V4 (0x800)
# SRAMECC-ON: EF_AMDGPU_FEATURE_SRAMECC_ON_V4 (0xC00)
# SRAMECC-INCOMPATIBLE: incompatible sramecc:

# RUN: llvm-mc -triple amdgcn-amd-amdhsa -mcpu=gfx900 --amdhsa-code-object-version=6 --amdgpu-force-generic-version=1 -filetype=obj %s -o %t-genericv1_0.o
# RUN: llvm-mc -triple amdgcn-amd-amdhsa -mcpu=gfx900 --amdhsa-code-object-version=6 --amdgpu-force-generic-version=1 -filetype=obj %s -o %t-genericv1_1.o
# RUN: ld.lld -shared %t-genericv1_0.o %t-genericv1_1.o -o %t-genericv1_2.so
# RUN: llvm-readobj --file-headers %t-genericv1_2.so | FileCheck --check-prefix=GENERICV1 %s

# RUN: llvm-mc -triple amdgcn-amd-amdhsa -mcpu=gfx900 --amdhsa-code-object-version=6 --amdgpu-force-generic-version=2 -filetype=obj %s -o %t-genericv2_0.o
# RUN: llvm-mc -triple amdgcn-amd-amdhsa -mcpu=gfx900 --amdhsa-code-object-version=6 --amdgpu-force-generic-version=2 -filetype=obj %s -o %t-genericv2_1.o
# RUN: ld.lld -shared %t-genericv2_0.o %t-genericv2_1.o -o %t-genericv2_2.so
# RUN: llvm-readobj --file-headers %t-genericv2_2.so | FileCheck --check-prefix=GENERICV2 %s

# RUN: not ld.lld -shared %t-genericv1_0.o %t-genericv2_0.o -o /dev/null 2>&1 | FileCheck --check-prefix=GENERIC-INCOMPATIBLE %s

# GENERICV1: EF_AMDGPU_GENERIC_VERSION_V1 (0x1000000)
# GENERICV2: EF_AMDGPU_GENERIC_VERSION_V2 (0x2000000)
# GENERIC-INCOMPATIBLE: incompatible generic version
9 changes: 8 additions & 1 deletion llvm/include/llvm/BinaryFormat/ELF.h
Original file line number Diff line number Diff line change
Expand Up @@ -375,7 +375,8 @@ enum {
ELFABIVERSION_AMDGPU_HSA_V2 = 0,
ELFABIVERSION_AMDGPU_HSA_V3 = 1,
ELFABIVERSION_AMDGPU_HSA_V4 = 2,
ELFABIVERSION_AMDGPU_HSA_V5 = 3
ELFABIVERSION_AMDGPU_HSA_V5 = 3,
ELFABIVERSION_AMDGPU_HSA_V6 = 4,
};

#define ELF_RELOC(name, value) name = value,
Expand Down Expand Up @@ -842,6 +843,12 @@ enum : unsigned {
EF_AMDGPU_FEATURE_SRAMECC_OFF_V4 = 0x800,
// SRAMECC is on.
EF_AMDGPU_FEATURE_SRAMECC_ON_V4 = 0xc00,

// Generic target versioning. This is contained in the list byte of EFLAGS.
EF_AMDGPU_GENERIC_VERSION = 0xff000000,
EF_AMDGPU_GENERIC_VERSION_OFFSET = 24,
EF_AMDGPU_GENERIC_VERSION_MIN = 1,
EF_AMDGPU_GENERIC_VERSION_MAX = 0xff,
};

// ELF Relocation types for AMDGPU
Expand Down
7 changes: 7 additions & 0 deletions llvm/include/llvm/Support/AMDGPUMetadata.h
Original file line number Diff line number Diff line change
Expand Up @@ -44,8 +44,15 @@ constexpr uint32_t VersionMajorV5 = 1;
/// HSA metadata minor version for code object V5.
constexpr uint32_t VersionMinorV5 = 2;

/// HSA metadata major version for code object V6.
constexpr uint32_t VersionMajorV6 = 1;
/// HSA metadata minor version for code object V6.
constexpr uint32_t VersionMinorV6 = 3;
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Did anything change in metadata?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Not yet, but I assume we'll want to bundle some changes to the MD with V6 so it's better to update the version now, no?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

As of now, there are no changes planned to the MD for v6

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Do we just increment this number when there's a breaking metadata change? How does it work?


/// Old HSA metadata beginning assembler directive for V2. This is only used for
/// diagnostics now.

/// HSA metadata beginning assembler directive.
constexpr char AssemblerDirectiveBegin[] = ".amd_amdgpu_hsa_metadata";

/// Access qualifiers.
Expand Down
4 changes: 2 additions & 2 deletions llvm/include/llvm/Support/ScopedPrinter.h
Original file line number Diff line number Diff line change
Expand Up @@ -160,8 +160,8 @@ class ScopedPrinter {
template <typename T, typename TFlag>
void printFlags(StringRef Label, T Value, ArrayRef<EnumEntry<TFlag>> Flags,
TFlag EnumMask1 = {}, TFlag EnumMask2 = {},
TFlag EnumMask3 = {}) {
SmallVector<FlagEntry, 10> SetFlags;
TFlag EnumMask3 = {}, ArrayRef<FlagEntry> ExtraFlags = {}) {
SmallVector<FlagEntry, 10> SetFlags(ExtraFlags.begin(), ExtraFlags.end());

for (const auto &Flag : Flags) {
if (Flag.Value == 0)
Expand Down
1 change: 1 addition & 0 deletions llvm/include/llvm/Target/TargetOptions.h
Original file line number Diff line number Diff line change
Expand Up @@ -129,6 +129,7 @@ namespace llvm {
COV_3 = 300, // Unsupported.
COV_4 = 400,
COV_5 = 500,
COV_6 = 600,
};

class TargetOptions {
Expand Down
9 changes: 9 additions & 0 deletions llvm/lib/ObjectYAML/ELFYAML.cpp
Original file line number Diff line number Diff line change
Expand Up @@ -620,6 +620,15 @@ void ScalarBitSetTraits<ELFYAML::ELF_EF>::bitset(IO &IO,
BCase(EF_AMDGPU_FEATURE_XNACK_V3);
BCase(EF_AMDGPU_FEATURE_SRAMECC_V3);
break;
case ELF::ELFABIVERSION_AMDGPU_HSA_V6:
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think we'd need to add a test for this?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

elf-headers.test already covers it

for (unsigned K = ELF::EF_AMDGPU_GENERIC_VERSION_MIN;
K <= ELF::EF_AMDGPU_GENERIC_VERSION_MAX; ++K) {
std::string Key = "EF_AMDGPU_GENERIC_VERSION_V" + std::to_string(K);
IO.maskedBitSetCase(Value, Key.c_str(),
K << ELF::EF_AMDGPU_GENERIC_VERSION_OFFSET,
ELF::EF_AMDGPU_GENERIC_VERSION);
}
[[fallthrough]];
case ELF::ELFABIVERSION_AMDGPU_HSA_V4:
case ELF::ELFABIVERSION_AMDGPU_HSA_V5:
BCaseMask(EF_AMDGPU_FEATURE_XNACK_UNSUPPORTED_V4,
Expand Down
3 changes: 3 additions & 0 deletions llvm/lib/Target/AMDGPU/AMDGPUAsmPrinter.cpp
Original file line number Diff line number Diff line change
Expand Up @@ -335,6 +335,9 @@ bool AMDGPUAsmPrinter::doInitialization(Module &M) {
case AMDGPU::AMDHSA_COV5:
HSAMetadataStream.reset(new HSAMD::MetadataStreamerMsgPackV5());
break;
case AMDGPU::AMDHSA_COV6:
HSAMetadataStream.reset(new HSAMD::MetadataStreamerMsgPackV6());
break;
default:
report_fatal_error("Unexpected code object version");
}
Expand Down
Loading