-
Notifications
You must be signed in to change notification settings - Fork 14.3k
[X86] AMD Zen 5 Initial enablement #107964
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
@llvm/pr-subscribers-backend-x86 @llvm/pr-subscribers-clang-driver Author: Ganesh (ganeshgit) ChangesThis patch enables the basic skeleton enablement of AMD next gen zen5 CPUs. Patch is 31.47 KiB, truncated to 20.00 KiB below, full version: https://github.com/llvm/llvm-project/pull/107964.diff 30 Files Affected:
diff --git a/clang/lib/Basic/Targets/X86.cpp b/clang/lib/Basic/Targets/X86.cpp
index 62c382b67ad14a..5448bd841959f4 100644
--- a/clang/lib/Basic/Targets/X86.cpp
+++ b/clang/lib/Basic/Targets/X86.cpp
@@ -728,6 +728,9 @@ void X86TargetInfo::getTargetDefines(const LangOptions &Opts,
case CK_ZNVER4:
defineCPUMacros(Builder, "znver4");
break;
+ case CK_ZNVER5:
+ defineCPUMacros(Builder, "znver5");
+ break;
case CK_Geode:
defineCPUMacros(Builder, "geode");
break;
@@ -1626,6 +1629,7 @@ std::optional<unsigned> X86TargetInfo::getCPUCacheLineSize() const {
case CK_ZNVER2:
case CK_ZNVER3:
case CK_ZNVER4:
+ case CK_ZNVER5:
// Deprecated
case CK_x86_64:
case CK_x86_64_v2:
diff --git a/clang/test/CodeGen/target-builtin-noerror.c b/clang/test/CodeGen/target-builtin-noerror.c
index 14024e3953182c..2a05074d7c2b68 100644
--- a/clang/test/CodeGen/target-builtin-noerror.c
+++ b/clang/test/CodeGen/target-builtin-noerror.c
@@ -207,4 +207,5 @@ void verifycpustrings(void) {
(void)__builtin_cpu_is("znver2");
(void)__builtin_cpu_is("znver3");
(void)__builtin_cpu_is("znver4");
+ (void)__builtin_cpu_is("znver5");
}
diff --git a/clang/test/Driver/x86-march.c b/clang/test/Driver/x86-march.c
index cc993b53937c17..3bc2a82ae778d6 100644
--- a/clang/test/Driver/x86-march.c
+++ b/clang/test/Driver/x86-march.c
@@ -242,6 +242,10 @@
// RUN: %clang -target x86_64-unknown-unknown -c -### %s -march=znver4 2>&1 \
// RUN: | FileCheck %s -check-prefix=znver4
// znver4: "-target-cpu" "znver4"
+//
+// RUN: %clang -target x86_64-unknown-unknown -c -### %s -march=znver5 2>&1 \
+// RUN: | FileCheck %s -check-prefix=znver5
+// znver5: "-target-cpu" "znver5"
// RUN: %clang -target x86_64 -c -### %s -march=x86-64 2>&1 | FileCheck %s --check-prefix=x86-64
// x86-64: "-target-cpu" "x86-64"
diff --git a/clang/test/Frontend/x86-target-cpu.c b/clang/test/Frontend/x86-target-cpu.c
index 6c8502ac2c21ee..f2885a040c3701 100644
--- a/clang/test/Frontend/x86-target-cpu.c
+++ b/clang/test/Frontend/x86-target-cpu.c
@@ -38,5 +38,6 @@
// RUN: %clang_cc1 -triple x86_64-unknown-unknown -target-cpu znver2 -verify %s
// RUN: %clang_cc1 -triple x86_64-unknown-unknown -target-cpu znver3 -verify %s
// RUN: %clang_cc1 -triple x86_64-unknown-unknown -target-cpu znver4 -verify %s
+// RUN: %clang_cc1 -triple x86_64-unknown-unknown -target-cpu znver5 -verify %s
//
// expected-no-diagnostics
diff --git a/clang/test/Misc/target-invalid-cpu-note/x86.c b/clang/test/Misc/target-invalid-cpu-note/x86.c
index 607192a5409ba8..7879676040af46 100644
--- a/clang/test/Misc/target-invalid-cpu-note/x86.c
+++ b/clang/test/Misc/target-invalid-cpu-note/x86.c
@@ -99,6 +99,7 @@
// X86-SAME: {{^}}, znver2
// X86-SAME: {{^}}, znver3
// X86-SAME: {{^}}, znver4
+// X86-SAME: {{^}}, znver5
// X86-SAME: {{^}}, x86-64
// X86-SAME: {{^}}, x86-64-v2
// X86-SAME: {{^}}, x86-64-v3
@@ -175,6 +176,7 @@
// X86_64-SAME: {{^}}, znver2
// X86_64-SAME: {{^}}, znver3
// X86_64-SAME: {{^}}, znver4
+// X86_64-SAME: {{^}}, znver5
// X86_64-SAME: {{^}}, x86-64
// X86_64-SAME: {{^}}, x86-64-v2
// X86_64-SAME: {{^}}, x86-64-v3
@@ -278,6 +280,7 @@
// TUNE_X86-SAME: {{^}}, znver2
// TUNE_X86-SAME: {{^}}, znver3
// TUNE_X86-SAME: {{^}}, znver4
+// TUNE_X86-SAME: {{^}}, znver5
// TUNE_X86-SAME: {{^}}, x86-64
// TUNE_X86-SAME: {{^}}, geode
// TUNE_X86-SAME: {{$}}
@@ -379,6 +382,7 @@
// TUNE_X86_64-SAME: {{^}}, znver2
// TUNE_X86_64-SAME: {{^}}, znver3
// TUNE_X86_64-SAME: {{^}}, znver4
+// TUNE_X86_64-SAME: {{^}}, znver5
// TUNE_X86_64-SAME: {{^}}, x86-64
// TUNE_X86_64-SAME: {{^}}, geode
// TUNE_X86_64-SAME: {{$}}
diff --git a/clang/test/Preprocessor/predefined-arch-macros.c b/clang/test/Preprocessor/predefined-arch-macros.c
index 49646d94d920c8..a149c69ee0cdb2 100644
--- a/clang/test/Preprocessor/predefined-arch-macros.c
+++ b/clang/test/Preprocessor/predefined-arch-macros.c
@@ -3923,6 +3923,148 @@
// CHECK_ZNVER4_M64: #define __znver4 1
// CHECK_ZNVER4_M64: #define __znver4__ 1
+// RUN: %clang -march=znver5 -m32 -E -dM %s -o - 2>&1 \
+// RUN: -target i386-unknown-linux \
+// RUN: | FileCheck -match-full-lines %s -check-prefix=CHECK_ZNVER5_M32
+// CHECK_ZNVER5_M32-NOT: #define __3dNOW_A__ 1
+// CHECK_ZNVER5_M32-NOT: #define __3dNOW__ 1
+// CHECK_ZNVER5_M32: #define __ADX__ 1
+// CHECK_ZNVER5_M32: #define __AES__ 1
+// CHECK_ZNVER5_M32: #define __AVX2__ 1
+// CHECK_ZNVER5_M32: #define __AVX512BF16__ 1
+// CHECK_ZNVER5_M32: #define __AVX512BITALG__ 1
+// CHECK_ZNVER5_M32: #define __AVX512BW__ 1
+// CHECK_ZNVER5_M32: #define __AVX512CD__ 1
+// CHECK_ZNVER5_M32: #define __AVX512DQ__ 1
+// CHECK_ZNVER5_M32: #define __AVX512F__ 1
+// CHECK_ZNVER5_M32: #define __AVX512IFMA__ 1
+// CHECK_ZNVER5_M32: #define __AVX512VBMI2__ 1
+// CHECK_ZNVER5_M32: #define __AVX512VBMI__ 1
+// CHECK_ZNVER5_M32: #define __AVX512VL__ 1
+// CHECK_ZNVER5_M32: #define __AVX512VNNI__ 1
+// CHECK_ZNVER5_M32: #define __AVX512VP2INTERSECT__ 1
+// CHECK_ZNVER5_M32: #define __AVX512VPOPCNTDQ__ 1
+// CHECK_ZNVER5_M32: #define __AVXVNNI__ 1
+// CHECK_ZNVER5_M32: #define __AVX__ 1
+// CHECK_ZNVER5_M32: #define __BMI2__ 1
+// CHECK_ZNVER5_M32: #define __BMI__ 1
+// CHECK_ZNVER5_M32: #define __CLFLUSHOPT__ 1
+// CHECK_ZNVER5_M32: #define __CLWB__ 1
+// CHECK_ZNVER5_M32: #define __CLZERO__ 1
+// CHECK_ZNVER5_M32: #define __F16C__ 1
+// CHECK_ZNVER5_M32-NOT: #define __FMA4__ 1
+// CHECK_ZNVER5_M32: #define __FMA__ 1
+// CHECK_ZNVER5_M32: #define __FSGSBASE__ 1
+// CHECK_ZNVER5_M32: #define __GFNI__ 1
+// CHECK_ZNVER5_M32: #define __LZCNT__ 1
+// CHECK_ZNVER5_M32: #define __MMX__ 1
+// CHECK_ZNVER5_M32: #define __MOVDIR64B__ 1
+// CHECK_ZNVER5_M32: #define __MOVDIRI__ 1
+// CHECK_ZNVER5_M32: #define __PCLMUL__ 1
+// CHECK_ZNVER5_M32: #define __PKU__ 1
+// CHECK_ZNVER5_M32: #define __POPCNT__ 1
+// CHECK_ZNVER5_M32: #define __PREFETCHI__ 1
+// CHECK_ZNVER5_M32: #define __PRFCHW__ 1
+// CHECK_ZNVER5_M32: #define __RDPID__ 1
+// CHECK_ZNVER5_M32: #define __RDPRU__ 1
+// CHECK_ZNVER5_M32: #define __RDRND__ 1
+// CHECK_ZNVER5_M32: #define __RDSEED__ 1
+// CHECK_ZNVER5_M32: #define __SHA__ 1
+// CHECK_ZNVER5_M32: #define __SSE2_MATH__ 1
+// CHECK_ZNVER5_M32: #define __SSE2__ 1
+// CHECK_ZNVER5_M32: #define __SSE3__ 1
+// CHECK_ZNVER5_M32: #define __SSE4A__ 1
+// CHECK_ZNVER5_M32: #define __SSE4_1__ 1
+// CHECK_ZNVER5_M32: #define __SSE4_2__ 1
+// CHECK_ZNVER5_M32: #define __SSE_MATH__ 1
+// CHECK_ZNVER5_M32: #define __SSE__ 1
+// CHECK_ZNVER5_M32: #define __SSSE3__ 1
+// CHECK_ZNVER5_M32-NOT: #define __TBM__ 1
+// CHECK_ZNVER5_M32: #define __WBNOINVD__ 1
+// CHECK_ZNVER5_M32-NOT: #define __XOP__ 1
+// CHECK_ZNVER5_M32: #define __XSAVEC__ 1
+// CHECK_ZNVER5_M32: #define __XSAVEOPT__ 1
+// CHECK_ZNVER5_M32: #define __XSAVES__ 1
+// CHECK_ZNVER5_M32: #define __XSAVE__ 1
+// CHECK_ZNVER5_M32: #define __i386 1
+// CHECK_ZNVER5_M32: #define __i386__ 1
+// CHECK_ZNVER5_M32: #define __tune_znver5__ 1
+// CHECK_ZNVER5_M32: #define __znver5 1
+// CHECK_ZNVER5_M32: #define __znver5__ 1
+
+// RUN: %clang -march=znver5 -m64 -E -dM %s -o - 2>&1 \
+// RUN: -target i386-unknown-linux \
+// RUN: | FileCheck -match-full-lines %s -check-prefix=CHECK_ZNVER5_M64
+// CHECK_ZNVER5_M64-NOT: #define __3dNOW_A__ 1
+// CHECK_ZNVER5_M64-NOT: #define __3dNOW__ 1
+// CHECK_ZNVER5_M64: #define __ADX__ 1
+// CHECK_ZNVER5_M64: #define __AES__ 1
+// CHECK_ZNVER5_M64: #define __AVX2__ 1
+// CHECK_ZNVER5_M64: #define __AVX512BF16__ 1
+// CHECK_ZNVER5_M64: #define __AVX512BITALG__ 1
+// CHECK_ZNVER5_M64: #define __AVX512BW__ 1
+// CHECK_ZNVER5_M64: #define __AVX512CD__ 1
+// CHECK_ZNVER5_M64: #define __AVX512DQ__ 1
+// CHECK_ZNVER5_M64: #define __AVX512F__ 1
+// CHECK_ZNVER5_M64: #define __AVX512IFMA__ 1
+// CHECK_ZNVER5_M64: #define __AVX512VBMI2__ 1
+// CHECK_ZNVER5_M64: #define __AVX512VBMI__ 1
+// CHECK_ZNVER5_M64: #define __AVX512VL__ 1
+// CHECK_ZNVER5_M64: #define __AVX512VNNI__ 1
+// CHECK_ZNVER5_M64: #define __AVX512VP2INTERSECT__ 1
+// CHECK_ZNVER5_M64: #define __AVX512VPOPCNTDQ__ 1
+// CHECK_ZNVER5_M64: #define __AVXVNNI__ 1
+// CHECK_ZNVER5_M64: #define __AVX__ 1
+// CHECK_ZNVER5_M64: #define __BMI2__ 1
+// CHECK_ZNVER5_M64: #define __BMI__ 1
+// CHECK_ZNVER5_M64: #define __CLFLUSHOPT__ 1
+// CHECK_ZNVER5_M64: #define __CLWB__ 1
+// CHECK_ZNVER5_M64: #define __CLZERO__ 1
+// CHECK_ZNVER5_M64: #define __F16C__ 1
+// CHECK_ZNVER5_M64-NOT: #define __FMA4__ 1
+// CHECK_ZNVER5_M64: #define __FMA__ 1
+// CHECK_ZNVER5_M64: #define __FSGSBASE__ 1
+// CHECK_ZNVER5_M64: #define __GFNI__ 1
+// CHECK_ZNVER5_M64: #define __LZCNT__ 1
+// CHECK_ZNVER5_M64: #define __MMX__ 1
+// CHECK_ZNVER5_M64: #define __MOVDIR64B__ 1
+// CHECK_ZNVER5_M64: #define __MOVDIRI__ 1
+// CHECK_ZNVER5_M64: #define __PCLMUL__ 1
+// CHECK_ZNVER5_M64: #define __PKU__ 1
+// CHECK_ZNVER5_M64: #define __POPCNT__ 1
+// CHECK_ZNVER5_M64: #define __PREFETCHI__ 1
+// CHECK_ZNVER5_M64: #define __PRFCHW__ 1
+// CHECK_ZNVER5_M64: #define __RDPID__ 1
+// CHECK_ZNVER5_M64: #define __RDPRU__ 1
+// CHECK_ZNVER5_M64: #define __RDRND__ 1
+// CHECK_ZNVER5_M64: #define __RDSEED__ 1
+// CHECK_ZNVER5_M64: #define __SHA__ 1
+// CHECK_ZNVER5_M64: #define __SSE2_MATH__ 1
+// CHECK_ZNVER5_M64: #define __SSE2__ 1
+// CHECK_ZNVER5_M64: #define __SSE3__ 1
+// CHECK_ZNVER5_M64: #define __SSE4A__ 1
+// CHECK_ZNVER5_M64: #define __SSE4_1__ 1
+// CHECK_ZNVER5_M64: #define __SSE4_2__ 1
+// CHECK_ZNVER5_M64: #define __SSE_MATH__ 1
+// CHECK_ZNVER5_M64: #define __SSE__ 1
+// CHECK_ZNVER5_M64: #define __SSSE3__ 1
+// CHECK_ZNVER5_M64-NOT: #define __TBM__ 1
+// CHECK_ZNVER5_M64: #define __VAES__ 1
+// CHECK_ZNVER5_M64: #define __VPCLMULQDQ__ 1
+// CHECK_ZNVER5_M64: #define __WBNOINVD__ 1
+// CHECK_ZNVER5_M64-NOT: #define __XOP__ 1
+// CHECK_ZNVER5_M64: #define __XSAVEC__ 1
+// CHECK_ZNVER5_M64: #define __XSAVEOPT__ 1
+// CHECK_ZNVER5_M64: #define __XSAVES__ 1
+// CHECK_ZNVER5_M64: #define __XSAVE__ 1
+// CHECK_ZNVER5_M64: #define __amd64 1
+// CHECK_ZNVER5_M64: #define __amd64__ 1
+// CHECK_ZNVER5_M64: #define __tune_znver5__ 1
+// CHECK_ZNVER5_M64: #define __x86_64 1
+// CHECK_ZNVER5_M64: #define __x86_64__ 1
+// CHECK_ZNVER5_M64: #define __znver5 1
+// CHECK_ZNVER5_M64: #define __znver5__ 1
+
// End X86/GCC/Linux tests ------------------
// Begin PPC/GCC/Linux tests ----------------
diff --git a/compiler-rt/lib/builtins/cpu_model/x86.c b/compiler-rt/lib/builtins/cpu_model/x86.c
index 069defc970190e..e5d74daf26d3de 100644
--- a/compiler-rt/lib/builtins/cpu_model/x86.c
+++ b/compiler-rt/lib/builtins/cpu_model/x86.c
@@ -63,6 +63,7 @@ enum ProcessorTypes {
INTEL_SIERRAFOREST,
INTEL_GRANDRIDGE,
INTEL_CLEARWATERFOREST,
+ AMDFAM1AH,
CPU_TYPE_MAX
};
@@ -101,6 +102,7 @@ enum ProcessorSubtypes {
INTEL_COREI7_ARROWLAKE,
INTEL_COREI7_ARROWLAKE_S,
INTEL_COREI7_PANTHERLAKE,
+ AMDFAM1AH_ZNVER5,
CPU_SUBTYPE_MAX
};
@@ -748,6 +750,24 @@ static const char *getAMDProcessorTypeAndSubtype(unsigned Family,
break; // "znver4"
}
break; // family 19h
+ case 26:
+ CPU = "znver5";
+ *Type = AMDFAM1AH;
+ if (Model <= 0x77) {
+ // Models 00h-0Fh (Breithorn).
+ // Models 10h-1Fh (Breithorn-Dense).
+ // Models 20h-2Fh (Strix 1).
+ // Models 30h-37h (Strix 2).
+ // Models 38h-3Fh (Strix 3).
+ // Models 40h-4Fh (Granite Ridge).
+ // Models 50h-5Fh (Weisshorn).
+ // Models 60h-6Fh (Krackan1).
+ // Models 70h-77h (Sarlak).
+ CPU = "znver5";
+ *Subtype = AMDFAM1AH_ZNVER5;
+ break; // "znver5"
+ }
+ break;
default:
break; // Unknown AMD CPU.
}
diff --git a/llvm/include/llvm/TargetParser/X86TargetParser.def b/llvm/include/llvm/TargetParser/X86TargetParser.def
index cd160f54e66705..e5bf196559ba63 100644
--- a/llvm/include/llvm/TargetParser/X86TargetParser.def
+++ b/llvm/include/llvm/TargetParser/X86TargetParser.def
@@ -49,11 +49,13 @@ X86_CPU_TYPE(ZHAOXIN_FAM7H, "zhaoxin_fam7h")
X86_CPU_TYPE(INTEL_SIERRAFOREST, "sierraforest")
X86_CPU_TYPE(INTEL_GRANDRIDGE, "grandridge")
X86_CPU_TYPE(INTEL_CLEARWATERFOREST, "clearwaterforest")
+X86_CPU_TYPE(AMDFAM1AH, "amdfam1ah")
// Alternate names supported by __builtin_cpu_is and target multiversioning.
X86_CPU_TYPE_ALIAS(INTEL_BONNELL, "atom")
X86_CPU_TYPE_ALIAS(AMDFAM10H, "amdfam10")
X86_CPU_TYPE_ALIAS(AMDFAM15H, "amdfam15")
+X86_CPU_TYPE_ALIAS(AMDFAM1AH, "amdfam1a")
X86_CPU_TYPE_ALIAS(INTEL_SILVERMONT, "slm")
#undef X86_CPU_TYPE_ALIAS
@@ -104,6 +106,7 @@ X86_CPU_SUBTYPE(INTEL_COREI7_GRANITERAPIDS_D,"graniterapids-d")
X86_CPU_SUBTYPE(INTEL_COREI7_ARROWLAKE, "arrowlake")
X86_CPU_SUBTYPE(INTEL_COREI7_ARROWLAKE_S, "arrowlake-s")
X86_CPU_SUBTYPE(INTEL_COREI7_PANTHERLAKE, "pantherlake")
+X86_CPU_SUBTYPE(AMDFAM1AH_ZNVER5, "znver5")
// Alternate names supported by __builtin_cpu_is and target multiversioning.
X86_CPU_SUBTYPE_ALIAS(INTEL_COREI7_ALDERLAKE, "raptorlake")
diff --git a/llvm/include/llvm/TargetParser/X86TargetParser.h b/llvm/include/llvm/TargetParser/X86TargetParser.h
index 2083e585af4ac8..0e17c4674719cf 100644
--- a/llvm/include/llvm/TargetParser/X86TargetParser.h
+++ b/llvm/include/llvm/TargetParser/X86TargetParser.h
@@ -142,6 +142,7 @@ enum CPUKind {
CK_ZNVER2,
CK_ZNVER3,
CK_ZNVER4,
+ CK_ZNVER5,
CK_x86_64,
CK_x86_64_v2,
CK_x86_64_v3,
diff --git a/llvm/lib/Target/X86/X86.td b/llvm/lib/Target/X86/X86.td
index 988966fa6a6c46..6cf37836f921d4 100644
--- a/llvm/lib/Target/X86/X86.td
+++ b/llvm/lib/Target/X86/X86.td
@@ -1549,6 +1549,19 @@ def ProcessorFeatures {
FeatureVPOPCNTDQ];
list<SubtargetFeature> ZN4Features =
!listconcat(ZN3Features, ZN4AdditionalFeatures);
+
+
+ list<SubtargetFeature> ZN5Tuning = ZN4Tuning;
+ list<SubtargetFeature> ZN5AdditionalFeatures = [FeatureVNNI,
+ FeatureMOVDIRI,
+ FeatureMOVDIR64B,
+ FeatureVP2INTERSECT,
+ FeaturePREFETCHI,
+ FeatureAVXVNNI
+ ];
+ list<SubtargetFeature> ZN5Features =
+ !listconcat(ZN4Features, ZN5AdditionalFeatures);
+
}
//===----------------------------------------------------------------------===//
@@ -1898,6 +1911,8 @@ def : ProcModel<"znver3", Znver3Model, ProcessorFeatures.ZN3Features,
ProcessorFeatures.ZN3Tuning>;
def : ProcModel<"znver4", Znver4Model, ProcessorFeatures.ZN4Features,
ProcessorFeatures.ZN4Tuning>;
+def : ProcModel<"znver5", Znver4Model, ProcessorFeatures.ZN5Features,
+ ProcessorFeatures.ZN5Tuning>;
def : Proc<"geode", [FeatureX87, FeatureCX8, FeatureMMX, FeaturePRFCHW],
[TuningSlowUAMem16, TuningInsertVZEROUPPER]>;
diff --git a/llvm/lib/Target/X86/X86PfmCounters.td b/llvm/lib/Target/X86/X86PfmCounters.td
index 2b1dac411c9927..c30e989cdc2af1 100644
--- a/llvm/lib/Target/X86/X86PfmCounters.td
+++ b/llvm/lib/Target/X86/X86PfmCounters.td
@@ -350,3 +350,4 @@ def ZnVer4PfmCounters : ProcPfmCounters {
let ValidationCounters = DefaultAMDPfmValidationCounters;
}
def : PfmCountersBinding<"znver4", ZnVer4PfmCounters>;
+def : PfmCountersBinding<"znver5", ZnVer4PfmCounters>;
diff --git a/llvm/lib/TargetParser/Host.cpp b/llvm/lib/TargetParser/Host.cpp
index 986b9a211ce6c1..a85d28d8308308 100644
--- a/llvm/lib/TargetParser/Host.cpp
+++ b/llvm/lib/TargetParser/Host.cpp
@@ -1151,6 +1151,25 @@ static const char *getAMDProcessorTypeAndSubtype(unsigned Family,
break; // "znver4"
}
break; // family 19h
+ case 26:
+ CPU = "znver5";
+ *Type = X86::AMDFAM1AH;
+ if (Model <= 0x77) {
+ // Models 00h-0Fh (Breithorn).
+ // Models 10h-1Fh (Breithorn-Dense).
+ // Models 20h-2Fh (Strix 1).
+ // Models 30h-37h (Strix 2).
+ // Models 38h-3Fh (Strix 3).
+ // Models 40h-4Fh (Granite Ridge).
+ // Models 50h-5Fh (Weisshorn).
+ // Models 60h-6Fh (Krackan1).
+ // Models 70h-77h (Sarlak).
+ CPU = "znver5";
+ *Subtype = X86::AMDFAM1AH_ZNVER5;
+ break; // "znver5"
+ }
+ break;
+
default:
break; // Unknown AMD CPU.
}
diff --git a/llvm/lib/TargetParser/X86TargetParser.cpp b/llvm/lib/TargetParser/X86TargetParser.cpp
index 57bda0651ea829..27feb9c78912a5 100644
--- a/llvm/lib/TargetParser/X86TargetParser.cpp
+++ b/llvm/lib/TargetParser/X86TargetParser.cpp
@@ -238,6 +238,10 @@ static constexpr FeatureBitset FeaturesZNVER4 =
FeatureAVX512BITALG | FeatureAVX512VPOPCNTDQ | FeatureAVX512BF16 |
FeatureGFNI | FeatureSHSTK;
+static constexpr FeatureBitset FeaturesZNVER5 =
+ FeaturesZNVER4 | FeatureAVXVNNI | FeatureMOVDIRI | FeatureMOVDIR64B |
+ FeatureAVX512VP2INTERSECT | FeaturePREFETCHI | FeatureAVXVNNI ;
+
// D151696 tranplanted Mangling and OnlyForCPUDispatchSpecific from
// X86TargetParser.def to here. They are assigned by following ways:
// 1. Copy the mangling from the original CPU_SPEICIFC MACROs. If no, assign
@@ -417,6 +421,7 @@ constexpr ProcInfo Processors[] = {
{ {"znver2"}, CK_ZNVER2, FEATURE_AVX2, FeaturesZNVER2, '\0', false },
{ {"znver3"}, CK_ZNVER3, FEATURE_AVX2, FeaturesZNVER3, '\0', false },
{ {"znver4"}, CK_ZNVER4, FEATURE_AVX512VBMI2, FeaturesZNVER4, '\0', false },
+ { {"znver5"}, CK_ZNVER5, FEATURE_AVX512VP2INTERSECT, FeaturesZNVER5, '\0', false },
// Generic 64-bit processor.
{ {"x86-64"}, CK_x86_64, FEATURE_SSE2 , FeaturesX86_64, '\0', false },
{ {"x86-64-v2"}, CK_x86_64_v2, FEATURE_SSE4_2 , FeaturesX86_64_V2, '\0', false },
diff --git a/llvm/test/CodeGen/X86/bypass-slow-division-64.ll b/llvm/test/CodeGen/X86/bypass-slow-division-64.ll
index 6e0cfdd26a7866..b0ca0069a526b7 100644
--- a/llvm/test/CodeGen/X86/bypass-slow-division-64.ll
+++ b/llvm/test/CodeGen/X86/bypass-slow-division-64.ll
@@ -23,6 +23,7 @@
; RUN: llc < %s -mtriple=x86_64-- -mcpu=znver2 | FileCheck %s --check-prefixes=CHECK,SLOW-DIVQ
; RUN: llc < %s -mtriple=x86_64-- -mcpu=znver3 | FileCheck %s --check-prefixes=CHECK,SLOW-DIVQ
; RUN: llc < %s -mtriple=x86_64-- -mcpu=znver4 | FileCheck %s --check-prefixes=CHECK,SLOW-DIVQ
+; RUN: llc < %s -mtriple=x86_64-- -mcpu=znver5 | FileCheck %s --check-prefixes=CHECK,SLOW-DIVQ
; Additional tests for 64-bit divide bypass
diff --git a/llvm/test/CodeGen/X86/cmp16.ll b/llvm/test/CodeGen/X86/cmp16.ll
index fa9e75ff16a5ca..8c14a78d9e1138 100644
--- a/llvm/test/CodeGen/X86/cmp16.ll
+++ b/llvm/test/CodeGen/X86/cmp16.ll
@@ -13,6 +13,7 @@
; RUN: llc < %s -mtriple=x86_64-- -mcpu=znver2 | FileCheck %s --check-prefixes=X64,X64-FAST
; RUN: llc < %s -mtriple=x86_64-- -mcpu=znver3 | FileCheck %s --check-prefixes=X64,X64-FAST
; RUN: llc < %s -mtriple=x86_64-- -mcpu=znver4 | FileCheck %s --check-prefixes=X64,X64-FAST
+; RUN: llc < %s -mtriple=x86_64-- -mcpu=znver5 | FileCheck %s --check-prefixes=X64,X64-FAST
define i1 @cmp16_reg_eq_reg(i16 %a0, i16 %a1) {
; X86-GENERIC-LABEL: cmp16_reg_eq_reg:
diff --git a/llvm/test/CodeGen/X86/cpus-amd.ll b/llvm/test/CodeGen/X86/cpus-amd.ll
index 228a00428c4571..33b2cf37314788 100644
--- a/llvm/test/CodeGen/X86/cpus-amd.ll
+++ b/llvm/test/CodeGen/X86/cpus-amd.ll
@@ -29,6 +29,7 @@
; RUN: llc < %s -o /dev/null -mtriple=x86_64-unknown-unknown -mcpu=znver2 2>&1 | FileCheck %s --check-prefix=CHECK-NO-ERROR --allow-empty
; RUN: llc < %s -o /dev/null -mtriple=x86_64-unknown-unknown -mcpu=znver3 2>&1 | FileCheck %s --check-prefix=CHECK-NO-ERROR --allow-empty
; RUN: llc < %s -o /dev/null -mtriple=x86_64-unknown-unknown -mcpu=znver4 2>&1 | FileCheck %s --check-prefix=CHECK-NO-ERROR --allow-empty
+; RUN: llc < %s -o /dev/null -mtriple=x86_64-unknown-unknown -mcpu=znver5 2>&1 | FileCheck %s --check-prefix=CHECK-NO-ERROR --allow-empty
define void @foo() {
ret void
diff --git a/llvm/test/CodeGen/X86/rdpru.ll b/llvm/test/CodeGen/X86/rdpru.ll
index 7771f52653cb50..be79a4499a3389 100644
--- a/llvm/test/CodeGen/X86/rdpru.ll
+++ b/llvm/test/CodeGen/X86/rdpru.ll
@@ -6,6 +6,7 @@
; RUN: llc < %s -mtriple=x86_64-- -mcpu=znver2 | FileCheck %s --check-prefix=X64
; RUN: llc < %s -mtriple=x86_64-- -mcpu=znver3 -fast-isel | FileCheck %s --check-prefix=X64
; RUN: llc < %s -mtriple=x86_64-- -mcpu=znver4 -fast-isel | FileCheck ...
[truncated]
|
✅ With the latest revision this PR passed the C/C++ code formatter. |
@@ -1151,6 +1151,25 @@ static const char *getAMDProcessorTypeAndSubtype(unsigned Family, | |||
break; // "znver4" | |||
} | |||
break; // family 19h | |||
case 26: |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Can you bump the equivalent code in compiler-rt
too?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Will do!
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I posted some patches a while ago to start unifying things so that there's a single canonical version that gets copied and pasted, but those have been stalled for a while on reviewers, so for now we're mostly stuck with the ad-hoc way of doing things.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I posted some patches a while ago to start unifying things so that there's a single canonical version that gets copied and pasted, but those have been stalled for a while on reviewers, so for now we're mostly stuck with the ad-hoc way of doing things.
Can you point me to those pull requests.
BTW, I see that the compiler-rt change is in place. Do you find any issues there? I think the ordering with respect to libgcc is also okay.
b68bcc1#diff-f0c447f4acaa87cede210a02796f8a3b2f08f6d8acaf951ed4c65465bf37c967
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Oh, looks like I missed it. Sorry about that!
There's #97856, #97861, and #97872 (still a draft).
I want to try and land some test infrastructure first (#101927) so I can ensure the refactorings don't break anything though. There are some more patches that I need to finish doing, but haven't gotten to the rest yet given the current ones are stalled.
@RKSimon Please post your comments. I have few subsequent patches for scheduler enablement, and some tuning patches lined up as well. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM as a base patch (znver4 + extra isas) - we should hold off from cherry picking into 19.x until we see the scope of the follow up patches.
@ganeshgit Can you address the clang-format warnings please? |
Yes sure. I will correct them! |
@tru This patch at the very least needs to make it for 19.x but I was hoping we'd get some of the tuning improvements in as well - should we wait for those PRs or just get this committed and cherry picked straight away? |
Hi, This looks pretty safe, but feel free to think out loud about the risks of merging this. Technically it doesn't really fit the criteria of regression or important bugfix, I understand the needs of getting this into the release and I don't want to be a blocker for that. Tuning improvements could be riskier if I understand it correctly? |
@ganeshgit Ignore what I said earlier about waiting for the tuning patches :) Please can we get this committed to trunk, we'll let it brew for a few days and then cherry pick for 19.x - if you can create PRs for the tuning changes as soon as possible we can review them for 19.x on a case by case basis. |
Zen 5 support in GCC was upstreamed more than half a year ago -- why is the LLVM support being upstreamed only now, after missing the 19.x window? What steps are being taken to ensure this does not happen again? The change to X86TargetParser.h looks ABI breaking to me. |
I will upload the rebased code addressing the format errors shortly. Yes PRs will work for tuning changes. I think it will take some time. |
We have a dependency in libpfm for llvm which requires legal clearances. In future, we will make sure this gets addressed in advance and will try to upload our patches in sync with GCC patches. Apologies for the inconvenience. |
This seems unfortunate to me. But I don't think it would be good to insert the enum at the end of the list and changing the sorting order. How big of a problem would it be with a ABI break now? I know you have requested that we try to avoid those, even if it's strictly within the policy to still do that before the final release. |
It would be messy, but could we not place the CK_ZNVER5 enum entry at the end of the enum list just for 19.x and then fix the sorting in trunk? |
Seems better than an ABI break this late in the cycle, but I don't have super strong feelings. |
I would be fine with that. WDYT @nikic ? |
I will do the change placing the enum at the end of the list after CK_Geode and submit for this branch instead of rebasing then. |
I think it's better it land as it should be in |
@ganeshgit OK to commit? |
Yes I am okay. Can you please commit and close this PR. I will submit another PR for 19.x release. |
@ganeshgit or @RKSimon can one of you put up a PR against release/19.x with the abi compatible changes so that we have chance to look at it and approve it before the release tomorrow. |
I will submit the changes shortly. Probably I will create a branch with a base commit which is prior to this commit to main for easy integration. |
Create a branch from |
This patch enables the basic skeleton enablement of AMD next gen zen5 CPUs.