-
Notifications
You must be signed in to change notification settings - Fork 14.3k
[Clang][BPF] Add __BPF_CPU_VERSION__ macro #71856
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
@llvm/pr-subscribers-clang Author: None (yonghong-song) ChangesSometimes bpf developer might want to develop different codes based on particular cpu versioins. For example, cpu v1/v2/v3 branch target is 16bit while cpu v4 branch target is 32bit, thus cpu v4 allows more aggressive loop unrolling than cpu v1/v2/v3 (see [1] for a kernel selftest failure due to this). We would like to maintain aggressive loop unrolling for cpu v4 while limit loop unrolling for earlier cpu versions. Another example, signed divide also only available with cpu v4. Actually, adding cpu specific macros are fairly common in llvm. For example, x86 has maco like 'i486', 'pentium_mmx', etc. AArch64 has '__ARM_NEON', '__ARM_FEATURE_SVE', etc. This patch added bpf_cpu_version macro. Current possible values are 0/1/2/3/4. The following are the -mcpu=... to bpf_cpu_version mapping: [1] https://lore.kernel.org/bpf/3e3a8a30-dde0-43a1-981e-2274962780ef@linux.dev/ Full diff: https://github.com/llvm/llvm-project/pull/71856.diff 2 Files Affected:
diff --git a/clang/lib/Basic/Targets/BPF.cpp b/clang/lib/Basic/Targets/BPF.cpp
index d6288d2e0d0e176..a61e279f395ea31 100644
--- a/clang/lib/Basic/Targets/BPF.cpp
+++ b/clang/lib/Basic/Targets/BPF.cpp
@@ -29,6 +29,14 @@ void BPFTargetInfo::getTargetDefines(const LangOptions &Opts,
MacroBuilder &Builder) const {
Builder.defineMacro("__bpf__");
Builder.defineMacro("__BPF__");
+
+ std::string CPU = getTargetOpts().CPU;
+ if (CPU == "probe")
+ Builder.defineMacro("__bpf_cpu_version__", "0");
+ else if (CPU.empty() || CPU == "generic")
+ Builder.defineMacro("__bpf_cpu_version__", "1");
+ else
+ Builder.defineMacro("__bpf_cpu_version__", CPU.substr(1));
}
static constexpr llvm::StringLiteral ValidCPUNames[] = {"generic", "v1", "v2",
diff --git a/clang/test/Preprocessor/bpf-predefined-macros.c b/clang/test/Preprocessor/bpf-predefined-macros.c
index bcb985f95426622..364f4cc3f4d0c3c 100644
--- a/clang/test/Preprocessor/bpf-predefined-macros.c
+++ b/clang/test/Preprocessor/bpf-predefined-macros.c
@@ -1,5 +1,11 @@
-// RUN: %clang -E -target bpfel -x c -o - %s | FileCheck %s
-// RUN: %clang -E -target bpfeb -x c -o - %s | FileCheck %s
+// RUN: %clang -E -target bpfel -x c -o - %s | FileCheck -check-prefix=CHECK -check-prefix=CPU_NO %s
+// RUN: %clang -E -target bpfeb -x c -o - %s | FileCheck -check-prefix=CHECK -check-prefix=CPU_NO %s
+// RUN: %clang -E -target bpfel -mcpu=v1 -x c -o - %s | FileCheck -check-prefix=CHECK -check-prefix=CPU_V1 %s
+// RUN: %clang -E -target bpfel -mcpu=v2 -x c -o - %s | FileCheck -check-prefix=CHECK -check-prefix=CPU_V2 %s
+// RUN: %clang -E -target bpfel -mcpu=v3 -x c -o - %s | FileCheck -check-prefix=CHECK -check-prefix=CPU_V3 %s
+// RUN: %clang -E -target bpfel -mcpu=v4 -x c -o - %s | FileCheck -check-prefix=CHECK -check-prefix=CPU_V4 %s
+// RUN: %clang -E -target bpfel -mcpu=generic -x c -o - %s | FileCheck -check-prefix=CHECK -check-prefix=CPU_GENERIC %s
+// RUN: %clang -E -target bpfel -mcpu=probe -x c -o - %s | FileCheck -check-prefix=CHECK -check-prefix=CPU_PROBE %s
#ifdef __bpf__
int b;
@@ -10,7 +16,34 @@ int c;
#ifdef bpf
int d;
#endif
+#ifdef __bpf_cpu_version__
+int e;
+#endif
+#if __bpf_cpu_version__ == 0
+int f;
+#endif
+#if __bpf_cpu_version__ == 1
+int g;
+#endif
+#if __bpf_cpu_version__ == 2
+int h;
+#endif
+#if __bpf_cpu_version__ == 3
+int i;
+#endif
+#if __bpf_cpu_version__ == 4
+int j;
+#endif
// CHECK: int b;
// CHECK: int c;
// CHECK-NOT: int d;
+// CHECK: int e;
+
+// CPU_NO: int g;
+// CPU_V1: int g;
+// CPU_V2: int h;
+// CPU_V3: int i;
+// CPU_V4: int j;
+// CPU_GENERIC: int g;
+// CPU_PROBE: int f;
|
cc @4ast @anakryiko |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Hi Yonghong,
Thank you for working on this.
clang/lib/Basic/Targets/BPF.cpp
Outdated
else if (CPU.empty() || CPU == "generic") | ||
Builder.defineMacro("__bpf_cpu_version__", "1"); | ||
else | ||
Builder.defineMacro("__bpf_cpu_version__", CPU.substr(1)); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
That works and somewhat similar to other archs. Like amdgcn_processor.
I have slight preference to use capitol case BPF_CPU_VERSION.
Let's add all groups too:
HasJmpExt = true;
HasJmp32 = true;
HasAlu32 = true;
HasJmpExt = true;
HasJmp32 = true;
HasAlu32 = true;
HasLdsx = !Disable_ldsx;
HasMovsx = !Disable_movsx;
HasBswap = !Disable_bswap;
HasSdivSmod = !Disable_sdiv_smod;
HasGotol = !Disable_gotol;
arm does __ARM_FEATURE_xx.
We can do __BPF_FEATURE_ALU32, __BPF_FEATURE_GOTOL.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Some architectures prefer small case (x86, pcc, loongarch, riscv) while some other architectures (arm, etc.) prefer upper case. Yes, we can use capitol cases for all the proposed macros.
For BPF_CPU_VERSION, I would like __BPF_CPU_VERSION. Basically adding '__' as prefix. What do you think?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Okay, I don't particularly like __BPF_CPU_VERSION either and actually prefer double underscore before and after BPF_CPU_VERSION. Now, I realize that you actually mean with double underscore before and after BPF_CPU_VERSION. github just displays as highlight.
Sometimes bpf developer might want to develop different codes based on particular cpu versioins. For example, cpu v1/v2/v3 branch target is 16bit while cpu v4 branch target is 32bit, thus cpu v4 allows more aggressive loop unrolling than cpu v1/v2/v3 (see [1] for a kernel selftest failure due to this). We would like to maintain aggressive loop unrolling for cpu v4 while limit loop unrolling for earlier cpu versions. Another example, signed divide also only available with cpu v4. Actually, adding cpu specific macros are fairly common in llvm. For example, x86 has maco like 'i486', '__pentium_mmx__', etc. AArch64 has '__ARM_NEON', '__ARM_FEATURE_SVE', etc. This patch added __BPF_CPU_VERSION__ macro. Current possible values are 0/1/2/3/4. The following are the -mcpu=... to __BPF_CPU_VERSION__ mapping: cpu __BPF_CPU_VERSION__ no -mcpu=<...> 1 -mcpu=v1 1 -mcpu=v2 2 -mcpu=v3 3 -mcpu=v4 4 -mcpu=generic 1 -mcpu=probe 0 This patch also added some macros for developers to identify some cpu insn features: feature macro enabled in which cpu __BPF_FEATURE_JMP_EXT >= v2 __BPF_FEATURE_JMP32 >= v3 __BPF_FEATURE_ALU32 >= v3 __BPF_FEATURE_LDSX >= v4 __BPF_FEATURE_MOVSX >= v4 __BPF_FEATURE_BSWAP >= v4 __BPF_FEATURE_SDIV_SMOD >= v4 __BPF_FEATURE_GOTOL >= v4 __BPF_FEATURE_ST >= v4 [1] https://lore.kernel.org/bpf/[email protected]/
Sometimes bpf developer might want to develop different codes based on particular cpu versioins. For example, cpu v1/v2/v3 branch target is 16bit while cpu v4 branch target is 32bit, thus cpu v4 allows more aggressive loop unrolling than cpu v1/v2/v3 (see [1] for a kernel selftest failure due to this). We would like to maintain aggressive loop unrolling for cpu v4 while limit loop unrolling for earlier cpu versions. Another example, signed divide also only available with cpu v4. Actually, adding cpu specific macros are fairly common in llvm. For example, x86 has maco like 'i486', '__pentium_mmx__', etc. AArch64 has '__ARM_NEON', '__ARM_FEATURE_SVE', etc. This patch added __BPF_CPU_VERSION__ macro. Current possible values are 0/1/2/3/4. The following are the -mcpu=... to __BPF_CPU_VERSION__ mapping: ``` cpu __BPF_CPU_VERSION__ no -mcpu=<...> 1 -mcpu=v1 1 -mcpu=v2 2 -mcpu=v3 3 -mcpu=v4 4 -mcpu=generic 1 -mcpu=probe 0 ``` This patch also added some macros for developers to identify some cpu insn features: ``` feature macro enabled in which cpu __BPF_FEATURE_JMP_EXT >= v2 __BPF_FEATURE_JMP32 >= v3 __BPF_FEATURE_ALU32 >= v3 __BPF_FEATURE_LDSX >= v4 __BPF_FEATURE_MOVSX >= v4 __BPF_FEATURE_BSWAP >= v4 __BPF_FEATURE_SDIV_SMOD >= v4 __BPF_FEATURE_GOTOL >= v4 __BPF_FEATURE_ST >= v4 ``` [1] https://lore.kernel.org/bpf/[email protected]/
Sometimes bpf developer might want to develop different codes
based on particular cpu versioins. For example, cpu v1/v2/v3
branch target is 16bit while cpu v4 branch target is 32bit,
thus cpu v4 allows more aggressive loop unrolling than cpu v1/v2/v3
(see [1] for a kernel selftest failure due to this).
We would like to maintain aggressive loop unrolling for cpu v4
while limit loop unrolling for earlier cpu versions.
Another example, signed divide also only available with cpu v4.
Actually, adding cpu specific macros are fairly common
in llvm. For example, x86 has maco like 'i486', 'pentium_mmx', etc.
AArch64 has '__ARM_NEON', '__ARM_FEATURE_SVE', etc.
This patch added BPF_CPU_VERSION macro. Current possible values
are 0/1/2/3/4. The following are the -mcpu=... to BPF_CPU_VERSION
mapping:
This patch also added some macros for developers to identify some cpu
insn features: