-
Notifications
You must be signed in to change notification settings - Fork 14.3k
[clang] Limit alignment for emitted vectors #98629
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
The max aligment that ISel and therefore LLVM Verifier accepts is 2^14. The patch also forces indirect passing and returning of vectors bigger than this limit for generic and x86_64 targets.
@llvm/pr-subscribers-backend-x86 @llvm/pr-subscribers-clang Author: Mariya Podchishchaeva (Fznamznon) ChangesThe max aligment that ISel and therefore LLVM Verifier accepts is 2^14. The patch also forces indirect passing and returning of vectors bigger than this limit for generic and x86_64 targets. Full diff: https://github.com/llvm/llvm-project/pull/98629.diff 6 Files Affected:
diff --git a/clang/include/clang/Basic/TargetInfo.h b/clang/include/clang/Basic/TargetInfo.h
index a58fb5f979272..5f4e5c50bf7df 100644
--- a/clang/include/clang/Basic/TargetInfo.h
+++ b/clang/include/clang/Basic/TargetInfo.h
@@ -133,6 +133,7 @@ struct TransferrableTargetInfo {
unsigned short SuitableAlign;
unsigned short NewAlign;
unsigned MaxVectorAlign;
+ unsigned MaxPossibleAlign;
unsigned MaxTLSAlign;
const llvm::fltSemantics *HalfFormat, *BFloat16Format, *FloatFormat,
@@ -848,6 +849,8 @@ class TargetInfo : public TransferrableTargetInfo,
/// Return the maximum vector alignment supported for the given target.
unsigned getMaxVectorAlign() const { return MaxVectorAlign; }
+ unsigned getMaxPossibleAlign() const { return MaxPossibleAlign; }
+
unsigned getMaxOpenCLWorkGroupSize() const { return MaxOpenCLWorkGroupSize; }
/// Return the alignment (in bits) of the thrown exception object. This is
diff --git a/clang/lib/AST/ASTContext.cpp b/clang/lib/AST/ASTContext.cpp
index 497579dcc56b6..303322a7f0032 100644
--- a/clang/lib/AST/ASTContext.cpp
+++ b/clang/lib/AST/ASTContext.cpp
@@ -1985,6 +1985,8 @@ TypeInfo ASTContext::getTypeInfoImpl(const Type *T) const {
VT->getVectorKind() == VectorKind::RVVFixedLengthMask)
// Adjust the alignment for fixed-length RVV vectors.
Align = std::min<unsigned>(64, Width);
+ else if (Target->getMaxPossibleAlign() < Align)
+ Align = Target->getMaxPossibleAlign();
break;
}
diff --git a/clang/lib/Basic/TargetInfo.cpp b/clang/lib/Basic/TargetInfo.cpp
index 29f5cd14e46e1..5874df93d712b 100644
--- a/clang/lib/Basic/TargetInfo.cpp
+++ b/clang/lib/Basic/TargetInfo.cpp
@@ -121,6 +121,8 @@ TargetInfo::TargetInfo(const llvm::Triple &T) : Triple(T) {
LargeArrayAlign = 0;
MaxAtomicPromoteWidth = MaxAtomicInlineWidth = 0;
MaxVectorAlign = 0;
+ // LLVM can only process alignment up to 2^14 bytes.
+ MaxPossibleAlign = 8 << 14;
MaxTLSAlign = 0;
SizeType = UnsignedLong;
PtrDiffType = SignedLong;
diff --git a/clang/lib/CodeGen/ABIInfoImpl.cpp b/clang/lib/CodeGen/ABIInfoImpl.cpp
index e9a26abb77837..2b469d999438e 100644
--- a/clang/lib/CodeGen/ABIInfoImpl.cpp
+++ b/clang/lib/CodeGen/ABIInfoImpl.cpp
@@ -38,6 +38,10 @@ ABIArgInfo DefaultABIInfo::classifyArgumentType(QualType Ty) const {
: Context.LongLongTy))
return getNaturalAlignIndirect(Ty);
+ if (Ty->getAs<VectorType>() &&
+ Context.getTypeSize(Ty) > getTarget().getMaxPossibleAlign())
+ return getNaturalAlignIndirect(Ty);
+
return (isPromotableIntegerTypeForABI(Ty) ? ABIArgInfo::getExtend(Ty)
: ABIArgInfo::getDirect());
}
@@ -60,6 +64,10 @@ ABIArgInfo DefaultABIInfo::classifyReturnType(QualType RetTy) const {
: getContext().LongLongTy))
return getNaturalAlignIndirect(RetTy);
+ if (RetTy->getAs<VectorType>() &&
+ getContext().getTypeSize(RetTy) > getTarget().getMaxPossibleAlign())
+ return getNaturalAlignIndirect(RetTy);
+
return (isPromotableIntegerTypeForABI(RetTy) ? ABIArgInfo::getExtend(RetTy)
: ABIArgInfo::getDirect());
}
diff --git a/clang/lib/CodeGen/Targets/X86.cpp b/clang/lib/CodeGen/Targets/X86.cpp
index 1dc3172a6bdf9..de6f7580475fd 100644
--- a/clang/lib/CodeGen/Targets/X86.cpp
+++ b/clang/lib/CodeGen/Targets/X86.cpp
@@ -2175,6 +2175,10 @@ ABIArgInfo X86_64ABIInfo::getIndirectReturnResult(QualType Ty) const {
if (Ty->isBitIntType())
return getNaturalAlignIndirect(Ty);
+ if (Ty->getAs<VectorType>() &&
+ getContext().getTypeSize(Ty) > getTarget().getMaxPossibleAlign())
+ return getNaturalAlignIndirect(Ty);
+
return (isPromotableIntegerTypeForABI(Ty) ? ABIArgInfo::getExtend(Ty)
: ABIArgInfo::getDirect());
}
diff --git a/clang/test/CodeGen/big-vectors.c b/clang/test/CodeGen/big-vectors.c
new file mode 100644
index 0000000000000..dc44ca07ab8b0
--- /dev/null
+++ b/clang/test/CodeGen/big-vectors.c
@@ -0,0 +1,17 @@
+// RUN: %clang_cc1 -O0 -triple x86_64 %s -emit-llvm -o - | FileCheck --check-prefixes=x86 %s
+// RUN: %clang_cc1 -O0 -triple spir64 %s -emit-llvm -o - | FileCheck --check-prefixes=SPIR %s
+
+typedef float fvec __attribute__((ext_vector_type(5120)));
+fvec foo(fvec a) {
+ fvec c;
+ return c;
+}
+// x86-DAG: define{{.*}}@foo{{.*}}sret(<5120 x float>) align 16384{{.*}}byval(<5120 x float>) align 16384
+// SPIR: define{{.*}}@foo({{.*}}sret(<5120 x float>) align 16384{{.*}}byval(<5120 x float>) align 16384
+
+void bar() {
+ fvec a;
+ fvec c = foo(a);
+// x86-DAG: call void @foo({{.*}}sret(<5120 x float>) align 16384{{.*}}byval(<5120 x float>) align 16384
+// SPIR: call spir_func void @foo({{.*}}sret(<5120 x float>) align 16384{{.*}}byval(<5120 x float>) align 16384
+}
|
@llvm/pr-subscribers-clang-codegen Author: Mariya Podchishchaeva (Fznamznon) ChangesThe max aligment that ISel and therefore LLVM Verifier accepts is 2^14. The patch also forces indirect passing and returning of vectors bigger than this limit for generic and x86_64 targets. Full diff: https://github.com/llvm/llvm-project/pull/98629.diff 6 Files Affected:
diff --git a/clang/include/clang/Basic/TargetInfo.h b/clang/include/clang/Basic/TargetInfo.h
index a58fb5f979272..5f4e5c50bf7df 100644
--- a/clang/include/clang/Basic/TargetInfo.h
+++ b/clang/include/clang/Basic/TargetInfo.h
@@ -133,6 +133,7 @@ struct TransferrableTargetInfo {
unsigned short SuitableAlign;
unsigned short NewAlign;
unsigned MaxVectorAlign;
+ unsigned MaxPossibleAlign;
unsigned MaxTLSAlign;
const llvm::fltSemantics *HalfFormat, *BFloat16Format, *FloatFormat,
@@ -848,6 +849,8 @@ class TargetInfo : public TransferrableTargetInfo,
/// Return the maximum vector alignment supported for the given target.
unsigned getMaxVectorAlign() const { return MaxVectorAlign; }
+ unsigned getMaxPossibleAlign() const { return MaxPossibleAlign; }
+
unsigned getMaxOpenCLWorkGroupSize() const { return MaxOpenCLWorkGroupSize; }
/// Return the alignment (in bits) of the thrown exception object. This is
diff --git a/clang/lib/AST/ASTContext.cpp b/clang/lib/AST/ASTContext.cpp
index 497579dcc56b6..303322a7f0032 100644
--- a/clang/lib/AST/ASTContext.cpp
+++ b/clang/lib/AST/ASTContext.cpp
@@ -1985,6 +1985,8 @@ TypeInfo ASTContext::getTypeInfoImpl(const Type *T) const {
VT->getVectorKind() == VectorKind::RVVFixedLengthMask)
// Adjust the alignment for fixed-length RVV vectors.
Align = std::min<unsigned>(64, Width);
+ else if (Target->getMaxPossibleAlign() < Align)
+ Align = Target->getMaxPossibleAlign();
break;
}
diff --git a/clang/lib/Basic/TargetInfo.cpp b/clang/lib/Basic/TargetInfo.cpp
index 29f5cd14e46e1..5874df93d712b 100644
--- a/clang/lib/Basic/TargetInfo.cpp
+++ b/clang/lib/Basic/TargetInfo.cpp
@@ -121,6 +121,8 @@ TargetInfo::TargetInfo(const llvm::Triple &T) : Triple(T) {
LargeArrayAlign = 0;
MaxAtomicPromoteWidth = MaxAtomicInlineWidth = 0;
MaxVectorAlign = 0;
+ // LLVM can only process alignment up to 2^14 bytes.
+ MaxPossibleAlign = 8 << 14;
MaxTLSAlign = 0;
SizeType = UnsignedLong;
PtrDiffType = SignedLong;
diff --git a/clang/lib/CodeGen/ABIInfoImpl.cpp b/clang/lib/CodeGen/ABIInfoImpl.cpp
index e9a26abb77837..2b469d999438e 100644
--- a/clang/lib/CodeGen/ABIInfoImpl.cpp
+++ b/clang/lib/CodeGen/ABIInfoImpl.cpp
@@ -38,6 +38,10 @@ ABIArgInfo DefaultABIInfo::classifyArgumentType(QualType Ty) const {
: Context.LongLongTy))
return getNaturalAlignIndirect(Ty);
+ if (Ty->getAs<VectorType>() &&
+ Context.getTypeSize(Ty) > getTarget().getMaxPossibleAlign())
+ return getNaturalAlignIndirect(Ty);
+
return (isPromotableIntegerTypeForABI(Ty) ? ABIArgInfo::getExtend(Ty)
: ABIArgInfo::getDirect());
}
@@ -60,6 +64,10 @@ ABIArgInfo DefaultABIInfo::classifyReturnType(QualType RetTy) const {
: getContext().LongLongTy))
return getNaturalAlignIndirect(RetTy);
+ if (RetTy->getAs<VectorType>() &&
+ getContext().getTypeSize(RetTy) > getTarget().getMaxPossibleAlign())
+ return getNaturalAlignIndirect(RetTy);
+
return (isPromotableIntegerTypeForABI(RetTy) ? ABIArgInfo::getExtend(RetTy)
: ABIArgInfo::getDirect());
}
diff --git a/clang/lib/CodeGen/Targets/X86.cpp b/clang/lib/CodeGen/Targets/X86.cpp
index 1dc3172a6bdf9..de6f7580475fd 100644
--- a/clang/lib/CodeGen/Targets/X86.cpp
+++ b/clang/lib/CodeGen/Targets/X86.cpp
@@ -2175,6 +2175,10 @@ ABIArgInfo X86_64ABIInfo::getIndirectReturnResult(QualType Ty) const {
if (Ty->isBitIntType())
return getNaturalAlignIndirect(Ty);
+ if (Ty->getAs<VectorType>() &&
+ getContext().getTypeSize(Ty) > getTarget().getMaxPossibleAlign())
+ return getNaturalAlignIndirect(Ty);
+
return (isPromotableIntegerTypeForABI(Ty) ? ABIArgInfo::getExtend(Ty)
: ABIArgInfo::getDirect());
}
diff --git a/clang/test/CodeGen/big-vectors.c b/clang/test/CodeGen/big-vectors.c
new file mode 100644
index 0000000000000..dc44ca07ab8b0
--- /dev/null
+++ b/clang/test/CodeGen/big-vectors.c
@@ -0,0 +1,17 @@
+// RUN: %clang_cc1 -O0 -triple x86_64 %s -emit-llvm -o - | FileCheck --check-prefixes=x86 %s
+// RUN: %clang_cc1 -O0 -triple spir64 %s -emit-llvm -o - | FileCheck --check-prefixes=SPIR %s
+
+typedef float fvec __attribute__((ext_vector_type(5120)));
+fvec foo(fvec a) {
+ fvec c;
+ return c;
+}
+// x86-DAG: define{{.*}}@foo{{.*}}sret(<5120 x float>) align 16384{{.*}}byval(<5120 x float>) align 16384
+// SPIR: define{{.*}}@foo({{.*}}sret(<5120 x float>) align 16384{{.*}}byval(<5120 x float>) align 16384
+
+void bar() {
+ fvec a;
+ fvec c = foo(a);
+// x86-DAG: call void @foo({{.*}}sret(<5120 x float>) align 16384{{.*}}byval(<5120 x float>) align 16384
+// SPIR: call spir_func void @foo({{.*}}sret(<5120 x float>) align 16384{{.*}}byval(<5120 x float>) align 16384
+}
|
The verifier limit in most places is 2^32. It looks like 2^14 is specifically for ByVal. And there isn't really any good reason for that limit; it was arbitrarily chosen in 66011c1, nearly two decades ago, and could easily be changed if we want. Can we just fix the backend so ByVal has the same alignment limit as everything else? I really don't want the complexity of multiple different alignment limits here. |
(The relevant bitfield is was renamed to MemAlign, and is now in llvm/include/llvm/CodeGen/TargetCallingConv.h) |
I think the intention is to reduce the size of ArgFlagsTy. The super large alignment isn't meaningful to any targets. Maybe we can reduce the other limitation to 2^14 as well? |
@@ -133,6 +133,7 @@ struct TransferrableTargetInfo { | |||
unsigned short SuitableAlign; | |||
unsigned short NewAlign; | |||
unsigned MaxVectorAlign; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
What's this used for? Can we limit it to 2^14?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It's used in the same ASTContext code that's modified by this patch. Probably fine to adjust it for x86.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
There's no special limitaion on x86. We can change ArgFlagsTy if the size is not the concern.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
MaxVectorAlign
is set for x86 targets. I can use it as a limit as well, but note that this patch also makes indirect return value if size of the argument exceeds the limit. So, some changes can be caused to the current behavior and test changes like in for exampe CodeGen/X86/x86-vec-i128.c
:
source/llvm-project/clang/test/CodeGen/X86/x86-vec-i128.c:36:19: error: MEM256ALIGN32: expected string not found in input
// MEM256ALIGN32: define{{.*}} <4 x i64> @test_v32u128(ptr noundef byval(<4 x i64>) align 32 %{{.*}}, ptr noundef byval(<2 x i128>) align 32 %{{.*}})
note: possible intended match here
define dso_local void @test_v32u128(ptr dead_on_unwind noalias writable sret(<4 x i64>) align 32 %agg.result, ptr noundef byval(<4 x i64>) align 32 %0, ptr noundef byval(<2 x i128>) align 32 %1)
If that is correct, I can use MaxVectorAlign
for sure.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'm surprised the change of alignment affects ABI. Did you just expand
MaxVectorAlign =
hasFeature("avx512f") ? 512 : hasFeature("avx") ? 256 : 128;
to other targets in X86.h, or together with the change here?
I assumed the former won't change return type to set
, though align may need to update.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Because MaxVectorAlign
is set to a smaller value for some x86 targets -
llvm-project/clang/lib/Basic/Targets/X86.h
Line 954 in d8f0611
MaxVectorAlign = |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I noticed that, but the value is set for Darwin, but the failure prefix is a non Darwin one.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I pasted log in a weird way, sorry, all failures in this particular test I see are for Darwin.
// MEM256ALIGN16: define{{.*}} <4 x i64> @test_v32u128(ptr noundef byval(<4 x i64>) align 16 %{{.*}}, ptr noundef byval(<2 x i128>) align 16 %{{.*}})
^
// MEM512ALIGN32: define{{.*}} <8 x i64> @test_v64u128(ptr noundef byval(<8 x i64>) align 32 %{{.*}}, ptr noundef byval(<4 x i128>) align 32 %{{.*}})
^
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
That makes sense. Passing/returning vector larger than target capacity (e.g. <4 x i64> without AVX) is ABI undefined behavior. The sret change matches with GCC, but may result in potential compatibility issue for Clang. Maybe we won't care it much.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Ok, I updated the patch to use MaxVectorAlign
instead. Thanks
The size of ArgFlagsTy is, as far as I can tell, basically irrelevant: it's only used as part of computing the calling convention in isel. I don't think anyone will notice if it's a few bytes larger. The non-byval alignment limit was last adjusted in https://reviews.llvm.org/D110451 . |
BTW, I think the limitation I'm hitting is introduced by https://reviews.llvm.org/D121898 . |
The limitation existed before that; we just miscompiled instead of reporting an error. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
What's the relationship between the max vector alignment, and choosing whether to return a vector directly? As long as clang isn't emitting the byval attribute, I don't think it's the frontend's problem.
Oh, I see, there are two checks: one for the "align" attribute, and a separate check for the natural alignment of the type. I'm trying out changing the limit to see how hard it actually is. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM.
Closing since #99257 fixed the problem without side effects. Thank you very much @efriedma-quic |
The max aligment that ISel and therefore LLVM Verifier accepts is 2^14. The patch also forces indirect passing and returning of vectors bigger than this limit for generic and x86_64 targets.