Skip to content

[clang] Limit alignment for emitted vectors #98629

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Closed
wants to merge 2 commits into from

Conversation

Fznamznon
Copy link
Contributor

The max aligment that ISel and therefore LLVM Verifier accepts is 2^14. The patch also forces indirect passing and returning of vectors bigger than this limit for generic and x86_64 targets.

The max aligment that ISel and therefore LLVM Verifier accepts is 2^14.
The patch also forces indirect passing and returning of vectors bigger
than this limit for generic and x86_64 targets.
@Fznamznon Fznamznon requested a review from phoebewang July 12, 2024 13:21
@llvmbot llvmbot added clang Clang issues not falling into any other category backend:X86 clang:frontend Language frontend issues, e.g. anything involving "Sema" clang:codegen IR generation bugs: mangling, exceptions, etc. labels Jul 12, 2024
@llvmbot
Copy link
Member

llvmbot commented Jul 12, 2024

@llvm/pr-subscribers-backend-x86

@llvm/pr-subscribers-clang

Author: Mariya Podchishchaeva (Fznamznon)

Changes

The max aligment that ISel and therefore LLVM Verifier accepts is 2^14. The patch also forces indirect passing and returning of vectors bigger than this limit for generic and x86_64 targets.


Full diff: https://github.com/llvm/llvm-project/pull/98629.diff

6 Files Affected:

  • (modified) clang/include/clang/Basic/TargetInfo.h (+3)
  • (modified) clang/lib/AST/ASTContext.cpp (+2)
  • (modified) clang/lib/Basic/TargetInfo.cpp (+2)
  • (modified) clang/lib/CodeGen/ABIInfoImpl.cpp (+8)
  • (modified) clang/lib/CodeGen/Targets/X86.cpp (+4)
  • (added) clang/test/CodeGen/big-vectors.c (+17)
diff --git a/clang/include/clang/Basic/TargetInfo.h b/clang/include/clang/Basic/TargetInfo.h
index a58fb5f979272..5f4e5c50bf7df 100644
--- a/clang/include/clang/Basic/TargetInfo.h
+++ b/clang/include/clang/Basic/TargetInfo.h
@@ -133,6 +133,7 @@ struct TransferrableTargetInfo {
   unsigned short SuitableAlign;
   unsigned short NewAlign;
   unsigned MaxVectorAlign;
+  unsigned MaxPossibleAlign;
   unsigned MaxTLSAlign;
 
   const llvm::fltSemantics *HalfFormat, *BFloat16Format, *FloatFormat,
@@ -848,6 +849,8 @@ class TargetInfo : public TransferrableTargetInfo,
   /// Return the maximum vector alignment supported for the given target.
   unsigned getMaxVectorAlign() const { return MaxVectorAlign; }
 
+  unsigned getMaxPossibleAlign() const { return MaxPossibleAlign; }
+
   unsigned getMaxOpenCLWorkGroupSize() const { return MaxOpenCLWorkGroupSize; }
 
   /// Return the alignment (in bits) of the thrown exception object. This is
diff --git a/clang/lib/AST/ASTContext.cpp b/clang/lib/AST/ASTContext.cpp
index 497579dcc56b6..303322a7f0032 100644
--- a/clang/lib/AST/ASTContext.cpp
+++ b/clang/lib/AST/ASTContext.cpp
@@ -1985,6 +1985,8 @@ TypeInfo ASTContext::getTypeInfoImpl(const Type *T) const {
              VT->getVectorKind() == VectorKind::RVVFixedLengthMask)
       // Adjust the alignment for fixed-length RVV vectors.
       Align = std::min<unsigned>(64, Width);
+    else if (Target->getMaxPossibleAlign() < Align)
+      Align = Target->getMaxPossibleAlign();
     break;
   }
 
diff --git a/clang/lib/Basic/TargetInfo.cpp b/clang/lib/Basic/TargetInfo.cpp
index 29f5cd14e46e1..5874df93d712b 100644
--- a/clang/lib/Basic/TargetInfo.cpp
+++ b/clang/lib/Basic/TargetInfo.cpp
@@ -121,6 +121,8 @@ TargetInfo::TargetInfo(const llvm::Triple &T) : Triple(T) {
   LargeArrayAlign = 0;
   MaxAtomicPromoteWidth = MaxAtomicInlineWidth = 0;
   MaxVectorAlign = 0;
+  // LLVM can only process alignment up to 2^14 bytes.
+  MaxPossibleAlign = 8 << 14;
   MaxTLSAlign = 0;
   SizeType = UnsignedLong;
   PtrDiffType = SignedLong;
diff --git a/clang/lib/CodeGen/ABIInfoImpl.cpp b/clang/lib/CodeGen/ABIInfoImpl.cpp
index e9a26abb77837..2b469d999438e 100644
--- a/clang/lib/CodeGen/ABIInfoImpl.cpp
+++ b/clang/lib/CodeGen/ABIInfoImpl.cpp
@@ -38,6 +38,10 @@ ABIArgInfo DefaultABIInfo::classifyArgumentType(QualType Ty) const {
                                 : Context.LongLongTy))
       return getNaturalAlignIndirect(Ty);
 
+  if (Ty->getAs<VectorType>() &&
+      Context.getTypeSize(Ty) > getTarget().getMaxPossibleAlign())
+    return getNaturalAlignIndirect(Ty);
+
   return (isPromotableIntegerTypeForABI(Ty) ? ABIArgInfo::getExtend(Ty)
                                             : ABIArgInfo::getDirect());
 }
@@ -60,6 +64,10 @@ ABIArgInfo DefaultABIInfo::classifyReturnType(QualType RetTy) const {
                                      : getContext().LongLongTy))
       return getNaturalAlignIndirect(RetTy);
 
+  if (RetTy->getAs<VectorType>() &&
+      getContext().getTypeSize(RetTy) > getTarget().getMaxPossibleAlign())
+    return getNaturalAlignIndirect(RetTy);
+
   return (isPromotableIntegerTypeForABI(RetTy) ? ABIArgInfo::getExtend(RetTy)
                                                : ABIArgInfo::getDirect());
 }
diff --git a/clang/lib/CodeGen/Targets/X86.cpp b/clang/lib/CodeGen/Targets/X86.cpp
index 1dc3172a6bdf9..de6f7580475fd 100644
--- a/clang/lib/CodeGen/Targets/X86.cpp
+++ b/clang/lib/CodeGen/Targets/X86.cpp
@@ -2175,6 +2175,10 @@ ABIArgInfo X86_64ABIInfo::getIndirectReturnResult(QualType Ty) const {
     if (Ty->isBitIntType())
       return getNaturalAlignIndirect(Ty);
 
+    if (Ty->getAs<VectorType>() &&
+        getContext().getTypeSize(Ty) > getTarget().getMaxPossibleAlign())
+      return getNaturalAlignIndirect(Ty);
+
     return (isPromotableIntegerTypeForABI(Ty) ? ABIArgInfo::getExtend(Ty)
                                               : ABIArgInfo::getDirect());
   }
diff --git a/clang/test/CodeGen/big-vectors.c b/clang/test/CodeGen/big-vectors.c
new file mode 100644
index 0000000000000..dc44ca07ab8b0
--- /dev/null
+++ b/clang/test/CodeGen/big-vectors.c
@@ -0,0 +1,17 @@
+// RUN: %clang_cc1 -O0 -triple x86_64 %s -emit-llvm -o - | FileCheck --check-prefixes=x86 %s
+// RUN: %clang_cc1 -O0 -triple spir64 %s -emit-llvm -o - | FileCheck --check-prefixes=SPIR %s
+
+typedef float fvec __attribute__((ext_vector_type(5120)));
+fvec foo(fvec a) {
+  fvec c;
+  return c;
+}
+// x86-DAG: define{{.*}}@foo{{.*}}sret(<5120 x float>) align 16384{{.*}}byval(<5120 x float>) align 16384
+// SPIR: define{{.*}}@foo({{.*}}sret(<5120 x float>) align 16384{{.*}}byval(<5120 x float>) align 16384
+
+void bar() {
+  fvec a;
+  fvec c = foo(a);
+// x86-DAG: call void @foo({{.*}}sret(<5120 x float>) align 16384{{.*}}byval(<5120 x float>) align 16384
+// SPIR: call spir_func void @foo({{.*}}sret(<5120 x float>) align 16384{{.*}}byval(<5120 x float>) align 16384
+}

@llvmbot
Copy link
Member

llvmbot commented Jul 12, 2024

@llvm/pr-subscribers-clang-codegen

Author: Mariya Podchishchaeva (Fznamznon)

Changes

The max aligment that ISel and therefore LLVM Verifier accepts is 2^14. The patch also forces indirect passing and returning of vectors bigger than this limit for generic and x86_64 targets.


Full diff: https://github.com/llvm/llvm-project/pull/98629.diff

6 Files Affected:

  • (modified) clang/include/clang/Basic/TargetInfo.h (+3)
  • (modified) clang/lib/AST/ASTContext.cpp (+2)
  • (modified) clang/lib/Basic/TargetInfo.cpp (+2)
  • (modified) clang/lib/CodeGen/ABIInfoImpl.cpp (+8)
  • (modified) clang/lib/CodeGen/Targets/X86.cpp (+4)
  • (added) clang/test/CodeGen/big-vectors.c (+17)
diff --git a/clang/include/clang/Basic/TargetInfo.h b/clang/include/clang/Basic/TargetInfo.h
index a58fb5f979272..5f4e5c50bf7df 100644
--- a/clang/include/clang/Basic/TargetInfo.h
+++ b/clang/include/clang/Basic/TargetInfo.h
@@ -133,6 +133,7 @@ struct TransferrableTargetInfo {
   unsigned short SuitableAlign;
   unsigned short NewAlign;
   unsigned MaxVectorAlign;
+  unsigned MaxPossibleAlign;
   unsigned MaxTLSAlign;
 
   const llvm::fltSemantics *HalfFormat, *BFloat16Format, *FloatFormat,
@@ -848,6 +849,8 @@ class TargetInfo : public TransferrableTargetInfo,
   /// Return the maximum vector alignment supported for the given target.
   unsigned getMaxVectorAlign() const { return MaxVectorAlign; }
 
+  unsigned getMaxPossibleAlign() const { return MaxPossibleAlign; }
+
   unsigned getMaxOpenCLWorkGroupSize() const { return MaxOpenCLWorkGroupSize; }
 
   /// Return the alignment (in bits) of the thrown exception object. This is
diff --git a/clang/lib/AST/ASTContext.cpp b/clang/lib/AST/ASTContext.cpp
index 497579dcc56b6..303322a7f0032 100644
--- a/clang/lib/AST/ASTContext.cpp
+++ b/clang/lib/AST/ASTContext.cpp
@@ -1985,6 +1985,8 @@ TypeInfo ASTContext::getTypeInfoImpl(const Type *T) const {
              VT->getVectorKind() == VectorKind::RVVFixedLengthMask)
       // Adjust the alignment for fixed-length RVV vectors.
       Align = std::min<unsigned>(64, Width);
+    else if (Target->getMaxPossibleAlign() < Align)
+      Align = Target->getMaxPossibleAlign();
     break;
   }
 
diff --git a/clang/lib/Basic/TargetInfo.cpp b/clang/lib/Basic/TargetInfo.cpp
index 29f5cd14e46e1..5874df93d712b 100644
--- a/clang/lib/Basic/TargetInfo.cpp
+++ b/clang/lib/Basic/TargetInfo.cpp
@@ -121,6 +121,8 @@ TargetInfo::TargetInfo(const llvm::Triple &T) : Triple(T) {
   LargeArrayAlign = 0;
   MaxAtomicPromoteWidth = MaxAtomicInlineWidth = 0;
   MaxVectorAlign = 0;
+  // LLVM can only process alignment up to 2^14 bytes.
+  MaxPossibleAlign = 8 << 14;
   MaxTLSAlign = 0;
   SizeType = UnsignedLong;
   PtrDiffType = SignedLong;
diff --git a/clang/lib/CodeGen/ABIInfoImpl.cpp b/clang/lib/CodeGen/ABIInfoImpl.cpp
index e9a26abb77837..2b469d999438e 100644
--- a/clang/lib/CodeGen/ABIInfoImpl.cpp
+++ b/clang/lib/CodeGen/ABIInfoImpl.cpp
@@ -38,6 +38,10 @@ ABIArgInfo DefaultABIInfo::classifyArgumentType(QualType Ty) const {
                                 : Context.LongLongTy))
       return getNaturalAlignIndirect(Ty);
 
+  if (Ty->getAs<VectorType>() &&
+      Context.getTypeSize(Ty) > getTarget().getMaxPossibleAlign())
+    return getNaturalAlignIndirect(Ty);
+
   return (isPromotableIntegerTypeForABI(Ty) ? ABIArgInfo::getExtend(Ty)
                                             : ABIArgInfo::getDirect());
 }
@@ -60,6 +64,10 @@ ABIArgInfo DefaultABIInfo::classifyReturnType(QualType RetTy) const {
                                      : getContext().LongLongTy))
       return getNaturalAlignIndirect(RetTy);
 
+  if (RetTy->getAs<VectorType>() &&
+      getContext().getTypeSize(RetTy) > getTarget().getMaxPossibleAlign())
+    return getNaturalAlignIndirect(RetTy);
+
   return (isPromotableIntegerTypeForABI(RetTy) ? ABIArgInfo::getExtend(RetTy)
                                                : ABIArgInfo::getDirect());
 }
diff --git a/clang/lib/CodeGen/Targets/X86.cpp b/clang/lib/CodeGen/Targets/X86.cpp
index 1dc3172a6bdf9..de6f7580475fd 100644
--- a/clang/lib/CodeGen/Targets/X86.cpp
+++ b/clang/lib/CodeGen/Targets/X86.cpp
@@ -2175,6 +2175,10 @@ ABIArgInfo X86_64ABIInfo::getIndirectReturnResult(QualType Ty) const {
     if (Ty->isBitIntType())
       return getNaturalAlignIndirect(Ty);
 
+    if (Ty->getAs<VectorType>() &&
+        getContext().getTypeSize(Ty) > getTarget().getMaxPossibleAlign())
+      return getNaturalAlignIndirect(Ty);
+
     return (isPromotableIntegerTypeForABI(Ty) ? ABIArgInfo::getExtend(Ty)
                                               : ABIArgInfo::getDirect());
   }
diff --git a/clang/test/CodeGen/big-vectors.c b/clang/test/CodeGen/big-vectors.c
new file mode 100644
index 0000000000000..dc44ca07ab8b0
--- /dev/null
+++ b/clang/test/CodeGen/big-vectors.c
@@ -0,0 +1,17 @@
+// RUN: %clang_cc1 -O0 -triple x86_64 %s -emit-llvm -o - | FileCheck --check-prefixes=x86 %s
+// RUN: %clang_cc1 -O0 -triple spir64 %s -emit-llvm -o - | FileCheck --check-prefixes=SPIR %s
+
+typedef float fvec __attribute__((ext_vector_type(5120)));
+fvec foo(fvec a) {
+  fvec c;
+  return c;
+}
+// x86-DAG: define{{.*}}@foo{{.*}}sret(<5120 x float>) align 16384{{.*}}byval(<5120 x float>) align 16384
+// SPIR: define{{.*}}@foo({{.*}}sret(<5120 x float>) align 16384{{.*}}byval(<5120 x float>) align 16384
+
+void bar() {
+  fvec a;
+  fvec c = foo(a);
+// x86-DAG: call void @foo({{.*}}sret(<5120 x float>) align 16384{{.*}}byval(<5120 x float>) align 16384
+// SPIR: call spir_func void @foo({{.*}}sret(<5120 x float>) align 16384{{.*}}byval(<5120 x float>) align 16384
+}

@efriedma-quic
Copy link
Collaborator

efriedma-quic commented Jul 12, 2024

max aligment [...] LLVM Verifier accepts is 2^14

The verifier limit in most places is 2^32. It looks like 2^14 is specifically for ByVal. And there isn't really any good reason for that limit; it was arbitrarily chosen in 66011c1, nearly two decades ago, and could easily be changed if we want.

Can we just fix the backend so ByVal has the same alignment limit as everything else? I really don't want the complexity of multiple different alignment limits here.

@efriedma-quic
Copy link
Collaborator

(The relevant bitfield is was renamed to MemAlign, and is now in llvm/include/llvm/CodeGen/TargetCallingConv.h)

@phoebewang
Copy link
Contributor

max aligment [...] LLVM Verifier accepts is 2^14

The verifier limit in most places is 2^32. It looks like 2^14 is specifically for ByVal. And there isn't really any good reason for that limit; it was arbitrarily chosen in 66011c1, nearly two decades ago, and could easily be changed if we want.

Can we just fix the backend so ByVal has the same alignment limit as everything else? I really don't want the complexity of multiple different alignment limits here.

I think the intention is to reduce the size of ArgFlagsTy. The super large alignment isn't meaningful to any targets. Maybe we can reduce the other limitation to 2^14 as well?
Another alternative suggested by @nikic was to use the free bits in PointerAddrSpace. If we use e.g. 5bits, we can limit it to 2^31, but still smaller than 2^32.

@@ -133,6 +133,7 @@ struct TransferrableTargetInfo {
unsigned short SuitableAlign;
unsigned short NewAlign;
unsigned MaxVectorAlign;
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

What's this used for? Can we limit it to 2^14?

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It's used in the same ASTContext code that's modified by this patch. Probably fine to adjust it for x86.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

There's no special limitaion on x86. We can change ArgFlagsTy if the size is not the concern.

Copy link
Contributor Author

@Fznamznon Fznamznon Jul 15, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

MaxVectorAlign is set for x86 targets. I can use it as a limit as well, but note that this patch also makes indirect return value if size of the argument exceeds the limit. So, some changes can be caused to the current behavior and test changes like in for exampe CodeGen/X86/x86-vec-i128.c:

source/llvm-project/clang/test/CodeGen/X86/x86-vec-i128.c:36:19: error: MEM256ALIGN32: expected string not found in input
// MEM256ALIGN32: define{{.*}} <4 x i64> @test_v32u128(ptr noundef byval(<4 x i64>) align 32 %{{.*}}, ptr noundef byval(<2 x i128>) align 32 %{{.*}})
note: possible intended match here
define dso_local void @test_v32u128(ptr dead_on_unwind noalias writable sret(<4 x i64>) align 32 %agg.result, ptr noundef byval(<4 x i64>) align 32 %0, ptr noundef byval(<2 x i128>) align 32 %1)

If that is correct, I can use MaxVectorAlign for sure.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm surprised the change of alignment affects ABI. Did you just expand

    MaxVectorAlign =
        hasFeature("avx512f") ? 512 : hasFeature("avx") ? 256 : 128;

to other targets in X86.h, or together with the change here?

I assumed the former won't change return type to set, though align may need to update.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Because MaxVectorAlign is set to a smaller value for some x86 targets -

.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I noticed that, but the value is set for Darwin, but the failure prefix is a non Darwin one.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I pasted log in a weird way, sorry, all failures in this particular test I see are for Darwin.

// MEM256ALIGN16: define{{.*}} <4 x i64> @test_v32u128(ptr noundef byval(<4 x i64>) align 16 %{{.*}}, ptr noundef byval(<2 x i128>) align 16 %{{.*}})
                  ^
// MEM512ALIGN32: define{{.*}} <8 x i64> @test_v64u128(ptr noundef byval(<8 x i64>) align 32 %{{.*}}, ptr noundef byval(<4 x i128>) align 32 %{{.*}})
                  ^

Copy link
Contributor

@phoebewang phoebewang Jul 16, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

That makes sense. Passing/returning vector larger than target capacity (e.g. <4 x i64> without AVX) is ABI undefined behavior. The sret change matches with GCC, but may result in potential compatibility issue for Clang. Maybe we won't care it much.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ok, I updated the patch to use MaxVectorAlign instead. Thanks

@efriedma-quic
Copy link
Collaborator

The size of ArgFlagsTy is, as far as I can tell, basically irrelevant: it's only used as part of computing the calling convention in isel. I don't think anyone will notice if it's a few bytes larger.

The non-byval alignment limit was last adjusted in https://reviews.llvm.org/D110451 .

@Fznamznon
Copy link
Contributor Author

BTW, I think the limitation I'm hitting is introduced by https://reviews.llvm.org/D121898 .

@efriedma-quic
Copy link
Collaborator

BTW, I think the limitation I'm hitting is introduced by https://reviews.llvm.org/D121898 .

The limitation existed before that; we just miscompiled instead of reporting an error.

Copy link
Collaborator

@efriedma-quic efriedma-quic left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

What's the relationship between the max vector alignment, and choosing whether to return a vector directly? As long as clang isn't emitting the byval attribute, I don't think it's the frontend's problem.

@efriedma-quic
Copy link
Collaborator

Oh, I see, there are two checks: one for the "align" attribute, and a separate check for the natural alignment of the type.

I'm trying out changing the limit to see how hard it actually is.

Copy link
Contributor

@phoebewang phoebewang left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM.

@Fznamznon
Copy link
Contributor Author

Closing since #99257 fixed the problem without side effects. Thank you very much @efriedma-quic

@Fznamznon Fznamznon closed this Jul 19, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
backend:X86 clang:codegen IR generation bugs: mangling, exceptions, etc. clang:frontend Language frontend issues, e.g. anything involving "Sema" clang Clang issues not falling into any other category
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants