-
Notifications
You must be signed in to change notification settings - Fork 14.3k
[CIR] Upstream support for record packing and padding #136036
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
This change adds support for packing and padding record types in ClangIR and introduces some infrastructure needed for this computation. Although union support has not been upstreamed yet, there was no good way to report unions as NYI in the layout computation, so the code added here includes layout computation for unions. Unions will be added soon.
@llvm/pr-subscribers-clangir @llvm/pr-subscribers-clang Author: Andy Kaylor (andykaylor) ChangesThis change adds support for packing and padding record types in ClangIR and introduces some infrastructure needed for this computation. Although union support has not been upstreamed yet, there is no good way to report unions as NYI in the layout computation, so the code added here includes layout computation for unions. Unions will be added soon. Patch is 21.23 KiB, truncated to 20.00 KiB below, full version: https://github.com/llvm/llvm-project/pull/136036.diff 9 Files Affected:
diff --git a/clang/include/clang/CIR/Dialect/IR/CIRAttrs.td b/clang/include/clang/CIR/Dialect/IR/CIRAttrs.td
index 3680ded4afafe..792940857d60c 100644
--- a/clang/include/clang/CIR/Dialect/IR/CIRAttrs.td
+++ b/clang/include/clang/CIR/Dialect/IR/CIRAttrs.td
@@ -229,4 +229,32 @@ def ConstPtrAttr : CIR_Attr<"ConstPtr", "ptr", [TypedAttrInterface]> {
}];
}
+//===----------------------------------------------------------------------===//
+// RecordLayoutAttr
+//===----------------------------------------------------------------------===//
+
+// Used to decouple layout information from the record type. RecordType's
+// uses this attribute to cache that information.
+
+def RecordLayoutAttr : CIR_Attr<"RecordLayout", "record_layout"> {
+ let summary = "ABI specific information about a record layout";
+ let description = [{
+ Holds layout information often queried by !cir.record users
+ during lowering passes and optimizations.
+ }];
+
+ let parameters = (ins "unsigned":$size,
+ "unsigned":$alignment,
+ "bool":$padded,
+ "mlir::Type":$largest_member,
+ "mlir::ArrayAttr":$offsets);
+
+ let genVerifyDecl = 1;
+ let assemblyFormat = [{
+ `<`
+ struct($size, $alignment, $padded, $largest_member, $offsets)
+ `>`
+ }];
+}
+
#endif // LLVM_CLANG_CIR_DIALECT_IR_CIRATTRS_TD
diff --git a/clang/include/clang/CIR/Dialect/IR/CIRDataLayout.h b/clang/include/clang/CIR/Dialect/IR/CIRDataLayout.h
new file mode 100644
index 0000000000000..62fc53f2362fb
--- /dev/null
+++ b/clang/include/clang/CIR/Dialect/IR/CIRDataLayout.h
@@ -0,0 +1,33 @@
+//===----------------------------------------------------------------------===//
+//
+// Part of the LLVM Project, under the Apache License v2.0 with LLVM Exceptions.
+// See https://llvm.org/LICENSE.txt for license information.
+// SPDX-License-Identifier: Apache-2.0 WITH LLVM-exception
+//
+//===----------------------------------------------------------------------===//
+// Provides an LLVM-like API wrapper to DLTI and MLIR layout queries. This
+// makes it easier to port some of LLVM codegen layout logic to CIR.
+//===----------------------------------------------------------------------===//
+
+#ifndef CLANG_CIR_DIALECT_IR_CIRDATALAYOUT_H
+#define CLANG_CIR_DIALECT_IR_CIRDATALAYOUT_H
+
+#include "mlir/IR/BuiltinOps.h"
+
+namespace cir {
+
+// TODO(cir): This might be replaced by a CIRDataLayout interface which can
+// provide the same functionalities.
+class CIRDataLayout {
+ // This is starting with the minimum functionality needed for code that is
+ // being upstreamed. Additional methods and members will be added as needed.
+public:
+ mlir::DataLayout layout;
+
+ /// Constructs a DataLayout the module's data layout attribute.
+ CIRDataLayout(mlir::ModuleOp modOp) : layout{modOp} {}
+};
+
+} // namespace cir
+
+#endif // CLANG_CIR_DIALECT_IR_CIRDATALAYOUT_H
diff --git a/clang/include/clang/CIR/Dialect/IR/CIRTypes.td b/clang/include/clang/CIR/Dialect/IR/CIRTypes.td
index 23e20755dca95..3cec3701ab9b3 100644
--- a/clang/include/clang/CIR/Dialect/IR/CIRTypes.td
+++ b/clang/include/clang/CIR/Dialect/IR/CIRTypes.td
@@ -502,6 +502,15 @@ def CIR_RecordType : CIR_Type<"Record", "record",
void complete(llvm::ArrayRef<mlir::Type> members, bool packed,
bool isPadded);
+
+ // Utilities for lazily computing and cacheing data layout info.
+ // FIXME: currently opaque because there's a cycle if CIRTypes.types include
+ // from CIRAttrs.h. The implementation operates in terms of RecordLayoutAttr
+ // instead.
+ private:
+ mutable mlir::Attribute layoutInfo;
+ void computeSizeAndAlignment(const mlir::DataLayout &dataLayout) const;
+ public:
}];
let hasCustomAssemblyFormat = 1;
diff --git a/clang/include/clang/CIR/MissingFeatures.h b/clang/include/clang/CIR/MissingFeatures.h
index 5ba55c53dfc4d..798fcb0900a91 100644
--- a/clang/include/clang/CIR/MissingFeatures.h
+++ b/clang/include/clang/CIR/MissingFeatures.h
@@ -102,13 +102,10 @@ struct MissingFeatures {
static bool shouldReverseUnaryCondOnBoolExpr() { return false; }
// RecordType
- static bool recordTypeLayoutInfo() { return false; }
static bool recursiveRecordLayout() { return false; }
static bool skippedLayout() { return false; }
static bool astRecordDeclAttr() { return false; }
static bool cxxSupport() { return false; }
- static bool packedRecords() { return false; }
- static bool recordPadding() { return false; }
static bool recordZeroInit() { return false; }
static bool zeroSizeRecordMembers() { return false; }
diff --git a/clang/lib/CIR/CodeGen/CIRGenRecordLayoutBuilder.cpp b/clang/lib/CIR/CodeGen/CIRGenRecordLayoutBuilder.cpp
index 4e3adeaf50187..5e209b5b92503 100644
--- a/clang/lib/CIR/CodeGen/CIRGenRecordLayoutBuilder.cpp
+++ b/clang/lib/CIR/CodeGen/CIRGenRecordLayoutBuilder.cpp
@@ -19,6 +19,7 @@
#include "clang/AST/DeclCXX.h"
#include "clang/AST/RecordLayout.h"
#include "clang/CIR/Dialect/IR/CIRAttrs.h"
+#include "clang/CIR/Dialect/IR/CIRDataLayout.h"
#include "llvm/Support/Casting.h"
#include <memory>
@@ -57,8 +58,18 @@ struct CIRRecordLowering final {
CIRRecordLowering(CIRGenTypes &cirGenTypes, const RecordDecl *recordDecl,
bool isPacked);
+ /// Constructs a MemberInfo instance from an offset and mlir::Type.
+ MemberInfo makeStorageInfo(CharUnits offset, mlir::Type data) {
+ return MemberInfo(offset, MemberInfo::InfoKind::Field, data);
+ }
+
void lower();
+ /// Determines if we need a packed llvm struct.
+ void determinePacked();
+ /// Inserts padding everywhere it's needed.
+ void insertPadding();
+
void accumulateFields();
CharUnits bitsToCharUnits(uint64_t bitOffset) {
@@ -66,12 +77,33 @@ struct CIRRecordLowering final {
}
CharUnits getSize(mlir::Type Ty) {
- assert(!cir::MissingFeatures::recordTypeLayoutInfo());
- return CharUnits::One();
+ return CharUnits::fromQuantity(dataLayout.layout.getTypeSize(Ty));
}
CharUnits getAlignment(mlir::Type Ty) {
- assert(!cir::MissingFeatures::recordTypeLayoutInfo());
- return CharUnits::One();
+ return CharUnits::fromQuantity(dataLayout.layout.getTypeABIAlignment(Ty));
+ }
+
+ /// Wraps cir::IntType with some implicit arguments.
+ mlir::Type getUIntNType(uint64_t numBits) {
+ unsigned alignedBits = llvm::PowerOf2Ceil(numBits);
+ alignedBits = std::max(8u, alignedBits);
+ return cir::IntType::get(&cirGenTypes.getMLIRContext(), alignedBits,
+ /*isSigned=*/false);
+ }
+
+ mlir::Type getCharType() {
+ return cir::IntType::get(&cirGenTypes.getMLIRContext(),
+ astContext.getCharWidth(),
+ /*isSigned=*/false);
+ }
+
+ mlir::Type getByteArrayType(CharUnits numberOfChars) {
+ assert(!numberOfChars.isZero() && "Empty byte arrays aren't allowed.");
+ mlir::Type type = getCharType();
+ return numberOfChars == CharUnits::One()
+ ? type
+ : cir::ArrayType::get(type.getContext(), type,
+ numberOfChars.getQuantity());
}
mlir::Type getStorageType(const FieldDecl *fieldDecl) {
@@ -100,6 +132,7 @@ struct CIRRecordLowering final {
// Output fields, consumed by CIRGenTypes::computeRecordLayout
llvm::SmallVector<mlir::Type, 16> fieldTypes;
llvm::DenseMap<const FieldDecl *, unsigned> fields;
+ cir::CIRDataLayout dataLayout;
LLVM_PREFERRED_TYPE(bool)
unsigned zeroInitializable : 1;
@@ -121,6 +154,7 @@ CIRRecordLowering::CIRRecordLowering(CIRGenTypes &cirGenTypes,
astContext(cirGenTypes.getASTContext()), recordDecl(recordDecl),
astRecordLayout(
cirGenTypes.getASTContext().getASTRecordLayout(recordDecl)),
+ dataLayout(cirGenTypes.getCGModule().getModule()),
zeroInitializable(true), packed(isPacked), padded(false) {}
void CIRRecordLowering::lower() {
@@ -138,18 +172,20 @@ void CIRRecordLowering::lower() {
assert(!cir::MissingFeatures::cxxSupport());
+ CharUnits size = astRecordLayout.getSize();
+
accumulateFields();
llvm::stable_sort(members);
// TODO: implement clipTailPadding once bitfields are implemented
assert(!cir::MissingFeatures::bitfields());
- // TODO: implemented packed records
- assert(!cir::MissingFeatures::packedRecords());
- // TODO: implement padding
- assert(!cir::MissingFeatures::recordPadding());
- // TODO: support zeroInit
assert(!cir::MissingFeatures::recordZeroInit());
+ members.push_back(makeStorageInfo(size, getUIntNType(8)));
+ determinePacked();
+ insertPadding();
+ members.pop_back();
+
fillOutputFields();
}
@@ -186,6 +222,56 @@ void CIRRecordLowering::accumulateFields() {
}
}
+void CIRRecordLowering::determinePacked() {
+ if (packed)
+ return;
+ CharUnits alignment = CharUnits::One();
+
+ // TODO(cir): handle non-virtual base types
+ assert(!cir::MissingFeatures::cxxSupport());
+
+ for (const MemberInfo &member : members) {
+ if (!member.data)
+ continue;
+ // If any member falls at an offset that it not a multiple of its alignment,
+ // then the entire record must be packed.
+ if (member.offset % getAlignment(member.data))
+ packed = true;
+ alignment = std::max(alignment, getAlignment(member.data));
+ }
+ // If the size of the record (the capstone's offset) is not a multiple of the
+ // record's alignment, it must be packed.
+ if (members.back().offset % alignment)
+ packed = true;
+ // Update the alignment of the sentinel.
+ if (!packed)
+ members.back().data = getUIntNType(astContext.toBits(alignment));
+}
+
+void CIRRecordLowering::insertPadding() {
+ std::vector<std::pair<CharUnits, CharUnits>> padding;
+ CharUnits size = CharUnits::Zero();
+ for (const MemberInfo &member : members) {
+ if (!member.data)
+ continue;
+ CharUnits offset = member.offset;
+ assert(offset >= size);
+ // Insert padding if we need to.
+ if (offset !=
+ size.alignTo(packed ? CharUnits::One() : getAlignment(member.data)))
+ padding.push_back(std::make_pair(size, offset - size));
+ size = offset + getSize(member.data);
+ }
+ if (padding.empty())
+ return;
+ padded = true;
+ // Add the padding to the Members list and sort it.
+ for (const std::pair<CharUnits, CharUnits> &paddingPair : padding)
+ members.push_back(makeStorageInfo(paddingPair.first,
+ getByteArrayType(paddingPair.second)));
+ llvm::stable_sort(members);
+}
+
std::unique_ptr<CIRGenRecordLayout>
CIRGenTypes::computeRecordLayout(const RecordDecl *rd, cir::RecordType *ty) {
CIRRecordLowering lowering(*this, rd, /*packed=*/false);
diff --git a/clang/lib/CIR/Dialect/IR/CIRAttrs.cpp b/clang/lib/CIR/Dialect/IR/CIRAttrs.cpp
index a8d9f6a0e6e9b..cc3baf1f348f9 100644
--- a/clang/lib/CIR/Dialect/IR/CIRAttrs.cpp
+++ b/clang/lib/CIR/Dialect/IR/CIRAttrs.cpp
@@ -55,6 +55,22 @@ void CIRDialect::printAttribute(Attribute attr, DialectAsmPrinter &os) const {
llvm_unreachable("unexpected CIR type kind");
}
+//===----------------------------------------------------------------------===//
+// RecordLayoutAttr definitions
+//===----------------------------------------------------------------------===//
+
+LogicalResult RecordLayoutAttr::verify(
+ ::llvm::function_ref<::mlir::InFlightDiagnostic()> emitError, unsigned size,
+ unsigned alignment, bool padded, mlir::Type largest_member,
+ mlir::ArrayAttr offsets) {
+ if (not std::all_of(offsets.begin(), offsets.end(), [](mlir::Attribute attr) {
+ return mlir::isa<mlir::IntegerAttr>(attr);
+ })) {
+ return emitError() << "all index values must be integers";
+ }
+ return success();
+}
+
//===----------------------------------------------------------------------===//
// ConstPtrAttr definitions
//===----------------------------------------------------------------------===//
diff --git a/clang/lib/CIR/Dialect/IR/CIRTypes.cpp b/clang/lib/CIR/Dialect/IR/CIRTypes.cpp
index 160732d9c3610..527607c4c6d72 100644
--- a/clang/lib/CIR/Dialect/IR/CIRTypes.cpp
+++ b/clang/lib/CIR/Dialect/IR/CIRTypes.cpp
@@ -177,6 +177,12 @@ void RecordType::print(mlir::AsmPrinter &printer) const {
// Type not yet printed: continue printing the entire record.
printer << ' ';
+ if (getPacked())
+ printer << "packed ";
+
+ if (getPadded())
+ printer << "padded ";
+
if (isIncomplete()) {
printer << "incomplete";
} else {
@@ -210,6 +216,10 @@ mlir::StringAttr RecordType::getName() const { return getImpl()->name; }
bool RecordType::getIncomplete() const { return getImpl()->incomplete; }
+bool RecordType::getPacked() const { return getImpl()->packed; }
+
+bool RecordType::getPadded() const { return getImpl()->padded; }
+
cir::RecordType::RecordKind RecordType::getKind() const {
return getImpl()->kind;
}
@@ -225,17 +235,108 @@ void RecordType::complete(ArrayRef<Type> members, bool packed, bool padded) {
//===----------------------------------------------------------------------===//
llvm::TypeSize
-RecordType::getTypeSizeInBits(const ::mlir::DataLayout &dataLayout,
- ::mlir::DataLayoutEntryListRef params) const {
- assert(!cir::MissingFeatures::recordTypeLayoutInfo());
- return llvm::TypeSize::getFixed(8);
+RecordType::getTypeSizeInBits(const mlir::DataLayout &dataLayout,
+ mlir::DataLayoutEntryListRef params) const {
+ if (!layoutInfo)
+ computeSizeAndAlignment(dataLayout);
+ return llvm::TypeSize::getFixed(
+ mlir::cast<cir::RecordLayoutAttr>(layoutInfo).getSize() * 8);
}
uint64_t
RecordType::getABIAlignment(const ::mlir::DataLayout &dataLayout,
::mlir::DataLayoutEntryListRef params) const {
- assert(!cir::MissingFeatures::recordTypeLayoutInfo());
- return 4;
+ if (!layoutInfo)
+ computeSizeAndAlignment(dataLayout);
+ return mlir::cast<cir::RecordLayoutAttr>(layoutInfo).getAlignment();
+}
+
+void RecordType::computeSizeAndAlignment(
+ const mlir::DataLayout &dataLayout) const {
+ assert(isComplete() && "Cannot get layout of incomplete records");
+ // Do not recompute.
+ if (layoutInfo)
+ return;
+
+ // This is a similar algorithm to LLVM's StructLayout.
+ unsigned recordSize = 0;
+ llvm::Align recordAlignment{1};
+ bool isPadded = false;
+ unsigned numElements = getNumElements();
+ ArrayRef<mlir::Type> members = getMembers();
+ mlir::Type largestMember;
+ unsigned largestMemberSize = 0;
+ llvm::SmallVector<mlir::Attribute, 4> memberOffsets;
+
+ bool dontCountLastElt = isUnion() && getPadded();
+ if (dontCountLastElt)
+ numElements--;
+
+ // Loop over each of the elements, placing them in memory.
+ memberOffsets.reserve(numElements);
+
+ // We can't use a range-based for loop here because we might be ignoring the
+ // last element.
+ for (unsigned i = 0, e = numElements; i != e; ++i) {
+ mlir::Type ty = members[i];
+
+ // If we're processing a union, we need to find the largest member.
+ if (isUnion() && (!largestMember ||
+ dataLayout.getTypeABIAlignment(ty) >
+ dataLayout.getTypeABIAlignment(largestMember) ||
+ (dataLayout.getTypeABIAlignment(ty) ==
+ dataLayout.getTypeABIAlignment(largestMember) &&
+ dataLayout.getTypeSize(ty) > largestMemberSize))) {
+ largestMember = ty;
+ largestMemberSize = dataLayout.getTypeSize(largestMember);
+ }
+
+ // This matches LLVM since it uses the ABI instead of preferred alignment.
+ const llvm::Align tyAlign =
+ llvm::Align(getPacked() ? 1 : dataLayout.getTypeABIAlignment(ty));
+
+ // Add padding if necessary to align the data element properly.
+ if (!llvm::isAligned(tyAlign, recordSize)) {
+ isPadded = true;
+ recordSize = llvm::alignTo(recordSize, tyAlign);
+ }
+
+ // Keep track of maximum alignment constraint.
+ recordAlignment = std::max(tyAlign, recordAlignment);
+
+ // Record size up to each element is the element offset.
+ memberOffsets.push_back(mlir::IntegerAttr::get(
+ mlir::IntegerType::get(getContext(), 32), isUnion() ? 0 : recordSize));
+
+ // Consume space for this data item
+ recordSize += dataLayout.getTypeSize(ty);
+ }
+
+ // For unions, the size and aligment is that of the largest element.
+ if (isUnion()) {
+ recordSize = largestMemberSize;
+ if (getPadded()) {
+ memberOffsets.push_back(mlir::IntegerAttr::get(
+ mlir::IntegerType::get(getContext(), 32), recordSize));
+ mlir::Type ty = members[numElements];
+ recordSize += dataLayout.getTypeSize(ty);
+ isPadded = true;
+ } else {
+ isPadded = false;
+ }
+ } else {
+ // Add padding to the end of the record so that it could be put in an array
+ // and all array elements would be aligned correctly.
+ if (!llvm::isAligned(recordAlignment, recordSize)) {
+ isPadded = true;
+ recordSize = llvm::alignTo(recordSize, recordAlignment);
+ }
+ }
+
+ auto offsets = mlir::ArrayAttr::get(getContext(), memberOffsets);
+ layoutInfo = cir::RecordLayoutAttr::get(getContext(), recordSize,
+ recordAlignment.value(), isPadded,
+ largestMember, offsets);
}
//===----------------------------------------------------------------------===//
diff --git a/clang/lib/CIR/Lowering/DirectToLLVM/LowerToLLVM.cpp b/clang/lib/CIR/Lowering/DirectToLLVM/LowerToLLVM.cpp
index cb318c88c09c5..8c4a67258df3f 100644
--- a/clang/lib/CIR/Lowering/DirectToLLVM/LowerToLLVM.cpp
+++ b/clang/lib/CIR/Lowering/DirectToLLVM/LowerToLLVM.cpp
@@ -1356,12 +1356,11 @@ static void prepareTypeConverter(mlir::LLVMTypeConverter &converter,
if (type.getName()) {
llvmStruct = mlir::LLVM::LLVMStructType::getIdentified(
type.getContext(), type.getPrefixedName());
- assert(!cir::MissingFeatures::packedRecords());
- if (llvmStruct.setBody(llvmMembers, /*isPacked=*/true).failed())
+ if (llvmStruct.setBody(llvmMembers, type.getPacked()).failed())
llvm_unreachable("Failed to set body of record");
} else { // Record has no name: lower as literal record.
llvmStruct = mlir::LLVM::LLVMStructType::getLiteral(
- type.getContext(), llvmMembers, /*isPacked=*/true);
+ type.getContext(), llvmMembers, type.getPacked());
}
return llvmStruct;
diff --git a/clang/test/CIR/CodeGen/struct.c b/clang/test/CIR/CodeGen/struct.c
index e1b01e6a8e86c..55f316f27eeb1 100644
--- a/clang/test/CIR/CodeGen/struct.c
+++ b/clang/test/CIR/CodeGen/struct.c
@@ -5,9 +5,19 @@
// RUN: %clang_cc1 -triple x86_64-unknown-linux-gnu -emit-llvm %s -o %t.ll
// RUN: FileCheck --check-prefix=OGCG --input-file=%t.ll %s
+// For LLVM IR checks, the structs are defined before the variables, so these
+// checks are at the top.
+// LLVM: %struct.CompleteS = type { i32, i8 }
+// LLVM: %struct.PackedS = type <{ i32, i8 }>
+// LLVM: %struct.PackedAndPaddedS = type <{ i32, i8, i8 }>
+// OGCG: %struct.CompleteS = type { i32, i8 }
+// OGCG: %struct.PackedS = type <{ i32, i8 }>
+// OGCG: %struct.PackedAndPaddedS = type <{ i32, i8, i8 }>
+
struct IncompleteS *p;
-// CIR: cir.global external @p = #cir.ptr<null> : !cir.ptr<!cir.record<struct "IncompleteS" incomplete>>
+// CIR: cir.global external @p = #cir.ptr<null> : !cir.ptr<!cir.record<struct
+// CIR-SAME: "IncompleteS" incomplete>>
// LLVM: @p = dso_local global ptr null
// OGCG: @p = global ptr null, align 8
@@ -16,17 +26,44 @@ struct CompleteS {
char b;
} cs;
-// CIR: cir.global external @cs = #cir.zero : !cir.record<struct "CompleteS" {!s32i, !s8i}>
+// CIR: cir.global external @cs = #cir.zero : !cir.record<struct
+// CIR-SAME: "CompleteS" {!s32i, !s8i}>
// LLVM: @cs = dso_local global %struct.CompleteS zeroinitializer
// OGCG: @cs = global %struct.CompleteS zeroinitializer, align 4
+#pragma pack(push)
+#pragma pack(1)
+
+struct PackedS {
+ int a0;
+ char a1;
+} ps;
+
+// CIR: cir.global external @ps = #cir.zero : !cir.record<struct "PackedS"
+// CIR-SAME: packed {!s32i, !s8i}>
+// LLVM: @ps = dso_local global %struct.PackedS zeroinitializer
+// OGCG: @ps = global %struct.PackedS zeroinitialize...
[truncated]
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'm not sure I did a great job reviewing all the layout code, but didn't see anything of concern. I'm somewhat fearful that we're pulling in existing "scary" code from Classic-CodeGen that is a 'thar be dragons' though.
At the same time, re-engineering this at this time also seems scary as well, so I think we have to live with it for now? Anyway, please let others review this before merging.
let parameters = (ins "unsigned":$size, | ||
"unsigned":$alignment, | ||
"bool":$padded, | ||
"mlir::Type":$largest_member, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I don't really like largest_member
as it is only useful when lowering unions and at that particular location one can easily search for it in record? Why to hold additional member in each other struct?
The second use is here:
which for some reason diverges from CodeGen, that does not treat unions here anyhow:
llvm-project/clang/lib/CodeGen/CGCall.cpp
Line 1234 in 45f2716
llvm::Type *FirstElt = SrcSTy->getElementType(0); |
But similarly to @erichkeane I have no idea how this thing is intertwined.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
RecordLayoutAttr
is computed lazily, and the idea is that whenever we have to walk the elements we then compute all properties, because they all require a single pass. You might be right that it's probably not too bad for largest_member
to be done every time this query is needed, but (a) I haven't seen measurements indicating either way (and would pick the compile time defensive approach) and (b) largest_member
makes sense along side the other properties computed here IMO.
which for some reason diverges from CodeGen
Because we have slightly higher level unions than MLIR, this is the price to be paid. FWIW, @gitoleg has done a lot of work to make unions work, including often running csmith and fixing correctness issues.
// from CIRAttrs.h. The implementation operates in terms of RecordLayoutAttr | ||
// instead. | ||
private: | ||
mutable mlir::Attribute layoutInfo; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
So the RecordLayoutAttr
is not supposed to be considered in type uniquer/storage anyhow?
This just comes to me as really unconventional placement of the "cache". If it is not meant to be stored in the IR why is it even Attribute, this could have been just class cacheing desired information?
If the goal was to have it in the IR, I would expect some kay-value storage similar to DLTI_MapAttr
and store it at modul level?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Investigating this more, I don't believe this does what it is intended for. Since MLIR is heavily value-oriented RecordType
will hold this only in "local scope" for specific instance of it. If you get to this type at different location, e.g. through dyn_cast
it won't retain any of the layoutInfo, as it will be default constructed.
That does not sound really useful to cache these just locally, I would suggest to have just extra methods to get specific layouInfo
parameterers in such case as they are recomputed at every RecordType
construction anyway.
Also it breaks the expected behavior that mlir values are lightweight so one can copy them freely (basically just tagged pointer to storage).
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Some extra context: we use an attribute here as an implementation detail / convenience, RecordLayoutAttr
was never intended to be parsed/printed, all we need here is cache functionality. We do want such cache to be tied to the IR (as you pointed out) because different passes might be able to reuse/explore the information.
If the goal was to have it in the IR, I would expect some kay-value storage similar to DLTI_MapAttr and store it at modul level?
Nice, that's probably more pristine MLIR usage, given it still keeps the intent I mentioned above.
it won't retain any of the layoutInfo, as it will be default constructed.
This is not a problem in practice: it will lazily computed again if what you get via dyn_cast
precisely because the value is default constructed (and therefore empty/invalid).
I would suggest to have just extra methods to get specific layouInfo parameterers in such case as they are recomputed at every RecordType construction anyway.
Honestly, with all the FUD / negativity regarding compile time we already have towards CIR, I'd love if we don't make the story worse just because we're trying to be "MLIR purists". My suggestion is to keep the approach as-is for this PR and as follow up either (a) prove somehow this isn't impacting compile time and change the impl to be eager or (b) change RecordLayoutAttr
to use DLTI_MapAttr
as you suggested. The same change should be done in the incubator to allow us to try evaluating the impact.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We do want such cache to be tied to the IR (as you pointed out) because different passes might be able to reuse/explore the information.
But at the moment it is not tied to IR at all. It lives only for the duration of specific StructType
instance, so it cannot survive across passes.
With the current representation, there's no need to store this as a member of RecordType
. You could achieve the same result with a getLayoutInfo
function that computes the layout when needed. This would likely be more efficient in terms of both time and memory usage. For example returning just something like following class from the method, and probably better computing each element only when needed:
struct LayoutInfo {
unsigned size;
unsigned alignment;
bool padded;
// ...
};
By the way, while browsing the MLIR codebase, I didn't come across any extraClassDeclaration
that defines non-static data members. So having such a member here feels a bit unexpected to me.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Also at the moment whenever computeSizeAndAlignment
that initializes StructLayoutAttr
is called only one parameter from the attribute is picked and the rest is basically thrown away.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Given that this isn't actually caching anything at the moment, I propose to remove RecordLayoutAttr from this PR and upstream an implementation that recomputes what it needs on each call, with a comment saying that we would like to cache the information. Then, as @bcardosolopes suggested, we can rework it in the incubator (though maybe not right away).
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I've just put up a PR in the incubator (llvm/clangir#1569) to calculate the required information (and only the required information) on demand. I don't know if we want to make this change there or wait for a proper caching mechanism instead, but I thought it would be useful to have as a preview of a complete implementation following the path we've discussed here. All regression tests passed in my sandbox with that change.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Fair, you guys are right!
I propose to remove RecordLayoutAttr from this PR and upstream an implementation that recomputes what it needs on each call
Sounds good!
const mlir::DataLayout &dataLayout) const { | ||
assert(isComplete() && "Cannot get layout of incomplete records"); | ||
// Do not recompute. | ||
if (layoutInfo) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Just tested locally on clangir tests, this condition is never met, so it supports my previous claim that it does not really cache anything.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
You might be right, but clangir tests are mostly reduced testcases, I have trouble accepting this as a fact unless you tell me you are running on top of significant C/C++ code. (It's kinda bad that it's not exercised in the tests, you do have a point tho)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The fact that the condition isn't met here didn't surprise me, because all the places that call this function have also checked it and only call this function if layoutInfo is null. What did surprise me is that it's always null in the callers too. This is explained by what @xlauko said above about the cache being local to the RecordType instance.
I verified that even with a simple test case (clang/test/CIR/CodeGen/struct.cpp) we recompute the layout multiple times for the same type. This just isn't working the way it was intended.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
verified that even with a simple test case (clang/test/CIR/CodeGen/struct.cpp) we recompute the layout multiple times for the same type. This just isn't working the way it was intended.
Thanks for double checking
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM once @xlauko is happy
} | ||
|
||
unsigned | ||
RecordType::computeStructSize(const mlir::DataLayout &dataLayout) const { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This matches what the LLVM dialect does for LLVMStructType::getTypeSizeInBits
// Add padding to the end of the record so that it could be put in an array | ||
// and all array elements would be aligned correctly. | ||
assert(llvm::isAligned(llvm::Align(recordAlignment), recordSize)); | ||
// if (!llvm::isAligned(recordAlignment, recordSize)) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This shouldn't be here.
@xlauko Are you happy with this after the latest update? |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM
This change adds support for packing and padding record types in ClangIR and introduces some infrastructure needed for this computation. Although union support has not been upstreamed yet, there is no good way to report unions as NYI in the layout computation, so the code added here includes layout computation for unions. Unions will be added soon.
This change adds support for packing and padding record types in ClangIR and introduces some infrastructure needed for this computation. Although union support has not been upstreamed yet, there is no good way to report unions as NYI in the layout computation, so the code added here includes layout computation for unions. Unions will be added soon.
This change adds support for packing and padding record types in ClangIR and introduces some infrastructure needed for this computation. Although union support has not been upstreamed yet, there is no good way to report unions as NYI in the layout computation, so the code added here includes layout computation for unions. Unions will be added soon.
This change adds support for packing and padding record types in ClangIR and introduces some infrastructure needed for this computation.
Although union support has not been upstreamed yet, there is no good way to report unions as NYI in the layout computation, so the code added here includes layout computation for unions. Unions will be added soon.