Skip to content

[CIR] Add initial support for bitfields in structs #142041

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
merged 6 commits into from
Jun 20, 2025
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
3 changes: 3 additions & 0 deletions clang/include/clang/CIR/MissingFeatures.h
Original file line number Diff line number Diff line change
Expand Up @@ -149,6 +149,8 @@ struct MissingFeatures {
static bool cxxabiUseARMGuardVarABI() { return false; }
static bool cxxabiAppleARM64CXXABI() { return false; }
static bool cxxabiStructorImplicitParam() { return false; }
static bool isDiscreteBitFieldABI() { return false; }
static bool isBigEndian() { return false; }

// Address class
static bool addressOffset() { return false; }
Expand Down Expand Up @@ -239,6 +241,7 @@ struct MissingFeatures {
static bool builtinCall() { return false; }
static bool builtinCallF128() { return false; }
static bool builtinCallMathErrno() { return false; }
static bool nonFineGrainedBitfields() { return false; }

// Missing types
static bool dataMemberType() { return false; }
Expand Down
114 changes: 114 additions & 0 deletions clang/lib/CIR/CodeGen/CIRGenRecordLayout.h
Original file line number Diff line number Diff line change
Expand Up @@ -14,6 +14,106 @@

namespace clang::CIRGen {

/// Record with information about how a bitfield should be accessed. This is
/// very similar to what LLVM codegen does, once CIR evolves it's possible we
/// can use a more higher level representation.
///
/// Often we lay out a sequence of bitfields as a contiguous sequence of bits.
/// When the AST record layout does this, we represent it in CIR as a
/// `!cir.record` type, which directly reflects the structure's layout,
/// including bitfield packing and padding, using CIR types such as
/// `!cir.bool`, `!s8i`, `!u16i`.
///
/// To access a particular bitfield in CIR, we use the operations
/// `cir.get_bitfield` (`GetBitfieldOp`) or `cir.set_bitfield`
/// (`SetBitfieldOp`). These operations rely on the `bitfield_info`
/// attribute, which provides detailed metadata required for access,
/// such as the size and offset of the bitfield, the type and size of
/// the underlying storage, and whether the value is signed.
/// The CIRGenRecordLayout also has a bitFields map which encodes which
/// byte-sequence this bitfield falls within. Let's assume the following C
/// struct:
///
/// struct S {
/// char a, b, c;
/// unsigned bits : 3;
/// unsigned more_bits : 4;
/// unsigned still_more_bits : 7;
/// };
///
/// This will end up as the following cir.record. The bitfield members are
/// represented by one !u16i value, and the array provides padding to align the
/// struct to a 4-byte alignment.
///
/// !rec_S = !cir.record<struct "S" padded {!s8i, !s8i, !s8i, !u16i,
/// !cir.array<!u8i x 3>}>
///
/// When generating code to access more_bits, we'll generate something
/// essentially like this:
///
/// #bfi_more_bits = #cir.bitfield_info<name = "more_bits", storage_type =
/// !u16i, size = 4, offset = 3, is_signed = false>
///
/// cir.func @store_field() {
/// %0 = cir.alloca !rec_S, !cir.ptr<!rec_S>, ["s"] {alignment = 4 : i64}
/// %1 = cir.const #cir.int<2> : !s32i
/// %2 = cir.cast(integral, %1 : !s32i), !u32i
/// %3 = cir.get_member %0[3] {name = "more_bits"} : !cir.ptr<!rec_S> ->
/// !cir.ptr<!u16i>
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is also stale, but differently so. I believe we generate a !u8i for this case.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

sorry, when I was writing this, I was running some tests and forgot to revert the struct back to match what the comment was describing. Thanks for catching that

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

With the new algorithm, this is now a !u16i

/// %4 = cir.set_bitfield(#bfi_more_bits, %3 :
/// !cir.ptr<!u16i>, %2 : !u32i) -> !u32i
/// cir.return
/// }
///
struct CIRGenBitFieldInfo {
/// The offset within a contiguous run of bitfields that are represented as
/// a single "field" within the cir.record type. This offset is in bits.
unsigned offset : 16;

/// The total size of the bit-field, in bits.
unsigned size : 15;

/// Whether the bit-field is signed.
unsigned isSigned : 1;

/// The storage size in bits which should be used when accessing this
/// bitfield.
unsigned storageSize;

/// The offset of the bitfield storage from the start of the record.
clang::CharUnits storageOffset;

/// The offset within a contiguous run of bitfields that are represented as a
/// single "field" within the cir.record type, taking into account the AAPCS
/// rules for volatile bitfields. This offset is in bits.
unsigned volatileOffset : 16;

/// The storage size in bits which should be used when accessing this
/// bitfield.
unsigned volatileStorageSize;

/// The offset of the bitfield storage from the start of the record.
clang::CharUnits volatileStorageOffset;

/// The name of a bitfield
llvm::StringRef name;

// The actual storage type for the bitfield
mlir::Type storageType;

CIRGenBitFieldInfo()
: offset(), size(), isSigned(), storageSize(), volatileOffset(),
volatileStorageSize() {}

CIRGenBitFieldInfo(unsigned offset, unsigned size, bool isSigned,
unsigned storageSize, clang::CharUnits storageOffset)
: offset(offset), size(size), isSigned(isSigned),
storageSize(storageSize), storageOffset(storageOffset) {}

void print(llvm::raw_ostream &os) const;
LLVM_DUMP_METHOD void dump() const;
};

/// This class handles record and union layout info while lowering AST types
/// to CIR types.
///
Expand Down Expand Up @@ -41,6 +141,10 @@ class CIRGenRecordLayout {
// for both virtual and non-virtual bases.
llvm::DenseMap<const clang::CXXRecordDecl *, unsigned> nonVirtualBases;

/// Map from (bit-field) record field to the corresponding CIR record type
/// field no. This info is populated by record builder.
llvm::DenseMap<const clang::FieldDecl *, CIRGenBitFieldInfo> bitFields;

/// False if any direct or indirect subobject of this class, when considered
/// as a complete object, requires a non-zero bitpattern when
/// zero-initialized.
Expand Down Expand Up @@ -83,6 +187,16 @@ class CIRGenRecordLayout {
/// Check whether this struct can be C++ zero-initialized
/// with a zeroinitializer when considered as a base subobject.
bool isZeroInitializableAsBase() const { return zeroInitializableAsBase; }

/// Return the BitFieldInfo that corresponds to the field FD.
const CIRGenBitFieldInfo &getBitFieldInfo(const clang::FieldDecl *fd) const {
fd = fd->getCanonicalDecl();
assert(fd->isBitField() && "Invalid call for non-bit-field decl!");
llvm::DenseMap<const clang::FieldDecl *, CIRGenBitFieldInfo>::const_iterator
it = bitFields.find(fd);
assert(it != bitFields.end() && "Unable to find bitfield info");
return it->second;
}
};

} // namespace clang::CIRGen
Expand Down
Loading
Loading