Skip to content

[clang][dataflow] Correctly treat empty initializer lists for unions. #82986

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
merged 3 commits into from
Mar 1, 2024
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
Original file line number Diff line number Diff line change
Expand Up @@ -723,9 +723,12 @@ RecordStorageLocation *getImplicitObjectLocation(const CXXMemberCallExpr &MCE,
RecordStorageLocation *getBaseObjectLocation(const MemberExpr &ME,
const Environment &Env);

/// Returns the fields of `RD` that are initialized by an `InitListExpr`, in the
/// order in which they appear in `InitListExpr::inits()`.
std::vector<FieldDecl *> getFieldsForInitListExpr(const RecordDecl *RD);
/// Returns the fields of a `RecordDecl` that are initialized by an
/// `InitListExpr`, in the order in which they appear in
/// `InitListExpr::inits()`.
/// `Init->getType()` must be a record type.
std::vector<const FieldDecl *>
getFieldsForInitListExpr(const InitListExpr *InitList);

/// Associates a new `RecordValue` with `Loc` and returns the new value.
RecordValue &refreshRecordValue(RecordStorageLocation &Loc, Environment &Env);
Expand Down
25 changes: 20 additions & 5 deletions clang/lib/Analysis/FlowSensitive/DataflowEnvironment.cpp
Original file line number Diff line number Diff line change
Expand Up @@ -361,8 +361,8 @@ getFieldsGlobalsAndFuncs(const Stmt &S, FieldSet &Fields,
if (const auto *FD = dyn_cast<FieldDecl>(VD))
Fields.insert(FD);
} else if (auto *InitList = dyn_cast<InitListExpr>(&S)) {
if (RecordDecl *RD = InitList->getType()->getAsRecordDecl())
for (const auto *FD : getFieldsForInitListExpr(RD))
if (InitList->getType()->isRecordType())
for (const auto *FD : getFieldsForInitListExpr(InitList))
Fields.insert(FD);
}
}
Expand Down Expand Up @@ -983,7 +983,7 @@ StorageLocation &Environment::createObjectInternal(const ValueDecl *D,
}

Value *Val = nullptr;
if (InitExpr)
if (InitExpr) {
// In the (few) cases where an expression is intentionally
// "uninterpreted", `InitExpr` is not associated with a value. There are
// two ways to handle this situation: propagate the status, so that
Expand All @@ -998,6 +998,11 @@ StorageLocation &Environment::createObjectInternal(const ValueDecl *D,
// default value (assuming we don't update the environment API to return
// references).
Val = getValue(*InitExpr);

if (!Val && isa<ImplicitValueInitExpr>(InitExpr) &&
InitExpr->getType()->isPointerType())
Val = &getOrCreateNullPointerValue(InitExpr->getType()->getPointeeType());
}
if (!Val)
Val = createValue(Ty);

Expand Down Expand Up @@ -1104,12 +1109,22 @@ RecordStorageLocation *getBaseObjectLocation(const MemberExpr &ME,
return Env.get<RecordStorageLocation>(*Base);
}

std::vector<FieldDecl *> getFieldsForInitListExpr(const RecordDecl *RD) {
std::vector<const FieldDecl *>
getFieldsForInitListExpr(const InitListExpr *InitList) {
const RecordDecl *RD = InitList->getType()->getAsRecordDecl();
assert(RD != nullptr);

std::vector<const FieldDecl *> Fields;

if (InitList->getType()->isUnionType()) {
Fields.push_back(InitList->getInitializedFieldInUnion());
return Fields;
}

// Unnamed bitfields are only used for padding and do not appear in
// `InitListExpr`'s inits. However, those fields do appear in `RecordDecl`'s
// field list, and we thus need to remove them before mapping inits to
// fields to avoid mapping inits to the wrongs fields.
std::vector<FieldDecl *> Fields;
llvm::copy_if(
RD->fields(), std::back_inserter(Fields),
[](const FieldDecl *Field) { return !Field->isUnnamedBitfield(); });
Expand Down
40 changes: 28 additions & 12 deletions clang/lib/Analysis/FlowSensitive/Transfer.cpp
Original file line number Diff line number Diff line change
Expand Up @@ -663,14 +663,7 @@ class TransferVisitor : public ConstStmtVisitor<TransferVisitor> {
void VisitInitListExpr(const InitListExpr *S) {
QualType Type = S->getType();

if (Type->isUnionType()) {
// FIXME: Initialize unions properly.
if (auto *Val = Env.createValue(Type))
Env.setValue(*S, *Val);
return;
}

if (!Type->isStructureOrClassType()) {
if (!Type->isRecordType()) {
// Until array initialization is implemented, we skip arrays and don't
// need to care about cases where `getNumInits() > 1`.
if (!Type->isArrayType() && S->getNumInits() == 1)
Expand All @@ -688,14 +681,26 @@ class TransferVisitor : public ConstStmtVisitor<TransferVisitor> {
llvm::DenseMap<const ValueDecl *, StorageLocation *> FieldLocs;

// This only contains the direct fields for the given type.
std::vector<FieldDecl *> FieldsForInit =
getFieldsForInitListExpr(Type->getAsRecordDecl());
std::vector<const FieldDecl *> FieldsForInit = getFieldsForInitListExpr(S);

// `S->inits()` contains all the initializer epressions, including the
// `S->inits()` contains all the initializer expressions, including the
// ones for direct base classes.
auto Inits = S->inits();
ArrayRef<Expr *> Inits = S->inits();
size_t InitIdx = 0;

// Unions initialized with an empty initializer list need special treatment.
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I am wondering how does CodeGen deal with this? If it happens to have similar extra logic, we could simplify both by changing the AST.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It appears that CodeGen also appears to special-case this like we do:

if (NumInitElements) {

And if we changed the AST so that we had an ImplicitValueInitExpr here, it looks as if the general case in CodeGen would do the right thing.

I'll make a note of this for myself.

// For structs/classes initialized with an empty initializer list, Clang
// puts `ImplicitValueInitExpr`s in `InitListExpr::inits()`, but for unions,
// it doesn't do this -- so we create an `ImplicitValueInitExpr` ourselves.
std::optional<ImplicitValueInitExpr> ImplicitValueInitForUnion;
SmallVector<Expr *> InitsForUnion;
if (S->getType()->isUnionType() && Inits.empty()) {
assert(FieldsForInit.size() == 1);
ImplicitValueInitForUnion.emplace(FieldsForInit.front()->getType());
InitsForUnion.push_back(&*ImplicitValueInitForUnion);
Inits = InitsForUnion;
}

// Initialize base classes.
if (auto* R = S->getType()->getAsCXXRecordDecl()) {
assert(FieldsForInit.size() + R->getNumBases() == Inits.size());
Expand Down Expand Up @@ -731,6 +736,17 @@ class TransferVisitor : public ConstStmtVisitor<TransferVisitor> {
FieldLocs.insert({Field, &Loc});
}

// In the case of a union, we don't in general have initializers for all
// of the fields. Create storage locations for the remaining fields (but
// don't associate them with values).
if (Type->isUnionType()) {
for (const FieldDecl *Field :
Env.getDataflowAnalysisContext().getModeledFields(Type)) {
if (auto [it, inserted] = FieldLocs.insert({Field, nullptr}); inserted)
it->second = &Env.createStorageLocation(Field->getType());
}
}

// Check that we satisfy the invariant that a `RecordStorageLoation`
// contains exactly the set of modeled fields for that type.
// `ModeledFields` includes fields from all the bases, but only the
Expand Down
19 changes: 19 additions & 0 deletions clang/unittests/Analysis/FlowSensitive/TestingSupport.h
Original file line number Diff line number Diff line change
Expand Up @@ -432,6 +432,8 @@ llvm::Error checkDataflowWithNoopAnalysis(
{});

/// Returns the `ValueDecl` for the given identifier.
/// The returned pointer is guaranteed to be non-null; the function asserts if
/// no `ValueDecl` with the given name is found.
///
/// Requirements:
///
Expand Down Expand Up @@ -475,6 +477,15 @@ ValueT &getValueForDecl(ASTContext &ASTCtx, const Environment &Env,
return *cast<ValueT>(Env.getValue(*VD));
}

/// Returns the storage location for the field called `Name` of `Loc`.
/// Optionally casts the field storage location to `T`.
template <typename T = StorageLocation>
std::enable_if_t<std::is_base_of_v<StorageLocation, T>, T &>
getFieldLoc(const RecordStorageLocation &Loc, llvm::StringRef Name,
ASTContext &ASTCtx) {
return *cast<T>(Loc.getChild(*findValueDecl(ASTCtx, Name)));
}

/// Returns the value of a `Field` on the record referenced by `Loc.`
/// Returns null if `Loc` is null.
inline Value *getFieldValue(const RecordStorageLocation *Loc,
Expand All @@ -487,6 +498,14 @@ inline Value *getFieldValue(const RecordStorageLocation *Loc,
return Env.getValue(*FieldLoc);
}

/// Returns the value of a `Field` on the record referenced by `Loc.`
/// Returns null if `Loc` is null.
inline Value *getFieldValue(const RecordStorageLocation *Loc,
llvm::StringRef Name, ASTContext &ASTCtx,
const Environment &Env) {
return getFieldValue(Loc, *findValueDecl(ASTCtx, Name), Env);
}

/// Creates and owns constraints which are boolean values.
class ConstraintContext {
unsigned NextAtom = 0;
Expand Down
82 changes: 80 additions & 2 deletions clang/unittests/Analysis/FlowSensitive/TransferTest.cpp
Original file line number Diff line number Diff line change
Expand Up @@ -2392,14 +2392,92 @@ TEST(TransferTest, InitListExprAsUnion) {
} F;

public:
constexpr target() : F{nullptr} {}
constexpr target() : F{nullptr} {
int *null = nullptr;
F.b; // Make sure we reference 'b' so it is modeled.
// [[p]]
}
};
)cc";
runDataflow(
Code,
[](const llvm::StringMap<DataflowAnalysisState<NoopLattice>> &Results,
ASTContext &ASTCtx) {
// Just verify that it doesn't crash.
const Environment &Env = getEnvironmentAtAnnotation(Results, "p");

auto &FLoc = getFieldLoc<RecordStorageLocation>(
*Env.getThisPointeeStorageLocation(), "F", ASTCtx);
auto *AVal = cast<PointerValue>(getFieldValue(&FLoc, "a", ASTCtx, Env));
EXPECT_EQ(AVal, &getValueForDecl<PointerValue>(ASTCtx, Env, "null"));
EXPECT_EQ(getFieldValue(&FLoc, "b", ASTCtx, Env), nullptr);
});
}

TEST(TransferTest, EmptyInitListExprForUnion) {
// This is a crash repro.
std::string Code = R"cc(
class target {
union {
int *a;
bool *b;
} F;

public:
// Empty initializer list means that `F` is aggregate-initialized.
// For a union, this has the effect that the first member of the union
// is copy-initialized from an empty initializer list; in this specific
// case, this has the effect of initializing `a` with null.
constexpr target() : F{} {
int *null = nullptr;
F.b; // Make sure we reference 'b' so it is modeled.
// [[p]]
}
};
)cc";
runDataflow(
Code,
[](const llvm::StringMap<DataflowAnalysisState<NoopLattice>> &Results,
ASTContext &ASTCtx) {
const Environment &Env = getEnvironmentAtAnnotation(Results, "p");

auto &FLoc = getFieldLoc<RecordStorageLocation>(
*Env.getThisPointeeStorageLocation(), "F", ASTCtx);
auto *AVal = cast<PointerValue>(getFieldValue(&FLoc, "a", ASTCtx, Env));
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why is a set when the initializer is empty? I'm guessing it's because line 699 in Transfer.cpp uses front() and so initializes the first field of the union. If so, perhaps say this explicitly in the comment preceding those lines?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I've added a comment explaining what the language rules are that govern the behavior we expect here.

EXPECT_EQ(AVal, &getValueForDecl<PointerValue>(ASTCtx, Env, "null"));
EXPECT_EQ(getFieldValue(&FLoc, "b", ASTCtx, Env), nullptr);
});
}

TEST(TransferTest, EmptyInitListExprForStruct) {
std::string Code = R"cc(
class target {
struct {
int *a;
bool *b;
} F;

public:
constexpr target() : F{} {
int *NullIntPtr = nullptr;
bool *NullBoolPtr = nullptr;
// [[p]]
}
};
)cc";
runDataflow(
Code,
[](const llvm::StringMap<DataflowAnalysisState<NoopLattice>> &Results,
ASTContext &ASTCtx) {
const Environment &Env = getEnvironmentAtAnnotation(Results, "p");

auto &FLoc = getFieldLoc<RecordStorageLocation>(
*Env.getThisPointeeStorageLocation(), "F", ASTCtx);
auto *AVal = cast<PointerValue>(getFieldValue(&FLoc, "a", ASTCtx, Env));
EXPECT_EQ(AVal,
&getValueForDecl<PointerValue>(ASTCtx, Env, "NullIntPtr"));
auto *BVal = cast<PointerValue>(getFieldValue(&FLoc, "b", ASTCtx, Env));
EXPECT_EQ(BVal,
&getValueForDecl<PointerValue>(ASTCtx, Env, "NullBoolPtr"));
});
}

Expand Down