-
Notifications
You must be signed in to change notification settings - Fork 14.2k
[clang][dataflow] Add synthetic fields to RecordStorageLocation
#73860
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Changes from all commits
a713828
44270dc
c49199a
cea4a72
File filter
Filter by extension
Conversations
Jump to
Diff view
Diff view
There are no files selected for viewing
Original file line number | Diff line number | Diff line change |
---|---|---|
|
@@ -58,6 +58,10 @@ using FieldSet = llvm::SmallSetVector<const FieldDecl *, 4>; | |
/// Returns the set of all fields in the type. | ||
FieldSet getObjectFields(QualType Type); | ||
|
||
/// Returns whether `Fields` and `FieldLocs` contain the same fields. | ||
bool containsSameFields(const FieldSet &Fields, | ||
const RecordStorageLocation::FieldToLoc &FieldLocs); | ||
|
||
struct ContextSensitiveOptions { | ||
/// The maximum depth to analyze. A value of zero is equivalent to disabling | ||
/// context-sensitive analysis entirely. | ||
|
@@ -92,11 +96,39 @@ class DataflowAnalysisContext { | |
/*Logger=*/nullptr}); | ||
~DataflowAnalysisContext(); | ||
|
||
/// Sets a callback that returns the names and types of the synthetic fields | ||
/// to add to a `RecordStorageLocation` of a given type. | ||
/// Typically, this is called from the constructor of a `DataflowAnalysis` | ||
/// | ||
/// To maintain the invariant that all `RecordStorageLocation`s of a given | ||
/// type have the same fields: | ||
/// * The callback must always return the same result for a given type | ||
/// * `setSyntheticFieldCallback()` must be called before any | ||
// `RecordStorageLocation`s are created. | ||
void setSyntheticFieldCallback( | ||
std::function<llvm::StringMap<QualType>(QualType)> CB) { | ||
assert(!RecordStorageLocationCreated); | ||
SyntheticFieldCallback = CB; | ||
} | ||
|
||
/// Returns a new storage location appropriate for `Type`. | ||
/// | ||
/// A null `Type` is interpreted as the pointee type of `std::nullptr_t`. | ||
StorageLocation &createStorageLocation(QualType Type); | ||
|
||
/// Creates a `RecordStorageLocation` for the given type and with the given | ||
/// fields. | ||
/// | ||
/// Requirements: | ||
/// | ||
/// `FieldLocs` must contain exactly the fields returned by | ||
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. With requirements like this, I was wondering if this should be a private API. There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. I agree this would be nice (this is not the API that most callers should call), but this API is not just called from In the end, I don't think it's worth the effort of "locking down" this API -- the documentation makes it clear that the function has pretty exacting preconditions (which we assert within the implementation), which makes the function unattractive to a potential caller anyway. |
||
/// `getModeledFields(Type)`. | ||
/// `SyntheticFields` must contain exactly the fields returned by | ||
/// `getSyntheticFields(Type)`. | ||
RecordStorageLocation &createRecordStorageLocation( | ||
QualType Type, RecordStorageLocation::FieldToLoc FieldLocs, | ||
RecordStorageLocation::SyntheticFieldMap SyntheticFields); | ||
|
||
/// Returns a stable storage location for `D`. | ||
StorageLocation &getStableStorageLocation(const ValueDecl &D); | ||
|
||
|
@@ -169,6 +201,15 @@ class DataflowAnalysisContext { | |
/// context. | ||
FieldSet getModeledFields(QualType Type); | ||
|
||
/// Returns the names and types of the synthetic fields for the given record | ||
/// type. | ||
llvm::StringMap<QualType> getSyntheticFields(QualType Type) { | ||
assert(Type->isRecordType()); | ||
if (SyntheticFieldCallback) | ||
return SyntheticFieldCallback(Type); | ||
return {}; | ||
} | ||
|
||
private: | ||
friend class Environment; | ||
|
||
|
@@ -250,6 +291,11 @@ class DataflowAnalysisContext { | |
FieldSet ModeledFields; | ||
|
||
std::unique_ptr<Logger> LogOwner; // If created via flags. | ||
|
||
std::function<llvm::StringMap<QualType>(QualType)> SyntheticFieldCallback; | ||
|
||
/// Has any `RecordStorageLocation` been created yet? | ||
bool RecordStorageLocationCreated = false; | ||
}; | ||
|
||
} // namespace dataflow | ||
|
Original file line number | Diff line number | Diff line change |
---|---|---|
|
@@ -73,7 +73,13 @@ class ScalarStorageLocation final : public StorageLocation { | |
/// | ||
/// Contains storage locations for all modeled fields of the record (also | ||
/// referred to as "children"). The child map is flat, so accessible members of | ||
/// the base class are directly accesible as children of this location. | ||
/// the base class are directly accessible as children of this location. | ||
/// | ||
/// Record storage locations may also contain so-called synthetic fields. These | ||
/// are typically used to model the internal state of a class (e.g. the value | ||
/// stored in a `std::optional`) without having to depend on that class's | ||
/// implementation details. All `RecordStorageLocation`s of a given type should | ||
/// have the same synthetic fields. | ||
/// | ||
/// The storage location for a field of reference type may be null. This | ||
/// typically occurs in one of two situations: | ||
|
@@ -88,12 +94,12 @@ class ScalarStorageLocation final : public StorageLocation { | |
class RecordStorageLocation final : public StorageLocation { | ||
public: | ||
using FieldToLoc = llvm::DenseMap<const ValueDecl *, StorageLocation *>; | ||
using SyntheticFieldMap = llvm::StringMap<StorageLocation *>; | ||
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. I was wondering, if we want this map to not be indexed by strings. An alternative would be to somehow access the That being said, I am not insisting on the change, and even if we want to do it, it is probably OK in a follow-up PR. There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. This is an interesting idea, but I'm hesitant about mutating the I suppose we could always create our own mapping from strings to integers or pointers of some sort, which could then serve as the keys for the |
||
|
||
explicit RecordStorageLocation(QualType Type) | ||
: RecordStorageLocation(Type, FieldToLoc()) {} | ||
|
||
RecordStorageLocation(QualType Type, FieldToLoc TheChildren) | ||
: StorageLocation(Kind::Record, Type), Children(std::move(TheChildren)) { | ||
RecordStorageLocation(QualType Type, FieldToLoc TheChildren, | ||
SyntheticFieldMap TheSyntheticFields) | ||
: StorageLocation(Kind::Record, Type), Children(std::move(TheChildren)), | ||
SyntheticFields(std::move(TheSyntheticFields)) { | ||
assert(!Type.isNull()); | ||
assert(Type->isRecordType()); | ||
assert([this] { | ||
|
@@ -133,6 +139,19 @@ class RecordStorageLocation final : public StorageLocation { | |
return It->second; | ||
} | ||
|
||
/// Returns the storage location for the synthetic field `Name`. | ||
/// The synthetic field must exist. | ||
StorageLocation &getSyntheticField(llvm::StringRef Name) const { | ||
StorageLocation *Loc = SyntheticFields.lookup(Name); | ||
assert(Loc != nullptr); | ||
return *Loc; | ||
} | ||
|
||
llvm::iterator_range<SyntheticFieldMap::const_iterator> | ||
synthetic_fields() const { | ||
return {SyntheticFields.begin(), SyntheticFields.end()}; | ||
} | ||
|
||
/// Changes the child storage location for a field `D` of reference type. | ||
/// All other fields cannot change their storage location and always retain | ||
/// the storage location passed to the `RecordStorageLocation` constructor. | ||
|
@@ -151,6 +170,7 @@ class RecordStorageLocation final : public StorageLocation { | |
|
||
private: | ||
FieldToLoc Children; | ||
SyntheticFieldMap SyntheticFields; | ||
}; | ||
|
||
} // namespace dataflow | ||
|
Original file line number | Diff line number | Diff line change |
---|---|---|
|
@@ -68,11 +68,38 @@ StorageLocation &DataflowAnalysisContext::createStorageLocation(QualType Type) { | |
else | ||
FieldLocs.insert({Field, &createStorageLocation( | ||
Field->getType().getNonReferenceType())}); | ||
return arena().create<RecordStorageLocation>(Type, std::move(FieldLocs)); | ||
|
||
RecordStorageLocation::SyntheticFieldMap SyntheticFields; | ||
for (const auto &Entry : getSyntheticFields(Type)) | ||
SyntheticFields.insert( | ||
{Entry.getKey(), | ||
&createStorageLocation(Entry.getValue().getNonReferenceType())}); | ||
|
||
return createRecordStorageLocation(Type, std::move(FieldLocs), | ||
std::move(SyntheticFields)); | ||
} | ||
return arena().create<ScalarStorageLocation>(Type); | ||
} | ||
|
||
// Returns the keys for a given `StringMap`. | ||
// Can't use `StringSet` as the return type as it doesn't support `operator==`. | ||
template <typename T> | ||
static llvm::DenseSet<llvm::StringRef> getKeys(const llvm::StringMap<T> &Map) { | ||
return llvm::DenseSet<llvm::StringRef>(Map.keys().begin(), Map.keys().end()); | ||
} | ||
|
||
RecordStorageLocation &DataflowAnalysisContext::createRecordStorageLocation( | ||
QualType Type, RecordStorageLocation::FieldToLoc FieldLocs, | ||
RecordStorageLocation::SyntheticFieldMap SyntheticFields) { | ||
assert(Type->isRecordType()); | ||
assert(containsSameFields(getModeledFields(Type), FieldLocs)); | ||
assert(getKeys(getSyntheticFields(Type)) == getKeys(SyntheticFields)); | ||
|
||
RecordStorageLocationCreated = true; | ||
return arena().create<RecordStorageLocation>(Type, std::move(FieldLocs), | ||
std::move(SyntheticFields)); | ||
} | ||
|
||
StorageLocation & | ||
DataflowAnalysisContext::getStableStorageLocation(const ValueDecl &D) { | ||
if (auto *Loc = DeclToLoc.lookup(&D)) | ||
|
@@ -367,3 +394,14 @@ clang::dataflow::FieldSet clang::dataflow::getObjectFields(QualType Type) { | |
getFieldsFromClassHierarchy(Type, Fields); | ||
return Fields; | ||
} | ||
|
||
bool clang::dataflow::containsSameFields( | ||
const clang::dataflow::FieldSet &Fields, | ||
const clang::dataflow::RecordStorageLocation::FieldToLoc &FieldLocs) { | ||
if (Fields.size() != FieldLocs.size()) | ||
return false; | ||
for ([[maybe_unused]] auto [Field, Loc] : FieldLocs) | ||
if (!Fields.contains(cast_or_null<FieldDecl>(Field))) | ||
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. What are the cases when we expect to see a null pointer here? There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. This can happen for fields of reference type in the cases discussed here. |
||
return false; | ||
return true; | ||
} |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Can this be integrated into the constructor? Could be in a followup patch (to minimize further churn in this one).
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I was wondering if having a single
SyntheticFieldCallback
will be too restrictive in the future. What if an ananlysis wants to use modeling both for optionals and some other data structure, both having its own callbacks?Although this is mostly food for thought for the future, no need to address in this PR.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Unfortunately, that doesn't work because
setSyntheticFieldCallback()
needs to be called by theDataflowAnalysis
, and theDataflowAnalysis
doesn't create theDataflowAnalysisContext
(i.e. it can't pass anything to its constructor).I think we could consider changing this though: I don't think there's anything that would prevent the
DataflowAnalysis
itself from creating theDataflowAnalysisContext
, and then the synthetic field callback could simply be a constructor parameter. This would require changing existing analyses, however, so I don't think this is something to tackle as part of this patch.There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
A single synthetic field callback can handle any number of different types, so the analysis wouldn't need to set more than one callback.
Maybe, however, you're thinking of a scenario where we might want to run several analyses at the same time? I think this is something we've discussed in the past, and in this scenario each analysis would of course potentially want to be able to set its own synthetic field callback. As we currently don't have the ability to run more than one analysis at the same time, I think it is currently sufficient to only support a single callback as well.