Skip to content

[analyzer][NFC] Add some docs for LazyCompoundValue #97407

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
merged 6 commits into from
Jul 25, 2024
Merged
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
46 changes: 46 additions & 0 deletions clang/include/clang/StaticAnalyzer/Core/PathSensitive/SVals.h
Original file line number Diff line number Diff line change
Expand Up @@ -326,6 +326,10 @@ class LocAsInteger : public NonLoc {
static bool classof(SVal V) { return V.getKind() == LocAsIntegerKind; }
};

/// The simplest example of a concrete compound value is nonloc::CompoundVal,
/// which represents a concrete r-value of an initializer-list or a string.
/// Internally, it contains an llvm::ImmutableList of SVal's stored inside the
/// literal.
class CompoundVal : public NonLoc {
friend class ento::SValBuilder;

Expand All @@ -346,6 +350,36 @@ class CompoundVal : public NonLoc {
static bool classof(SVal V) { return V.getKind() == CompoundValKind; }
};

/// While nonloc::CompoundVal covers a few simple use cases,
/// nonloc::LazyCompoundVal is a more performant and flexible way to represent
/// an rvalue of record type, so it shows up much more frequently during
/// analysis. This value is an r-value that represents a snapshot of any
/// structure "as a whole" at a given moment during the analysis. Such value is
/// already quite far from being referred to as "concrete", as many fields
/// inside it would be unknown or symbolic. nonloc::LazyCompoundVal operates by
/// storing two things:
/// * a reference to the TypedValueRegion being snapshotted (yes, it is always
/// typed), and also
/// * a reference to the whole Store object, obtained from the ProgramState in
/// which the nonloc::LazyCompoundVal was created.
///
/// Note that the old ProgramState and its Store is kept alive during the
/// analysis because these are immutable functional data structures and each new
/// Store value is represented as "earlier Store" + "additional binding".
///
/// Essentially, nonloc::LazyCompoundVal is a performance optimization for the
/// analyzer. Because Store is immutable, creating a nonloc::LazyCompoundVal is
/// a very cheap operation. Note that the Store contains all region bindings in
/// the program state, not only related to the region. Later, if necessary, such
/// value can be unpacked -- eg. when it is assigned to another variable.
///
/// If you ever need to inspect the contents of the LazyCompoundVal, you can use
/// StoreManager::iterBindings(). It'll iterate through all values in the Store,
/// but you're only interested in the ones that belong to
/// LazyCompoundVal::getRegion(); other bindings are immaterial.
///
/// NOTE: LazyCompoundVal::getRegion() itself is also immaterial (see the actual
/// method docs for details).
class LazyCompoundVal : public NonLoc {
friend class ento::SValBuilder;

Expand All @@ -363,6 +397,18 @@ class LazyCompoundVal : public NonLoc {
/// It might return null.
const void *getStore() const;

/// This function itself is immaterial. It is only an implementation detail.
/// LazyCompoundVal represents only the rvalue, the data (known or unknown)
/// that *was* stored in that region *at some point in the past*. The region
/// should not be used for any purpose other than figuring out what part of
/// the frozen Store you're interested in. The value does not represent the
/// *current* value of that region. Sometimes it may, but this should not be
/// relied upon. Instead, if you want to figure out what region it represents,
/// you typically need to see where you got it from in the first place. The
/// region is absolutely not analogous to the C++ "this" pointer. It is also
/// not a valid way to "materialize" the prvalue into a glvalue in C++,
/// because the region represents the *old* storage (sometimes very old), not
/// the *future* storage.
Comment on lines +400 to +411
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

To be fair, I don't really understand this.

To me, this region is the region that one would need to query from the associated Store to get the value this LazyCompoundValue actually boils down to.
So, a region, like every other region, only defines a memory location (an l-value), and there is not much one could do with such a region. But I find this true for any other MemRegion anywhere in the engine.

Probably I miss something, as I don't usually deal with LCVs, and it's been a while I last touched them.
So, to summarize, I don't really understand what immaterial means in this context or understand why we mention it for LCV::getRegion() instead of mention this for a MemRegion.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@haoNoQ I can't really counter this, can you help us out a bit? :)

LLVM_ATTRIBUTE_RETURNS_NONNULL
const TypedValueRegion *getRegion() const;

Expand Down
Loading