Skip to content

[analyzer][NFC] Add some docs for LazyCompoundValue #97407

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
merged 6 commits into from
Jul 25, 2024

Conversation

Szelethus
Copy link
Contributor

Yes, I basically copy-pasted some posts from discord and Artem's book, but these make for a rather decent docs.

Yes, I basically copy-pasted some posts from discord and Artem's book,
but these make for a rather decent docs.
@llvmbot llvmbot added the clang Clang issues not falling into any other category label Jul 2, 2024
@llvmbot
Copy link
Member

llvmbot commented Jul 2, 2024

@llvm/pr-subscribers-clang

@llvm/pr-subscribers-clang-static-analyzer-1

Author: Kristóf Umann (Szelethus)

Changes

Yes, I basically copy-pasted some posts from discord and Artem's book, but these make for a rather decent docs.


Full diff: https://github.com/llvm/llvm-project/pull/97407.diff

1 Files Affected:

  • (modified) clang/include/clang/StaticAnalyzer/Core/PathSensitive/SVals.h (+39)
diff --git a/clang/include/clang/StaticAnalyzer/Core/PathSensitive/SVals.h b/clang/include/clang/StaticAnalyzer/Core/PathSensitive/SVals.h
index 3a4b087257149..e44cda50ef21d 100644
--- a/clang/include/clang/StaticAnalyzer/Core/PathSensitive/SVals.h
+++ b/clang/include/clang/StaticAnalyzer/Core/PathSensitive/SVals.h
@@ -326,6 +326,12 @@ class LocAsInteger : public NonLoc {
   static bool classof(SVal V) { return V.getKind() == LocAsIntegerKind; }
 };
 
+/// The simplest example of a concrete compound value is nonloc::CompoundVal,
+/// which represents a concrete r-value of an initializer-list or a string.
+/// Internally, it contains an llvm::ImmutableList of SVal's stored inside the
+/// literal.
+///
+/// Source: https://github.com/haoNoQ/clang-analyzer-guide, v0.1, section 5.3.2.
 class CompoundVal : public NonLoc {
   friend class ento::SValBuilder;
 
@@ -346,6 +352,39 @@ class CompoundVal : public NonLoc {
   static bool classof(SVal V) { return V.getKind() == CompoundValKind; }
 };
 
+/// The simplest example of a concrete compound value is nonloc::CompoundVal,
+/// which represents a concrete r-value of an initializer-list or a string.
+/// Internally, it contains an llvm::ImmutableList of SVal's stored inside the
+/// literal.
+///
+/// However, there is another compound value used in the analyzer, which appears
+/// much more often during analysis, which is nonloc::LazyCompoundVal. This
+/// value is an r-value that represents a snapshot of any structure "as a whole"
+/// at a given moment during the analysis. Such value is already quite far from
+/// being re- ferred to as "concrete", as many fields inside it would be unknown
+/// or symbolic. nonloc::LazyCompoundVal operates by storing two things:
+///   * a reference to the TypedValueRegion being snapshotted (yes, it is always
+///     typed), and also
+///   * a copy of the whole Store object, obtained from the ProgramState in
+///     which it was created.
+///
+/// Essentially, nonloc::LazyCompoundVal is a performance optimization for the
+/// analyzer. Because Store is immutable, creating a nonloc::LazyCompoundVal is
+/// a very cheap operation. Note that the Store contains all region bindings in
+/// the program state, not only related to the region. Later, if necessary, such
+/// value can be unpacked -- eg. when it is assigned to another variable.
+///
+/// Source: https://github.com/haoNoQ/clang-analyzer-guide, v0.1, section 5.3.2.
+///
+/// If you ever need to see if a LazyCompoundVal is fully or partially
+/// (un)initialized, you can iterBindings(). This is non-typechecked lookup
+/// (i.e., you cannot figure out which specific sub-region is initialized by the
+/// value you look at, you only get a byte offset). You can also improve
+/// iterBindings() to make it possible to restrict the iteration to a single
+/// cluster, because within the LazyCompoundVal’s Store only the cluster that
+/// corresponds to the LazyCompoundVal’s parent region is relevant.
+///
+/// Source: https://discourse.llvm.org/t/analyzer-for-the-undefined-value-of-array-element-the-tracking-information-is-incomplete/49372/2
 class LazyCompoundVal : public NonLoc {
   friend class ento::SValBuilder;
 

Copy link

github-actions bot commented Jul 2, 2024

✅ With the latest revision this PR passed the C/C++ code formatter.

Copy link
Contributor

@NagyDonat NagyDonat left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm really happy that you decided to document these 😄

Copy link
Contributor

@steakhal steakhal left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Makes sense. It's good to see some love for the docs.

Copy link
Collaborator

@haoNoQ haoNoQ left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks a lot!! Some comments are very much overdue here.

@Szelethus
Copy link
Contributor Author

I'd also love to see some docs for Loc and NonLoc, because I recall some angry looks from years ago when I touched those on their own, but I can barely recall them :( As well as some docs for the various ProgramState::getSVal methods...

Copy link
Contributor

@steakhal steakhal left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think it's already in a good shape. I found barely any problems with the wording.

Comment on lines +400 to +411
/// This function itself is immaterial. It is only an implementation detail.
/// LazyCompoundVal represents only the rvalue, the data (known or unknown)
/// that *was* stored in that region *at some point in the past*. The region
/// should not be used for any purpose other than figuring out what part of
/// the frozen Store you're interested in. The value does not represent the
/// *current* value of that region. Sometimes it may, but this should not be
/// relied upon. Instead, if you want to figure out what region it represents,
/// you typically need to see where you got it from in the first place. The
/// region is absolutely not analogous to the C++ "this" pointer. It is also
/// not a valid way to "materialize" the prvalue into a glvalue in C++,
/// because the region represents the *old* storage (sometimes very old), not
/// the *future* storage.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

To be fair, I don't really understand this.

To me, this region is the region that one would need to query from the associated Store to get the value this LazyCompoundValue actually boils down to.
So, a region, like every other region, only defines a memory location (an l-value), and there is not much one could do with such a region. But I find this true for any other MemRegion anywhere in the engine.

Probably I miss something, as I don't usually deal with LCVs, and it's been a while I last touched them.
So, to summarize, I don't really understand what immaterial means in this context or understand why we mention it for LCV::getRegion() instead of mention this for a MemRegion.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@haoNoQ I can't really counter this, can you help us out a bit? :)

@Szelethus Szelethus merged commit f8006a5 into llvm:main Jul 25, 2024
7 checks passed
@Szelethus
Copy link
Contributor Author

I decided to merge this. We are still miles ahead with these docs, and I don't wanna accidentally forget the PR.

yuxuanchen1997 pushed a commit that referenced this pull request Jul 25, 2024
Summary:
Yes, I basically copy-pasted some posts from discord and Artem's book,
but these make for a rather decent docs.

---------

Co-authored-by: Artem Dergachev <[email protected]>
Co-authored-by: Donát Nagy <[email protected]>
Co-authored-by: Balazs Benics <[email protected]>

Test Plan: 

Reviewers: 

Subscribers: 

Tasks: 

Tags: 


Differential Revision: https://phabricator.intern.facebook.com/D60250770
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
clang:static analyzer clang Clang issues not falling into any other category
Projects
None yet
Development

Successfully merging this pull request may close these issues.

5 participants