Skip to content

[clang][dataflow] Tighten checking for existence of a function body. #78163

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
merged 1 commit into from
Jan 16, 2024

Conversation

martinboehme
Copy link
Contributor

@martinboehme martinboehme commented Jan 15, 2024

In various places, we would previously call FunctionDecl::hasBody() (which
checks whether any redeclaration of the function has a body, not necessarily the
one on which hasBody() is being called).

This is bug-prone, as a recent bug in Crubit's nullability checker has shown
(fix, fix for the fix).

Instead, we now use FunctionDecl::doesThisDeclarationHaveABody() which, as the
name implies, checks whether the specific redeclaration it is being called on
has a body.

Alternatively, I considered being more lenient and "canonicalizing" to the
FunctionDecl that has the body if the FunctionDecl being passed is a
different redeclaration. However, this also risks hiding bugs: A caller might
inadverently perform the analysis for all redeclarations of a function and end
up duplicating work without realizing it. By accepting only the redeclaration
that contains the body, we prevent this.

I've checked, and all clients that I'm aware of do currently pass in the
redeclaration that contains the function body. Typically this is because they
use the ast_matchers::hasBody() matcher which, unlike
FunctionDecl::hasBody(), only matches for the redeclaration containing the
body.

In various places, we would previously call `FunctionDecl::hasBody()` (which
checks whether any redeclaration of the function has a body, not necessarily the
one on which `hasBody()` is being called).

This is bug-prone, as a recent bug in Crubit's nullability checker has shown
([fix](google/crubit@4b01ed0),
[fix for the
fix](google/crubit@e0c5d8d)).

Instead, we now use `FunctionDecl::doesThisDeclarationHaveABody()` which, as the
name implies, checks whether the specific redeclaration it is being called on
has a body.

Alternatively, I considered being more lenient and "canonicalizing" to the
`FunctionDecl` that has the body if the `FunctionDecl` being passed is a
different redeclaration. However, this also risks hiding bugs: A caller might
inadverently perform the analysis for all redeclarations of a function and end
up duplicating work without realizing it. By accepting only the redeclaration
that contains the body, we prevent this.

I've checked, and all clients that I'm aware of do currently pass in the
redeclaration that contains the function body. Typically this is because they
use the `ast_matchers::hasBody()` matcher which, unlike
`FunctionDecl::hasBody()`, only matches for the redeclaration containing the
body.
@llvmbot llvmbot added clang Clang issues not falling into any other category clang:dataflow Clang Dataflow Analysis framework - https://clang.llvm.org/docs/DataFlowAnalysisIntro.html clang:analysis labels Jan 15, 2024
@llvmbot
Copy link
Member

llvmbot commented Jan 15, 2024

@llvm/pr-subscribers-clang

@llvm/pr-subscribers-clang-analysis

Author: None (martinboehme)

Changes

In various places, we would previously call FunctionDecl::hasBody() (which
checks whether any redeclaration of the function has a body, not necessarily the
one on which hasBody() is being called).

This is bug-prone, as a recent bug in Crubit's nullability checker has shown
(fix,
fix for the
fix
).

Instead, we now use FunctionDecl::doesThisDeclarationHaveABody() which, as the
name implies, checks whether the specific redeclaration it is being called on
has a body.

Alternatively, I considered being more lenient and "canonicalizing" to the
FunctionDecl that has the body if the FunctionDecl being passed is a
different redeclaration. However, this also risks hiding bugs: A caller might
inadverently perform the analysis for all redeclarations of a function and end
up duplicating work without realizing it. By accepting only the redeclaration
that contains the body, we prevent this.

I've checked, and all clients that I'm aware of do currently pass in the
redeclaration that contains the function body. Typically this is because they
use the ast_matchers::hasBody() matcher which, unlike
FunctionDecl::hasBody(), only matches for the redeclaration containing the
body.


Full diff: https://github.com/llvm/llvm-project/pull/78163.diff

5 Files Affected:

  • (modified) clang/include/clang/Analysis/FlowSensitive/ControlFlowContext.h (+2-1)
  • (modified) clang/include/clang/Analysis/FlowSensitive/DataflowEnvironment.h (+2-1)
  • (modified) clang/lib/Analysis/FlowSensitive/ControlFlowContext.cpp (+1-1)
  • (modified) clang/lib/Analysis/FlowSensitive/DataflowAnalysisContext.cpp (+1-1)
  • (modified) clang/lib/Analysis/FlowSensitive/DataflowEnvironment.cpp (+2-2)
diff --git a/clang/include/clang/Analysis/FlowSensitive/ControlFlowContext.h b/clang/include/clang/Analysis/FlowSensitive/ControlFlowContext.h
index 768387a121b920..405e93287a05d3 100644
--- a/clang/include/clang/Analysis/FlowSensitive/ControlFlowContext.h
+++ b/clang/include/clang/Analysis/FlowSensitive/ControlFlowContext.h
@@ -32,7 +32,8 @@ namespace dataflow {
 class ControlFlowContext {
 public:
   /// Builds a ControlFlowContext from a `FunctionDecl`.
-  /// `Func.hasBody()` must be true, and `Func.isTemplated()` must be false.
+  /// `Func.doesThisDeclarationHaveABody()` must be true, and
+  /// `Func.isTemplated()` must be false.
   static llvm::Expected<ControlFlowContext> build(const FunctionDecl &Func);
 
   /// Builds a ControlFlowContext from an AST node. `D` is the function in which
diff --git a/clang/include/clang/Analysis/FlowSensitive/DataflowEnvironment.h b/clang/include/clang/Analysis/FlowSensitive/DataflowEnvironment.h
index e8c27d6c12038b..1543f900e401d6 100644
--- a/clang/include/clang/Analysis/FlowSensitive/DataflowEnvironment.h
+++ b/clang/include/clang/Analysis/FlowSensitive/DataflowEnvironment.h
@@ -172,7 +172,8 @@ class Environment {
   ///
   /// Requirements:
   ///
-  ///  The function must have a body.
+  ///  The function must have a body, i.e.
+  ///  `FunctionDecl::doesThisDecalarationHaveABody()` must be true.
   void initialize();
 
   /// Returns a new environment that is a copy of this one.
diff --git a/clang/lib/Analysis/FlowSensitive/ControlFlowContext.cpp b/clang/lib/Analysis/FlowSensitive/ControlFlowContext.cpp
index 56246066e4aa13..c9ebffe6f37801 100644
--- a/clang/lib/Analysis/FlowSensitive/ControlFlowContext.cpp
+++ b/clang/lib/Analysis/FlowSensitive/ControlFlowContext.cpp
@@ -69,7 +69,7 @@ static llvm::BitVector findReachableBlocks(const CFG &Cfg) {
 
 llvm::Expected<ControlFlowContext>
 ControlFlowContext::build(const FunctionDecl &Func) {
-  if (!Func.hasBody())
+  if (!Func.doesThisDeclarationHaveABody())
     return llvm::createStringError(
         std::make_error_code(std::errc::invalid_argument),
         "Cannot analyze function without a body");
diff --git a/clang/lib/Analysis/FlowSensitive/DataflowAnalysisContext.cpp b/clang/lib/Analysis/FlowSensitive/DataflowAnalysisContext.cpp
index fa114979c8e326..03670b1821e374 100644
--- a/clang/lib/Analysis/FlowSensitive/DataflowAnalysisContext.cpp
+++ b/clang/lib/Analysis/FlowSensitive/DataflowAnalysisContext.cpp
@@ -292,7 +292,7 @@ DataflowAnalysisContext::getControlFlowContext(const FunctionDecl *F) {
   if (It != FunctionContexts.end())
     return &It->second;
 
-  if (F->hasBody()) {
+  if (F->doesThisDeclarationHaveABody()) {
     auto CFCtx = ControlFlowContext::build(*F);
     // FIXME: Handle errors.
     assert(CFCtx);
diff --git a/clang/lib/Analysis/FlowSensitive/DataflowEnvironment.cpp b/clang/lib/Analysis/FlowSensitive/DataflowEnvironment.cpp
index 96fe6df88dbb9f..a50ee57a3c11b4 100644
--- a/clang/lib/Analysis/FlowSensitive/DataflowEnvironment.cpp
+++ b/clang/lib/Analysis/FlowSensitive/DataflowEnvironment.cpp
@@ -386,7 +386,7 @@ void Environment::initialize() {
     return;
 
   if (const auto *FuncDecl = dyn_cast<FunctionDecl>(DeclCtx)) {
-    assert(FuncDecl->getBody() != nullptr);
+    assert(FuncDecl->doesThisDeclarationHaveABody());
 
     initFieldsGlobalsAndFuncs(FuncDecl);
 
@@ -426,7 +426,7 @@ void Environment::initialize() {
 // FIXME: Add support for resetting globals after function calls to enable
 // the implementation of sound analyses.
 void Environment::initFieldsGlobalsAndFuncs(const FunctionDecl *FuncDecl) {
-  assert(FuncDecl->getBody() != nullptr);
+  assert(FuncDecl->doesThisDeclarationHaveABody());
 
   FieldSet Fields;
   llvm::DenseSet<const VarDecl *> Vars;

@martinboehme
Copy link
Contributor Author

Build failure is merely a clang-format failure in a file not touched by this PR (clang/docs/HLSL/FunctionCalls.rst).

@martinboehme martinboehme merged commit c19cacf into llvm:main Jan 16, 2024
justinfargnoli pushed a commit to justinfargnoli/llvm-project that referenced this pull request Jan 28, 2024
…lvm#78163)

In various places, we would previously call `FunctionDecl::hasBody()`
(which
checks whether any redeclaration of the function has a body, not
necessarily the
one on which `hasBody()` is being called).

This is bug-prone, as a recent bug in Crubit's nullability checker has
shown

([fix](google/crubit@4b01ed0),
[fix for the
fix](google/crubit@e0c5d8d)).

Instead, we now use `FunctionDecl::doesThisDeclarationHaveABody()`
which, as the
name implies, checks whether the specific redeclaration it is being
called on
has a body.

Alternatively, I considered being more lenient and "canonicalizing" to
the
`FunctionDecl` that has the body if the `FunctionDecl` being passed is a
different redeclaration. However, this also risks hiding bugs: A caller
might
inadverently perform the analysis for all redeclarations of a function
and end
up duplicating work without realizing it. By accepting only the
redeclaration
that contains the body, we prevent this.

I've checked, and all clients that I'm aware of do currently pass in the
redeclaration that contains the function body. Typically this is because
they
use the `ast_matchers::hasBody()` matcher which, unlike
`FunctionDecl::hasBody()`, only matches for the redeclaration containing
the
body.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
clang:analysis clang:dataflow Clang Dataflow Analysis framework - https://clang.llvm.org/docs/DataFlowAnalysisIntro.html clang Clang issues not falling into any other category
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants