Skip to content

[analyzer] use invalidateRegions() in VisitGCCAsmStmt #109838

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
merged 3 commits into from
Oct 4, 2024

Conversation

pskrgag
Copy link
Contributor

@pskrgag pskrgag commented Sep 24, 2024

Since b1e4656 it's possible to pass Stmt into invalidateRegions(). Use it in VisitGCCAsmStmt to invalidate regions passed as in and out arguments for inline assembly.

This also fixes #109836 issue with FP uninitialized read, since invalidateRegions() invalidates super region as well, so there won't be problems with binding only to first element of the array.

Closes #109836

@llvmbot llvmbot added clang Clang issues not falling into any other category clang:static analyzer labels Sep 24, 2024
@@ -40,7 +42,19 @@ void testInlineAsmMemcpyUninit(void)
{
int a[10], b[10] = {}, c;
MyMemcpy(&a[1], &b[1], sizeof(b) - sizeof(b[1]));
c = a[0]; // expected-warning{{Assigned value is garbage or undefined}}
c = a[0]; // FIXME: should be warning about uninitialized value, but invalidateRegions() also
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is unfortunate, but not really sure how to avoid it

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

You can set the invalidation traint to preserve super region.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Don't think it would work, since then testInlineAsmMemcpyUninitLoop would also fail, since only the first element of the array will be invalidated.

Maybe I am missing some API to invalidate part of the region?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

On the other hand, I think, this behavior maybe even better in terms of FP. Like maybe asm block will do smth like container_of or such.

So this might be an acceptable damage

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I suppose it all boils down to what the assembly block does. I don't think there is anything fundamentally preventing us refining what gets invalidated other than actually parsing and understanding the asm block.

Invalidating everything is a good default choice; however, frequently we can do better than that.
IDK how the MyMemcpy works in your test, but it could model the invalidation similar to how the StreamChecker goes smart about it in tryToInvalidateFReadBufferByElements here.

@llvmbot
Copy link
Member

llvmbot commented Sep 24, 2024

@llvm/pr-subscribers-clang

Author: Pavel Skripkin (pskrgag)

Changes

Since b1e4656 it's possible to pass Stmt into invalidateRegions(). Use it in VisitGCCAsmStmt to invalidate regions passed as in and out arguments for inline assembly.

This also fixes #109836 issue with FP uninitialized read, since invalidateRegions() invalidates super region as well, so there won't be problems with binding only to first element of the array.

Closes #109836


Full diff: https://github.com/llvm/llvm-project/pull/109838.diff

2 Files Affected:

  • (modified) clang/lib/StaticAnalyzer/Core/ExprEngine.cpp (+6-2)
  • (modified) clang/test/Analysis/asm.cpp (+18-4)
diff --git a/clang/lib/StaticAnalyzer/Core/ExprEngine.cpp b/clang/lib/StaticAnalyzer/Core/ExprEngine.cpp
index fdabba46992b08..a1a015715b4c8b 100644
--- a/clang/lib/StaticAnalyzer/Core/ExprEngine.cpp
+++ b/clang/lib/StaticAnalyzer/Core/ExprEngine.cpp
@@ -3810,7 +3810,9 @@ void ExprEngine::VisitGCCAsmStmt(const GCCAsmStmt *A, ExplodedNode *Pred,
     assert(!isa<NonLoc>(X)); // Should be an Lval, or unknown, undef.
 
     if (std::optional<Loc> LV = X.getAs<Loc>())
-      state = state->bindLoc(*LV, UnknownVal(), Pred->getLocationContext());
+      state = state->invalidateRegions(*LV, A, currBldrCtx->blockCount(),
+                                       Pred->getLocationContext(),
+                                       /*CausedByPointerEscape=*/true);
   }
 
   // Do not reason about locations passed inside inline assembly.
@@ -3818,7 +3820,9 @@ void ExprEngine::VisitGCCAsmStmt(const GCCAsmStmt *A, ExplodedNode *Pred,
     SVal X = state->getSVal(I, Pred->getLocationContext());
 
     if (std::optional<Loc> LV = X.getAs<Loc>())
-      state = state->bindLoc(*LV, UnknownVal(), Pred->getLocationContext());
+      state = state->invalidateRegions(*LV, A, currBldrCtx->blockCount(),
+                                       Pred->getLocationContext(),
+                                       /*CausedByPointerEscape=*/true);
   }
 
   Bldr.generateNode(A, Pred, state);
diff --git a/clang/test/Analysis/asm.cpp b/clang/test/Analysis/asm.cpp
index e0691dc4d794f5..b77038b2e83db7 100644
--- a/clang/test/Analysis/asm.cpp
+++ b/clang/test/Analysis/asm.cpp
@@ -10,8 +10,10 @@ void testRValueOutput() {
   int &ref = global;
   ref = 1;
   __asm__("" : "=r"(((int)(global))));  // don't crash on rvalue output operand
-  clang_analyzer_eval(global == 1); // expected-warning{{UNKNOWN}}
-  clang_analyzer_eval(ref == 1);    // expected-warning{{UNKNOWN}}
+  clang_analyzer_eval(global == 1); // expected-warning{{FALSE}}
+                                    // expected-warning@-1{{TRUE}}
+  clang_analyzer_eval(ref == 1);    // expected-warning{{FALSE}}
+                                    // expected-warning@-1{{TRUE}}
 }
 
 void *MyMemcpy(void *d, const void *s, const int n) {
@@ -40,7 +42,19 @@ void testInlineAsmMemcpyUninit(void)
 {
     int a[10], b[10] = {}, c;
     MyMemcpy(&a[1], &b[1], sizeof(b) - sizeof(b[1]));
-    c = a[0]; // expected-warning{{Assigned value is garbage or undefined}}
+    c = a[0]; // FIXME: should be warning about uninitialized value, but invalidateRegions() also
+              // invalidates super region.
+}
+
+void testInlineAsmMemcpyUninitLoop(const void *src, unsigned long len)
+{
+    int a[10], c;
+    unsigned long toCopy = sizeof(a) < len ? sizeof(a) : len;
+
+    MyMemcpy(&a, src, toCopy);
+
+    for (unsigned long i = 0; i < toCopy; ++i)
+      c = a[i]; // no-warning
 }
 
 void testAsmWithVoidPtrArgument()
@@ -49,6 +63,6 @@ void testAsmWithVoidPtrArgument()
   clang_analyzer_dump(*(int *)globalVoidPtr); // expected-warning-re {{reg_${{[0-9]+}}<int Element{SymRegion{reg_${{[0-9]+}}<void * globalVoidPtr>},0 S64b,int}>}}
   clang_analyzer_dump_ptr(globalVoidPtr); // expected-warning-re {{&SymRegion{reg_${{[0-9]+}}<void * globalVoidPtr>}}}
   asm ("" : : "a"(globalVoidPtr)); // no crash
-  clang_analyzer_dump(*(int *)globalVoidPtr); // expected-warning {{Unknown}}
+  clang_analyzer_dump(*(int *)globalVoidPtr); // expected-warning {{derived_$3{conj_$2{int, LC1, S2385, #1},Element{SymRegion{reg_$0<void * globalVoidPtr>},0 S64b,int}}}}
   clang_analyzer_dump_ptr(globalVoidPtr); // expected-warning-re {{&SymRegion{reg_${{[0-9]+}}<void * globalVoidPtr>}}}
 }

@llvmbot
Copy link
Member

llvmbot commented Sep 24, 2024

@llvm/pr-subscribers-clang-static-analyzer-1

Author: Pavel Skripkin (pskrgag)

Changes

Since b1e4656 it's possible to pass Stmt into invalidateRegions(). Use it in VisitGCCAsmStmt to invalidate regions passed as in and out arguments for inline assembly.

This also fixes #109836 issue with FP uninitialized read, since invalidateRegions() invalidates super region as well, so there won't be problems with binding only to first element of the array.

Closes #109836


Full diff: https://github.com/llvm/llvm-project/pull/109838.diff

2 Files Affected:

  • (modified) clang/lib/StaticAnalyzer/Core/ExprEngine.cpp (+6-2)
  • (modified) clang/test/Analysis/asm.cpp (+18-4)
diff --git a/clang/lib/StaticAnalyzer/Core/ExprEngine.cpp b/clang/lib/StaticAnalyzer/Core/ExprEngine.cpp
index fdabba46992b08..a1a015715b4c8b 100644
--- a/clang/lib/StaticAnalyzer/Core/ExprEngine.cpp
+++ b/clang/lib/StaticAnalyzer/Core/ExprEngine.cpp
@@ -3810,7 +3810,9 @@ void ExprEngine::VisitGCCAsmStmt(const GCCAsmStmt *A, ExplodedNode *Pred,
     assert(!isa<NonLoc>(X)); // Should be an Lval, or unknown, undef.
 
     if (std::optional<Loc> LV = X.getAs<Loc>())
-      state = state->bindLoc(*LV, UnknownVal(), Pred->getLocationContext());
+      state = state->invalidateRegions(*LV, A, currBldrCtx->blockCount(),
+                                       Pred->getLocationContext(),
+                                       /*CausedByPointerEscape=*/true);
   }
 
   // Do not reason about locations passed inside inline assembly.
@@ -3818,7 +3820,9 @@ void ExprEngine::VisitGCCAsmStmt(const GCCAsmStmt *A, ExplodedNode *Pred,
     SVal X = state->getSVal(I, Pred->getLocationContext());
 
     if (std::optional<Loc> LV = X.getAs<Loc>())
-      state = state->bindLoc(*LV, UnknownVal(), Pred->getLocationContext());
+      state = state->invalidateRegions(*LV, A, currBldrCtx->blockCount(),
+                                       Pred->getLocationContext(),
+                                       /*CausedByPointerEscape=*/true);
   }
 
   Bldr.generateNode(A, Pred, state);
diff --git a/clang/test/Analysis/asm.cpp b/clang/test/Analysis/asm.cpp
index e0691dc4d794f5..b77038b2e83db7 100644
--- a/clang/test/Analysis/asm.cpp
+++ b/clang/test/Analysis/asm.cpp
@@ -10,8 +10,10 @@ void testRValueOutput() {
   int &ref = global;
   ref = 1;
   __asm__("" : "=r"(((int)(global))));  // don't crash on rvalue output operand
-  clang_analyzer_eval(global == 1); // expected-warning{{UNKNOWN}}
-  clang_analyzer_eval(ref == 1);    // expected-warning{{UNKNOWN}}
+  clang_analyzer_eval(global == 1); // expected-warning{{FALSE}}
+                                    // expected-warning@-1{{TRUE}}
+  clang_analyzer_eval(ref == 1);    // expected-warning{{FALSE}}
+                                    // expected-warning@-1{{TRUE}}
 }
 
 void *MyMemcpy(void *d, const void *s, const int n) {
@@ -40,7 +42,19 @@ void testInlineAsmMemcpyUninit(void)
 {
     int a[10], b[10] = {}, c;
     MyMemcpy(&a[1], &b[1], sizeof(b) - sizeof(b[1]));
-    c = a[0]; // expected-warning{{Assigned value is garbage or undefined}}
+    c = a[0]; // FIXME: should be warning about uninitialized value, but invalidateRegions() also
+              // invalidates super region.
+}
+
+void testInlineAsmMemcpyUninitLoop(const void *src, unsigned long len)
+{
+    int a[10], c;
+    unsigned long toCopy = sizeof(a) < len ? sizeof(a) : len;
+
+    MyMemcpy(&a, src, toCopy);
+
+    for (unsigned long i = 0; i < toCopy; ++i)
+      c = a[i]; // no-warning
 }
 
 void testAsmWithVoidPtrArgument()
@@ -49,6 +63,6 @@ void testAsmWithVoidPtrArgument()
   clang_analyzer_dump(*(int *)globalVoidPtr); // expected-warning-re {{reg_${{[0-9]+}}<int Element{SymRegion{reg_${{[0-9]+}}<void * globalVoidPtr>},0 S64b,int}>}}
   clang_analyzer_dump_ptr(globalVoidPtr); // expected-warning-re {{&SymRegion{reg_${{[0-9]+}}<void * globalVoidPtr>}}}
   asm ("" : : "a"(globalVoidPtr)); // no crash
-  clang_analyzer_dump(*(int *)globalVoidPtr); // expected-warning {{Unknown}}
+  clang_analyzer_dump(*(int *)globalVoidPtr); // expected-warning {{derived_$3{conj_$2{int, LC1, S2385, #1},Element{SymRegion{reg_$0<void * globalVoidPtr>},0 S64b,int}}}}
   clang_analyzer_dump_ptr(globalVoidPtr); // expected-warning-re {{&SymRegion{reg_${{[0-9]+}}<void * globalVoidPtr>}}}
 }

Copy link
Contributor

@steakhal steakhal left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks correct to me.
Btw, do you think the invalidation should cause pointer escape? I have no opinion. I rarely use this api. Wdyt?

@pskrgag
Copy link
Contributor Author

pskrgag commented Sep 24, 2024

Hm, in this particular case, I think, escaping does make sense, since we don't know what's going on inside inline assembly block.

@pskrgag
Copy link
Contributor Author

pskrgag commented Sep 26, 2024

Looks correct to me.
Btw, do you think the invalidation should cause pointer escape? I have no opinion. I rarely use this api. Wdyt?

Maybe @NagyDonat or @Szelethus have an opinion on that

Copy link
Contributor

@steakhal steakhal left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

A direct bind can be always swapped out for an invalidation, so this PR is correct.
I wonder if you have plans for interpreting the asm block to see if we could do something better - e.g. because we could infer that it will only write a specific input element instead of any reachable memory regions.

@@ -40,7 +42,19 @@ void testInlineAsmMemcpyUninit(void)
{
int a[10], b[10] = {}, c;
MyMemcpy(&a[1], &b[1], sizeof(b) - sizeof(b[1]));
c = a[0]; // expected-warning{{Assigned value is garbage or undefined}}
c = a[0]; // FIXME: should be warning about uninitialized value, but invalidateRegions() also
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I suppose it all boils down to what the assembly block does. I don't think there is anything fundamentally preventing us refining what gets invalidated other than actually parsing and understanding the asm block.

Invalidating everything is a good default choice; however, frequently we can do better than that.
IDK how the MyMemcpy works in your test, but it could model the invalidation similar to how the StreamChecker goes smart about it in tryToInvalidateFReadBufferByElements here.

Copy link
Contributor

@steakhal steakhal left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Still LGTM.

@pskrgag pskrgag merged commit 8d661fd into llvm:main Oct 4, 2024
9 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
clang:static analyzer clang Clang issues not falling into any other category
Projects
None yet
Development

Successfully merging this pull request may close these issues.

[analyzer] FP "uninitialized read" with hand-written memcpy
3 participants