-
Notifications
You must be signed in to change notification settings - Fork 14.3k
Thread Safety Analysis: Support warning on taking address of guarded variables #123063
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
@llvm/pr-subscribers-clang-analysis @llvm/pr-subscribers-clang Author: Marco Elver (melver) ChangesAdd the optional ability, via This is required to avoid false negatives in large C codebases, where data structures are typically implemented through helpers that take pointers to instances of a data structure. We also argue that, while obtaining the address itself does not yet constitute a potential race (in the presence of missing locking), placing the requirement on the pointer-recipient to obtain locks to access the pointed-to data is most likely poor style. This is analogous to passing C++ references to guarded variables, which produces warnings by default. Given that existing codebases using Full diff: https://github.com/llvm/llvm-project/pull/123063.diff 9 Files Affected:
diff --git a/clang/docs/ReleaseNotes.rst b/clang/docs/ReleaseNotes.rst
index 794943b24a003c..abcf0ef236cc2e 100644
--- a/clang/docs/ReleaseNotes.rst
+++ b/clang/docs/ReleaseNotes.rst
@@ -788,6 +788,9 @@ Improvements to Clang's diagnostics
require(scope); // Warning! Requires mu1.
}
+- The :doc:`ThreadSafetyAnalysis` now supports ``-Wthread-safety-addressof``,
+ which enables warning if the address of guarded variables is obtained.
+
Improvements to Clang's time-trace
----------------------------------
diff --git a/clang/docs/ThreadSafetyAnalysis.rst b/clang/docs/ThreadSafetyAnalysis.rst
index 9c1c32e46989bc..c506844e7c94dc 100644
--- a/clang/docs/ThreadSafetyAnalysis.rst
+++ b/clang/docs/ThreadSafetyAnalysis.rst
@@ -515,8 +515,16 @@ Warning flags
+ ``-Wthread-safety-analysis``: The core analysis.
+ ``-Wthread-safety-precise``: Requires that mutex expressions match precisely.
This warning can be disabled for code which has a lot of aliases.
- + ``-Wthread-safety-reference``: Checks when guarded members are passed by reference.
-
+ + ``-Wthread-safety-reference``: Checks when guarded variables are passed by reference.
+
+* ``-Wthread-safety-addressof``: Warn when the address of guarded variables is
+ obtained (``&var``). Since obtaining the address of a variable doesn't
+ necessarily imply a read or write, the warning is off by default. In
+ codebases that prefer passing pointers rather than references (for C++
+ codebases), or passing pointers is ubiquitous (for C codebases), enabling
+ this warning will result in fewer false negatives; for example, where the
+ manipulation of common data structures is done via functions that take
+ pointers to instances of the data structure.
:ref:`negative` are an experimental feature, which are enabled with:
diff --git a/clang/include/clang/Analysis/Analyses/ThreadSafety.h b/clang/include/clang/Analysis/Analyses/ThreadSafety.h
index 0fcf5bed1285a4..a61c60cbf531e9 100644
--- a/clang/include/clang/Analysis/Analyses/ThreadSafety.h
+++ b/clang/include/clang/Analysis/Analyses/ThreadSafety.h
@@ -54,6 +54,9 @@ enum ProtectedOperationKind {
/// Returning a pt-guarded variable by reference.
POK_PtReturnByRef,
+
+ /// Obtaining address of a variable (e.g. &x).
+ POK_AddressOf,
};
/// This enum distinguishes between different kinds of lock actions. For
diff --git a/clang/include/clang/Basic/DiagnosticGroups.td b/clang/include/clang/Basic/DiagnosticGroups.td
index 594e99a19b64d6..8bd5d043cefa80 100644
--- a/clang/include/clang/Basic/DiagnosticGroups.td
+++ b/clang/include/clang/Basic/DiagnosticGroups.td
@@ -1126,6 +1126,7 @@ def ThreadSafetyReferenceReturn : DiagGroup<"thread-safety-reference-return">;
def ThreadSafetyReference : DiagGroup<"thread-safety-reference",
[ThreadSafetyReferenceReturn]>;
def ThreadSafetyNegative : DiagGroup<"thread-safety-negative">;
+def ThreadSafetyAddressof : DiagGroup<"thread-safety-addressof">;
def ThreadSafety : DiagGroup<"thread-safety",
[ThreadSafetyAttributes,
ThreadSafetyAnalysis,
diff --git a/clang/include/clang/Basic/DiagnosticSemaKinds.td b/clang/include/clang/Basic/DiagnosticSemaKinds.td
index 8be4f946dce1cc..8e980dc31e3003 100644
--- a/clang/include/clang/Basic/DiagnosticSemaKinds.td
+++ b/clang/include/clang/Basic/DiagnosticSemaKinds.td
@@ -4140,6 +4140,14 @@ def warn_thread_safety_verbose : Warning<"thread safety verbose warning">,
def note_thread_warning_in_fun : Note<"thread warning in function %0">;
def note_guarded_by_declared_here : Note<"guarded_by declared here">;
+// Thread safety warnings on addressof
+def warn_addressof_requires_any_lock : Warning<
+ "obtaining address of variable %0 requires holding any mutex">,
+ InGroup<ThreadSafetyAddressof>, DefaultIgnore;
+def warn_addressof_requires_lock : Warning<
+ "obtaining address of variable %1 requires holding %0 '%2'">,
+ InGroup<ThreadSafetyAddressof>, DefaultIgnore;
+
// Dummy warning that will trigger "beta" warnings from the analysis if enabled.
def warn_thread_safety_beta : Warning<"thread safety beta warning">,
InGroup<ThreadSafetyBeta>, DefaultIgnore;
diff --git a/clang/lib/Analysis/ThreadSafety.cpp b/clang/lib/Analysis/ThreadSafety.cpp
index bfaf1a0e7c7ffc..3f06b40a623475 100644
--- a/clang/lib/Analysis/ThreadSafety.cpp
+++ b/clang/lib/Analysis/ThreadSafety.cpp
@@ -2080,6 +2080,9 @@ void BuildLockset::VisitUnaryOperator(const UnaryOperator *UO) {
case UO_PreInc:
checkAccess(UO->getSubExpr(), AK_Written);
break;
+ case UO_AddrOf:
+ checkAccess(UO->getSubExpr(), AK_Read, POK_AddressOf);
+ break;
default:
break;
}
diff --git a/clang/lib/Sema/AnalysisBasedWarnings.cpp b/clang/lib/Sema/AnalysisBasedWarnings.cpp
index 589869d0186575..bb009de228ee4f 100644
--- a/clang/lib/Sema/AnalysisBasedWarnings.cpp
+++ b/clang/lib/Sema/AnalysisBasedWarnings.cpp
@@ -1983,11 +1983,21 @@ class ThreadSafetyReporter : public clang::threadSafety::ThreadSafetyHandler {
void handleNoMutexHeld(const NamedDecl *D, ProtectedOperationKind POK,
AccessKind AK, SourceLocation Loc) override {
- assert((POK == POK_VarAccess || POK == POK_VarDereference) &&
- "Only works for variables");
- unsigned DiagID = POK == POK_VarAccess?
- diag::warn_variable_requires_any_lock:
- diag::warn_var_deref_requires_any_lock;
+ unsigned DiagID = 0;
+ switch (POK) {
+ case POK_VarAccess:
+ DiagID = diag::warn_variable_requires_any_lock;
+ break;
+ case POK_VarDereference:
+ DiagID = diag::warn_var_deref_requires_any_lock;
+ break;
+ case POK_AddressOf:
+ DiagID = diag::warn_addressof_requires_any_lock;
+ break;
+ default:
+ assert(false && "Only works for variables");
+ break;
+ }
PartialDiagnosticAt Warning(Loc, S.PDiag(DiagID)
<< D << getLockKindFromAccessKind(AK));
Warnings.emplace_back(std::move(Warning), getNotes());
@@ -2006,6 +2016,9 @@ class ThreadSafetyReporter : public clang::threadSafety::ThreadSafetyHandler {
case POK_VarDereference:
DiagID = diag::warn_var_deref_requires_lock_precise;
break;
+ case POK_AddressOf:
+ DiagID = diag::warn_addressof_requires_lock;
+ break;
case POK_FunctionCall:
DiagID = diag::warn_fun_requires_lock_precise;
break;
@@ -2042,6 +2055,9 @@ class ThreadSafetyReporter : public clang::threadSafety::ThreadSafetyHandler {
case POK_VarDereference:
DiagID = diag::warn_var_deref_requires_lock;
break;
+ case POK_AddressOf:
+ DiagID = diag::warn_addressof_requires_lock;
+ break;
case POK_FunctionCall:
DiagID = diag::warn_fun_requires_lock;
break;
diff --git a/clang/test/Sema/warn-thread-safety-analysis.c b/clang/test/Sema/warn-thread-safety-analysis.c
index 73b4e4798f6443..cc6af79aadb22b 100644
--- a/clang/test/Sema/warn-thread-safety-analysis.c
+++ b/clang/test/Sema/warn-thread-safety-analysis.c
@@ -1,5 +1,6 @@
// RUN: %clang_cc1 -fsyntax-only -verify -Wthread-safety -Wthread-safety-beta %s
// RUN: %clang_cc1 -fsyntax-only -verify -Wthread-safety -Wthread-safety-beta -fexperimental-late-parse-attributes -DLATE_PARSING %s
+// RUN: %clang_cc1 -fsyntax-only -verify -Wthread-safety -Wthread-safety-beta -Wthread-safety-addressof -fexperimental-late-parse-attributes -DLATE_PARSING -DCHECK_ADDRESSOF %s
#define LOCKABLE __attribute__ ((lockable))
#define SCOPED_LOCKABLE __attribute__ ((scoped_lockable))
@@ -133,7 +134,12 @@ int main(void) {
Foo_func3(5);
+#ifdef CHECK_ADDRESSOF
+ set_value(&a_, 0); // expected-warning{{calling function 'set_value' requires holding mutex 'foo_.mu_' exclusively}} \
+ expected-warning{{obtaining address of variable 'a_' requires holding mutex 'foo_.mu_'}}
+#else
set_value(&a_, 0); // expected-warning{{calling function 'set_value' requires holding mutex 'foo_.mu_' exclusively}}
+#endif
get_value(b_); // expected-warning{{calling function 'get_value' requires holding mutex 'foo_.mu_'}}
mutex_exclusive_lock(foo_.mu_);
set_value(&a_, 1);
@@ -180,6 +186,11 @@ int main(void) {
#ifdef LATE_PARSING
late_parsing.a_value_defined_before = 1; // expected-warning{{writing variable 'a_value_defined_before' requires holding mutex 'a_mutex_defined_late' exclusively}}
late_parsing.a_ptr_defined_before = 0;
+# ifdef CHECK_ADDRESSOF
+ (void)&late_parsing.a_value_defined_before; // expected-warning{{obtaining address of variable 'a_value_defined_before' requires holding mutex 'a_mutex_defined_late'}}
+# else
+ (void)&late_parsing.a_value_defined_before; // no warning
+# endif
mutex_exclusive_lock(late_parsing.a_mutex_defined_late);
mutex_exclusive_lock(late_parsing.a_mutex_defined_early); // expected-warning{{mutex 'a_mutex_defined_early' must be acquired before 'a_mutex_defined_late'}}
mutex_exclusive_unlock(late_parsing.a_mutex_defined_early);
diff --git a/clang/test/SemaCXX/warn-thread-safety-analysis.cpp b/clang/test/SemaCXX/warn-thread-safety-analysis.cpp
index 018d6d1bb258b5..ce5c08d32328c2 100644
--- a/clang/test/SemaCXX/warn-thread-safety-analysis.cpp
+++ b/clang/test/SemaCXX/warn-thread-safety-analysis.cpp
@@ -1,7 +1,9 @@
// RUN: %clang_cc1 -fsyntax-only -verify -std=c++11 -Wthread-safety -Wthread-safety-beta -Wno-thread-safety-negative -fcxx-exceptions -DUSE_CAPABILITY=0 %s
// RUN: %clang_cc1 -fsyntax-only -verify -std=c++11 -Wthread-safety -Wthread-safety-beta -Wno-thread-safety-negative -fcxx-exceptions -DUSE_CAPABILITY=1 %s
+// RUN: %clang_cc1 -fsyntax-only -verify -std=c++11 -Wthread-safety -Wthread-safety-beta -Wno-thread-safety-negative -Wthread-safety-addressof -fcxx-exceptions -DUSE_CAPABILITY=1 -DCHECK_ADDRESSOF %s
// RUN: %clang_cc1 -fsyntax-only -verify -std=c++17 -Wthread-safety -Wthread-safety-beta -Wno-thread-safety-negative -fcxx-exceptions -DUSE_CAPABILITY=0 %s
// RUN: %clang_cc1 -fsyntax-only -verify -std=c++17 -Wthread-safety -Wthread-safety-beta -Wno-thread-safety-negative -fcxx-exceptions -DUSE_CAPABILITY=1 %s
+// RUN: %clang_cc1 -fsyntax-only -verify -std=c++17 -Wthread-safety -Wthread-safety-beta -Wno-thread-safety-negative -Wthread-safety-addressof -fcxx-exceptions -DUSE_CAPABILITY=1 -DCHECK_ADDRESSOF %s
// FIXME: should also run %clang_cc1 -fsyntax-only -verify -Wthread-safety -std=c++11 -Wc++98-compat %s
// FIXME: should also run %clang_cc1 -fsyntax-only -verify -Wthread-safety %s
@@ -4863,6 +4865,11 @@ class Data {
int dat;
};
+class DataWithAddrOf : public Data {
+public:
+ void* operator&() const;
+};
+
class DataCell {
public:
@@ -4900,10 +4907,15 @@ class Foo {
a = (*datap2_).getValue(); // \
// expected-warning {{reading the value pointed to by 'datap2_' requires holding mutex 'mu_'}}
+ // Calls operator&, and does not obtain the address.
+ (void)&data_ao_; // expected-warning {{reading variable 'data_ao_' requires holding mutex 'mu_'}}
+ (void)__builtin_addressof(data_ao_); // expected-warning {{passing variable 'data_ao_' by reference requires holding mutex 'mu_'}}
+
mu_.Lock();
data_.setValue(1);
datap1_->setValue(1);
datap2_->setValue(1);
+ (void)&data_ao_;
mu_.Unlock();
mu_.ReaderLock();
@@ -4911,6 +4923,7 @@ class Foo {
datap1_->setValue(0); // reads datap1_, writes *datap1_
a = datap1_->getValue();
a = datap2_->getValue();
+ (void)&data_ao_;
mu_.Unlock();
}
@@ -4939,11 +4952,29 @@ class Foo {
data_++; // expected-warning {{writing variable 'data_' requires holding mutex 'mu_' exclusively}}
--data_; // expected-warning {{writing variable 'data_' requires holding mutex 'mu_' exclusively}}
data_--; // expected-warning {{writing variable 'data_' requires holding mutex 'mu_' exclusively}}
+#ifdef CHECK_ADDRESSOF
+ (void)&data_; // expected-warning {{obtaining address of variable 'data_' requires holding mutex 'mu_'}}
+ (void)&datap1_; // expected-warning {{obtaining address of variable 'datap1_' requires holding mutex 'mu_'}}
+#else
+ (void)&data_; // no warning
+ (void)&datap1_; // no warning
+#endif
+ (void)(&*datap1_); // expected-warning {{reading variable 'datap1_' requires holding mutex 'mu_'}}
data_[0] = 0; // expected-warning {{reading variable 'data_' requires holding mutex 'mu_'}}
(*datap2_)[0] = 0; // expected-warning {{reading the value pointed to by 'datap2_' requires holding mutex 'mu_'}}
+ (void)&datap2_; // no warning
+ (void)(&*datap2_); // expected-warning {{reading the value pointed to by 'datap2_' requires holding mutex 'mu_'}}
data_(); // expected-warning {{reading variable 'data_' requires holding mutex 'mu_'}}
+
+ mu_.Lock();
+ (void)&data_;
+ mu_.Unlock();
+
+ mu_.ReaderLock();
+ (void)&data_;
+ mu_.Unlock();
}
// const operator tests
@@ -4962,6 +4993,7 @@ class Foo {
Data data_ GUARDED_BY(mu_);
Data* datap1_ GUARDED_BY(mu_);
Data* datap2_ PT_GUARDED_BY(mu_);
+ DataWithAddrOf data_ao_ GUARDED_BY(mu_);
};
} // end namespace GuardedNonPrimitiveTypeTest
@@ -5870,7 +5902,11 @@ class Foo {
}
void ptr_test() {
- int *b = &a;
+#ifdef CHECK_ADDRESSOF
+ int *b = &a; // expected-warning {{obtaining address of variable 'a' requires holding mutex 'mu'}}
+#else
+ int *b = &a; // no warning
+#endif
*b = 0; // no expected warning yet
}
@@ -6089,7 +6125,11 @@ class Return {
}
Foo *returns_ptr() {
- return &foo; // FIXME -- Do we want to warn on this ?
+#ifdef CHECK_ADDRESSOF
+ return &foo; // expected-warning {{obtaining address of variable 'foo' requires holding mutex 'mu'}}
+#else
+ return &foo; // no warning
+#endif
}
Foo &returns_ref2() {
|
✅ With the latest revision this PR passed the C/C++ code formatter. |
…ed variables Add the optional ability, via `-Wthread-safety-addressof`, to warn when obtaining the address of guarded variables. This is required to avoid false negatives in large C codebases, where data structures are typically implemented through helpers that take pointers to instances of a data structure. We also argue that, while obtaining the address itself does not yet constitute a potential race (in the presence of missing locking), placing the requirement on the pointer-recipient to obtain locks to access the pointed-to data is most likely poor style. This is analogous to passing C++ references to guarded variables, which produces warnings by default. Given that existing codebases using `-Wthread-safety` likely have cases where obtaining the pointer to a guarded variable is benign, the feature is not enabled by default but requires explicit opt-in.
60e3b1d
to
6727047
Compare
For additional background: I'm trying to work out how to bring -Wthread-safety to the Linux kernel. Since -fexperimental-late-parse-attributes made the feature practical for complex C structs, I started to look at a strategy to enable the feature. The current strategy is to create an opt-in mechanism (per subsystem), and since the constraints imposed by Wthread-safety already require subtly restructuring most code that would opt-in, I think having Wthread-safety-addressof part of the initial constraints imposed by Wthread-safety would be very helpful. In converting one subsystem I maintain, I was disappointed that things like |
We had a discussion on https://reviews.llvm.org/D52395 that might be relevant here. To quote @delesley:
Another aspect is that we don't check if the lock is still held when we dereference the pointer, so there are also false negatives: Mutex mu;
int x GUARDED_BY(mu);
void f() {
mu.Lock();
int *p = &x;
mu.Unlock();
*p = 0;
} You use the analogy of C++ references. Interestingly, we don't warn on "taking a reference", partly because no such thing exists. (You can only initialize something of reference type with an expression.) We warn on passing a variable by reference to a function, or in other words, initializing a parameter of reference type with a guarded variable. (Since https://reviews.llvm.org/D153131, we also warn on returning a reference when the lock is no longer held after return. Note that the initialization of the reference might still happen under lock!) However, we also track references in a way that pointers are not tracked. The reason is probably that references are immutable (that is, void g() {
mu.Lock();
int &ref = x;
mu.Unlock();
ref = 0; // warning: writing variable 'x' requires holding mutex 'mu' exclusively
} If you "take the reference" outside of the lock and do the assignment inside, there is no warning. Because the assignment is what needs the lock, not taking the address. However, with functions it's somewhat reasonable to assume that the function accesses through the pointer, and that the pointer doesn't escape. (This is of course not always true, but often enough.) So I wonder whether we should restrict this to pointers passed as function arguments. But if the number of false positives is manageable, we can also leave it like you implemented it. Did you try it out on some code base that uses thread safety annotations? |
Right. Though in the absence of better pointer tracking / alias analysis, I believe there's no way to avoid resulting false positives with pointers. It's something that a user of Wthread-safety-addressof would opt-in explicitly - documentation needs elaboration on this perhaps.
Good point - though I'd prefer few false negatives + false positives over no checks at all. It's a tradeoff, and therefore we definitely shouldn't enable Wthread-safety-addressof by default.
Good points, and I missed that reference handling is much better in this regard.
That's a reasonable option to make it more conservative, although I'm not sure it's necessary (yet).
I'm working on bringing Wthread-safety to the Linux kernel. The strategy chosen is to convert one subsystem (or single TU) at a time, based on an opt-in basis. Since Linux already has a rather special dialect of C, those that would choose to opt in their TUs would opt into another variant of Linux's C dialect (one with Wthread-safety enabled). My experience in converting a subsystem I maintain is that applying Wthread-safety already requires small refactors to avoid false positives, and I'm not opposed to add additional constraints via Wthread-safety-addressof. Without additional checking of "obtaining address of guarded variables", I find that Wthread-safety barely provides benefits, as passing pointers to datastructures is so common. Resolving this will improve the usefulness, and ultimately the chances of us succeeding to upstream this to the Linux kernel. I don't think we will ever convert 100% of the kernel to use Wthread-safety, but the plan is that willing subsystem maintainers opt in and deal with warnings produced by Wthread-safety. Having Wthread-safety-addressof will add additional coverage. For cases where it produces too many false positives, I intend to add a simple option that can enable/disable addressof checking for individual TUs. Or if almost all places where obtaining the address of a particular lock-guarded variable is not actually under a lock (but still correct), leaving off |
Update documentation wording to be clearer about false positives and negatives of Wthread-safety-addressof.
The wording "take address of" or "taking address of" is more common across the codebase: % git grep 'tak\(e\|ing\) address of' | wc 107 1513 16080 Whereas the wording "obtaining address of" has no occurrence. Use the more common wording.
120ef92
to
0c4d02e
Compare
FWIW, the Linux kernel integration (draft, WIP) currently lives here: https://github.com/google/kernel-sanitizers/tree/cap-analysis And I want to re-iterate that without -Wthread-safety-addressof, the feature's coverage is significantly reduced, and I predict it to be one of the first complaints in later review. Kindly take another look. cc @bvanassche |
Looks like the kernel patches use |
Yeah, no way around it if we want to have guarded_by in structs work properly. If the problem is cost, still planning to allow it to be disabled via Kconfig option. |
Can any of the members in the structs be reorganized to put the mutex member declaration BEFORE the members they guard? Probably not always, but perhaps that's possible for most structures? |
It's an option I considered, but I can already hear "what is this crap ... NACK". In many cases it might be possible, but where data layout or cacheline organization is important for performance, definitely not an option. |
Doesn't Sure, I can imagine where there might be pushback, but perhaps its worthwhile to see how far you can get with either marking structs that don't need Also, does @bvanassche have a list of trophies from this? Would be cool to track. |
It does, AFAIK.
Fair point, in the worst case individual TUs could still add the flag if needed. |
But counted_by isn't using |
Our current |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM (we've discussed minor nits in the comments offline, Marco fixed them)
…ty analysis WIP A Clang version that supports -Wthread-safety-addressof is recommended, but not required: llvm/llvm-project#123063 Signed-off-by: Marco Elver <[email protected]>
…ty analysis WIP A Clang version that supports -Wthread-safety-addressof is recommended, but not required: llvm/llvm-project#123063 Signed-off-by: Marco Elver <[email protected]>
RFC posted to LKML: https://lore.kernel.org/lkml/[email protected] Depending on how things go, I intend to commit this within a few weeks unless there are objections, so that folks that want to already use this, can do so by just grabbing latest Clang. Thanks! |
…ty analysis Capability analysis is a C language extension, which enables statically checking that user-definable "capabilities" are acquired and released where required. An obvious application is lock-safety checking for the kernel's various synchronization primitives (each of which represents a "capability"), and checking that locking rules are not violated. Clang originally called the feature "Thread Safety Analysis" [1], with some terminology still using the thread-safety-analysis-only names. This was later changed and the feature became more flexible, gaining the ability to define custom "capabilities". Its foundations can be found in "capability systems", used to specify the permissibility of operations to depend on some capability being held (or not held). [1] https://clang.llvm.org/docs/ThreadSafetyAnalysis.html [2] https://www.cs.cornell.edu/talc/papers/capabilities.pdf Because the feature is not just able to express capabilities related to synchronization primitives, the naming chosen for the kernel departs from Clang's initial "Thread Safety" nomenclature and refers to the feature as "Capability Analysis" to avoid confusion. The implementation still makes references to the older terminology in some places, such as `-Wthread-safety` being the warning enabled option that also still appears in diagnostic messages. See more details in the kernel-doc documentation added in this and the subsequent changes. [ RFC Note: A Clang version that supports -Wthread-safety-addressof is recommended, but not required: llvm/llvm-project#123063 Should this patch series reach non-RFC stage, it is planned to be committed to Clang before. ] Signed-off-by: Marco Elver <[email protected]>
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I've taken a brief look at the Linux kernel changes. I didn't check how many variables are affected by __rcu_guarded
, but otherwise there seem to be maybe two dozen guarded variables. If you really didn't encounter any false positives that's not bad.
However, I'd still prefer if we could mark this as experimental, because it's a bit coarse:
- The address-of operation isn't interesting by itself, it merely happens to be required to do pass-by-reference in C.
- The equivalent of passing a
pt_guarded_by
variable by value doesn't seem to warn. - The actual access itself is not checked, whether explicit or assumed behind a function call.
In the long term, a better approach might be:
- Extend alias tracking to pointers, perhaps restricted to those that don't change value.
- Warn when passing pointers to guarded variables into other functions.
…ty analysis Capability analysis is a C language extension, which enables statically checking that user-definable "capabilities" are acquired and released where required. An obvious application is lock-safety checking for the kernel's various synchronization primitives (each of which represents a "capability"), and checking that locking rules are not violated. Clang originally called the feature "Thread Safety Analysis" [1], with some terminology still using the thread-safety-analysis-only names. This was later changed and the feature became more flexible, gaining the ability to define custom "capabilities". Its foundations can be found in "capability systems", used to specify the permissibility of operations to depend on some capability being held (or not held). [1] https://clang.llvm.org/docs/ThreadSafetyAnalysis.html [2] https://www.cs.cornell.edu/talc/papers/capabilities.pdf Because the feature is not just able to express capabilities related to synchronization primitives, the naming chosen for the kernel departs from Clang's initial "Thread Safety" nomenclature and refers to the feature as "Capability Analysis" to avoid confusion. The implementation still makes references to the older terminology in some places, such as `-Wthread-safety` being the warning enabled option that also still appears in diagnostic messages. See more details in the kernel-doc documentation added in this and the subsequent changes. A Clang version that supports -Wthread-safety-addressof is recommended, but not required: llvm/llvm-project#123063 Signed-off-by: Marco Elver <[email protected]>
…ty analysis Capability analysis is a C language extension, which enables statically checking that user-definable "capabilities" are acquired and released where required. An obvious application is lock-safety checking for the kernel's various synchronization primitives (each of which represents a "capability"), and checking that locking rules are not violated. Clang originally called the feature "Thread Safety Analysis" [1], with some terminology still using the thread-safety-analysis-only names. This was later changed and the feature became more flexible, gaining the ability to define custom "capabilities". Its foundations can be found in "capability systems", used to specify the permissibility of operations to depend on some capability being held (or not held). [1] https://clang.llvm.org/docs/ThreadSafetyAnalysis.html [2] https://www.cs.cornell.edu/talc/papers/capabilities.pdf Because the feature is not just able to express capabilities related to synchronization primitives, the naming chosen for the kernel departs from Clang's initial "Thread Safety" nomenclature and refers to the feature as "Capability Analysis" to avoid confusion. The implementation still makes references to the older terminology in some places, such as `-Wthread-safety` being the warning enabled option that also still appears in diagnostic messages. See more details in the kernel-doc documentation added in this and the subsequent changes. A Clang version that supports -Wthread-safety-addressof is recommended, but not required: llvm/llvm-project#123063 Signed-off-by: Marco Elver <[email protected]>
This inconsistency is probably my largest concern. If you have int x GUARDED_BY(mu);
int *p PT_GUARDED_BY(mu); then That's why I think we should rather check function arguments, and warn on passing (1) the address of a guarded variable, (2) a pointee-guarded variable. There is already functionality that makes sure (1) and (2) work together: if (Qt->isReferenceType())
checkAccess(*Arg, AK_Read, POK_PassByRef); Here we might simply add a branch for pointer types with the appropriate This still doesn't cover local pointer variables, but here I guess we need a better understanding of the patterns and how much alias analysis we'd need to unravel them. Local pointer variables as obfuscation device are hopefully not common, so there is likely some more complex logic that would be intractable for us anyway. So I'd be curious to see some interesting examples. |
…f guarded variables
Addressed comments so far.
Right, PT_GUARDED_BY() coverage is lacking in this case.
I agree this might be more conservative.
The Linux kernel has odd idioms like this one:
When passed a guarded variable (e.g. My intent with Wthread-safety-addressof was to blindly warn on "addressof", so we get this coverage as well. But I suppose if we switch it to just cover cases when passing to functions, I'd propose:
I still wouldn't mind if Wthread-safety-addressof existed (for the plethora of weird macros the Linux kernel has), but I understand that maintaining more features might not be what we want. Preferences? |
Something like void BuildLockset::VisitCastExpr(const CastExpr *CE) {
if (CE->getCastKind() != CK_LValueToRValue)
return;
checkAccess(CE->getSubExpr(), AK_Read);
} Then if (const auto *UO = dyn_cast<UnaryOperator>(Exp)) {
// For dereferences
if (UO->getOpcode() == UO_Deref)
checkPtAccess(FSet, UO->getSubExpr(), AK, POK);
return;
} Then we only need to make sure that |
[...]
Thanks for the suggestions! I implemented both -- please see #127396. |
I'm very excited about this, as I have wanted it for many years for my C codebase, and TSA is not super useful in C without this! One thought --- you could consider an attribute that could be put on pointer arguments to functions that says "yes, I dereference this and read or write it". In a codebase that otherwise would have many false positives, you could annotate at least core data structures without having to turn it on for all address-of operations. E.g In the C codebase I desperately want this for, an annotation like that sprinkled in a couple key places would get you 80% of the benefit with much lower risk of false positives. |
This PR is being superseded by #127396 (implementation changed completely) - we agreed to go with the more conservative approach, and at the same time also properly handle pt_guarded_by pointers. Wthread-safety-pointer is meant to be to pointer passing what Wthread-safety-reference is to C++ reference passing. I'm leaving this PR up to keep the history.
Perhaps if there are false positives this can be added. But initially I'd like to avoid it if at all possible. A similar precedent at least for passing references to functions exists with FWIW, something similar to explicit marking can be achieved with something like this (but yes, it's ugly):
Although we might need this patch too, to handle things like |
Ah great, I will follow along there. I might patch in the early version of this and give it a spin this week (I don't want to wait for it be merged :D)
Yup, this is what I currently am playing with, but it's ugly (and requires double-evaluating the macro argument which screws up side effects).
|
GCC has such an attribute, independent of Thread Safety Analysis. But I don't think we have an equivalent. However, as I wrote above, it's probably the rule and not the exception that pointers passed into a function are dereferenced at some point. So we're probably fine if we always warn. And as @melver pointed out, the experience with reference passing in C++ has been pretty good, and it's probably transferable to pointers. |
Note: This PR has been superseded by #127396, which takes a different approach, but coverage should be comparable with less risk of false positives.
Add the optional ability, via
-Wthread-safety-addressof
, to warn when taking the address of guarded variables.The motivation is to avoid false negatives in large C codebases, where data structures are typically implemented through helpers that take pointers to instances of a data structure. While taking the address itself does not yet constitute a race (in the presence of missing locking), placing the requirement on the pointer-recipient to obtain locks to access the pointed-to data is most likely poor style.
When passing pointers to functions, the additional warning is analogous to warning about passing reference to functions. However, unlike the existing analysis for references, the analysis is unable to track pointers precisely. Given the trade-offs, the user needs to explicitly decide if the remaining false positives and negatives are an acceptable cost for detecting potentially unsafe passing of pointers to guarded variables. Given that existing codebases using
-Wthread-safety
likely have cases where taking the pointer to a guarded variable is benign, the feature is not enabled by default but requires explicit opt-in.