-
Notifications
You must be signed in to change notification settings - Fork 341
[-Wunsafe-buffer-usage][BoundsSafety] support __null_terminated in the interop mode #10060
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[-Wunsafe-buffer-usage][BoundsSafety] support __null_terminated in the interop mode #10060
Conversation
@swift-ci test |
CC: @dtarditi |
0be9793
to
6a7f6c7
Compare
@swift-ci test |
…e interop mode For a __null_terminated pointer 'p', only a __null_terminated pointer, a string literal or a call to `std::string::c_str` can be assigned to (/passed to/used to initialize) 'p'. Otherwise, in the interop mode, an -Wunsafe-buffer-usage warning will be reported. Similar to how __counted_by is handled, incompatibility in nested __null_terminated pointers (e.g., input/output pointer type arguments) is reported as Bounds Safety type checking errors. (rdar://138798346)
6a7f6c7
to
1e578f1
Compare
@swift-ci test |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I have some suggestions.
|
||
#include <ptrcheck.h> | ||
typedef unsigned size_t; | ||
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Some of these new tests look unrelated to this change. Please consider splitting them out to a separate PR.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
sure. These are things left un-upstreamed.
"initialization of a pointer of|" | ||
"assignment to a pointer of|" | ||
"casting to}2 '__null_terminated' type; " | ||
"only '__null_terminated' pointers, string literals, and 'std::string::c_str' calls are compatible with '__null_terminated' pointers">, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It might be simpler to say that. the source pointer may not point to null-terminated data. I could imagine that we eventually we will increase the number of cases that we allow. I don't have super strong opinions about this - it is nice to tell people what is allowed, but it doesn't tell them what the error is.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I see. Yeah, I should explain what the error is. The convention for warning messages (at least for -Wunsafe-buffer-usage) is that we have a short but clear enough sentence to just say "something is unsafe", then put necessary explanation in notes. My message here might already be lengthy. I will explain what the error is in a note and also move "only '__null_terminated' pointers, ..."
(sort of the suggestion) to a note too.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
oh, I maybe misunderstood part of your comment here.
You were suggesting to simply saying "source pointer may not point to null-terminated data." instead of breaking into cases. I'm not against the idea of making the warning message simpler. My only concern is that warning messages under -Wunsafe-buffer-usage
all have the forms "unsafe [some operation]" or "[some operation] is unsafe". I'd like this message to follow that.
Different cases are determined by AssignmentAction
, which is handled by the existing framework in Sema
. I assume it is stable.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Or maybe "unsafe conversion to __null_terminated type; source pointer may not point to null-terminated data."
? If "conversion" is a appropriate umbrella term for these operations.
"only '__null_terminated' pointers, string literals, and 'std::string::c_str' calls are compatible with '__null_terminated' pointers">, | ||
InGroup<UnsafeBufferUsage>, DefaultIgnore; | ||
def warn_unsafe_named_cast_to_null_terminated_pointer : Warning< | ||
"C++ named cast to '__null_terminated' type is unsafe">, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This may be overly specific: any cast to __null_terminated
is unsafe. If I got this error message I'd try doing a C-style cast instead, which would not help.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Wait, looking at the test cases it seems like C-style casts are deliberately allowed. What's the rationale?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
My original idea is that it should be rare for C++ programmers to cast null_terminated pointers with named casts. And, Bounds Safety's type checking for null_terminated can only be re-used for C-Style casting. So why not force C++ programmers to just use C-Style casts for null_terminated pointers?
But now I changed my mind. I think it is too strict. Moreover, it will be difficult to relax the restriction once this warning is added. On the other hand, we are allowed to have some false negatives for some rare use cases and make warnings for them later. So I think I will remove this warning. @rapidsna @patrykstefanski WDYT?
Thanks for bringing this up!
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I disagree that it's easier to add warnings later than relaxing later, since relaxing the rules doesn't disrupt existing adopters while new warnings do. But that said, if there's no reason for preferring one cast over the other it seems reasonable to remove it.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Where's the test case to allow C-Style casts?
|
||
#ifndef INTEROP_MODE |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Looks like the code isn't actually shared for INTEROP_MODE vs !INTEROP_MODE. Shouldn't it be just a separate test file?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
sure. These are things left un-upstreamed.
clang/lib/Sema/SemaCast.cpp
Outdated
@@ -238,6 +238,17 @@ namespace { | |||
const auto *DstPointerType = DestType->getAs<ValueTerminatedType>(); | |||
IsDstNullTerm = | |||
DstPointerType->getTerminatorValue(Self.Context).isZero(); | |||
if (Self.LangOpts.isBoundsSafetyAttributeOnlyMode() && | |||
Self.LangOpts.CPlusPlus && | |||
!Self.Diags.isIgnored( |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Why is it controlled by diag::warn_unsafe_assign_to_null_terminated_pointer
, not by the UnsafeBufferUsage?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Usually, we use diag::warn_unsafe_buffer_operation
to indicate whether -Wunsafe-buffer-usage
is on for a source location. (For example: here.)
The diag::warn_unsafe_assign_to_null_terminated_pointer
is under the UnsafeBufferUsage warning group and is only controlled by the -Wunsafe-buffer-usage
flag. So functionally it is equivalent to using diag::warn_unsafe_buffer_operation
. I originally thought it is more specific but now I feel like using diag::warn_unsafe_buffer_operation
is more consistent and probably more readable. I will change it to diag::warn_unsafe_buffer_operation
, if this is your concern.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yes, consistently using diag::warn_unsafe_buffer_operation
sounds a more robust implementation rather than relying on different warnings for each individual set of warnings.
Nit: You might also want to have a helper function like IsSafeBufferMode
and use that instead to avoid mistakes and duplication.
clang/lib/Sema/SemaCast.cpp
Outdated
// In C++ Safe Buffers/Bounds Safety interop mode, the compiler | ||
// reports a warning under -Wunsafe-buffer-usage: | ||
DiagKind = diag::warn_unsafe_assign_to_null_terminated_pointer; | ||
isInvalid = true; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Shouldn't it be isInvalid = false
because this is a warning. Then, it's controlled by -Werror if the compiler still wants to fail the build.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
yes, it is a mistake!
clang/lib/Sema/SemaCast.cpp
Outdated
if (DestType->getAs<ValueTerminatedType>()) { | ||
Diag(E->getBeginLoc(), | ||
diag::warn_unsafe_named_cast_to_null_terminated_pointer); | ||
return ExprError(); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Similarly, it's weird that it returns ExprError()
on a warning. It should either an error or just pass through.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I will remove it.
clang/lib/Sema/SemaCast.cpp
Outdated
@@ -576,6 +587,17 @@ Sema::BuildCXXNamedCast(SourceLocation OpLoc, tok::TokenKind Kind, | |||
|
|||
Op.checkQualifiedDestType(); | |||
|
|||
// In C++ Safe Buffers/Bounds Safety interop mode, forbidding for now all | |||
// named cast to __null_terminated pointer type. | |||
if (DestType->isPointerType() && LangOpts.hasBoundsSafetyAttributes() && |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Did you mean Self.LangOpts.isBoundsSafetyAttributeOnlyMode()
?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
My general idea is that the interoperation work should be guarded by LangOpts.hasBoundsSafetyAttributes
combined with the testing of whether-Wunsafe-buffer-usage
is enabled for most places. We only use LangOpts.isBoundsSafetyAttributeOnlyMode()
to guard against the work that otherwise can affect -fbounds-safety
. Let me know if this is not the best way of thinking.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LangOpts.hasBoundsSafetyAttributes
is enabled in both the annotation only mode and the full firebloom mode. I was asking because you may want an error and may be a different diagnostics in the firebloom mode, which we didn't design yet. The current way of doing it is still fine for this patch and we can adjust it later for C++ firebloom mode.
clang/lib/Sema/SemaCast.cpp
Outdated
// In C++ Safe Buffers/Bounds Safety interop mode, forbidding for now all | ||
// named cast to __null_terminated pointer type. | ||
if (DestType->isPointerType() && LangOpts.hasBoundsSafetyAttributes() && | ||
!Diags.isIgnored(diag::warn_unsafe_named_cast_to_null_terminated_pointer, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Again, why it's guarded by this specific diag, not the unsafe buffer usage flag?
@@ -13603,6 +13603,19 @@ def note_unsafe_count_attributed_pointer_argument : Note< | |||
def warn_unsafe_single_pointer_argument : Warning< | |||
"unsafe assignment to function parameter of __single pointer type">, | |||
InGroup<UnsafeBufferUsage>, DefaultIgnore; | |||
def warn_unsafe_assign_to_null_terminated_pointer : Warning< |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This is probably almost a duplicate of an existing one warn_bounds_safety_incompatible_non_terminated_by_to_terminated_by
. I guess you introduced a new one because you want something in the UsafeBufferUsage
group? Okay, I see the diagnostic message at the end is different.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
warn_bounds_safety_incompatible_non_terminated_by_to_terminated_by
mentions "indexable" and "unsafe_forge", which do not exist in the interop mode. This is the main reason for having another diagnostic message.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
You could move those "indexable" and "unsafe_forge" to %select
in warn_bounds_safety_incompatible_non_terminated_by_to_terminated_by
and not print them in interop.
- Make the diagnostic message simply saying "unsafe [conversion operation] of '__null_terminated' type"; further explanation is put in notes. - Use `warn_unsafe_buffer_operation` to consistently test if CXX Safe Buffer is enabled. Add an helper function for the test. - Remove the warning on using C++ named cast to obtain __null_terminated pointers. - Added tests for good cases of C-Style casts. - Consistently use isBoundsSafetyAttributeOnlyMode for code involving bounds attributes but not shared with full Bounds Safety mode.
17a8a92
to
453073f
Compare
clang/lib/Sema/SemaCast.cpp
Outdated
@@ -205,7 +205,7 @@ namespace { | |||
/// int *__bidi_indexable p1; | |||
/// int *__indexable p2 = (int *__indexable)p1; | |||
/// \endcode | |||
if (Self.getLangOpts().BoundsSafety) { | |||
if (Self.getLangOpts().hasBoundsSafetyAttributes()) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Nit: We want to remove hasBoundsSafetyAttributes()
, since it is not needed anymore. Self.getLangOpts().hasBoundsSafetyAttributes()
-> Self.getLangOpts().BoundsSafetyAttributes
.
@@ -13603,6 +13603,19 @@ def note_unsafe_count_attributed_pointer_argument : Note< | |||
def warn_unsafe_single_pointer_argument : Warning< | |||
"unsafe assignment to function parameter of __single pointer type">, | |||
InGroup<UnsafeBufferUsage>, DefaultIgnore; | |||
def warn_unsafe_assign_to_null_terminated_pointer : Warning< |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
You could move those "indexable" and "unsafe_forge" to %select
in warn_bounds_safety_incompatible_non_terminated_by_to_terminated_by
and not print them in interop.
clang/include/clang/Sema/Sema.h
Outdated
@@ -1560,6 +1560,11 @@ class Sema final : public SemaBase { | |||
/// be controlled with \#pragma clang abi_ptr_attr set(*ATTR*). Defaults to | |||
/// __single in user code and __unsafe_indexable in system headers. | |||
BoundsSafetyPointerAttributes CurPointerAbi; | |||
|
|||
/// \returns true iff `-Wunsafe-buffer-usage` is enabled for `Loc`. | |||
bool isCXXSafeBuffersEnabledAt(SourceLocation Loc) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Nit: Can this method be const
?
// match fails: | ||
ObjTypeII = RecordDecl->getIdentifier(); | ||
if (MethodDeclII && ObjTypeII && | ||
MethodDeclII->getName() == "c_str" && |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
If there a reason we support std::string::c_str()
but not std::string::data()
?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
std::string::data()
is not null-termination guaranteed as I remember.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
https://en.cppreference.com/w/cpp/string/basic_string/c_str
c_str() and data() perform the same function. (since C++11)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Aha, it's changed since C++11.
if (MethodDeclII && ObjTypeII && | ||
MethodDeclII->getName() == "c_str" && | ||
ObjTypeII->getName() == "basic_string") | ||
return Sema::Compatible; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Does this work if LHS is terminated by a value different than 0, e.g. const char *__terminated_by(42)
?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Good point. Need to check it is 0-terminator.
clang/lib/Sema/SemaCast.cpp
Outdated
DiagKind = diag::warn_unsafe_assign_to_null_terminated_pointer; | ||
isInvalid = false; | ||
break; | ||
} |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Same here, we don't check if the terminator is 0.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I have changed it to re-use warn_bounds_safety_incompatible_non_terminated_by_to_terminated_by
. We do not need to check terminator there now.
- Change LangOpts.hasBoundsSafetyAttributes() to LangOpts.BoundsSafetyAttributes - Correct that std::string::c_str() should be not compatible with (non-0)-terminated pointers
Reuse the existing warn_bounds_safety_incompatible_non_terminated_by_to_terminated_by
I have addressed comments by:
@patrykstefanski @rapidsna @hnrklssn @dtarditi please take another look |
@swift-ci test |
clang/lib/Sema/SemaExpr.cpp
Outdated
@@ -20694,6 +20720,9 @@ bool Sema::DiagnoseAssignmentResult(AssignConvertType ConvTy, | |||
AssignedVarFixIts; | |||
VarDecl *BoundsSafetyIncompatNestedPtrLocalVarToFix = nullptr; | |||
FixItHint BoundsSafetyIncompatNestedPtrFixIt; | |||
bool InCXXBoundsSafetyMode = LangOpts.CPlusPlus && |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Nit: This seems to be duplicated too in another Sema file. So it's another candidate to get a helper function that tales Loc
.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
👌
/// BoundsSafety: Such case should be handled as BoundsSafetyPointerCast. | ||
/// \code | ||
/// int *__bidi_indexable p1; | ||
/// int *__indexable p2 = (int *__indexable)p1; | ||
/// \endcode | ||
if (Self.getLangOpts().BoundsSafety) { | ||
if (Self.getLangOpts().BoundsSafetyAttributes) { | ||
unsigned DiagKind = 0; | ||
bool isInvalid = false; | ||
// The type error may be nested, so any pointer can result in VT errors |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Do we want to assert that Self.getLangOpts().BoundsSafety || ConvertTy == Sema::Compatible || (ConvertTy == Sema::IncompatibleNonValueTerminatedToValueTerminatedPointer && InCXXBoundsSafetyMode)
? Especially since CheckValueTerminatedAssignmentConstraints
used to only run under BoundsSafety
, but now it runs under BoundsSafetyAttributes
even for C and in C++ without safe buffers mode. We really don't want it to accidentally start returning errors in these modes. Otherwise LGTM
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Fair point. But maybe the following is enough?
Self.getLangOpts().BoundsSafety || Self.isCXXSafeBuffersBoundsSafetyInteropEnabledAt(SrcExpr.get()->getBeginLoc())
,
where isCXXSafeBuffersBoundsSafetyInteropEnabledAt
is defined by
bool isCXXSafeBuffersBoundsSafetyInteropEnabledAt(SourceLocation Loc) const {
return LangOpts.CPlusPlus && LangOpts.isBoundsSafetyAttributeOnlyMode() &&
isCXXSafeBuffersEnabledAt(Loc);
}
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
That works for me 👍
- Refactor interop mode check to a helper function - Guard the type checking for C-style cast so that it runs only in full bounds safety or interop only mode
…e interop mode (swiftlang#10060) For a __null_terminated pointer 'p', only a __null_terminated pointer, a string literal or a call to `std::string::c_str/data` can be assigned to (/passed to/used to initialize, etc.) 'p'. Otherwise, an error will be emitted. The implementation reuses Bounds Safety's diagnostics. (rdar://138798346)
…e interop mode (#10060) (#10103) For a __null_terminated pointer 'p', only a __null_terminated pointer, a string literal or a call to `std::string::c_str/data` can be assigned to (/passed to/used to initialize, etc.) 'p'. Otherwise, an error will be emitted. The implementation reuses Bounds Safety's diagnostics. (rdar://138798346)
For a __null_terminated pointer 'p', only a __null_terminated pointer, a string literal or a call to
std::string::c_str
can be assigned to (/passed to/used to initialize) 'p'. Otherwise, in the interop mode, an -Wunsafe-buffer-usage warning will be reported.Similar to how __counted_by is handled, incompatibility in nested __null_terminated pointers (e.g., input/output pointer type arguments) is reported as Bounds Safety type checking errors.
(rdar://138798346)