Skip to content

Avoid losing the high bits of bridged constant tagged pointers #31110

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
merged 1 commit into from
Apr 23, 2020
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
66 changes: 49 additions & 17 deletions stdlib/public/core/StringBridge.swift
Original file line number Diff line number Diff line change
Expand Up @@ -413,10 +413,42 @@ private func _getCocoaStringPointer(
return .none
}

#if arch(arm64)
//11000000..payload..111
private var constantTagMask:UInt {
0b1111_1111_1000_0000_0000_0000_0000_0000_0000_0000_0000_0000_0000_0000_0000_0111
}
private var expectedConstantTagValue:UInt {
0b1100_0000_0000_0000_0000_0000_0000_0000_0000_0000_0000_0000_0000_0000_0000_0111
}
#endif

@inline(__always)
private func formConstantTaggedCocoaString(
untaggedCocoa: _CocoaString
) -> AnyObject? {
#if !arch(arm64)
return nil
#else

let constantPtr:UnsafeRawPointer = Builtin.reinterpretCast(untaggedCocoa)

// Check if what we're pointing to is actually a valid tagged constant
guard _swift_stdlib_dyld_is_objc_constant_string(constantPtr) == 1 else {
return nil
}

let retaggedPointer = UInt(bitPattern: constantPtr) | expectedConstantTagValue

return unsafeBitCast(retaggedPointer, to: AnyObject.self)
#endif
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This still seems a little sketchy without at least a defer { _fixLifetime(untaggedCocoa) }. Make sure to check with @atrick or @eeckstein.

In practice, would ref counting always be disabled here? How are the retain/release counts balanced?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

These objects are in immutable memory, so can't be refcounted

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

👍

}

@inline(__always)
private func getConstantTaggedCocoaContents(_ cocoaString: _CocoaString) ->
(utf16Length: Int, asciiContentsPointer: UnsafePointer<UInt8>?)? {
(utf16Length: Int,
asciiContentsPointer: UnsafePointer<UInt8>,
untaggedCocoa: _CocoaString)? {
#if !arch(arm64)
return nil
#else
Expand All @@ -427,34 +459,33 @@ private func getConstantTaggedCocoaContents(_ cocoaString: _CocoaString) ->

let taggedValue = unsafeBitCast(cocoaString, to: UInt.self)

//11000000..payload..111
let tagMask:UInt =
0b1111_1111_1000_0000_0000_0000_0000_0000_0000_0000_0000_0000_0000_0000_0000_0111
let expectedValue:UInt =
0b1100_0000_0000_0000_0000_0000_0000_0000_0000_0000_0000_0000_0000_0000_0000_0111

guard taggedValue & tagMask == expectedValue else {
return nil
}

guard _swift_stdlib_dyld_is_objc_constant_string(
cocoaString as! UnsafeRawPointer
) == 1 else {
guard taggedValue & constantTagMask == expectedConstantTagValue else {
return nil
}

let payloadMask = ~tagMask
let payloadMask = ~constantTagMask
let payload = taggedValue & payloadMask
let ivarPointer = UnsafePointer<_swift_shims_builtin_CFString>(
bitPattern: payload
)!

guard _swift_stdlib_dyld_is_objc_constant_string(
unsafeBitCast(ivarPointer, to: UnsafeRawPointer.self)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I wonder if going through ObjectIdentifier is more sanctioned. @lorentey , thoughts?

) == 1 else {
return nil
}

let length = ivarPointer.pointee.length
let isUTF16Mask:UInt = 0x0000_0000_0000_0004 //CFStringFlags bit 4: isUnicode
let isASCII = ivarPointer.pointee.flags & isUTF16Mask == 0
precondition(isASCII) // we don't currently support non-ASCII here
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

What should be the behavior if this check fails? Should we trap in the stdlib?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think so. It's pretty much "assert the world is as we expected"

let contentsPtr = ivarPointer.pointee.str
return (
utf16Length: Int(length),
asciiContentsPointer: isASCII ? contentsPtr : nil
asciiContentsPointer: contentsPtr,
untaggedCocoa: Builtin.reinterpretCast(ivarPointer)
)
#endif
}
Expand All @@ -476,9 +507,9 @@ internal func _bridgeCocoaString(_ cocoaString: _CocoaString) -> _StringGuts {
case .constantTagged:
let taggedContents = getConstantTaggedCocoaContents(cocoaString)!
return _StringGuts(
cocoa: cocoaString,
cocoa: taggedContents.untaggedCocoa,
providesFastUTF8: false, //TODO: if contentsPtr is UTF8 compatible, use it
isASCII: taggedContents.asciiContentsPointer != nil,
isASCII: true,
length: taggedContents.utf16Length
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is this what we want, or do we want to bring it in as an immortal string? Have you tested that ref counting and message sending Swift-side works on that untagged object?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think we want to, but I'm trying to be conservative about the changes initially both to unblock people and to reduce risk. This limits the changes to just the bridging paths.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

(and yes, that's been tested, it's an ordinary NSCFConstantString)

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

They'd also have be null-terminated.

)
#endif
Expand Down Expand Up @@ -590,7 +621,8 @@ extension String {

_internalInvariant(_guts._object.hasObjCBridgeableObject,
"Unknown non-bridgeable object case")
return _guts._object.objCBridgeableObject
let result = _guts._object.objCBridgeableObject
return formConstantTaggedCocoaString(untaggedCocoa: result) ?? result
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Here we'd be checking a spare perf flag on the second word instead.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yeah. I think that's ultimately where we want to end up.

}
}

Expand Down
1 change: 1 addition & 0 deletions stdlib/public/stubs/FoundationHelpers.mm
Original file line number Diff line number Diff line change
Expand Up @@ -112,6 +112,7 @@ typedef __swift_uint8_t (*getCStringImplPtr)(id,

__swift_uint8_t
swift::_swift_stdlib_dyld_is_objc_constant_string(const void *addr) {
initializeBridgingFunctions();
if (!dyld_is_objc_constant) return false;
return dyld_is_objc_constant(dyld_objc_string_kind, addr) ? 1 : 0;
}
Expand Down