Skip to content

Added basic implementation for NSString contentsOfURL:usedEncoding: #893

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
merged 3 commits into from
Mar 16, 2017

Conversation

mohitathwani
Copy link
Contributor

@mohitathwani mohitathwani commented Feb 23, 2017

In NSString, for contentsOfURL:usedEncoding: checking the BOM of the received data and determining whether encoding to be used should be UTF 16 BE or UTF 16 LE

enc?.pointee = String.Encoding.utf8.rawValue
}

guard let cf = CFStringCreateWithBytes(kCFAllocatorDefault, bytePtr, readResult.length, CFStringConvertNSStringEncodingToEncoding(enc!.pointee), true) else {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This has a guard here, but uses enc! in the dereferencing of the pointer. It would make sense to have let enc = enc, cf = ... enc, so that if enc isn't passed then we don't blow up the process.

@@ -66,7 +66,8 @@ class TestNSString : XCTestCase {
("test_rangeOfCharacterFromSet", test_rangeOfCharacterFromSet ),
("test_CFStringCreateMutableCopy", test_CFStringCreateMutableCopy),
("test_FromContentsOfURL",test_FromContentsOfURL),
("test_FromContentOfFile",test_FromContentOfFile),
("test_FromContentsOfURLUsedEncodingUTF16BE", test_FromContentsOfURLUsedEncodingUTF16BE),
("test_FromContentOfFile",test_FromContentOfFile),
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Indentation looks off by one character on this line.

@parkera
Copy link
Contributor

parkera commented Feb 28, 2017

@swift-ci please test

In NSString, for contentsOfURL:usedEncoding: checking the BOM of the received data and determinig whether encoding to be used should be UTF 16 BE

Fixed guard statement to handle nil case

Fixed indentation in test case

Updated build.py file to include test input file

Fixed Indentation in build.py
@mohitathwani
Copy link
Contributor Author

@swift-ci please test

1 similar comment
@parkera
Copy link
Contributor

parkera commented Mar 5, 2017

@swift-ci please test

@zmeyc
Copy link

zmeyc commented Mar 5, 2017

 +        let bytePtr = readResult.bytes.bindMemory(to: UInt8.self, capacity:readResult.length)
 +        if bytePtr[0] == 254 && bytePtr[1] == 255 {
 +          enc?.pointee = String.Encoding.utf16BigEndian.rawValue
 +        }

If readResult.length is less than 2, unitialized memory will be referenced.

And a possible memory leak: cf is created by CFStringCreateWithBytes, but not released when exitting on errors. AFAIK ARC doesn't handle this?

@mohitathwani
Copy link
Contributor Author

@zmeyc I thought about that too but there is an else which sets the default encoding to UTF8 and proceed to try decoding the file which could be less than 2 bytes with UTF 8 encoding.

@zmeyc
Copy link

zmeyc commented Mar 5, 2017

@mohitathwani Referencing unitialized memory is undefined behavior, it will trigger tools like valgrind, unitialized memory can contain BOM with wrong byte order etc. I propose checking the length before analyzing these two bytes for BOM.

@mohitathwani
Copy link
Contributor Author

@zmeyc Updated the code with your suggestion.
@parkera could you please trigger a test?

@parkera
Copy link
Contributor

parkera commented Mar 15, 2017

@swift-ci please test

@mohitathwani
Copy link
Contributor Author

Thanks @parkera ! Do you think this is merge worthy?

@parkera parkera merged commit 6ec4054 into swiftlang:master Mar 16, 2017
@parkera
Copy link
Contributor

parkera commented Mar 16, 2017

Looks good, thanks!

mohitathwani pushed a commit to mohitathwani/swift-corelibs-foundation that referenced this pull request Apr 2, 2017
…L:usedEncoding:

This pull request is in continuation to PR swiftlang#893
mohitathwani pushed a commit to mohitathwani/swift-corelibs-foundation that referenced this pull request Apr 9, 2017
…L:usedEncoding:

This pull request is in continuation to PR swiftlang#893
mohitathwani pushed a commit to mohitathwani/swift-corelibs-foundation that referenced this pull request Apr 9, 2017
…L:usedEncoding:

This pull request is in continuation to PR swiftlang#893
mohitathwani pushed a commit to mohitathwani/swift-corelibs-foundation that referenced this pull request Apr 9, 2017
…L:usedEncoding:

This pull request is in continuation to PR swiftlang#893
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants