Skip to content

Unicode × Linux #2368

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Closed
wants to merge 8 commits into from
Closed
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
31 changes: 30 additions & 1 deletion Fixtures/Miscellaneous/Unicode/Package.swift
Original file line number Diff line number Diff line change
Expand Up @@ -34,11 +34,40 @@ let package = Package(
.target(
name: complicatedString,
dependencies: [.product(name: "UnicodeDependency‐\(complicatedString)")]),
.target(
name: "C" + complicatedString),
.target(
name: complicatedString + "‐tool",
dependencies: [.target(name: complicatedString)]),
.testTarget(
name: complicatedString + "Tests",
dependencies: [.target(name: complicatedString)]),
dependencies: [
.target(name: complicatedString),
.target(name: "C" + complicatedString)
]),
]
)

// This section is separate on purpose.
// If the directory turns out to be illegal on a platform (Windows?),
// it can easily be removed with “#if !os(...)” and the rest of the test will still work.
let equivalentToASCII = "\u{037E}" // ερωτηματικό (greek question mark)
let ascii = "\u{3B}" // semicolon
// What follows is a nasty hack that requires sandboxing to be disabled. (--disable-sandbox)
// The target it creates can exist in this form on Linux and other platforms,
// but as soon as it is checked out on macOS, the macOS filesystem obliterates the distinction,
// leaving the test meaningless.
// Since much development of the SwiftPM repository occurs on macOS,
// maintaining the integrity of the test fixture requires regenerating this part of it each time.
import Foundation
let manifestURL = URL(fileURLWithPath: #file)
let packageRoot = manifestURL.deletingLastPathComponent()
let targetURL = packageRoot
.appendingPathComponent("Sources")
.appendingPathComponent(equivalentToASCII)
let sourceURL = targetURL.appendingPathComponent("Source.swift")
try FileManager.default.createDirectory(at: targetURL, withIntermediateDirectories: true)
try Data().write(to: targetURL.appendingPathComponent("\(equivalentToASCII).swift"))
package.targets.append(.target(name: ascii))
let tests = package.targets.first(where: { $0.type == .test })!
tests.dependencies.append(.target(name: ascii))
2 changes: 2 additions & 0 deletions Fixtures/Miscellaneous/Unicode/README.md
Original file line number Diff line number Diff line change
Expand Up @@ -3,3 +3,5 @@
This fixture makes extensive use of exotic Unicode. While deliberately trying to break a as many common false assumptions as possible, *this is a valid package*, and clients are encouraged to test their functionality with it. A tool that successfully handles this package is unlikely to encounter problems with any real‐world package in any human language.

The neighbouring package `UnicodeDependency‐πשּׁµ𝄞🇺🇳🇮🇱x̱̱̱̱̱̄̄̄̄̄` must be placed next this package and tagged with version 1.0.0. (This is necessary to use Unicode in dependency URLs in this package’s manifest.)

Sandboxing likely needs to be disabled to load this package ( `--disable-sandbox`). The latter part of the manifest writes into the `Sources` directory. See the end of the manifest for an explanation. (To temporarily make the fixture compatible with a sandbox for some reason, the latter part of the manifest can be commented out.)
Original file line number Diff line number Diff line change
@@ -0,0 +1,4 @@
module Cπשּׁµ𝄞🇺🇳🇮🇱x̱̱̱̱̱̄̄̄̄̄ {
header "Cπשּׁµ𝄞🇺🇳🇮🇱x̱̱̱̱̱̄̄̄̄̄.h"
export *
}
3 changes: 2 additions & 1 deletion Sources/Build/ToolProtocol.swift
Original file line number Diff line number Diff line change
Expand Up @@ -169,7 +169,8 @@ struct SwiftCompilerTool: ToolProtocol {
stream <<< " other-args: "
<<< Format.asJSON(target.compileArguments()) <<< "\n"
stream <<< " sources: "
<<< Format.asJSON(target.target.sources.paths.map{$0.pathString}) <<< "\n"
<<< Format.asJSON(target.target.sources.paths
.map{ localFileSystem.resolveUnicode($0).pathString }) <<< "\n"
stream <<< " is-library: "
<<< Format.asJSON(target.target.type == .library || target.target.type == .test) <<< "\n"
stream <<< " enable-whole-module-optimization: "
Expand Down
17 changes: 11 additions & 6 deletions Sources/Build/llbuild.swift
Original file line number Diff line number Diff line change
Expand Up @@ -181,17 +181,19 @@ public final class LLBuildManifestGenerator {

for package in plan.graph.rootPackages {
let directoryStructureInputs = package.targets.map {
$0.sources.root.pathString + "/"
localFileSystem.resolveUnicode($0.sources.root).pathString + "/"
}.sorted()
self.nodes += directoryStructureInputs.map{ Node(value: $0, isDirectoryStructure: true) }

inputs = directoryStructureInputs

// FIXME: Need to handle version-specific manifests.
inputs += [package.manifest.path.pathString]
inputs += [localFileSystem.resolveUnicode(package.manifest.path).pathString]

// FIXME: This won't be the location of Package.resolved for multiroot packages.
inputs += [package.path.appending(component: "Package.resolved").pathString]
inputs += [
localFileSystem.resolveUnicode(package.path.appending(component: "Package.resolved")).pathString
]

// FIXME: Add config file as an input
}
Expand Down Expand Up @@ -240,7 +242,7 @@ public final class LLBuildManifestGenerator {
private func createSwiftCompileTarget(_ target: SwiftTargetBuildDescription) -> Target {
// Compute inital inputs.
var inputs = SortedArray<String>()
inputs += target.target.sources.paths.map({ $0.pathString })
inputs += target.target.sources.paths.map({ localFileSystem.resolveUnicode($0).pathString })

func addStaticTargetInputs(_ target: ResolvedTarget) {
// Ignore C Modules.
Expand Down Expand Up @@ -363,10 +365,13 @@ public final class LLBuildManifestGenerator {
}
}

args += ["-c", path.source.pathString, "-o", path.object.pathString]
args += [
"-c", localFileSystem.resolveUnicode(path.source).pathString,
"-o", path.object.pathString
]
let clang = ClangTool(
desc: "Compiling \(target.target.name) \(path.filename)",
inputs: externalDependencies + [path.source.pathString],
inputs: externalDependencies + [localFileSystem.resolveUnicode(path.source).pathString],
outputs: [path.object.pathString],
args: [try plan.buildParameters.toolchain.getClangCompiler().pathString] + args,
deps: path.deps.pathString)
Expand Down
91 changes: 74 additions & 17 deletions Sources/TSCBasic/FileSystem.swift
Original file line number Diff line number Diff line change
Expand Up @@ -184,6 +184,16 @@ public protocol FileSystem: class {
///
/// The method throws if the underlying stat call fails.
func getFileInfo(_ path: AbsolutePath) throws -> FileInfo

/// Returns the path in the normalization form (or lack thereof) actually present.
/// Any path components without an extant match are left unaltered.
///
/// On file systems that natively consider Unicode‐equivalent paths to be equal,
/// the returned value need not represent the actual scalars present.
/// It is sufficient if whatever scalars are returned
/// *will be considered equal to the real ones by the file system*
/// when future file system calls are made.*
func resolveUnicode(_ path: AbsolutePath) -> AbsolutePath
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ideally, a "unicode resolved" path would be reflected in the type system as such. Have you thought about ways to present this distinction?

Otherwise, be sure to liberally assert anywhere assuming a resolved path is in fact resolved (if it comes up). I like as many checks as possible to guard against the omission of an API call, especially as code evolves over multiple contributors.

Example:

extension FileSystem {
  internal func _assertUnicodeResolved(_ path: AboslutePath) {
#if os(macOS)
    return
#else
    assert(path == self.resolveUnicode(path))
  }
}

which can be sprinkled about as needed. If the idea is that all AbsolutePath-receiving APIs on this protocol have to be resolved, then a separate (or phantom) type becomes more appealing. Another option is to have the protocol requirements be something like uncheckedGetFileInfo(_:) and have getFileInfo(_:) be in an extension that runs this assert before calling uncheckedGetFileInfo(_:).

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Also, if this is about putting the path into the conformer's preferred normal form, then this might be better called normalizeUnicode or something, unless normalization is the only kind of resolution imaginable for file systems (it might be).

Copy link
Contributor Author

@SDGGiesbrecht SDGGiesbrecht Oct 7, 2019

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for taking a look, @milseman. All very good questions.

Is canonical equivalence what you need for all platforms, or do you need compatibility equivalence? If you're trying to define a new model for SPM, then I agree with canonical equivalence. If you're trying to match an existing system's (legacy) model, then you might need compatibility equivalence.

I’m not aiming at supporting old code pages. (If someone else wants to extend it for that, I leave it up to them.) The aim here is about resolving the discrepancy between path components entered in the manifest, which use Swift String’s Unicode equivalence, and the file paths stored by the platform, which outside of macOS do not attempt to handle Unicode equivalence. A manifest should be equally loadable on every platform. Right now you get “file not found” errors when you go back and forth between platforms.

Tangent: what is the story for unpaired surrogates in file names, which can happen on Windows?

I have not thought about them and I have done nothing special to handle them. Whatever String does ought to be fine, at least for now. In contrast to what this aims to fix, unpaired surrogates would indicate that the string itself is already “broken” to begin with.

Ideally, a "unicode resolved" path would be reflected in the type system as such. Have you thought about ways to present this distinction?

Initially I wanted to do it that way. Unfortunately, I realized that is itself unsafe, because the “resolvedness” of a path can be invalidated over time as file operations occur, some of which are hard for SwiftPM to track (such as Git). Since resolution must be redone before each operation anyway, the type‐system doesn’t actually help us much.

Otherwise, be sure to liberally assert anywhere assuming a resolved path is in fact resolved (if it comes up).

The assertion would have to do all the same work to check whether the resolution is valid. (And we cannot just record a flag in AbsolutePath for the same reason a separate type would not work.) Calling resolveUnicode(_:) at the start of each LocalFileSystem method, effectively accomplishes the same safety otherwise afforded by assert.

Another option is to have the protocol requirements be something like uncheckedGetFileInfo(:) and have getFileInfo(:) be in an extension that runs this assert before calling uncheckedGetFileInfo(_:).

The two private underscored methods are basically what you mean by unchecked. They only exist because they need a variant callable from within resolveUnicode(_:) that doesn’t wind up circular. Nothing else should be using them.

Also, if this is about putting the path into the conformer's preferred normal form, then this might be better called normalizeUnicode or something, unless normalization is the only kind of resolution imaginable for file systems (it might be).

Normalization‐related stuff is the only resolution I am imagining here. However, it is actually not normalization, but the reverse. On platforms besides macOS, the file on disk may have a name that is neither NFD nor NFC, and we still need to be able to find it.

}

/// Convenience implementations (default arguments aren't permitted in protocol
Expand Down Expand Up @@ -224,41 +234,78 @@ public extension FileSystem {
func getFileInfo(_ path: AbsolutePath) throws -> FileInfo {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Should the default implementations above assert that the path is resolved?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

See the other comment about types and assertions.

fatalError("This file system currently doesn't support this method")
}

func resolveUnicode(_ path: AbsolutePath) -> AbsolutePath {
return path
}
}

/// Concrete FileSystem implementation which communicates with the local file system.
private class LocalFileSystem: FileSystem {

func resolveUnicode(_ path: AbsolutePath) -> AbsolutePath {
#if os(macOS)
// The macOS file system enforces NFD.
// It can find everything without any help.
return path
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is this contingent on APFS?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I have no idea. File systems largely aren’t my area of expertise. Based on this note on the Uncyclopedia article for APFS, I suspect you are at least generally right that my #if os(macOS) shortcut here uses the wrong check and not always valid. I have absolutely no idea if there is a more accurate check, or if the shortcut needs to be dropped.

#else
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hmm, this will tank the performance on non-macOS platforms. Can you check how corelibs and llvm::path handle these? I wonder if there's a API in corelibs that we can leverage here.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think corelibs is just as broken in this regard, since most uses here are immediately followed by calls to FileManager, whose failures clued me into this in the first place.

I don’t find anything relevant in corelibs when searching for “Unicode”, or when perusing the FileManager and URL sources. Maybe someone from that team would know better? @millenomi?

I find nothing related in llvm::sys::path or llvm::sys::unicode either.

Those lower levels may simply not care. The package manager is in an unusual situation, in that it looks up paths that it neither received from the file system nor placed there itself (and also in that it cannot defer this responsibility to a higher‐level client). This is the gap in which lookup becomes necessary. SwiftPM’s unique case is compounded by the fact that the same string literal in a manifest may need to match differing byte representations when checked out on different devices, so even telling users to be explicit with their literals ("\u{...}\u{...}\u{...}") cannot work around it.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The intent for corelibs is to mimic Darwin. On Darwin, normalization occurs at the filesystem access level.

These should likely be contributions to FileManager to supplement its .fileSystemRepresentation… method. FileManager should accept any path as input and normalize it internally.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

i.e. this is the wrong place; we should make FileManager better.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yeah, I agree that this should be solved at the FileManager level.

Copy link
Contributor Author

@SDGGiesbrecht SDGGiesbrecht Oct 7, 2019

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

we should make FileManager better.

This is good to hear.

Unfortunately, my Linux machine is not large enough to develop core-libs-foundation. I can just barely manage to test SwiftPM because it works in isolation, without any of the other swift-... repositories. So that probably means someone else will have to be the one to sink it from here.

fileSystemRepresentation(withPath:) looks like it intends to do the same thing, but at the moment it appears to basically just forward to CFStringGetFileSystemRepresentation, which in turn currently does nothing outside of macOS. I don’t know if that is the right place for it to live. On macOS it basically only needs to decompose the string, but on other platforms it must actually look things up in the file system.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I haven't checked but maybe it's possible to develop corelibs-foundation against as a standalone project if you install a toolchain on Linux?

if _exists(path, followSymlink: true) {
return path
} else {
// Search for any neighbours with Unicode‐equivalent names
// in order to be resilient against Unicode‐naïveté in the file system.
let parent = resolveUnicode(path.parentDirectory)
let neighbours = (try? _getDirectoryContents(parent)) ?? []
if let equivalent = neighbours.first(where: { $0 == path.basename }) {
// Return the extant equivalent.
return parent.appending(component: equivalent)
} else {
// Nothing found; return unaltered (but still using resolved parent).
return parent.appending(component: path.basename)
}
}
#endif
}

func isExecutableFile(_ path: AbsolutePath) -> Bool {
let path = resolveUnicode(path)
// Our semantics doesn't consider directories.
return (self.isFile(path) || self.isSymlink(path)) && FileManager.default.isExecutableFile(atPath: path.pathString)
}

func exists(_ path: AbsolutePath, followSymlink: Bool) -> Bool {
/// This method is Unicode‐naïve.
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Good indication that an assert would be nice; at the very least it is documentation about what you mean by "naive", as in it assumes the caller handled such details.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

See the other comment about types and assertions.

It is private. The documentation was a brief note to other contributors about why it is distinct from the neighbouring method. I could elaborate it with more of what I said in the other comment if any of you think it would be helpful.

private func _exists(_ path: AbsolutePath, followSymlink: Bool) -> Bool {
if followSymlink {
return FileManager.default.fileExists(atPath: path.pathString)
}
return (try? FileManager.default.attributesOfItem(atPath: path.pathString)) != nil
}
func exists(_ path: AbsolutePath, followSymlink: Bool) -> Bool {
let path = resolveUnicode(path)
return _exists(path, followSymlink: followSymlink)
}

func isDirectory(_ path: AbsolutePath) -> Bool {
let path = resolveUnicode(path)
var isDirectory: ObjCBool = false
let exists: Bool = FileManager.default.fileExists(atPath: path.pathString, isDirectory: &isDirectory)
return exists && isDirectory.boolValue
}

func isFile(_ path: AbsolutePath) -> Bool {
let path = resolveSymlinks(path)
let path = resolveUnicode(resolveSymlinks(resolveUnicode(path)))
let attrs = try? FileManager.default.attributesOfItem(atPath: path.pathString)
return attrs?[.type] as? FileAttributeType == .typeRegular
}

func isSymlink(_ path: AbsolutePath) -> Bool {
let path = resolveUnicode(path)
let attrs = try? FileManager.default.attributesOfItem(atPath: path.pathString)
return attrs?[.type] as? FileAttributeType == .typeSymbolicLink
}

func getFileInfo(_ path: AbsolutePath) throws -> FileInfo {
let path = resolveUnicode(path)
let attrs = try FileManager.default.attributesOfItem(atPath: path.pathString)
return FileInfo(attrs)
}
Expand All @@ -272,32 +319,38 @@ private class LocalFileSystem: FileSystem {
return AbsolutePath(NSHomeDirectory())
}

/// This method is Unicode‐naïve.
func _getDirectoryContents(_ path: AbsolutePath) throws -> [String] {
#if canImport(Darwin)
return try FileManager.default.contentsOfDirectory(atPath: path.pathString)
#else
do {
return try FileManager.default.contentsOfDirectory(atPath: path.pathString)
} catch let error as NSError {
// Fixup error from corelibs-foundation.
if error.code == CocoaError.fileReadNoSuchFile.rawValue, !error.userInfo.keys.contains(NSLocalizedDescriptionKey) {
var userInfo = error.userInfo
userInfo[NSLocalizedDescriptionKey] = "The folder “\(path.basename)” doesn’t exist."
throw NSError(domain: error.domain, code: error.code, userInfo: userInfo)
}
throw error
}
#endif
}
func getDirectoryContents(_ path: AbsolutePath) throws -> [String] {
#if canImport(Darwin)
return try FileManager.default.contentsOfDirectory(atPath: path.pathString)
#else
do {
return try FileManager.default.contentsOfDirectory(atPath: path.pathString)
} catch let error as NSError {
// Fixup error from corelibs-foundation.
if error.code == CocoaError.fileReadNoSuchFile.rawValue, !error.userInfo.keys.contains(NSLocalizedDescriptionKey) {
var userInfo = error.userInfo
userInfo[NSLocalizedDescriptionKey] = "The folder “\(path.basename)” doesn’t exist."
throw NSError(domain: error.domain, code: error.code, userInfo: userInfo)
}
throw error
}
#endif
return try _getDirectoryContents(resolveUnicode(path))
}

func createDirectory(_ path: AbsolutePath, recursive: Bool) throws {
let path = resolveUnicode(path)
// Don't fail if path is already a directory.
if isDirectory(path) { return }

try FileManager.default.createDirectory(atPath: path.pathString, withIntermediateDirectories: recursive, attributes: [:])
}

func readFileContents(_ path: AbsolutePath) throws -> ByteString {
let path = resolveUnicode(path)
// Open the file.
let fp = fopen(path.pathString, "rb")
if fp == nil {
Expand Down Expand Up @@ -327,6 +380,7 @@ private class LocalFileSystem: FileSystem {
}

func writeFileContents(_ path: AbsolutePath, bytes: ByteString) throws {
let path = resolveUnicode(path)
// Open the file.
let fp = fopen(path.pathString, "wb")
if fp == nil {
Expand All @@ -350,6 +404,7 @@ private class LocalFileSystem: FileSystem {
}

func writeFileContents(_ path: AbsolutePath, bytes: ByteString, atomically: Bool) throws {
let path = resolveUnicode(path)
// Perform non-atomic writes using the fast path.
if !atomically {
return try writeFileContents(path, bytes: bytes)
Expand All @@ -369,12 +424,14 @@ private class LocalFileSystem: FileSystem {
}

func removeFileTree(_ path: AbsolutePath) throws {
let path = resolveUnicode(path)
if self.exists(path, followSymlink: false) {
try FileManager.default.removeItem(atPath: path.pathString)
}
}

func chmod(_ mode: FileMode, path: AbsolutePath, options: Set<FileMode.Option>) throws {
let path = resolveUnicode(path)
guard exists(path) else { return }
func setMode(path: String) throws {
let attrs = try FileManager.default.attributesOfItem(atPath: path)
Expand Down
12 changes: 7 additions & 5 deletions Tests/FunctionalTests/MiscellaneousTests.swift
Original file line number Diff line number Diff line change
Expand Up @@ -439,7 +439,6 @@ class MiscellaneousTestCase: XCTestCase {
}

func testUnicode() {
#if !os(Linux) // TODO: - Linux has trouble with this and needs investigation.
fixture(name: "Miscellaneous/Unicode") { prefix in
// See the fixture manifest for an explanation of this string.
let complicatedString = "πשּׁµ𝄞🇺🇳🇮🇱x̱̱̱̱̱̄̄̄̄̄"
Expand All @@ -453,7 +452,8 @@ class MiscellaneousTestCase: XCTestCase {
let dependencyOrigin = AbsolutePath(#file).parentDirectory.parentDirectory.parentDirectory
.appending(component: "Fixtures")
.appending(component: "Miscellaneous")
.appending(component: dependencyName)
// Fixture originates from macOS; directory name is in NFD:
.appending(component: dependencyName.decomposedStringWithCanonicalMapping)
let dependencyDestination = prefix.parentDirectory.appending(component: dependencyName)
try? FileManager.default.removeItem(atPath: dependencyDestination.pathString)
defer { try? FileManager.default.removeItem(atPath: dependencyDestination.pathString) }
Expand All @@ -468,9 +468,11 @@ class MiscellaneousTestCase: XCTestCase {
// •••••

// Attempt several operations.
try SwiftPMProduct.SwiftTest.execute([], packagePath: prefix)
try SwiftPMProduct.SwiftRun.execute([complicatedString + "‐tool"], packagePath: prefix)
// (An explanation for the sandbox disabling can be found at the bottom of the fixture manifest.)
try SwiftPMProduct.SwiftTest.execute(["--disable-sandbox"], packagePath: prefix)
try SwiftPMProduct.SwiftRun.execute([
"--disable-sandbox", complicatedString + "‐tool"
], packagePath: prefix)
}
#endif
}
}
34 changes: 33 additions & 1 deletion Tests/TSCBasicTests/FileSystemTests.swift
Original file line number Diff line number Diff line change
Expand Up @@ -73,13 +73,45 @@ class FileSystemTests: XCTestCase {
_ = try fs.getDirectoryContents(AbsolutePath("/does-not-exist"))
XCTFail("Unexpected success")
} catch {
XCTAssertEqual(error.localizedDescription, "The folder “does-not-exist” doesn’t exist.")
XCTAssertEqual(error._domain, "NSCocoaErrorDomain")
XCTAssertEqual(error._code, 260)
XCTAssert(error.localizedDescription.contains("does-not-exist"))
// English description: “The folder “does-not-exist” doesn’t exist.”
}

let thisDirectoryContents = try! fs.getDirectoryContents(AbsolutePath(#file).parentDirectory)
XCTAssertTrue(!thisDirectoryContents.contains(where: { $0 == "." }))
XCTAssertTrue(!thisDirectoryContents.contains(where: { $0 == ".." }))
XCTAssertTrue(thisDirectoryContents.contains(where: { $0 == AbsolutePath(#file).basename }))

// Unicode
let willyNillyString = "e\u{301}\u{E9}x\u{304}\u{331}" // Neither NFC nor NFD.
let nfdString = "e\u{301}e\u{301}x\u{331}\u{304}"
let nfcString = "\u{E9}\u{E9}x\u{331}\u{304}"
#if !os(Linux) // TODO: Linux cannot normalize?!?
XCTAssertEqual(
Array(willyNillyString.decomposedStringWithCanonicalMapping.unicodeScalars),
Array(nfdString.unicodeScalars))
XCTAssertEqual(
Array(willyNillyString.precomposedStringWithCanonicalMapping.unicodeScalars),
Array(nfcString.unicodeScalars))
#endif
let willyNilly = tempDirPath.appending(component: willyNillyString)
let nfd = tempDirPath.appending(component: nfdString)
let nfc = tempDirPath.appending(component: nfcString)
let variants = [willyNilly, nfd, nfc]
for directory in variants {
do {
try fs.createDirectory(directory)
defer { try? fs.removeFileTree(directory) }
for other in variants {
_ = try fs.getDirectoryContents(other)
}
} catch {
let directoryName = directory.basename.unicodeScalars.map({ "U+\($0.value)" })
XCTFail("Unable to handle directory named “\(directoryName)”: \(error)")
}
}
}
}
}
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If you're looking for a more complete programmatic data set to drive this, the stdlib uses Unicode's published normalization tests at https://raw.githubusercontent.com/apple/swift/master/test/stdlib/Inputs/NormalizationTest.txt

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This lightweight test is really only there to check that Unicode equivalence is being considered at all. String does all the heavy lifting when it comes to the definition of equivalence. (The text fixture we are trying to get working on Linux is already much meaner too.)

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

You can also try U+0F73 which is FCD, has a CCC of 0, and yet expands into two different scalars, neither of which have a CCC of 0.

Expand Down