Skip to content

Commit 33c4a54

Browse files
Merge pull request #6939 from ole/string-manifesto-fixes
[docs] Minor String Manifesto fixes
2 parents 9bdd3d2 + aebf80b commit 33c4a54

File tree

1 file changed

+28
-28
lines changed

1 file changed

+28
-28
lines changed

docs/StringManifesto.md

Lines changed: 28 additions & 28 deletions
Original file line numberDiff line numberDiff line change
@@ -29,12 +29,12 @@ work that could be done in the Swift 4 timeframe.
2929
It's worth noting that ergonomics and correctness are mutually-reinforcing. An
3030
API that is easy to use—but incorrectly—cannot be considered an ergonomic
3131
success. Conversely, an API that's simply hard to use is also hard to use
32-
correctly. Acheiving optimal performance without compromising ergonomics or
32+
correctly. Achieving optimal performance without compromising ergonomics or
3333
correctness is a greater challenge.
3434

3535
Consistency with the Swift language and idioms is also important for
3636
ergonomics. There are several places both in the standard library and in the
37-
foundation additions to `String` where patterns and practices found elsewhere
37+
Foundation additions to `String` where patterns and practices found elsewhere
3838
could be applied to improve usability and familiarity.
3939

4040
### API Surface Area
@@ -123,8 +123,8 @@ The first step in improving this situation is to regularize all localized
123123
operations as invocations of normal string operations with extra
124124
parameters. Among other things, this means:
125125

126-
1. Doing away with `localizedXXX` methods
127-
2. Providing a terse way to name the current locale as a parameter
126+
1. Doing away with `localizedXXX` methods.
127+
2. Providing a terse way to name the current locale as a parameter.
128128
3. Automatically [adjusting defaults](#operations-with-options) for options such
129129
as case sensitivity based on whether the operation is localized.
130130
4. Removing correctness traps like `localizedCaseInsensitiveCompare` (see
@@ -233,16 +233,16 @@ What Unicode says about collation—which is used in `<`, `==`, and hashing— t
233233
out to be quite interesting, once you pick it apart. The full Unicode Collation
234234
Algorithm (UCA) works like this:
235235

236-
1. Fully normalize both strings
237-
2. Convert each string to a sequence of numeric triples to form a collation key
236+
1. Fully normalize both strings.
237+
2. Convert each string to a sequence of numeric triples to form a collation key.
238238
3. “Flatten” the key by concatenating the sequence of first elements to the
239-
sequence of second elements to the sequence of third elements
240-
4. Lexicographically compare the flattened keys
239+
sequence of second elements to the sequence of third elements.
240+
4. Lexicographically compare the flattened keys.
241241

242242
While step 1 can usually
243243
be [done quickly](http://unicode.org/reports/tr15/#Description_Norm) and
244244
incrementally, step 2 uses a collation table that maps matching *sequences* of
245-
unicode scalars in the normalized string to *sequences* of triples, which get
245+
Unicode scalars in the normalized string to *sequences* of triples, which get
246246
accumulated into a collation key. Predictably, this is where the real costs
247247
lie.
248248

@@ -383,7 +383,7 @@ The benefits of restoring `Collection` conformance are substantial:
383383
from whole-string ordering comparison, equality comparison, and
384384
case-conversion, respectively. `reverse` operates correctly on graphemes,
385385
keeping diacritics moored to their base characters and leaving emoji intact.
386-
Other methods such as `indexOf` and `contains` make obvious sense. A few
386+
Other methods such as `index(of:)` and `contains` make obvious sense. A few
387387
`Collection` methods, like `min` and `max`, may not be particularly useful
388388
on `String`, but we don't consider that to be a problem worth solving, in
389389
the same way that we wouldn't try to suppress `min` and `max` on a
@@ -412,9 +412,9 @@ The benefits of restoring `Collection` conformance are substantial:
412412
this:
413413

414414
```swift
415-
extension String : BidirectionalCollection {
416-
subscript(i: Index) -> Character { return characters[i] }
417-
}
415+
extension String : BidirectionalCollection {
416+
subscript(i: Index) -> Character { return characters[i] }
417+
}
418418
```
419419

420420
It would be much better to legitimize the conformance to `Collection` and
@@ -439,8 +439,8 @@ do any introspection, including interoperation with ASCII. To fix this, we shou
439439
that contain 0 or 2+ graphemes).
440440
- (Lower priority) expose some operations, such as `func uppercase() ->
441441
String`, `var isASCII: Bool`, and, to the extent they can be sensibly
442-
generalized, queries of unicode properties that should also be exposed on
443-
`UnicodeScalar` such as `isAlphabetic` and `isGraphemeBase` .
442+
generalized, queries of Unicode properties that should also be exposed on
443+
`UnicodeScalar` such as `isAlphabetic` and `isGraphemeBase`.
444444

445445
Despite its name, `CharacterSet` currently operates on the Swift `UnicodeScalar`
446446
type. This means it is usable on `String`, but only by going through the unicode
@@ -451,21 +451,21 @@ grapheme clusters. <sup id="a5">[5](#f5)</sup>
451451

452452
### Unification of Slicing Operations
453453

454-
Creating substrings is a basic part of String processing, but the slicing
454+
Creating substrings is a basic part of string processing, but the slicing
455455
operations that we have in Swift are inconsistent in both their spelling and
456-
their naming:
456+
their naming:
457457

458458
* Slices with two explicit endpoints are done with subscript, and support
459459
in-place mutation:
460460

461461
```swift
462-
s[i..<j].mutate()
462+
s[i..<j].mutate()
463463
```
464464

465465
* Slicing from an index to the end, or from the start to an index, is done
466466
with a method and does not support in-place mutation:
467467
```swift
468-
s.prefix(upTo: i).readOnly()
468+
s.prefix(upTo: i).readOnly()
469469
```
470470

471471
Prefix and suffix operations should be migrated to be subscripting operations
@@ -658,7 +658,7 @@ consistent rule that could be applied in the general case for detecting when a
658658
substring is truly being stored long-term.
659659

660660
To avoid the cost of copying substrings under "same type, copied storage", the
661-
optimizer could be enhanced to to reduce the impact of some of those copies.
661+
optimizer could be enhanced to reduce the impact of some of those copies.
662662
For example, this code could be optimized to pull the invariant substring out
663663
of the loop:
664664

@@ -811,7 +811,7 @@ protocols in protocols.
811811

812812
#### Low-Level Textual Analysis
813813

814-
We should provide convenient APIs processing strings by character. For example,
814+
We should provide convenient APIs for processing strings by character. For example,
815815
it should be easy to cleanly express, “if this string starts with `"f"`, process
816816
the rest of the string as follows” Swift is well-suited to expressing this
817817
common pattern beautifully, but we need to add the APIs. Here are two examples
@@ -835,10 +835,10 @@ point is to make sure matching-and-consuming jobs are well-supported.
835835
Many of the current methods that do matching are overloaded to do the same
836836
logical operations in different ways, with the following axes:
837837

838-
- Logical Operation: `find`, `split`, `replace`, match at start
839-
- Kind of pattern: `CharacterSet`, `String`, a regex, a closure
838+
- Logical Operation: `find`, `split`, `replace`, match at start.
839+
- Kind of pattern: `CharacterSet`, `String`, a regex, a closure.
840840
- Options, e.g. case/diacritic sensitivity, locale. Sometimes a part of
841-
the method name, and sometimes an argument
841+
the method name, and sometimes an argument.
842842
- Whole string or subrange.
843843

844844
We should represent these aspects as orthogonal, composable components,
@@ -878,7 +878,7 @@ forces.replace(oneOrMore([Float.nan]), [0.0])
878878
#### Regular Expressions
879879

880880
Addressing regular expressions is out of scope for this proposal.
881-
That said, it is important that to note the pattern matching protocol mentioned
881+
That said, it is important to note that the pattern matching protocol mentioned
882882
above provides a suitable foundation for regular expressions, and types such as
883883
`NSRegularExpression` can easily be retrofitted to conform to it. In the
884884
future, support for regular expression literals in the compiler could allow for
@@ -982,7 +982,7 @@ development.
982982

983983
#### Printf-Style Formatting
984984

985-
`String.format` is designed on the `printf` model: it takes a format string with
985+
`String(format:)` is designed on the `printf` model: it takes a format string with
986986
textual placeholders for substitution, and an arbitrary list of other arguments.
987987
The syntax and meaning of these placeholders has a long history in
988988
C, but for anyone who doesn't use them regularly they are cryptic and complex,
@@ -1011,7 +1011,7 @@ design pattern demands more from users than it should:
10111011

10121012
These may seem like small issues, but the experience of Apple localization
10131013
experts is that the total drag of these factors on programmers is such that they
1014-
tend to reach for `String.format` instead.
1014+
tend to reach for `String(format:)` instead.
10151015

10161016
#### String Interpolation
10171017

@@ -1229,7 +1229,7 @@ This area will require some design work.
12291229

12301230
### `StaticString`
12311231

1232-
`StaticString` was added as a byproduct of standard library developed and kept
1232+
`StaticString` was added as a byproduct of standard library development and kept
12331233
around because it seemed useful, but it was never truly *designed* for client
12341234
programmers. We need to decide what happens with it. Presumably *something*
12351235
should fill its role, and that should conform to `Unicode`.

0 commit comments

Comments
 (0)