Skip to content

Commit b8e0e09

Browse files
jensmaurertkoeppe
authored andcommitted
[lex] Replace \term with \placeholder or \defn as appropriate (#1067)
Partially addresses #329.
1 parent bf3327a commit b8e0e09

File tree

1 file changed

+65
-71
lines changed

1 file changed

+65
-71
lines changed

source/lex.tex

Lines changed: 65 additions & 71 deletions
Original file line numberDiff line numberDiff line change
@@ -20,14 +20,16 @@
2020
\indextext{pointer literal|see{literal, pointer}}
2121
\indextext{user-defined literal|see{literal, user-defined}}
2222
\indextext{file, source|see{source file}}
23+
\indextext{null character|see{character, null}}
24+
\indextext{null wide character|see{wide-character, null}}
2325

2426
\rSec1[lex.separate]{Separate translation}
2527

2628
\pnum
2729
\indextext{conventions!lexical|(}%
2830
\indextext{compilation!separate|(}%
2931
The text of the program is kept in units called
30-
\indextext{source file}\term{source files} in this International
32+
\defnx{source files}{source file} in this International
3133
Standard. A source file together with all the headers~(\ref{headers})
3234
and source files included~(\ref{cpp.include}) via the preprocessing
3335
directive \tcode{\#include}, less any source lines skipped by any of the
@@ -56,7 +58,6 @@
5658
occur, although in practice different phases might be folded together.}
5759

5860
\begin{enumerate}
59-
\indextext{source file}%
6061
\indextext{character!source file}%
6162
\indextext{character set!basic source}%
6263
\item Physical source file characters are mapped, in an
@@ -174,8 +175,7 @@
174175

175176
\pnum
176177
\indextext{character set|(}%
177-
\indextext{character set!basic source}%
178-
The \term{basic source character set} consists of 96 characters: the space character,
178+
The \defnx{basic source character set}{character set!basic source} consists of 96 characters: the space character,
179179
the control characters representing horizontal tab, vertical tab, form feed, and
180180
new-line, plus the following 91 graphical characters:\footnote{The glyphs for
181181
the members of the basic source character set are intended to
@@ -229,17 +229,18 @@
229229
\grammarterm{universal-character-name}.}
230230

231231
\pnum
232-
The \term{basic execution character set} and the \term{basic
233-
execution wide-character set} shall each contain all the members of the
232+
The \defnx{basic execution character set}{character set!basic execution} and the
233+
\defnx{basic execution wide-character set}{wide-character set!basic execution}
234+
shall each contain all the members of the
234235
basic source character set, plus control characters representing alert,
235-
backspace, and carriage return, plus a \term{null character}
236-
(respectively, \term{null wide character}), whose value is 0.
236+
backspace, and carriage return, plus a \defnx{null character}{character!null}
237+
(respectively, \defnx{null wide character}{wide-character!null}), whose value is 0.
237238
For each basic execution character set, the values of the
238239
members shall be non-negative and distinct from one another. In both the
239240
source and execution basic character sets, the value of each character
240241
after \tcode{0} in the above list of decimal digits shall be one greater
241-
than the value of the previous. The \term{execution character set}
242-
and the \term{execution wide-character set} are
242+
than the value of the previous. The \defnx{execution character set}{character set!execution}
243+
and the \defnx{execution wide-character set}{wide-character set!execution} are
243244
\impldef{execution character set and execution wide-character set}
244245
supersets of the
245246
basic execution character set and the basic execution wide-character
@@ -930,26 +931,22 @@
930931
\pnum
931932
\indextext{literal!\idxcode{unsigned}}%
932933
\indextext{literal!\idxcode{long}}%
933-
\indextext{literal!integer}%
934-
\indextext{literal!binary}%
935-
\indextext{literal!octal}%
936-
\indextext{literal!decimal}%
937-
\indextext{literal!hexadecimal}%
938934
\indextext{literal!base~of integer}%
939-
An \term{integer literal} is a sequence of digits that has no period
935+
An \defnx{integer literal}{literal!integer} is a sequence of digits that has no period
940936
or exponent part, with optional separating single quotes that are ignored
941937
when determining its value. An integer literal may have a prefix that specifies
942938
its base and a suffix that specifies its type. The lexically first digit
943939
of the sequence of digits is the most significant.
944-
A \term{binary} integer literal (base two) begins with
940+
A \defnx{binary integer literal}{literal!binary} (base two) begins with
945941
\tcode{0b} or \tcode{0B} and consists of a sequence of binary digits.
946-
An \term{octal} integer
947-
literal (base eight) begins with the digit \tcode{0} and consists of a
942+
An \defnx{octal integer literal}{literal!octal}
943+
(base eight) begins with the digit \tcode{0} and consists of a
948944
sequence of octal digits.\footnote{The digits \tcode{8} and \tcode{9} are not octal digits. }
949-
A \term{decimal}
950-
integer literal (base ten) begins with a digit other than \tcode{0} and
945+
A \defnx{decimal integer literal}{literal!decimal}
946+
(base ten) begins with a digit other than \tcode{0} and
951947
consists of a sequence of decimal digits.
952-
A \term{hexadecimal} integer literal (base sixteen) begins with
948+
A \defnx{hexadecimal integer literal}{literal!hexadecimal}
949+
(base sixteen) begins with
953950
\tcode{0x} or \tcode{0X} and consists of a sequence of hexadecimal
954951
digits, which include the decimal digits and the letters \tcode{a}
955952
through \tcode{f} and \tcode{A} through \tcode{F} with decimal values
@@ -1358,10 +1355,8 @@
13581355
The integer and fraction parts both consist of
13591356
a sequence of decimal (base ten) digits if there is no prefix, or
13601357
hexadecimal (base sixteen) digits if the prefix is \tcode{0x} or \tcode{0X}.
1361-
\indextext{literal!decimal floating}%
1362-
The literal is a \term{decimal floating literal} in the former case and
1363-
\indextext{literal!hexadecimal floating}%
1364-
a \term{hexadecimal floating literal} in the latter case.
1358+
The literal is a \defnx{decimal floating literal}{literal!decimal floating} in the former case and
1359+
a \defnx{hexadecimal floating literal}{literal!hexadecimal floating} in the latter case.
13651360
Optional separating single quotes in
13661361
a \grammarterm{digit-sequence} or \grammarterm{hexadecimal-digit-sequence}
13671362
are ignored when determining its value.
@@ -1558,7 +1553,7 @@
15581553
also referred to as narrow
15591554
string literals. A narrow string literal has type
15601555
\indextext{literal!string!type~of}%
1561-
``array of \term{n} \tcode{const char}'', where \term{n} is the size of
1556+
``array of \placeholder{n} \tcode{const char}'', where \placeholder{n} is the size of
15621557
the string as defined below, and has static storage
15631558
duration~(\ref{basic.stc}).
15641559

@@ -1573,7 +1568,7 @@
15731568
\indextext{prefix!\idxcode{u}}%
15741569
such as \tcode{u"asdf"}, is
15751570
a \tcode{char16_t} string literal. A \tcode{char16_t} string literal has
1576-
type ``array of \term{n} \tcode{const char16_t}'', where \term{n} is the
1571+
type ``array of \placeholder{n} \tcode{const char16_t}'', where \placeholder{n} is the
15771572
size of the string as defined below; it
15781573
is initialized with the given characters. A single \grammarterm{c-char} may
15791574
produce more than one \tcode{char16_t} character in the form of
@@ -1585,7 +1580,7 @@
15851580
\indextext{prefix!\idxcode{U}}%
15861581
such as \tcode{U"asdf"}, is
15871582
a \tcode{char32_t} string literal. A \tcode{char32_t} string literal has
1588-
type ``array of \term{n} \tcode{const char32_t}'', where \term{n} is the
1583+
type ``array of \placeholder{n} \tcode{const char32_t}'', where \placeholder{n} is the
15891584
size of the string as defined below; it
15901585
is initialized with the given characters.
15911586

@@ -1598,8 +1593,8 @@
15981593
\indextext{\idxcode{wchar_t}}%
15991594
\indextext{literal!string!wide}%
16001595
\indextext{prefix!\idxcode{L}}%
1601-
A wide string literal has type ``array of \term{n} \tcode{const
1602-
wchar_t}'', where \term{n} is the size of the string as defined below; it
1596+
A wide string literal has type ``array of \placeholder{n} \tcode{const
1597+
wchar_t}'', where \placeholder{n} is the size of the string as defined below; it
16031598
is initialized with the given characters.
16041599

16051600
\pnum
@@ -1654,13 +1649,12 @@
16541649
\pnum
16551650
\indextext{\idxcode{0}|seealso{zero,~null}}%
16561651
\indextext{\idxcode{0}!string terminator}%
1657-
\indextext{\idxcode{0}!null~character}%
1652+
\indextext{\idxcode{0}!null~character|see {character, null}}%
16581653
After any necessary concatenation, in translation phase
16591654
7~(\ref{lex.phases}), \tcode{'\textbackslash 0'} is appended to every
16601655
string literal so that programs that scan a string can find its end.
16611656

16621657
\pnum
1663-
\indextext{encoding!multibyte}%
16641658
Escape sequences and \grammarterm{universal-character-name}{s} in non-raw string literals
16651659
have the same meaning as in character literals~(\ref{lex.ccon}), except that
16661660
the single quote \tcode{'} is representable either by itself or by the escape sequence
@@ -1670,7 +1664,7 @@
16701664
\tcode{char16_t} string literal may yield a surrogate pair.
16711665
\indextext{string!\idxcode{sizeof}}%
16721666
In a narrow string literal, a \grammarterm{universal-character-name} may map to more
1673-
than one \tcode{char} element due to \term{multibyte encoding}. The
1667+
than one \tcode{char} element due to \defnx{multibyte encoding}{encoding!multibyte}. The
16741668
size of a \tcode{char32_t} or wide string literal is the total number of
16751669
escape sequences, \grammarterm{universal-character-name}{s}, and other characters, plus
16761670
one for the terminating \tcode{U'\textbackslash 0'} or
@@ -1786,93 +1780,93 @@
17861780
\pnum
17871781
A \grammarterm{user-defined-literal} is treated as a call to a literal operator or
17881782
literal operator template~(\ref{over.literal}). To determine the form of this call for a
1789-
given \grammarterm{user-defined-literal} \term{L} with \grammarterm{ud-suffix} \term{X},
1790-
the \grammarterm{literal-operator-id} whose literal suffix identifier is \term{X} is
1791-
looked up in the context of \term{L} using the rules for unqualified name
1792-
lookup~(\ref{basic.lookup.unqual}). Let \term{S} be the set of declarations found by
1793-
this lookup. \term{S} shall not be empty.
1783+
given \grammarterm{user-defined-literal} \placeholder{L} with \grammarterm{ud-suffix} \placeholder{X},
1784+
the \grammarterm{literal-operator-id} whose literal suffix identifier is \placeholder{X} is
1785+
looked up in the context of \placeholder{L} using the rules for unqualified name
1786+
lookup~(\ref{basic.lookup.unqual}). Let \placeholder{S} be the set of declarations found by
1787+
this lookup. \placeholder{S} shall not be empty.
17941788

17951789
\pnum
1796-
If \term{L} is a \grammarterm{user-defined-integer-literal}, let \term{n} be the literal
1797-
without its \grammarterm{ud-suffix}. If \term{S} contains a literal operator with
1798-
parameter type \tcode{unsigned long long}, the literal \term{L} is treated as a call of
1790+
If \placeholder{L} is a \grammarterm{user-defined-integer-literal}, let \placeholder{n} be the literal
1791+
without its \grammarterm{ud-suffix}. If \placeholder{S} contains a literal operator with
1792+
parameter type \tcode{unsigned long long}, the literal \placeholder{L} is treated as a call of
17991793
the form
18001794

18011795
\begin{codeblock}
1802-
operator "" @\term{X}@(@\term{n}@ULL)
1796+
operator "" @\placeholder{X}@(@\placeholder{n}@ULL)
18031797
\end{codeblock}
18041798

1805-
Otherwise, \term{S} shall contain a raw literal operator or a literal operator
1806-
template~(\ref{over.literal}) but not both. If \term{S} contains a raw literal operator,
1807-
the literal \term{L} is treated as a call of the form
1799+
Otherwise, \placeholder{S} shall contain a raw literal operator or a literal operator
1800+
template~(\ref{over.literal}) but not both. If \placeholder{S} contains a raw literal operator,
1801+
the literal \placeholder{L} is treated as a call of the form
18081802

18091803
\begin{codeblock}
1810-
operator "" @\term{X}@(@"\term{n}{"}@)
1804+
operator "" @\placeholder{X}@(@"\placeholder{n}{"}@)
18111805
\end{codeblock}
18121806

1813-
Otherwise (\term{S} contains a literal operator template), \term{L} is treated as a call
1807+
Otherwise (\placeholder{S} contains a literal operator template), \placeholder{L} is treated as a call
18141808
of the form
18151809

18161810

18171811
\begin{codeblock}
1818-
operator "" @\term{X}@<'@$c_1$@', '@$c_2$@', ... '@$c_k$@'>()
1812+
operator "" @\placeholder{X}@<'@$c_1$@', '@$c_2$@', ... '@$c_k$@'>()
18191813
\end{codeblock}
18201814

1821-
where \term{n} is the source character sequence $c_1c_2...c_k$. \begin{note} The sequence
1815+
where \placeholder{n} is the source character sequence $c_1c_2...c_k$. \begin{note} The sequence
18221816
$c_1c_2...c_k$ can only contain characters from the basic source character set.
18231817
\end{note}
18241818

18251819
\pnum
1826-
If \term{L} is a \grammarterm{user-defined-floating-literal}, let \term{f} be the
1827-
literal without its \grammarterm{ud-suffix}. If \term{S} contains a literal operator
1828-
with parameter type \tcode{long double}, the literal \term{L} is treated as a call of
1820+
If \placeholder{L} is a \grammarterm{user-defined-floating-literal}, let \placeholder{f} be the
1821+
literal without its \grammarterm{ud-suffix}. If \placeholder{S} contains a literal operator
1822+
with parameter type \tcode{long double}, the literal \placeholder{L} is treated as a call of
18291823
the form
18301824

18311825
\begin{codeblock}
1832-
operator "" @\term{X}@(@\term{f}@L)
1826+
operator "" @\placeholder{X}@(@\placeholder{f}@L)
18331827
\end{codeblock}
18341828

1835-
Otherwise, \term{S} shall contain a raw literal operator or a literal operator
1836-
template~(\ref{over.literal}) but not both. If \term{S} contains a raw literal operator,
1837-
the \term{literal} \term{L} is treated as a call of the form
1829+
Otherwise, \placeholder{S} shall contain a raw literal operator or a literal operator
1830+
template~(\ref{over.literal}) but not both. If \placeholder{S} contains a raw literal operator,
1831+
the \grammarterm{literal} \placeholder{L} is treated as a call of the form
18381832

18391833
\begin{codeblock}
1840-
operator "" @\term{X}@(@"\term{f}{"}@)
1834+
operator "" @\placeholder{X}@(@"\placeholder{f}{"}@)
18411835
\end{codeblock}
18421836

1843-
Otherwise (\term{S} contains a literal operator template), \term{L} is treated as a call
1837+
Otherwise (\placeholder{S} contains a literal operator template), \placeholder{L} is treated as a call
18441838
of the form
18451839

18461840
\begin{codeblock}
1847-
operator "" @\term{X}@<'@$c_1$@', '@$c_2$@', ... '@$c_k$@'>()
1841+
operator "" @\placeholder{X}@<'@$c_1$@', '@$c_2$@', ... '@$c_k$@'>()
18481842
\end{codeblock}
18491843

1850-
where \term{f} is the source character sequence $c_1c_2...c_k$. \begin{note} The sequence
1844+
where \placeholder{f} is the source character sequence $c_1c_2...c_k$. \begin{note} The sequence
18511845
$c_1c_2...c_k$ can only contain characters from the basic source character set.
18521846
\end{note}
18531847

18541848
\pnum
1855-
If \term{L} is a \grammarterm{user-defined-string-literal}, let \term{str} be the
1856-
literal without its \grammarterm{ud-suffix} and let \term{len} be
1849+
If \placeholder{L} is a \grammarterm{user-defined-string-literal}, let \placeholder{str} be the
1850+
literal without its \grammarterm{ud-suffix} and let \placeholder{len} be
18571851
the number of
1858-
code units in \term{str} (i.e., its length excluding the terminating
1852+
code units in \placeholder{str} (i.e., its length excluding the terminating
18591853
null character).
1860-
The literal \term{L} is treated as a call of the form
1854+
The literal \placeholder{L} is treated as a call of the form
18611855

18621856
\begin{codeblock}
1863-
operator "" @\term{X}@(@\term{str}{}@, @\term{len}{}@)
1857+
operator "" @\placeholder{X}@(@\placeholder{str}{}@, @\placeholder{len}{}@)
18641858
\end{codeblock}
18651859

18661860
\pnum
1867-
If \term{L} is a \grammarterm{user-defined-character-literal}, let \term{ch} be the
1861+
If \placeholder{L} is a \grammarterm{user-defined-character-literal}, let \placeholder{ch} be the
18681862
literal without its \grammarterm{ud-suffix}.
1869-
\term{S} shall contain a literal operator~(\ref{over.literal}) whose only parameter has
1870-
the type of \term{ch} and the
1871-
literal \term{L} is treated as a call
1863+
\placeholder{S} shall contain a literal operator~(\ref{over.literal}) whose only parameter has
1864+
the type of \placeholder{ch} and the
1865+
literal \placeholder{L} is treated as a call
18721866
of the form
18731867

18741868
\begin{codeblock}
1875-
operator "" @\term{X}@(@\term{ch}{}@)
1869+
operator "" @\placeholder{X}@(@\placeholder{ch}{}@)
18761870
\end{codeblock}
18771871

18781872
\pnum

0 commit comments

Comments
 (0)