Skip to content

Commit 86070a8

Browse files
authored
[libc++][hardening] Finish documenting hardening. (#92021)
1 parent 21f5ee0 commit 86070a8

File tree

4 files changed

+410
-13
lines changed

4 files changed

+410
-13
lines changed

libcxx/docs/Hardening.rst

Lines changed: 363 additions & 10 deletions
Original file line numberDiff line numberDiff line change
@@ -1,4 +1,4 @@
1-
.. _hardening-modes:
1+
.. _hardening:
22

33
===============
44
Hardening Modes
@@ -29,8 +29,11 @@ modes are:
2929
rigour impacts performance more than fast mode: we recommend benchmarking to
3030
determine if that is acceptable for your program.
3131
- **Debug mode**, which enables all the available checks in the library,
32-
including internal assertions, some of which might be very expensive. This
33-
mode is intended to be used for testing, not in production.
32+
including heuristic checks that might have significant performance overhead as
33+
well as internal library assertions. This mode should be used in
34+
non-production environments (such as test suites, CI, or local development).
35+
We don’t commit to a particular level of performance in this mode and it’s
36+
*not* intended to be used in production.
3437

3538
.. note::
3639

@@ -72,17 +75,367 @@ to control the level by passing **one** of the following options to the compiler
7275
Notes for vendors
7376
-----------------
7477

75-
Vendors can set the default hardening mode by providing ``LIBCXX_HARDENING_MODE``
76-
as a configuration option, with the possible values of ``none``, ``fast``,
77-
``extensive`` and ``debug``. The default value is ``none`` which doesn't enable
78-
any hardening checks (this mode is sometimes called the ``unchecked`` mode).
78+
Vendors can set the default hardening mode by providing
79+
``LIBCXX_HARDENING_MODE`` as a configuration option, with the possible values of
80+
``none``, ``fast``, ``extensive`` and ``debug``. The default value is ``none``
81+
which doesn't enable any hardening checks (this mode is sometimes called the
82+
``unchecked`` mode).
7983

8084
This option controls both the hardening mode that the precompiled library is
8185
built with and the default hardening mode that users will build with. If set to
8286
``none``, the precompiled library will not contain any assertions, and user code
8387
will default to building without assertions.
8488

85-
Iterator bounds checking
86-
------------------------
89+
Vendors can also override the way the program is terminated when an assertion
90+
fails by :ref:`providing a custom header <override-assertion-handler>`.
8791

88-
TODO(hardening)
92+
Assertion categories
93+
====================
94+
95+
Inside the library, individual assertions are grouped into different
96+
*categories*. Each hardening mode enables a different set of assertion
97+
categories; categories provide an additional layer of abstraction that makes it
98+
easier to reason about the high-level semantics of a hardening mode.
99+
100+
.. note::
101+
102+
Users are not intended to interact with these categories directly -- the
103+
categories are considered internal to the library and subject to change.
104+
105+
- ``valid-element-access`` -- checks that any attempts to access a container
106+
element, whether through the container object or through an iterator, are
107+
valid and do not attempt to go out of bounds or otherwise access
108+
a non-existent element. This also includes operations that set up an imminent
109+
invalid access (e.g. incrementing an end iterator). For iterator checks to
110+
work, bounded iterators must be enabled in the ABI. Types like
111+
``std::optional`` and ``std::function`` are considered containers (with at
112+
most one element) for the purposes of this check.
113+
114+
- ``valid-input-range`` -- checks that ranges (whether expressed as an iterator
115+
pair, an iterator and a sentinel, an iterator and a count, or
116+
a ``std::range``) given as input to library functions are valid:
117+
- the sentinel is reachable from the begin iterator;
118+
- TODO(hardening): both iterators refer to the same container.
119+
120+
("input" here refers to "an input given to an algorithm", not to an iterator
121+
category)
122+
123+
Violating assertions in this category leads to an out-of-bounds access.
124+
125+
- ``non-null`` -- checks that the pointer being dereferenced is not null. On
126+
most modern platforms, the zero address does not refer to an actual location
127+
in memory, so a null pointer dereference would not compromise the memory
128+
security of a program (however, it is still undefined behavior that can result
129+
in strange errors due to compiler optimizations).
130+
131+
- ``non-overlapping-ranges`` -- for functions that take several ranges as
132+
arguments, checks that those ranges do not overlap.
133+
134+
- ``valid-deallocation`` -- checks that an attempt to deallocate memory is valid
135+
(e.g. the given object was allocated by the given allocator). Violating this
136+
category typically results in a memory leak.
137+
138+
- ``valid-external-api-call`` -- checks that a call to an external API doesn't
139+
fail in an unexpected manner. This includes triggering documented cases of
140+
undefined behavior in an external library (like attempting to unlock an
141+
unlocked mutex in pthreads). Any API external to the library falls under this
142+
category (from system calls to compiler intrinsics). We generally don't expect
143+
these failures to compromise memory safety or otherwise create an immediate
144+
security issue.
145+
146+
- ``compatible-allocator`` -- checks any operations that exchange nodes between
147+
containers to make sure the containers have compatible allocators.
148+
149+
- ``argument-within-domain`` -- checks that the given argument is within the
150+
domain of valid arguments for the function. Violating this typically produces
151+
an incorrect result (e.g. ``std::clamp`` returns the original value without
152+
clamping it due to incorrect functors) or puts an object into an invalid state
153+
(e.g. a string view where only a subset of elements is accessible). This
154+
category is for assertions violating which doesn't cause any immediate issues
155+
in the library -- whatever the consequences are, they will happen in the user
156+
code.
157+
158+
- ``pedantic`` -- checks preconditions that are imposed by the Standard, but
159+
violating which happens to be benign in libc++.
160+
161+
- ``semantic-requirement`` -- checks that the given argument satisfies the
162+
semantic requirements imposed by the Standard. Typically, there is no simple
163+
way to completely prove that a semantic requirement is satisfied; thus, this
164+
would often be a heuristic check and it might be quite expensive.
165+
166+
- ``internal`` -- checks that internal invariants of the library hold. These
167+
assertions don't depend on user input.
168+
169+
- ``uncategorized`` -- for assertions that haven't been properly classified yet.
170+
This category is an escape hatch used for some existing assertions in the
171+
library; all new code should have its assertions properly classified.
172+
173+
Mapping between the hardening modes and the assertion categories
174+
================================================================
175+
176+
.. list-table::
177+
:header-rows: 1
178+
:widths: auto
179+
180+
* - Category name
181+
- ``fast``
182+
- ``extensive``
183+
- ``debug``
184+
* - ``valid-element-access``
185+
- ✅
186+
- ✅
187+
- ✅
188+
* - ``valid-input-range``
189+
- ✅
190+
- ✅
191+
- ✅
192+
* - ``non-null``
193+
- ❌
194+
- ✅
195+
- ✅
196+
* - ``non-overlapping-ranges``
197+
- ❌
198+
- ✅
199+
- ✅
200+
* - ``valid-deallocation``
201+
- ❌
202+
- ✅
203+
- ✅
204+
* - ``valid-external-api-call``
205+
- ❌
206+
- ✅
207+
- ✅
208+
* - ``compatible-allocator``
209+
- ❌
210+
- ✅
211+
- ✅
212+
* - ``argument-within-domain``
213+
- ❌
214+
- ✅
215+
- ✅
216+
* - ``pedantic``
217+
- ❌
218+
- ✅
219+
- ✅
220+
* - ``semantic-requirement``
221+
- ❌
222+
- ❌
223+
- ✅
224+
* - ``internal``
225+
- ❌
226+
- ❌
227+
- ✅
228+
* - ``uncategorized``
229+
- ❌
230+
- ✅
231+
- ✅
232+
233+
.. note::
234+
235+
At the moment, each subsequent hardening mode is a strict superset of the
236+
previous one (in other words, each subsequent mode only enables additional
237+
assertion categories without disabling any), but this won't necessarily be
238+
true for any hardening modes that might be added in the future.
239+
240+
.. note::
241+
242+
The categories enabled by each mode are subject to change and users should not
243+
rely on the precise assertions enabled by a mode at a given point in time.
244+
However, the library does guarantee to keep the hardening modes stable and
245+
to fulfill the semantics documented here.
246+
247+
Hardening assertion failure
248+
===========================
249+
250+
In production modes (``fast`` and ``extensive``), a hardening assertion failure
251+
immediately ``_traps <https://llvm.org/docs/LangRef.html#llvm-trap-intrinsic>``
252+
the program. This is the safest approach that also minimizes the code size
253+
penalty as the failure handler maps to a single instruction. The downside is
254+
that the failure provides no additional details other than the stack trace
255+
(which might also be affected by optimizations).
256+
257+
TODO(hardening): describe ``__builtin_verbose_trap`` once we can use it.
258+
259+
In the ``debug`` mode, an assertion failure terminates the program in an
260+
unspecified manner and also outputs the associated error message to the error
261+
output. This is less secure and increases the size of the binary (among other
262+
things, it has to store the error message strings) but makes the failure easier
263+
to debug. It also allows testing the error messages in our test suite.
264+
265+
.. _override-assertion-handler:
266+
267+
Overriding the assertion failure handler
268+
----------------------------------------
269+
270+
Vendors can override the default assertion handler mechanism by following these
271+
steps:
272+
273+
- create a header file that provides a definition of a macro called
274+
``_LIBCPP_ASSERTION_HANDLER``. The macro will be invoked when a hardening
275+
assertion fails, with a single parameter containing a null-terminated string
276+
with the error message.
277+
- when configuring the library, provide the path to custom header (relative to
278+
the root of the repository) via the CMake variable
279+
``LIBCXX_ASSERTION_HANDLER_FILE``.
280+
281+
Note that almost all libc++ headers include the assertion handler header which
282+
means it should not include anything non-trivial from the standard library to
283+
avoid creating circular dependencies.
284+
285+
There is no existing mechanism for users to override the assertion handler
286+
because the ability to do the override other than at configure-time carries an
287+
unavoidable code size penalty that would otherwise be imposed on all users,
288+
whether they require such customization or not. Instead, we let vendors decide
289+
what's right on their platform for their users -- a vendor who wishes to provide
290+
this capability is free to do so, e.g. by declaring the assertion handler as an
291+
overridable function.
292+
293+
ABI
294+
===
295+
296+
Setting a hardening mode does **not** affect the ABI. Each mode uses the subset
297+
of checks available in the current ABI configuration which is determined by the
298+
platform.
299+
300+
It is important to stress that whether a particular check is enabled depends on
301+
the combination of the selected hardening mode and the hardening-related ABI
302+
options. Some checks require changing the ABI from the "default" to store
303+
additional information in the library classes -- e.g. checking whether an
304+
iterator is valid upon dereference generally requires storing data about bounds
305+
inside the iterator object. Using ``std::span`` as an example, setting the
306+
hardening mode to ``fast`` will always enable the ``valid-element-access``
307+
checks when accessing elements via a ``std::span`` object, but whether
308+
dereferencing a ``std::span`` iterator does the equivalent check depends on the
309+
ABI configuration.
310+
311+
ABI options
312+
-----------
313+
314+
Vendors can use the following ABI options to enable additional hardening checks:
315+
316+
- ``_LIBCPP_ABI_BOUNDED_ITERATORS`` -- changes the iterator type of select
317+
containers (see below) to a bounded iterator that keeps track of whether it's
318+
within the bounds of the original container and asserts valid bounds on every
319+
dereference.
320+
321+
ABI impact: changes the iterator type of the relevant containers.
322+
323+
Supported containers:
324+
325+
- ``span``;
326+
- ``string_view``.
327+
328+
ABI tags
329+
--------
330+
331+
We use ABI tags to allow translation units built with different hardening modes
332+
to interact with each other without causing ODR violations. Knowing how
333+
hardening modes are encoded into the ABI tags might be useful to examine
334+
a binary and determine whether it was built with hardening enabled.
335+
336+
.. warning::
337+
We don't commit to the encoding scheme used by the ABI tags being stable
338+
between different releases of libc++. The tags themselves are never stable, by
339+
design -- new releases increase the version number. The following describes
340+
the state of the latest release and is for informational purposes only.
341+
342+
The first character of an ABI tag encodes the hardening mode:
343+
344+
- ``f`` -- [f]ast mode;
345+
- ``s`` -- extensive ("[s]afe") mode;
346+
- ``d`` -- [d]ebug mode;
347+
- ``n`` -- [n]one mode.
348+
349+
Hardened containers status
350+
==========================
351+
352+
.. list-table::
353+
:header-rows: 1
354+
:widths: auto
355+
356+
* - Name
357+
- Member functions
358+
- Iterators (ABI-dependent)
359+
* - ``span``
360+
- ✅
361+
- ✅
362+
* - ``string_view``
363+
- ✅
364+
- ✅
365+
* - ``array``
366+
- ✅
367+
- ❌
368+
* - ``vector``
369+
- ✅
370+
- ❌
371+
* - ``string``
372+
- ✅
373+
- ❌
374+
* - ``list``
375+
- ✅
376+
- ❌
377+
* - ``forward_list``
378+
- ❌
379+
- ❌
380+
* - ``deque``
381+
- ✅
382+
- ❌
383+
* - ``map``
384+
- ❌
385+
- ❌
386+
* - ``set``
387+
- ❌
388+
- ❌
389+
* - ``multimap``
390+
- ❌
391+
- ❌
392+
* - ``multiset``
393+
- ❌
394+
- ❌
395+
* - ``unordered_map``
396+
- Partial
397+
- Partial
398+
* - ``unordered_set``
399+
- Partial
400+
- Partial
401+
* - ``unordered_multimap``
402+
- Partial
403+
- Partial
404+
* - ``unordered_multiset``
405+
- Partial
406+
- Partial
407+
* - ``mdspan``
408+
- ✅
409+
- ❌
410+
* - ``optional``
411+
- ✅
412+
- N/A
413+
* - ``function``
414+
- ❌
415+
- N/A
416+
* - ``variant``
417+
- N/A
418+
- N/A
419+
* - ``any``
420+
- N/A
421+
- N/A
422+
* - ``expected``
423+
- ✅
424+
- N/A
425+
* - ``valarray``
426+
- Partial
427+
- N/A
428+
* - ``bitset``
429+
- ❌
430+
- N/A
431+
432+
Testing
433+
=======
434+
435+
Please see :ref:`Testing documentation <testing-hardening-assertions>`.
436+
437+
Further reading
438+
===============
439+
440+
- ``_Hardening RFC <https://discourse.llvm.org/t/rfc-hardening-in-libc/73925>``:
441+
contains some of the design rationale.

0 commit comments

Comments
 (0)