Skip to content

Commit 76e5488

Browse files
committed
Docs on how to make backwards-compatible changes using LLVM bitstream
1 parent b20e8e9 commit 76e5488

File tree

2 files changed

+78
-14
lines changed

2 files changed

+78
-14
lines changed

docs/StableBitcode.md

Lines changed: 46 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,46 @@
1+
# Making Backwards-Compatible Changes in the LLVM Bitstream Format
2+
3+
Swift uses the [LLVM bitstream][] format for some of its serialization logic. This format was invented as a container for LLVM IR. It is a binary format supporting two basic structures: *blocks,* which define regions of the file, and *records,* which contain data fields that can be up to 64 bits. It has a few nice properties that make it a useful container format for us as well:
4+
5+
- It is easy to skip over an entire block, because the block's length is recorded at its start.
6+
7+
- It is possible to jump to specific offsets *within* a block without having to reparse from the start of the block.
8+
9+
- A format change doesn't immediately invalidate existing bitstream files, because the stream includes layout information for each record.
10+
11+
However, it has some disadvantages as well:
12+
13+
- Each record can only contain one variable-sized entry (either an array or a "blob" of bytes).
14+
15+
- Higher-level features like cross-references or lookup by key have to be built on top of the format, usually in a way that the existing tooling doesn't understand.
16+
17+
You can view the contents of any LLVM bitstream using the `llvm-bcanalyzer` tool's `-dump` option.
18+
19+
[LLVM bitstream]: http://llvm.org/docs/BitCodeFormat.html
20+
21+
22+
## Backwards-compatibility
23+
24+
For a format change to be backwards-compatible, we need the v5 tools to be able to read a file generated by the v6 tools. At a high level, this means that whatever data is introduced in v6, it doesn't interfere with what v5 is looking for.
25+
26+
(We also care about *forwards*-compatibility, which says that the v6 tools is able to read a file generated by the v5 tools. This is usually easier to maintain, because the v5 format is already known.)
27+
28+
In practice, there are a few ways to accomplish this with LLVM bitstreams:
29+
30+
- If the deserialization logic is set to skip over any blocks it doesn't understand, a new format can always add new blocks.
31+
32+
- If the deserialization logic is set to skip over any *records* it doesn't understand, a new format can always add new *records.* Be careful, though, of records that are expected to appear immediately after another record---if you put a new record between them, you may break the expectations of older compilers.
33+
34+
- If the deserialization logic always looks for a possible blob entry in records (i.e. passing a StringRef out-parameter to BitstreamCursor's `readRecord`), a new format can add blob data to an existing record that does not have it.
35+
36+
- If the deserialization logic always checks for a minimum number of fields in a record before extracting those fields, or if the only field in a record is blob data, a new format can add new fields to an existing record, as long as they come after any existing non-blob fields.
37+
38+
Note that the BCRecordLayout DSL expects the number of fields to **match exactly**. If you want to use BCRecordLayout's `readRecord` method, the deserialization logic will have to check that the deserialized data has the correct number of fields ahead of time. If it has more fields, you can make an ArrayRef that slices off the extra ones; if it has fewer, you're reading from an old format and will need to use a different BCRecordLayout, or just read them manually.
39+
40+
(We could also add more API to BCRecordLayout to make this easier. It's part of LLVM, but it's a part of LLVM originally contributed by Swift folks.)
41+
42+
Note also that it's still okay to use BCRecordLayout for *serialization.* It's only deserialization where we have to be careful about multiple formats.
43+
44+
Remember that any new data will be *ignored* by the old tools. If it's something that *should* affect how old tools read the file, it must be encoded in an existing field; if that's impossible, you have a backwards-incompatible change and should bump the major version number of the file.
45+
46+
If the existing deserialization logic is already checking for the exact size of a record (and therefore preventing new fields from being added), one trick is to put a second record after the first, and check for its presence in the new version of the tools. As long as the old logic is set up to skip unknown records, this shouldn't cause any problems.

lib/Serialization/DocFormat.h

Lines changed: 32 additions & 14 deletions
Original file line numberDiff line numberDiff line change
@@ -35,29 +35,47 @@ const unsigned char SWIFTDOC_SIGNATURE[] = { 0xE2, 0x9C, 0xA8, 0x07 };
3535

3636
/// Serialized swiftdoc format major version number.
3737
///
38-
/// Increment this value when making a backwards-incompatible change, which
39-
/// should be rare. When incrementing this value, reset SWIFTDOC_VERSION_MINOR
40-
/// to 0.
38+
/// Increment this value when making a backwards-incompatible change, i.e. where
39+
/// an \e old compiler will \e not be able to read the new format. This should
40+
/// be rare. When incrementing this value, reset SWIFTDOC_VERSION_MINOR to 0.
41+
///
42+
/// See docs/StableBitcode.md for information on how to make
43+
/// backwards-compatible changes using the LLVM bitcode format.
4144
const uint16_t SWIFTDOC_VERSION_MAJOR = 1;
4245

4346
/// Serialized swiftdoc format minor version number.
4447
///
45-
/// Increment this value when making a backwards-compatible change that might
46-
/// be interesting to test for. However, if old swiftdoc files are fully
47-
/// compatible with the new change, you do not need to increment this.
48+
/// Increment this value when making a backwards-compatible change that might be
49+
/// interesting to test for. A backwards-compatible change is one where an \e
50+
/// old compiler can read the new format without any problems (usually by
51+
/// ignoring new information).
52+
///
53+
/// If the \e new compiler can treat the new and old format identically, or if
54+
/// the presence of a new record, block, or field is sufficient to indicate that
55+
/// the swiftdoc file is using a new format, it is okay not to increment this
56+
/// value. However, it may be interesting for a new compiler to treat the \e
57+
/// absence of information differently for the old and new formats; in this
58+
/// case, the difference in minor version number can distinguish the two.
59+
///
60+
/// The minor version number does not need to be changed simply to track which
61+
/// compiler generated a swiftdoc file; the full compiler version is already
62+
/// stored as text and can be checked by running the \c strings command-line
63+
/// tool on a swiftdoc file.
4864
///
49-
/// To ensure that two separate changes don't silently get merged into one
50-
/// in source control, you should also update the comment to briefly
51-
/// describe what change you made. The content of this comment isn't important;
52-
/// it just ensures a conflict if two people change the module format.
53-
/// Don't worry about adhering to the 80-column limit for this line.
65+
/// To ensure that two separate changes don't silently get merged into one in
66+
/// source control, you should also update the comment to briefly describe what
67+
/// change you made. The content of this comment isn't important; it just
68+
/// ensures a conflict if two people change the module format. Don't worry about
69+
/// adhering to the 80-column limit for this line.
5470
const uint16_t SWIFTDOC_VERSION_MINOR = 1; // Last change: skipping 0 for testing purposes
5571

5672
/// The record types within the comment block.
5773
///
58-
/// Be very careful when changing this block; it must remain stable. Adding new
59-
/// records is okay---they will be ignored---but modifying existing ones must be
60-
/// done carefully. You may need to update the version when you do so.
74+
/// Be very careful when changing this block; it must remain
75+
/// backwards-compatible. Adding new records is okay---they will be ignored---
76+
/// but modifying existing ones must be done carefully. You may need to update
77+
/// the version when you do so. See docs/StableBitcode.md for information on how
78+
/// to make backwards-compatible changes using the LLVM bitcode format.
6179
///
6280
/// \sa COMMENT_BLOCK_ID
6381
namespace comment_block {

0 commit comments

Comments
 (0)