Skip to content

Commit ce68d74

Browse files
authored
Merge pull request #73276 from Snowy1803/doc-and-fixes
[DebugInfo] Improve documentation & Fix discovered bugs
2 parents f7c9966 + 16c57ae commit ce68d74

File tree

10 files changed

+343
-36
lines changed

10 files changed

+343
-36
lines changed

docs/HowToUpdateDebugInfo.md

Lines changed: 241 additions & 26 deletions
Original file line numberDiff line numberDiff line change
@@ -1,50 +1,265 @@
1-
## How to update Debug Info in the Swift Compiler
1+
# How to update Debug Info in the Swift Compiler
22

3-
### Introduction
3+
## Introduction
44

55
This document describes how debug info works at the SIL level and how to
66
correctly update debug info in SIL optimization passes. This document is
77
inspired by its LLVM analog, [How to Update Debug Info: A Guide for LLVM Pass
88
Authors](https://llvm.org/docs/HowToUpdateDebugInfo.html), which is recommended
99
reading, since all of the concepts discussed there also apply to SIL.
1010

11-
### Source Locations
11+
## Source Locations
1212

1313
Contrary to LLVM IR, SIL makes source locations and lexical scopes mandatory on
1414
all instructions. SIL transformations should follow the LLVM guide for when to
1515
merge drop and copy locations, since all the same considerations apply. Helpers
1616
like `SILBuilderWithScope` make it easy to copy source locations when expanding
1717
SIL instructions.
1818

19-
### Variables, Variable Locations
19+
## Variables
20+
21+
Each `debug_value` (and variable-carrying instruction) defines an update point
22+
for the location of (part of) that source variable. A variable location is an
23+
SSA value, modified by a debug expression that can transform that value,
24+
yielding the value of that variable. Optimizations like SROA may split a source
25+
variable into multiple smaller fragments, other optimizations such as Mem2Reg
26+
may split a debug value describing an address into multiple debug values
27+
describing different SSA values. Each variable (fragment) location is valid
28+
until the end of the current basic block, or until another `debug_value`
29+
describes another location for a variable fragment for the same unique variable
30+
that overlaps with that (fragment of the) variable.
31+
32+
### Debug variable-carrying instructions
2033

2134
Source variables are represented by `debug_value` instructions, and may also be
22-
described in variable-carrying instructions (`alloc_stack`, `alloc_box`). There
23-
is no semantic difference between describing a variable in an allocation
35+
described in debug variable-carrying instructions (`alloc_stack`, `alloc_box`).
36+
There is no semantic difference between describing a variable in an allocation
2437
instruction directly or describing it in an `debug_value` following the
25-
allocation instruction. Variables are uniquely identified via their lexical
26-
scope, which also includes inline information, and their name and binding kind.
38+
allocation instruction.
2739

28-
Each `debug_value` (and variable-carrying instruction) defines an update point
29-
for the location of (part of) that source variable. A variable location is an
30-
SSA value or constant, modified by a debug expression that can transform that
31-
value, yielding the value of that variable. The debug expressions get lowered
32-
into LLVM [DIExpressions](https://llvm.org/docs/LangRef.html#diexpression) which
33-
get lowered into [DWARF](https://dwarfstd.org) expressions. Optimizations like
34-
SROA may split a source variable into multiple smaller fragments. An
35-
`op_fragment` is used to denote a location of a partial variable. Each variable
36-
(fragment) location is valid until the end of the current basic block, or until
37-
another `debug_value` describes another location for a variable fragment for the
38-
same unique variable that overlaps with that (fragment of the) variable.
39-
Variables may be undefined, in which case the SSA value is `undef`.
40-
41-
### Rules of thumb
40+
This is equivalent, and should be optimized similarly:
41+
```
42+
%0 = alloc_stack $T, var, name "value", loc "a.swift":4:2, scope 1
43+
// equivalent to:
44+
%0 = alloc_stack $T, loc "a.swift":4:2, scope 1
45+
debug_value %0 : $*T, var, name "value", expr op_deref, loc "a.swift":4:2, scope 1
46+
```
47+
48+
> [!Note]
49+
> In the future, we may want to remove the debug variable from the `alloc_stack`
50+
> to only use the second form, in order to simplify SIL. Additionally, we could
51+
> then move the `debug_value` instruction to the point where the variable is
52+
> initialized to avoid showing ununitialized memory in the debugger. This would
53+
> be a change in SILGen, which should not affect the optimizer.
54+
55+
For now, the `DebugVarCarryingInst` type can be used to handle both cases.
56+
57+
### Variable identity, location and scope
58+
59+
Variables are uniquely identified via their debug scope, their location, and
60+
their name.
61+
62+
The debug scope, is the range in which the variable is declared and available.
63+
More information about debug scopes is available on
64+
[the Swift blog](https://www.swift.org/blog/whats-new-swift-debugging-5.9/#fine-grained-scope-information)
65+
For arguments, this will be the function's scope, otherwise, this will be a
66+
subscope within a function. When a function is inlined, a new scope is created,
67+
including information about the inlined function, and in which function it was
68+
inlined (inlined_at).
69+
70+
The location of the variable is the source location where the variable was
71+
declared.
72+
73+
If the location and scope of a debug variable isn't set, it will use the scope
74+
and location of the instruction, which is correct in most cases. However, if a
75+
`debug_value` describes a modification of a variable, the instruction should
76+
have the location of the update point, and the variable must keep the location
77+
of the variable declaration:
78+
79+
```
80+
%0 = integer_literal $Int, 2
81+
debug_value %0 : $Int, var, name "a", loc "a.swift":2:5, scope 2
82+
%2 = integer_literal $Int, 3
83+
debug_value %2 : $Int, var, (name "a", loc "a.swift":2:5, scope 2), loc "a.swift":3:3, scope 2
84+
```
85+
For this code:
86+
```swift
87+
var a = 2
88+
a = 3
89+
```
90+
91+
### Variable types
92+
93+
By default the type of the variable will be the object type of the SSA value.
94+
If this is not the correct type, a type must be attached to the debug variable
95+
to override it.
96+
97+
Example:
98+
99+
```
100+
debug_value %0 : $*T, let, name "address", type $UnsafeRawPointer
101+
```
102+
103+
The variable will usually have an associated expression yielding the correct
104+
type.
105+
106+
### Variable expressions
107+
108+
A variable can have an associated expression if the value needs computation.
109+
This can be for dereferencing a pointer, arithmetic, or for splitting structs.
110+
An expression is a sequence of operations to be executed left to right. Debug
111+
expressions get lowered into LLVM
112+
[DIExpressions](https://llvm.org/docs/LangRef.html#diexpression) which get
113+
lowered into [DWARF](https://dwarfstd.org) expressions.
114+
115+
#### Address types and op_deref
116+
117+
A variable's expression may include an `op_deref`, usually at the beginning, in
118+
which case the SSA value is a pointer that must be dereferenced to access the
119+
value of the variable.
120+
121+
In this example, the value returned by the `alloc_stack` is an address that must
122+
be dereferenced.
123+
```
124+
%0 = alloc_stack $T
125+
debug_value %0 : $*T, var, name "value", expr op_deref
126+
```
127+
128+
SILGen can use `SILBuilder::createDebugValue` and
129+
`SILBuilder::createDebugValueAddr` to create debug values, respectively without
130+
and with an op_deref, or use `SILBuilder::emitDebugDescription` which will
131+
automatically choose the correct one depending on the type of the SSA value. As
132+
there are no pointers in Swift, this should always do the right thing.
133+
134+
> [!Warning]
135+
> At the optimizer level, Swift `Unsafe*Pointer` types can be simplified
136+
> to address types. As such, a `debug_value` with an address type without an
137+
> `op_deref` can be valid. SIL passes must not assume that `op_deref` and
138+
> address types correlate.
139+
140+
Even if `op_deref` is usually at the beginning, it doesn't have to be:
141+
```
142+
debug_value %0 : $*UInt8, let, name "hello", expr op_constu:3:op_plus:op_deref
143+
```
144+
This will add `3` to the pointer contained in `%0`, then dereference the result.
145+
146+
#### Fragments
147+
148+
If a variable is partially updated, a fragment can be used to specify that this
149+
update refers to an element of an aggregate type.
150+
151+
> [!Tip]
152+
> When using fragments, always specify the type of the variable, as it will be
153+
> different from the SSA value.
154+
155+
When SROA is splitting a struct or tuple, it will also split the debug values,
156+
and add a fragment to specify which field is being updated.
157+
158+
```
159+
struct Pair { var a, b: Int }
160+
161+
alloc_stack $Pair, var, name "pair"
162+
// -->
163+
alloc_stack $Int, var, name "pair", type $Pair, expr op_fragment:#Pair.a
164+
alloc_stack $Int, var, name "pair", type $Pair, expr op_fragment:#Pair.b
165+
// -->
166+
alloc_stack $Builtin.Int64, var, name "pair", type $Pair, expr op_fragment:#Pair.a:op_fragment:#Int._value
167+
alloc_stack $Builtin.Int64, var, name "pair", type $Pair, expr op_fragment:#Pair.b:op_fragment:#Int._value
168+
```
169+
170+
Here, Pair is a struct containing two Ints, so each `alloc_stack` will receive a
171+
fragment with the field it is describing. Int, in Swift, is itself a struct
172+
containing one Builtin.Int64 (on 64 bits systems), so it can itself be SROA'ed.
173+
Fragments can be chained to describe this.
174+
175+
Tuple fragments use a different syntax, but work similarly:
176+
177+
```
178+
alloc_stack $(Int, Int), var, name "pair"
179+
// -->
180+
alloc_stack $Int, var, name "pair", type $(Int, Int), expr op_tuple_fragment:$(Int, Int):0
181+
alloc_stack $Int, var, name "pair", type $(Int, Int), expr op_tuple_fragment:$(Int, Int):1
182+
// -->
183+
alloc_stack $Builtin.Int64, var, name "pair", type $(Int, Int), expr op_tuple_fragment:$(Int, Int):0:op_fragment:#Int._value
184+
alloc_stack $Builtin.Int64, var, name "pair", type $(Int, Int), expr op_tuple_fragment:$(Int, Int):1:op_fragment:#Int._value
185+
```
186+
187+
Tuple fragments and struct fragments can be mixed freely, however, they must all
188+
be at the end of the expression. That is because the fragment operator can be
189+
seen as returning a struct containing a single element, with the rest undefined,
190+
and, except fragments, no debug expression operator take a struct as input.
191+
192+
> [!Note]
193+
> When multiple fragments are present, they are evaluated in the reverse way —
194+
> from the field within the variable first, to the SSA's type at the end
195+
196+
#### Arithmetic
197+
198+
An expression can add or subtract a constant offset to a value. To do so, an
199+
`op_constu` or `op_consts` can be used to push a constant integer to the stack,
200+
respectively unsigned and signed. Then, the `op_plus` and `op_minus` operators
201+
can be used to sum or subtract the two values on the stack.
202+
203+
```
204+
debug_value %0 : $Builtin.Int64, var, name "previous", type $Int, expr op_consts:1:op_minus:op_fragment:#Int._value
205+
debug_value %0 : $Builtin.Int64, var, name "next", type $Int, expr op_consts:1:op_plus:op_fragment:#Int._value
206+
```
207+
208+
> [!Caution]
209+
> This currently doesn't work if a fragment is present.
210+
211+
#### Constants
212+
213+
If a `debug_value` is describing a constant, such as in `let x = 1`, and the
214+
value is optimized out, we can keep it, using a constant expression, and no SSA
215+
value.
216+
217+
```
218+
debug_value undef : $Int, let, name "x", expr op_consts:1:op_fragment:#Int._value
219+
```
220+
221+
### Undef variables
222+
223+
If the value of the variable cannot be recovered as the value is entirely
224+
optimized away, an undef debug value should still be kept:
225+
226+
```
227+
debug_value undef : $Int, let, name "x"
228+
```
229+
230+
Additionally, if a previous `debug_value` exists for the variable, a debug value
231+
of undef invalidates the previous value, in case the value of the variable isn't
232+
known anymore:
233+
234+
```
235+
debug_value %0 : $Int, var, name "x" // var x = a
236+
...
237+
debug_value undef : $Int, var, name "x" // x = <optimized out>
238+
```
239+
240+
Combined with fragments, some parts of the variable can be undefined and some
241+
not:
242+
243+
```
244+
... // pair = ?
245+
debug_value %0 : $Int, var, name "pair", type $Pair, expr op_fragment:#Pair.a // pair.a = x
246+
debug_value %0 : $Int, var, name "pair", type $Pair, expr op_fragment:#Pair.b // pair.b = x
247+
... // pair = (x, x)
248+
debug_value undef : $Pair, var, name "pair", expr op_fragment:#Pair.a // pair.a = <optimized out>
249+
... // pair = (?, x)
250+
debug_value undef : $Pair, var, name "pair" // pair = <optimized out>
251+
... // pair = ?
252+
debug_value %1 : $Int, var, name "pair", type $Pair, expr op_fragment:#Pair.a // pair.a = y
253+
... // pair = (y, ?)
254+
```
255+
256+
## Rules of thumb
42257
- Optimization passes may never drop a variable entirely. If a variable is
43258
entirely optimized away, an `undef` debug value should still be kept.
44259
- A `debug_value` must always describe a correct value for that source variable
45260
at that source location. If a value is only correct on some paths through that
46261
instruction, it must be replaced with `undef`. Debug info never speculates.
47-
- When a SIL instruction referenced by a `debug_value` is (really, any
48-
instruction) deleted, call salvageDebugInfo(). It will try to capture the
49-
effect of the deleted instruction in a debug expression, so the location can
50-
be preserved.
262+
- When a SIL instruction is deleted, call salvageDebugInfo(). It will try to
263+
capture the effect of the deleted instruction in a debug expression, so the
264+
location can be preserved. You can also use an `InstructionDeleter` which will
265+
automatically call `salvageDebugInfo`.

lib/IRGen/IRGenDebugInfo.cpp

Lines changed: 13 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -3138,6 +3138,9 @@ bool IRGenDebugInfoImpl::buildDebugInfoExpression(
31383138
return false;
31393139
}
31403140
}
3141+
if (Operands.size() && Operands.back() != llvm::dwarf::DW_OP_deref) {
3142+
Operands.push_back(llvm::dwarf::DW_OP_stack_value);
3143+
}
31413144
return true;
31423145
}
31433146

@@ -3429,6 +3432,16 @@ void IRGenDebugInfoImpl::emitDbgIntrinsic(
34293432
// /always/ emit an llvm.dbg.value of undef.
34303433
// If we have undef, always emit a llvm.dbg.value in the current position.
34313434
if (isa<llvm::UndefValue>(Storage)) {
3435+
if (Expr->getNumElements() &&
3436+
(Expr->getElement(0) == llvm::dwarf::DW_OP_consts
3437+
|| Expr->getElement(0) == llvm::dwarf::DW_OP_constu)) {
3438+
/// Convert `undef, expr op_consts:N:...` to `N, expr ...`
3439+
Storage = llvm::ConstantInt::get(
3440+
llvm::IntegerType::getInt64Ty(Builder.getContext()),
3441+
Expr->getElement(1));
3442+
Expr = llvm::DIExpression::get(Builder.getContext(),
3443+
Expr->getElements().drop_front(2));
3444+
}
34323445
DBuilder.insertDbgValueIntrinsic(Storage, Var, Expr, DL, ParentBlock);
34333446
return;
34343447
}

lib/IRGen/IRGenSIL.cpp

Lines changed: 2 additions & 3 deletions
Original file line numberDiff line numberDiff line change
@@ -5617,11 +5617,10 @@ void IRGenSILFunction::visitDebugValueInst(DebugValueInst *i) {
56175617

56185618
auto VarInfo = i->getVarInfo();
56195619
assert(VarInfo && "debug_value without debug info");
5620-
if (isa<SILUndef>(SILVal)) {
5620+
if (isa<SILUndef>(SILVal) && VarInfo->Name == "$error") {
56215621
// We cannot track the location of inlined error arguments because it has no
56225622
// representation in SIL.
5623-
if (!IsAddrVal &&
5624-
!i->getDebugScope()->InlinedCallSite && VarInfo->Name == "$error") {
5623+
if (!IsAddrVal && !i->getDebugScope()->InlinedCallSite) {
56255624
auto funcTy = CurSILFn->getLoweredFunctionType();
56265625
emitErrorResultVar(funcTy, funcTy->getErrorResult(), i);
56275626
}

lib/SILOptimizer/Transforms/DeadCodeElimination.cpp

Lines changed: 3 additions & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -77,8 +77,10 @@ static bool seemsUseful(SILInstruction *I) {
7777
}
7878

7979
// Is useful if it's associating with a function argument
80+
// If undef, it is useful and it doesn't cost anything.
8081
if (isa<DebugValueInst>(I))
81-
return isa<SILFunctionArgument>(I->getOperand(0));
82+
return isa<SILFunctionArgument>(I->getOperand(0))
83+
|| isa<SILUndef>(I->getOperand(0));
8284

8385
return false;
8486
}

lib/SILOptimizer/Utils/InstOptUtils.cpp

Lines changed: 19 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -1964,6 +1964,25 @@ void swift::salvageDebugInfo(SILInstruction *I) {
19641964
}
19651965
}
19661966
}
1967+
1968+
if (auto *IL = dyn_cast<IntegerLiteralInst>(I)) {
1969+
APInt value = IL->getValue();
1970+
const SILDIExprElement ExprElements[2] = {
1971+
SILDIExprElement::createOperator(value.isNegative() ?
1972+
SILDIExprOperator::ConstSInt : SILDIExprOperator::ConstUInt),
1973+
SILDIExprElement::createConstInt(value.getLimitedValue()),
1974+
};
1975+
for (Operand *U : getDebugUses(IL)) {
1976+
auto *DbgInst = cast<DebugValueInst>(U->getUser());
1977+
auto VarInfo = DbgInst->getVarInfo();
1978+
if (!VarInfo)
1979+
continue;
1980+
VarInfo->DIExpr.prependElements(ExprElements);
1981+
// Create a new debug_value, with undef, and the correct const int
1982+
SILBuilder(DbgInst, DbgInst->getDebugScope())
1983+
.createDebugValue(DbgInst->getLoc(), SILUndef::get(IL), *VarInfo);
1984+
}
1985+
}
19671986
}
19681987

19691988
void swift::salvageLoadDebugInfo(LoadOperation load) {

0 commit comments

Comments
 (0)