Skip to content

Commit ed83e3e

Browse files
committed
---
yaml --- r: 216357 b: refs/heads/stable c: 5993ae8 h: refs/heads/master i: 216355: 2043aed v: v3
1 parent 67daec1 commit ed83e3e

File tree

3 files changed

+203
-190
lines changed

3 files changed

+203
-190
lines changed

[refs]

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -29,5 +29,5 @@ refs/heads/tmp: 378a370ff2057afeb1eae86eb6e78c476866a4a6
2929
refs/tags/1.0.0-alpha.2: 4c705f6bc559886632d3871b04f58aab093bfa2f
3030
refs/tags/homu-tmp: a5286998df566e736b32f6795bfc3803bdaf453d
3131
refs/tags/1.0.0-beta: 8cbb92b53468ee2b0c2d3eeb8567005953d40828
32-
refs/heads/stable: 39e2e649cb0ef3da750d296af07d4cea6aadf51f
32+
refs/heads/stable: 5993ae86b85173d18fbc0cd620f61c011a8a7b03
3333
refs/tags/1.0.0: 55bd4f8ff2b323f317ae89e254ce87162d52a375
Lines changed: 189 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,189 @@
1+
// Copyright 2015 The Rust Project Developers. See the COPYRIGHT
2+
// file at the top-level directory of this distribution and at
3+
// http://!rust-lang.org/COPYRIGHT.
4+
//
5+
// Licensed under the Apache License, Version 2.0 <LICENSE-APACHE or
6+
// http://!www.apache.org/licenses/LICENSE-2.0> or the MIT license
7+
// <LICENSE-MIT or http://!opensource.org/licenses/MIT>, at your
8+
// option. This file may not be copied, modified, or distributed
9+
// except according to those terms.
10+
11+
//! # Debug Info Module
12+
//!
13+
//! This module serves the purpose of generating debug symbols. We use LLVM's
14+
//! [source level debugging](http://!llvm.org/docs/SourceLevelDebugging.html)
15+
//! features for generating the debug information. The general principle is
16+
//! this:
17+
//!
18+
//! Given the right metadata in the LLVM IR, the LLVM code generator is able to
19+
//! create DWARF debug symbols for the given code. The
20+
//! [metadata](http://!llvm.org/docs/LangRef.html#metadata-type) is structured
21+
//! much like DWARF *debugging information entries* (DIE), representing type
22+
//! information such as datatype layout, function signatures, block layout,
23+
//! variable location and scope information, etc. It is the purpose of this
24+
//! module to generate correct metadata and insert it into the LLVM IR.
25+
//!
26+
//! As the exact format of metadata trees may change between different LLVM
27+
//! versions, we now use LLVM
28+
//! [DIBuilder](http://!llvm.org/docs/doxygen/html/classllvm_1_1DIBuilder.html)
29+
//! to create metadata where possible. This will hopefully ease the adaption of
30+
//! this module to future LLVM versions.
31+
//!
32+
//! The public API of the module is a set of functions that will insert the
33+
//! correct metadata into the LLVM IR when called with the right parameters.
34+
//! The module is thus driven from an outside client with functions like
35+
//! `debuginfo::create_local_var_metadata(bcx: block, local: &ast::local)`.
36+
//!
37+
//! Internally the module will try to reuse already created metadata by
38+
//! utilizing a cache. The way to get a shared metadata node when needed is
39+
//! thus to just call the corresponding function in this module:
40+
//!
41+
//! let file_metadata = file_metadata(crate_context, path);
42+
//!
43+
//! The function will take care of probing the cache for an existing node for
44+
//! that exact file path.
45+
//!
46+
//! All private state used by the module is stored within either the
47+
//! CrateDebugContext struct (owned by the CrateContext) or the
48+
//! FunctionDebugContext (owned by the FunctionContext).
49+
//!
50+
//! This file consists of three conceptual sections:
51+
//! 1. The public interface of the module
52+
//! 2. Module-internal metadata creation functions
53+
//! 3. Minor utility functions
54+
//!
55+
//!
56+
//! ## Recursive Types
57+
//!
58+
//! Some kinds of types, such as structs and enums can be recursive. That means
59+
//! that the type definition of some type X refers to some other type which in
60+
//! turn (transitively) refers to X. This introduces cycles into the type
61+
//! referral graph. A naive algorithm doing an on-demand, depth-first traversal
62+
//! of this graph when describing types, can get trapped in an endless loop
63+
//! when it reaches such a cycle.
64+
//!
65+
//! For example, the following simple type for a singly-linked list...
66+
//!
67+
//! ```
68+
//! struct List {
69+
//! value: int,
70+
//! tail: Option<Box<List>>,
71+
//! }
72+
//! ```
73+
//!
74+
//! will generate the following callstack with a naive DFS algorithm:
75+
//!
76+
//! ```
77+
//! describe(t = List)
78+
//! describe(t = int)
79+
//! describe(t = Option<Box<List>>)
80+
//! describe(t = Box<List>)
81+
//! describe(t = List) // at the beginning again...
82+
//! ...
83+
//! ```
84+
//!
85+
//! To break cycles like these, we use "forward declarations". That is, when
86+
//! the algorithm encounters a possibly recursive type (any struct or enum), it
87+
//! immediately creates a type description node and inserts it into the cache
88+
//! *before* describing the members of the type. This type description is just
89+
//! a stub (as type members are not described and added to it yet) but it
90+
//! allows the algorithm to already refer to the type. After the stub is
91+
//! inserted into the cache, the algorithm continues as before. If it now
92+
//! encounters a recursive reference, it will hit the cache and does not try to
93+
//! describe the type anew.
94+
//!
95+
//! This behaviour is encapsulated in the 'RecursiveTypeDescription' enum,
96+
//! which represents a kind of continuation, storing all state needed to
97+
//! continue traversal at the type members after the type has been registered
98+
//! with the cache. (This implementation approach might be a tad over-
99+
//! engineered and may change in the future)
100+
//!
101+
//!
102+
//! ## Source Locations and Line Information
103+
//!
104+
//! In addition to data type descriptions the debugging information must also
105+
//! allow to map machine code locations back to source code locations in order
106+
//! to be useful. This functionality is also handled in this module. The
107+
//! following functions allow to control source mappings:
108+
//!
109+
//! + set_source_location()
110+
//! + clear_source_location()
111+
//! + start_emitting_source_locations()
112+
//!
113+
//! `set_source_location()` allows to set the current source location. All IR
114+
//! instructions created after a call to this function will be linked to the
115+
//! given source location, until another location is specified with
116+
//! `set_source_location()` or the source location is cleared with
117+
//! `clear_source_location()`. In the later case, subsequent IR instruction
118+
//! will not be linked to any source location. As you can see, this is a
119+
//! stateful API (mimicking the one in LLVM), so be careful with source
120+
//! locations set by previous calls. It's probably best to not rely on any
121+
//! specific state being present at a given point in code.
122+
//!
123+
//! One topic that deserves some extra attention is *function prologues*. At
124+
//! the beginning of a function's machine code there are typically a few
125+
//! instructions for loading argument values into allocas and checking if
126+
//! there's enough stack space for the function to execute. This *prologue* is
127+
//! not visible in the source code and LLVM puts a special PROLOGUE END marker
128+
//! into the line table at the first non-prologue instruction of the function.
129+
//! In order to find out where the prologue ends, LLVM looks for the first
130+
//! instruction in the function body that is linked to a source location. So,
131+
//! when generating prologue instructions we have to make sure that we don't
132+
//! emit source location information until the 'real' function body begins. For
133+
//! this reason, source location emission is disabled by default for any new
134+
//! function being translated and is only activated after a call to the third
135+
//! function from the list above, `start_emitting_source_locations()`. This
136+
//! function should be called right before regularly starting to translate the
137+
//! top-level block of the given function.
138+
//!
139+
//! There is one exception to the above rule: `llvm.dbg.declare` instruction
140+
//! must be linked to the source location of the variable being declared. For
141+
//! function parameters these `llvm.dbg.declare` instructions typically occur
142+
//! in the middle of the prologue, however, they are ignored by LLVM's prologue
143+
//! detection. The `create_argument_metadata()` and related functions take care
144+
//! of linking the `llvm.dbg.declare` instructions to the correct source
145+
//! locations even while source location emission is still disabled, so there
146+
//! is no need to do anything special with source location handling here.
147+
//!
148+
//! ## Unique Type Identification
149+
//!
150+
//! In order for link-time optimization to work properly, LLVM needs a unique
151+
//! type identifier that tells it across compilation units which types are the
152+
//! same as others. This type identifier is created by
153+
//! TypeMap::get_unique_type_id_of_type() using the following algorithm:
154+
//!
155+
//! (1) Primitive types have their name as ID
156+
//! (2) Structs, enums and traits have a multipart identifier
157+
//!
158+
//! (1) The first part is the SVH (strict version hash) of the crate they
159+
//! wereoriginally defined in
160+
//!
161+
//! (2) The second part is the ast::NodeId of the definition in their
162+
//! originalcrate
163+
//!
164+
//! (3) The final part is a concatenation of the type IDs of their concrete
165+
//! typearguments if they are generic types.
166+
//!
167+
//! (3) Tuple-, pointer and function types are structurally identified, which
168+
//! means that they are equivalent if their component types are equivalent
169+
//! (i.e. (int, int) is the same regardless in which crate it is used).
170+
//!
171+
//! This algorithm also provides a stable ID for types that are defined in one
172+
//! crate but instantiated from metadata within another crate. We just have to
173+
//! take care to always map crate and node IDs back to the original crate
174+
//! context.
175+
//!
176+
//! As a side-effect these unique type IDs also help to solve a problem arising
177+
//! from lifetime parameters. Since lifetime parameters are completely omitted
178+
//! in debuginfo, more than one `Ty` instance may map to the same debuginfo
179+
//! type metadata, that is, some struct `Struct<'a>` may have N instantiations
180+
//! with different concrete substitutions for `'a`, and thus there will be N
181+
//! `Ty` instances for the type `Struct<'a>` even though it is not generic
182+
//! otherwise. Unfortunately this means that we cannot use `ty::type_id()` as
183+
//! cheap identifier for type metadata---we have done this in the past, but it
184+
//! led to unnecessary metadata duplication in the best case and LLVM
185+
//! assertions in the worst. However, the unique type ID as described above
186+
//! *can* be used as identifier. Since it is comparatively expensive to
187+
//! construct, though, `ty::type_id()` is still used additionally as an
188+
//! optimization for cases where the exact same type has been seen before
189+
//! (which is most of the time).

0 commit comments

Comments
 (0)