Skip to content

[RemoveDIs][NFC] Update docs to reflect new class structure #85109

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
merged 3 commits into from
Mar 14, 2024
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
42 changes: 22 additions & 20 deletions llvm/docs/RemoveDIsDebugInfo.md
Original file line number Diff line number Diff line change
Expand Up @@ -24,7 +24,7 @@ The debug records are not instructions, do not appear in the instruction list, a

# Great, what do I need to do!

Approximately nothing -- we've already instrumented all of LLVM to handle these new records ("`DPValues`") and behave identically to past LLVM behaviour. We plan on turning this on by default some time soon, with IR converted to the intrinsic form of debug info at terminals (textual IR, bitcode) for a short while, before then changing the textual IR and bitcode formats.
Approximately nothing -- we've already instrumented all of LLVM to handle these new records ("`DbgRecords`") and behave identically to past LLVM behaviour. We plan on turning this on by default some time soon, with IR converted to the intrinsic form of debug info at terminals (textual IR, bitcode) for a short while, before then changing the textual IR and bitcode formats.

There are two significant changes to be aware of. Firstly, we're adding a single bit of debug relevant data to the `BasicBlock::iterator` class (it's so that we can determine whether ranges intend on including debug info at the beginning of a block or not). That means when writing passes that insert LLVM IR instructions, you need to identify positions with `BasicBlock::iterator` rather than just a bare `Instruction *`. Most of the time this means that after identifying where you intend on inserting something, you must also call `getIterator` on the instruction position -- however when inserting at the start of a block you _must_ use `getFirstInsertionPt`, `getFirstNonPHIIt` or `begin` and use that iterator to insert, rather than just fetching a pointer to the first instruction.

Expand All @@ -40,13 +40,15 @@ This will all happen transparently without needing to think about it!

## What exactly have you replaced debug intrinsics with?

We're using a dedicated C++ class called `DPValue` to store debug info, with a one-to-one relationship between each instance of a debug intrinsic and each `DPValue` object in any LLVM IR program. This class stores exactly the same information as is stored in debugging intrinsics. It also has almost entirely the same set of methods, that behave in the same way:
We're using a dedicated C++ class called `DbgRecord` to store debug info, with a one-to-one relationship between each instance of a debug intrinsic and each `DbgRecord` object in any LLVM IR program; these `DbgRecord`s are represented in the IR as non-instruction debug records, as described in the [Source Level Debugging](project:SourceLevelDebugging.rst#Debug Records) document. This class has a set of subclasses that store exactly the same information as is stored in debugging intrinsics. Each one also has almost entirely the same set of methods, that behave in the same way:

https://llvm.org/docs/doxygen/classllvm_1_1DbgRecord.html
https://llvm.org/docs/doxygen/classllvm_1_1DPValue.html
https://llvm.org/docs/doxygen/classllvm_1_1DPLabel.html

This allows you to treat a `DPValue` as if it's a `dbg.value` intrinsic most of the time, for example in generic (auto-param) lambdas.
This allows you to treat a `DPValue` as if it's a `dbg.value`/`dbg.declare`/`dbg.assign` intrinsic most of the time, for example in generic (auto-param) lambdas, and the same for `DPLabel` and `dbg.label`s.

## How do these DPValues fit into the instruction stream?
## How do these `DbgRecords` fit into the instruction stream?

Like so:

Expand All @@ -60,37 +62,37 @@ Like so:
|
v
+------------+
<-----+ DPMarker |<----
/ +------------+ \
/ \
/ \
v ^
+-----------+ +-----------+ +-----------+
| DPValue +--->| DPValue +-->| DPValue |
+-----------+ +-----------+ +-----------+
<-------+ DPMarker |<-------
/ +------------+ \
/ \
/ \
v ^
+-------------+ +-------------+ +-------------+
| DbgRecord +--->| DbgRecord +-->| DbgRecord |
+-------------+ +-------------+ +-------------+
```

Each instruction has a pointer to a `DPMarker` (which will become optional), that contains a list of `DPValue` objects. No debugging records appear in the instruction list at all. `DPValue`s have a parent pointer to their owning `DPMarker`, and each `DPMarker` has a pointer back to it's owning instruction.
Each instruction has a pointer to a `DPMarker` (which will become optional), that contains a list of `DbgRecord` objects. No debugging records appear in the instruction list at all. `DbgRecord`s have a parent pointer to their owning `DPMarker`, and each `DPMarker` has a pointer back to it's owning instruction.

Not shown are the links from DPValues to other parts of the `Value`/`Metadata` hierachy: `DPValue`s have raw pointers to `DILocalVariable`, `DIExpression` and `DILocation` objects, and references to `Value`s are stored in a `DebugValueUser` base class. This refers to a `ValueAsMetadata` object referring to `Value`s, via the `TrackingMetadata` facility.
Not shown are the links from DbgRecord to other parts of the `Value`/`Metadata` hierachy: `DbgRecord` subclasses have tracking pointers to the DIMetadata that they use, and `DPValue` has references to `Value`s that are stored in a `DebugValueUser` base class. This refers to a `ValueAsMetadata` object referring to `Value`s, via the `TrackingMetadata` facility.

The various kinds of debug intrinsic (value, declare, assign) are all stored in the `DPValue` object, with a "Type" field disamgibuating which is which.
The various kinds of debug intrinsic (value, declare, assign, label) are all stored in `DbgRecord` subclasses, with a "RecordKind" field distinguishing `DPLabel`s from `DPValue`s, and a `LocationType` field in the `DPValue` class further disambiguating the various debug variable intrinsics it can represent.

## Finding debug info records

Utilities such as `findDbgUsers` and the like now have an optional argument that will return the set of `DPValue` records that refer to a `Value`. You should be able to treat them the same as intrinsics.

## Examining debug info records at positions

Call `Instruction::getDbgRecordRange()` to get the range of `DPValue` objects that are attached to an instruction.
Call `Instruction::getDbgRecordRange()` to get the range of `DbgRecord` objects that are attached to an instruction.

## Moving around, deleting

You can use `DPValue::removeFromParent` to unlink a `DPValue` from it's marker, and then `BasicBlock::insertDbgRecordBefore` or `BasicBlock::insertDbgRecordAfter` to re-insert the `DPValue` somewhere else. You cannot insert a `DPValue` at an arbitary point in a list of `DPValue`s (if you're doing this with `dbg.value`s then it's unlikely to be correct).
You can use `DbgRecord::removeFromParent` to unlink a `DbgRecord` from it's marker, and then `BasicBlock::insertDbgRecordBefore` or `BasicBlock::insertDbgRecordAfter` to re-insert the `DbgRecord` somewhere else. You cannot insert a `DbgRecord` at an arbitary point in a list of `DbgRecord`s (if you're doing this with `dbg.value`s then it's unlikely to be correct).

Erase `DPValue`s by calling `eraseFromParent` or `deleteInstr` if it's already been removed.
Erase `DbgRecord`s by calling `eraseFromParent` or `deleteInstr` if it's already been removed.

## What about dangling `DPValue`s?
## What about dangling `DbgRecord`s?

If you have a block like so:

Expand All @@ -101,6 +103,6 @@ If you have a block like so:
br label %xyzzy
```

your optimisation pass may wish to erase the terminator and then do something to the block. This is easy to do when debug info is kept in instructions, but with `DPValue`s there is no trailing instruction to attach the variable information to in the block above, once the terminator is erased. For such degenerate blocks, `DPValue`s are stored temporarily in a map in `LLVMContext`, and are re-inserted when a terminator is reinserted to the block or other instruction inserted at `end()`.
your optimisation pass may wish to erase the terminator and then do something to the block. This is easy to do when debug info is kept in instructions, but with `DbgRecord`s there is no trailing instruction to attach the variable information to in the block above, once the terminator is erased. For such degenerate blocks, `DbgRecord`s are stored temporarily in a map in `LLVMContext`, and are re-inserted when a terminator is reinserted to the block or other instruction inserted at `end()`.

This can technically lead to trouble in the vanishingly rare scenario where an optimisation pass erases a terminator and then decides to erase the whole block. (We recommend not doing that).
2 changes: 1 addition & 1 deletion llvm/docs/UserGuides.rst
Original file line number Diff line number Diff line change
Expand Up @@ -181,7 +181,7 @@ Optimizations

:doc:`RemoveDIsDebugInfo`
This is a migration guide describing how to move from debug info using
intrinsics such as dbg.value to using the non-instruction DPValue object.
intrinsics such as dbg.value to using the non-instruction DbgRecord object.

:doc:`InstrProfileFormat`
This document explains two binary formats of instrumentation-based profiles.
Expand Down