Skip to content

Commit 0fde0f4

Browse files
committed
[AMDGPU][NFC] Update DW_OP_LLVM_overlay documentation
Update DWARF Extensions For Heterogeneous Debugging proposal for the DW_OP_LLVM_overlay operation: 1. Add an example. 2. Correct typo in definition of rbss. 3. Correct definition to specify both operands of the DW_OP_bit_piece operations. Reviewed By: zoran.zaric Differential Revision: https://reviews.llvm.org/D135394
1 parent ac0fe5d commit 0fde0f4

File tree

1 file changed

+48
-6
lines changed

1 file changed

+48
-6
lines changed

llvm/docs/AMDGPUDwarfExtensionsForHeterogeneousDebugging.rst

Lines changed: 48 additions & 6 deletions
Original file line numberDiff line numberDiff line change
@@ -60,7 +60,7 @@ extensions that includes an example. Finally, appendix
6060

6161
.. _amdgpu-dwarf-extensions:
6262

63-
1. Extensions
63+
2. Extensions
6464
=============
6565

6666
The extensions continue to evolve through collaboration with many individuals and
@@ -627,6 +627,49 @@ location description as an overlay of the first, positioned according to the
627627
offset and size. See ``DW_OP_LLVM_overlay`` and ``DW_OP_LLVM_bit_overlay`` in
628628
:ref:`amdgpu-dwarf-composite-location-description-operations`.
629629

630+
Consider an array that has been partially registerized such that the currently
631+
processed elements are held in registers, whereas the remainder of the array
632+
remains in memory. Consider the loop in this C function, for example:
633+
634+
.. code::
635+
:number-lines:
636+
637+
extern void foo(uint32_t dst[], uint32_t src[], int len) {
638+
for (int i = 0; i < len; ++i)
639+
dst[i] += src[i];
640+
}
641+
642+
Inside the loop body, the machine code loads ``src[i]`` and ``dst[i]`` into
643+
registers, adds them, and stores the result back into ``dst[i]``.
644+
645+
Considering the location of ``dst`` and ``src`` in the loop body, the elements
646+
``dst[i]`` and ``src[i]`` would be located in registers, all other elements are
647+
located in memory. Let register ``R0`` contain the base address of ``dst``,
648+
register ``R1`` contain ``i``, and register ``R2`` contain the registerized
649+
``dst[i]`` element. We can describe the location of ``dst`` as a memory location
650+
with a register location overlaid at a runtime offset involving ``i``:
651+
652+
.. code::
653+
:number-lines:
654+
655+
// 1. Memory location description of dst elements located in memory:
656+
DW_OP_breg0 0
657+
658+
// 2. Register location description of element dst[i] is located in R2:
659+
DW_OP_reg2
660+
661+
// 3. Offset of the register within the memory of dst:
662+
DW_OP_breg1 0
663+
DW_OP_lit4
664+
DW_OP_mul
665+
666+
// 4. The size of the register element:
667+
DW_OP_lit4
668+
669+
// 5. Make a composite location description for dst that is the memory #1 with
670+
// the register #2 positioned as an overlay at offset #3 of size #4:
671+
DW_OP_LLVM_overlay
672+
630673
.. _amdgpu-dwarf-changes-relative-to-dwarf-version-5:
631674

632675
A. Changes Relative to DWARF Version 5
@@ -2843,7 +2886,7 @@ compatible with the definitions in DWARF Version 5.*
28432886

28442887
*rbss(L)* is the minimum remaining bit storage size of L which is defined as
28452888
follows. LS is the location storage and LO is the location bit offset
2846-
specified by a single location descriptions SL of L. The remaining bit
2889+
specified by a single location description SL of L. The remaining bit
28472890
storage size RBSS of SL is the bit size of LS minus LO. *rbss(L)* is the
28482891
minimum RBSS of each single location description SL of L.
28492892

@@ -2861,11 +2904,10 @@ compatible with the definitions in DWARF Version 5.*
28612904
overlay starting at the overlay offset BO and covering overlay bit size BS.*
28622905

28632906
1. If BO is not 0 then push BL followed by performing the ``DW_OP_bit_piece
2864-
BO`` operation.
2865-
2. Push OL followed by performing the ``DW_OP_bit_piece BS`` operation.
2907+
BO, 0`` operation.
2908+
2. Push OL followed by performing the ``DW_OP_bit_piece BS, 0`` operation.
28662909
3. If *rbss(BL)* is greater than BO plus BS, push BL followed by performing
2867-
the ``DW_OP_LLVM_bit_offset (BO + BS); DW_OP_bit_piece (rbss(BL) - BO -
2868-
BS)`` operations.
2910+
the ``DW_OP_bit_piece (rbss(BL) - BO - BS), (BO + BS)`` operation.
28692911
4. Perform the ``DW_OP_LLVM_piece_end`` operation.
28702912

28712913
.. _amdgpu-dwarf-location-list-expressions:

0 commit comments

Comments
 (0)