@@ -60,7 +60,7 @@ extensions that includes an example. Finally, appendix
60
60
61
61
.. _amdgpu-dwarf-extensions :
62
62
63
- 1 . Extensions
63
+ 2 . Extensions
64
64
=============
65
65
66
66
The extensions continue to evolve through collaboration with many individuals and
@@ -627,6 +627,49 @@ location description as an overlay of the first, positioned according to the
627
627
offset and size. See ``DW_OP_LLVM_overlay `` and ``DW_OP_LLVM_bit_overlay `` in
628
628
:ref: `amdgpu-dwarf-composite-location-description-operations `.
629
629
630
+ Consider an array that has been partially registerized such that the currently
631
+ processed elements are held in registers, whereas the remainder of the array
632
+ remains in memory. Consider the loop in this C function, for example:
633
+
634
+ .. code ::
635
+ :number-lines:
636
+
637
+ extern void foo(uint32_t dst[], uint32_t src[], int len) {
638
+ for (int i = 0; i < len; ++i)
639
+ dst[i] += src[i];
640
+ }
641
+
642
+ Inside the loop body, the machine code loads ``src[i] `` and ``dst[i] `` into
643
+ registers, adds them, and stores the result back into ``dst[i] ``.
644
+
645
+ Considering the location of ``dst `` and ``src `` in the loop body, the elements
646
+ ``dst[i] `` and ``src[i] `` would be located in registers, all other elements are
647
+ located in memory. Let register ``R0 `` contain the base address of ``dst ``,
648
+ register ``R1 `` contain ``i ``, and register ``R2 `` contain the registerized
649
+ ``dst[i] `` element. We can describe the location of ``dst `` as a memory location
650
+ with a register location overlaid at a runtime offset involving ``i ``:
651
+
652
+ .. code ::
653
+ :number-lines:
654
+
655
+ // 1. Memory location description of dst elements located in memory:
656
+ DW_OP_breg0 0
657
+
658
+ // 2. Register location description of element dst[i] is located in R2:
659
+ DW_OP_reg2
660
+
661
+ // 3. Offset of the register within the memory of dst:
662
+ DW_OP_breg1 0
663
+ DW_OP_lit4
664
+ DW_OP_mul
665
+
666
+ // 4. The size of the register element:
667
+ DW_OP_lit4
668
+
669
+ // 5. Make a composite location description for dst that is the memory #1 with
670
+ // the register #2 positioned as an overlay at offset #3 of size #4:
671
+ DW_OP_LLVM_overlay
672
+
630
673
.. _amdgpu-dwarf-changes-relative-to-dwarf-version-5 :
631
674
632
675
A. Changes Relative to DWARF Version 5
@@ -2843,7 +2886,7 @@ compatible with the definitions in DWARF Version 5.*
2843
2886
2844
2887
*rbss(L) * is the minimum remaining bit storage size of L which is defined as
2845
2888
follows. LS is the location storage and LO is the location bit offset
2846
- specified by a single location descriptions SL of L. The remaining bit
2889
+ specified by a single location description SL of L. The remaining bit
2847
2890
storage size RBSS of SL is the bit size of LS minus LO. *rbss(L) * is the
2848
2891
minimum RBSS of each single location description SL of L.
2849
2892
@@ -2861,11 +2904,10 @@ compatible with the definitions in DWARF Version 5.*
2861
2904
overlay starting at the overlay offset BO and covering overlay bit size BS. *
2862
2905
2863
2906
1. If BO is not 0 then push BL followed by performing the ``DW_OP_bit_piece
2864
- BO `` operation.
2865
- 2. Push OL followed by performing the ``DW_OP_bit_piece BS `` operation.
2907
+ BO, 0 `` operation.
2908
+ 2. Push OL followed by performing the ``DW_OP_bit_piece BS, 0 `` operation.
2866
2909
3. If *rbss(BL) * is greater than BO plus BS, push BL followed by performing
2867
- the ``DW_OP_LLVM_bit_offset (BO + BS); DW_OP_bit_piece (rbss(BL) - BO -
2868
- BS) `` operations.
2910
+ the ``DW_OP_bit_piece (rbss(BL) - BO - BS), (BO + BS) `` operation.
2869
2911
4. Perform the ``DW_OP_LLVM_piece_end `` operation.
2870
2912
2871
2913
.. _amdgpu-dwarf-location-list-expressions :
0 commit comments