|
| 1 | +# Object layout |
| 2 | + |
| 3 | +## Common header |
| 4 | + |
| 5 | +Each Python object starts with two fields: |
| 6 | + |
| 7 | +* ob_refcnt |
| 8 | +* ob_type |
| 9 | + |
| 10 | +which the form the header common to all Python objects, for all versions, |
| 11 | +and hold the reference count and class of the object, respectively. |
| 12 | + |
| 13 | +## Pre-header |
| 14 | + |
| 15 | +Since the introduction of the cycle GC, there has also been a pre-header. |
| 16 | +Before 3.11, this pre-header was two words in size. |
| 17 | +It should be considered opaque to all code except the cycle GC. |
| 18 | + |
| 19 | +## 3.11 pre-header |
| 20 | + |
| 21 | +In 3.11 the pre-header was extended to include pointers to the VM managed ``__dict__``. |
| 22 | +The reason for moving the ``__dict__`` to the pre-header is that it allows |
| 23 | +faster access, as it is at a fixed offset, and it also allows object's |
| 24 | +dictionaries to be lazily created when the ``__dict__`` attribute is |
| 25 | +specifically asked for. |
| 26 | + |
| 27 | +In the 3.11 the non-GC part of the pre-header consists of two pointers: |
| 28 | + |
| 29 | +* dict |
| 30 | +* values |
| 31 | + |
| 32 | +The values pointer refers to the ``PyDictValues`` array which holds the |
| 33 | +values of the objects's attributes. |
| 34 | +Should the dictionary be needed, then ``values`` is set to ``NULL`` |
| 35 | +and the ``dict`` field points to the dictionary. |
| 36 | + |
| 37 | +## 3.12 pre-header |
| 38 | + |
| 39 | +In 3.12 the the pointer to the list of weak references is added to the |
| 40 | +pre-header. In order to make space for it, the ``dict`` and ``values`` |
| 41 | +pointers are combined into a single tagged pointer: |
| 42 | + |
| 43 | +* weakreflist |
| 44 | +* dict_or_values |
| 45 | + |
| 46 | +If the object has no physical dictionary, then the ``dict_or_values`` |
| 47 | +has its low bit set to one, and points to the values array. |
| 48 | +If the object has a physical dictioanry, then the ``dict_or_values`` |
| 49 | +has its low bit set to zero, and points to the dictionary. |
| 50 | + |
| 51 | +The untagged form is chosen for the dictionary pointer, rather than |
| 52 | +the values pointer, to enable the (legacy) C-API function |
| 53 | +`_PyObject_GetDictPtr(PyObject *obj)` to work. |
| 54 | + |
| 55 | + |
| 56 | +## Layout of a "normal" Python object in 3.12: |
| 57 | + |
| 58 | +* weakreflist |
| 59 | +* dict_or_values |
| 60 | +* GC 1 |
| 61 | +* GC 2 |
| 62 | +* ob_refcnt |
| 63 | +* ob_type |
| 64 | + |
| 65 | +For a "normal" Python object, that is one that doesn't inherit from a builtin |
| 66 | +class or have slots, the header and pre-header form the entire object. |
| 67 | + |
| 68 | + |
| 69 | + |
| 70 | +There are several advantages to this layout: |
| 71 | + |
| 72 | +* It allows lazy `__dict__`s, as described above. |
| 73 | +* The regular layout allows us to create tailored traversal and deallocation |
| 74 | + functions based on layout, rather than inheritance. |
| 75 | +* Multiple inheritance works properly, |
| 76 | + as the weakrefs and dict are always at the same offset. |
| 77 | + |
| 78 | +The full layout object, with an opaque part defined by a C extension, |
| 79 | +and `__slots__` looks like this: |
| 80 | + |
| 81 | + |
| 82 | + |
0 commit comments