Skip to content

Commit d5420bb

Browse files
committed
docs: Add reference docs for internal block device structures
It's time to shine a light on this area of the codebase ahead of some much required cleanup. This documentation is based on an email sent almost 5 years ago but is still accurate today. Change-Id: I66cc2c5549833f269872748fb1532438f9ba8489
1 parent 242002f commit d5420bb

File tree

3 files changed

+232
-2
lines changed

3 files changed

+232
-2
lines changed
Lines changed: 224 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,224 @@
1+
==========================
2+
Driver BDM Data Structures
3+
==========================
4+
5+
In addition to the :doc:`API BDM data format </user/block-device-mapping>`
6+
there are also several internal data structures within Nova that map out how
7+
block devices are attached to instances. This document aims to outline the two
8+
general data structures and two additional specific data structures used by the
9+
libvirt virt driver.
10+
11+
.. note::
12+
13+
This document is based on an email to the openstack-dev mailing
14+
list by Matthew Booth below provided as a primer for developers working on
15+
virt drivers and interacting with these data structures.
16+
17+
http://lists.openstack.org/pipermail/openstack-dev/2016-June/097529.html
18+
19+
.. note::
20+
21+
References to local disks in the following document refer to any
22+
disk directly managed by nova compute. If nova is configured to use RBD or
23+
NFS for instance disks then these disks won't actually be local, but they
24+
are still managed locally and referred to as local disks. As opposed to RBD
25+
volumes provided by Cinder that are not considered local.
26+
27+
Generic BDM data structures
28+
===========================
29+
30+
``BlockDeviceMapping``
31+
----------------------
32+
33+
The 'top level' data structure is the ``BlockDeviceMapping`` (BDM) object. It
34+
is a ``NovaObject``, persisted in the DB. Current code creates a BDM object for
35+
every disk associated with an instance, whether it is a volume or not.
36+
37+
The BDM object describes properties of each disk as specified by the user. It
38+
is initially from a user request, for more details on the format of these
39+
requests please see the :doc:`Block Device Mapping in Nova
40+
<../user/block-device-mapping>` document.
41+
42+
The Compute API transforms and consolidates all BDMs to ensure that all disks,
43+
explicit or implicit, have a BDM, and then persists them. Look in
44+
``nova.objects.block_device`` for all BDM fields, but in essence they contain
45+
information like (source_type='image', destination_type='local',
46+
image_id='<image uuid'>), or equivalents describing ephemeral disks, swap disks
47+
or volumes, and some associated data.
48+
49+
.. note::
50+
51+
BDM objects are typically stored in variables called ``bdm`` with lists
52+
in ``bdms``, although this is obviously not guaranteed (and unfortunately
53+
not always true: ``bdm`` in ``libvirt.block_device`` is usually a
54+
``DriverBlockDevice`` object). This is a useful reading aid (except when
55+
it's proactively confounding), as there is also something else typically
56+
called ``block_device_mapping`` which is not a ``BlockDeviceMapping``
57+
object.
58+
59+
``block_device_info``
60+
---------------------
61+
62+
Drivers do not directly use BDM objects. Instead, they are transformed into a
63+
different driver-specific representation. This representation is normally
64+
called ``block_device_info``, and is generated by
65+
``virt.driver.get_block_device_info()``. Its output is based on data in BDMs.
66+
``block_device_info`` is a dict containing:
67+
68+
``root_device_name``
69+
Hypervisor's notion of the root device's name
70+
``ephemerals``
71+
A list of all ephemeral disks
72+
``block_device_mapping``
73+
A list of all cinder volumes
74+
``swap``
75+
A swap disk, or None if there is no swap disk
76+
77+
The disks are represented in one of two ways, depending on the specific
78+
driver currently in use. There's the 'new' representation, used by the libvirt
79+
and vmwareAPI drivers, and the 'legacy' representation used by all other
80+
drivers. The legacy representation is a plain dict. It does not contain the
81+
same information as the new representation.
82+
83+
The new representation involves subclasses of
84+
``nova.block_device.DriverBlockDevice``. As well as containing different
85+
fields, the new representation significantly also retains a reference to the
86+
underlying BDM object. This means that by manipulating the
87+
``DriverBlockDevice`` object, the driver is able to persist data to the BDM
88+
object in the DB.
89+
90+
.. note::
91+
92+
Common usage is to pull ``block_device_mapping`` out of this
93+
dict into a variable called ``block_device_mapping``. This is not a
94+
``BlockDeviceMapping`` object, or list of them.
95+
96+
.. note::
97+
98+
If ``block_device_info`` was passed to the driver by compute manager, it
99+
was probably generated by ``_get_instance_block_device_info()``.
100+
By default, this function filters out all cinder volumes from
101+
``block_device_mapping`` which don't currently have ``connection_info``.
102+
In other contexts this filtering will not have happened, and
103+
``block_device_mapping`` will contain all volumes.
104+
105+
.. note::
106+
107+
Unlike BDMs, ``block_device_info`` does not currently represent all
108+
disks that an instance might have. Significantly, it will not contain any
109+
representation of an image-backed local disk, i.e. the root disk of a
110+
typical instance which isn't boot-from-volume. Other representations used
111+
by the libvirt driver explicitly reconstruct this missing disk.
112+
113+
libvirt driver specific BDM data structures
114+
===========================================
115+
116+
``instance_disk_info``
117+
----------------------
118+
119+
The virt driver API defines a method ``get_instance_disk_info``, which returns
120+
a JSON blob. The compute manager calls this and passes the data over RPC
121+
between calls without ever looking at it. This is driver-specific opaque data.
122+
It is also only used by the libvirt driver, despite being part of the API for
123+
all drivers. Other drivers do not return any data. The most interesting aspect
124+
of ``instance_disk_info`` is that it is generated from the libvirt XML, not
125+
from nova's state.
126+
127+
.. note::
128+
129+
``instance_disk_info`` is often named ``disk_info`` in code, which
130+
is unfortunate as this clashes with the normal naming of the next
131+
structure. Occasionally the two are used in the same block of code.
132+
133+
.. note::
134+
135+
RBD disks (including non-volume disks) and cinder volumes
136+
are not included in ``instance_disk_info``.
137+
138+
``instance_disk_info`` is a list of dicts for some of an instance's disks. Each
139+
dict contains the following:
140+
141+
``type``
142+
libvirt's notion of the disk's type
143+
``path``
144+
libvirt's notion of the disk's path
145+
``virt_disk_size``
146+
The disk's virtual size in bytes (the size the guest OS sees)
147+
``backing_file``
148+
libvirt's notion of the backing file path
149+
``disk_size``
150+
The file size of path, in bytes.
151+
``over_committed_disk_size``
152+
As-yet-unallocated disk size, in bytes.
153+
154+
``disk_info``
155+
-------------
156+
157+
.. note::
158+
159+
As opposed to ``instance_disk_info``, which is frequently called
160+
``disk_info``.
161+
162+
This data structure is actually described pretty well in the comment block at
163+
the top of ``nova.virt.libvirt.blockinfo``. It is internal to the libvirt
164+
driver. It contains:
165+
166+
``disk_bus``
167+
The default bus used by disks
168+
``cdrom_bus``
169+
The default bus used by cdrom drives
170+
``mapping``
171+
Defined below
172+
173+
``mapping`` is a dict which maps disk names to a dict describing how that disk
174+
should be passed to libvirt. This mapping contains every disk connected to the
175+
instance, both local and volumes.
176+
177+
First, a note on disk naming. Local disk names used by the libvirt driver are
178+
well defined. They are:
179+
180+
``disk``
181+
The root disk
182+
``disk.local``
183+
The flavor-defined ephemeral disk
184+
``disk.ephX``
185+
Where X is a zero-based index for BDM defined ephemeral disks
186+
``disk.swap``
187+
The swap disk
188+
``disk.config``
189+
The config disk
190+
191+
These names are hardcoded, reliable, and used in lots of places.
192+
193+
In ``disk_info``, volumes are keyed by device name, eg 'vda', 'vdb'. Different
194+
buses will be named differently, approximately according to legacy Linux
195+
device naming.
196+
197+
Additionally, ``disk_info`` will contain a mapping for 'root', which is the
198+
root disk. This will duplicate one of the other entries, either 'disk' or a
199+
volume mapping.
200+
201+
Each dict within the ``mapping`` dict contains the following 3 required fields
202+
of bus, dev and type with two optional fields of format and ``boot_index``:
203+
204+
``bus``:
205+
The guest bus type ('ide', 'virtio', 'scsi', etc)
206+
``dev``:
207+
The device name 'vda', 'hdc', 'sdf', 'xvde' etc
208+
``type``:
209+
Type of device eg 'disk', 'cdrom', 'floppy'
210+
``format``
211+
Which format to apply to the device if applicable
212+
``boot_index``
213+
Number designating the boot order of the device
214+
215+
.. note::
216+
217+
``BlockDeviceMapping`` and ``DriverBlockDevice`` store boot index
218+
zero-based. However, libvirt's boot index is 1-based, so the value stored
219+
here is 1-based.
220+
221+
.. todo::
222+
223+
Add a section for the per disk ``disk.info`` file within instance
224+
directory when using the libvirt driver.

doc/source/reference/index.rst

Lines changed: 3 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -39,6 +39,8 @@ The following is a dive into some of the internals in nova.
3939
works in nova to isolate groups of hosts.
4040
* :doc:`/reference/attach-volume`: Describes the attach volume flow, using the
4141
libvirt virt driver as an example.
42+
* :doc:`/reference/block-device-structs`: Block Device Data Structures
43+
4244

4345
.. # NOTE(amotoki): toctree needs to be placed at the end of the secion to
4446
# keep the document structure in the PDF doc.
@@ -59,6 +61,7 @@ The following is a dive into some of the internals in nova.
5961
isolate-aggregates
6062
api-microversion-history
6163
attach-volume
64+
block-device-structs
6265

6366
Debugging
6467
=========

doc/source/user/block-device-mapping.rst

Lines changed: 5 additions & 2 deletions
Original file line numberDiff line numberDiff line change
@@ -48,15 +48,18 @@ When we talk about block device mapping, we usually refer to one of two things
4848
virt driver code). We will refer to this format as 'Driver BDMs' from now
4949
on.
5050

51+
For more details on this please refer to the :doc:`Driver BDM Data
52+
Structures <../reference/block-device-structs>` refernce document.
53+
5154
.. note::
5255

5356
The maximum limit on the number of disk devices allowed to attach to
5457
a single server is configurable with the option
5558
:oslo.config:option:`compute.max_disk_devices_to_attach`.
5659

5760

58-
Data format and its history
59-
----------------------------
61+
API BDM data format and its history
62+
-----------------------------------
6063

6164
In the early days of Nova, block device mapping general structure closely
6265
mirrored that of the EC2 API. During the Havana release of Nova, block device

0 commit comments

Comments
 (0)