Skip to content

Commit 99a3cac

Browse files
authored
Merge pull request #959 from stackhpc/tempest-docs
Add docs page for running Tempest with Kayobe Automation
2 parents 3919a23 + 0bc60fc commit 99a3cac

File tree

2 files changed

+327
-0
lines changed

2 files changed

+327
-0
lines changed

doc/source/operations/index.rst

Lines changed: 1 addition & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -13,3 +13,4 @@ This guide is for operators of the StackHPC Kayobe configuration project.
1313
rocky-linux-9
1414
ubuntu-jammy
1515
secret-rotation
16+
tempest

doc/source/operations/tempest.rst

Lines changed: 326 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,326 @@
1+
======================================
2+
Running Tempest with Kayobe Automation
3+
======================================
4+
5+
Overview
6+
========
7+
8+
This document describes how to configure and run `Tempest
9+
<https://docs.openstack.org/tempest/latest/>`_ using `kayobe-automation
10+
<https://github.com/stackhpc/kayobe-automation>`_ from the ``.automation``
11+
submodule included with ``stackhpc-kayobe-config``.
12+
13+
The best way of running Tempest is to use CI/CD workflows. Before proceeding,
14+
consider whether it would be possible to use/set up a CI/CD workflow instead.
15+
For more information, see the :doc:`CI/CD workflows page
16+
</configuration/ci-cd>`.
17+
18+
The following guide will assume all commands are run from your
19+
``kayobe-config`` root and the environment has been configured to run Kayobe
20+
commands unless stated otherwise.
21+
22+
Prerequisites
23+
=============
24+
25+
Installing Docker
26+
-----------------
27+
28+
``kayobe-automation`` runs in a container on the Ansible control host. This
29+
means that Docker must be installed on the Ansible control host if it is not
30+
already.
31+
32+
.. warning::
33+
34+
Docker can cause networking issues when it is installed. By default, it
35+
will create a bridge and change ``iptables`` rules. These can be disabled
36+
by setting the following in ``/etc/docker/daemon.json``:
37+
38+
.. code-block:: json
39+
40+
{
41+
"bridge": "none",
42+
"iptables": false
43+
}
44+
45+
The bridge is the most common cause of issues and is *usually* safe to
46+
disable. Disabling the ``iptables`` rules will break any GitHub actions
47+
runners running on the host.
48+
49+
To install Docker on Ubuntu:
50+
51+
.. code-block:: bash
52+
53+
# Add Docker's official GPG key:
54+
sudo apt-get update
55+
sudo apt-get install ca-certificates curl
56+
sudo install -m 0755 -d /etc/apt/keyrings
57+
sudo curl -fsSL https://download.docker.com/linux/ubuntu/gpg -o /etc/apt/keyrings/docker.asc
58+
sudo chmod a+r /etc/apt/keyrings/docker.asc
59+
60+
# Add the repository to Apt sources:
61+
echo \
62+
"deb [arch=$(dpkg --print-architecture) signed-by=/etc/apt/keyrings/docker.asc] https://download.docker.com/linux/ubuntu \
63+
$(. /etc/os-release && echo "$VERSION_CODENAME") stable" | \
64+
sudo tee /etc/apt/sources.list.d/docker.list > /dev/null
65+
sudo apt-get update
66+
sudo apt-get install docker-ce docker-ce-cli containerd.io docker-buildx-plugin docker-compose-plugin
67+
68+
Installing Docker on CentOS/Rocky:
69+
70+
.. code-block:: bash
71+
72+
sudo dnf install -y dnf-utils
73+
sudo dnf-config-manager --add-repo https://download.docker.com/linux/centos/docker-ce.repo
74+
sudo dnf install docker-ce docker-ce-cli containerd.io docker-buildx-plugin docker-compose-plugin
75+
76+
Ensure Docker is running & enabled:
77+
78+
.. code-block:: bash
79+
80+
sudo systemctl start docker
81+
sudo systemctl enable docker
82+
83+
The Docker ``buildx`` plugin must be installed. If you are using an existing
84+
installation of docker, you may need to install it with:
85+
86+
.. code-block:: bash
87+
88+
sudo dnf/apt install docker-buildx-plugin
89+
sudo docker buildx install
90+
# or if that fails:
91+
sudo docker plugin install buildx
92+
93+
Building a Kayobe container
94+
---------------------------
95+
96+
Build a Kayobe automation image:
97+
98+
.. code-block:: bash
99+
100+
git submodule init
101+
git submodule update
102+
# If running on Ubuntu, the fact cache can confuse Kayobe in the CentOS-based container
103+
mv etc/kayobe/facts{,-old}
104+
sudo DOCKER_BUILDKIT=1 docker build --file .automation/docker/kayobe/Dockerfile --tag kayobe:latest .
105+
106+
Configuration
107+
=============
108+
109+
Kayobe automation configuration files are stored in the ``.automation.conf/``
110+
directory. It contains:
111+
112+
- A script used to export environment variables for meta configuration of
113+
Tempest - ``.automation.conf/config.sh``.
114+
- Tempest configuration override files, stored in ``.automation.conf/tempest/``
115+
and conventionally named ``tempest.overrides.conf`` or
116+
``tempest-<environment>.overrides.conf``.
117+
- Tempest load lists, stored in ``.automation.conf/tempest/load-lists``.
118+
- Tempest skip lists, stored in ``.automation.conf/tempest/skip-lists``.
119+
120+
config.sh
121+
---------
122+
123+
``config.sh`` is a mandatory shell script, primarily used to export environment
124+
variables for the meta configuration of Tempest.
125+
126+
See:
127+
https://github.com/stackhpc/docker-rally/blob/master/bin/rally-verify-wrapper.sh
128+
for a full list of Tempest parameters that can be overridden.
129+
130+
The most common variables to override are:
131+
132+
- ``TEMPEST_CONCURRENCY`` - The maximum number of tests to run in parallel at
133+
one time. Higher values are faster but increase the risk of timeouts. 1-2 is
134+
safest in CI/Tenks/Multinode/AIO etc. 8-32 is typical in production. Default
135+
value is 2.
136+
- ``KAYOBE_AUTOMATION_TEMPEST_LOADLIST``: the filename of a load list in the
137+
``load-lists`` directory. Default value is ``default`` (symlink to refstack).
138+
- ``KAYOBE_AUTOMATION_TEMPEST_SKIPLIST``: the filename of a load list in the
139+
``skip-lists`` directory. Default value is unset.
140+
- ``TEMPEST_OPENRC``: The **contents** of an ``openrc.sh`` file, to be used by
141+
Tempest to create resources on the cloud. Default is to read in the contents
142+
of ``etc/kolla/public-openrc.sh``.
143+
144+
tempest.overrides.conf
145+
----------------------
146+
147+
Tempest uses a configuration file to define which tests are run and how to run
148+
them. A full sample configuration file can be found `here
149+
<https://docs.openstack.org/tempest/latest/sampleconf.html>`_. Sensible
150+
defaults exist for all values and in most situations, a blank
151+
``*overrides.conf`` file will successfully run many tests. It will however also
152+
skip many tests which may otherwise be appropriate to run.
153+
154+
`Shakespeare <https://github.com/stackhpc/shakespeare>`_ is a tool for
155+
generating Tempest configuration files. It contains elements for different
156+
cloud features, which can be combined to template out a detailed configuration
157+
file. This is the best-practice approach.
158+
159+
Below is an example of a manually generated file including many of the most
160+
common overrides. It makes many assumptions about the environment, so make sure
161+
you understand all the options before applying them.
162+
163+
.. NOTE(upgrade): Microversions change for each release
164+
.. code-block:: ini
165+
166+
[openstack]
167+
# Use a StackHPC-built image without a default password.
168+
img_url=https://github.com/stackhpc/cirros/releases/download/20231206/cirros-d231206-x86_64-disk.img
169+
170+
[auth]
171+
# Expect unlimited quotas for CPU cores and RAM
172+
compute_quotas = cores:-1,ram:-1
173+
174+
[compute]
175+
# Required for migration testing
176+
min_compute_nodes = 2
177+
# Required to test some API features
178+
min_microversion = 2.1
179+
max_microversion = 2.90
180+
# Flavors for creating test servers and server resize. The ``alt`` flavor should be larger.
181+
flavor_ref = <flavor UUID>
182+
flavor_ref_alt = <different flavor UUID>
183+
volume_multiattach = true
184+
185+
[compute-feature-enabled]
186+
# Required for migration testing
187+
resize = true
188+
live_migration = true
189+
block_migration_for_live_migration = false
190+
volume_backed_live_migration = true
191+
192+
[placement]
193+
min_microversion = "1.0"
194+
max_microversion = "1.39"
195+
196+
[volume]
197+
storage_protocol = ceph
198+
# Required to test some API features
199+
min_microversion = 3.0
200+
max_microversion = 3.68
201+
202+
Tempest configuration override files are stored in
203+
``.automation.conf/tempest/``. The default file used is
204+
``tempest.overrides.conf`` or ``tempest-<environment>.overrides.conf``
205+
depending on whether a Kayobe environment is enabled. This can be changed by
206+
setting ``KAYOBE_AUTOMATION_TEMPEST_CONF_OVERRIDES`` to a different file path.
207+
An ``overrides.conf`` file must be supplied, even if it is blank.
208+
209+
Load Lists
210+
----------
211+
212+
Load lists are a newline-separated list of tests to run. They are stored in
213+
``.automation.conf/tempest/load-lists/``. The directory contains three objects
214+
by default:
215+
216+
- ``tempest-full`` - A complete list of all possible tests.
217+
- ``platform.2022.11-test-list.txt`` - A reduced list of tests to match the
218+
`Refstack <https://docs.opendev.org/openinfra/refstack/latest/>`_ standard.
219+
- ``default`` - A symlink to ``platform.2022.11-test-list.txt``.
220+
221+
Test lists can be selected by changing ``KAYOBE_AUTOMATION_TEMPEST_LOADLIST``
222+
in ``config.sh``. The default value is ``default``, which symlinks to
223+
``platform.2022.11-test-list.txt``.
224+
225+
A common use case is to use the ``failed-tests`` list output from a previous
226+
Tempest run as a load list, to retry the failed tests after making changes.
227+
228+
Skip Lists
229+
----------
230+
231+
Skip lists are a newline-separated list of tests to Skip. They are stored in
232+
``.automation.conf/tempest/skip-lists/``. Each line consists of a pattern to
233+
match against test names, and a string explaining why the test is being
234+
skipped e.g.
235+
236+
.. code-block::
237+
238+
tempest.scenario.test_network_basic_ops.TestNetworkBasicOps.test_subnet_details.*: "Cirros image doesn't have /var/run/udhcpc.eth0.pid"
239+
240+
There is no requirement for a skip list, and none is selected by default. A
241+
skip list can be selected by setting ``KAYOBE_AUTOMATION_TEMPEST_SKIPLIST`` in
242+
``config.sh``.
243+
244+
Tempest runner
245+
--------------
246+
247+
While the Kayobe automation container is always deployed to the ansible control
248+
host, the Tempest container is deployed to the host in the ``tempest_runner``
249+
group, which can be any host in the Kayobe inventory. The group should only
250+
ever contain one host. The seed is usually used as the tempest runner however
251+
it is also common to use the Ansible control host or an infrastructure VM. The
252+
main requirement of the host is that it can reach the OpenStack API.
253+
254+
Running Tempest
255+
===============
256+
257+
Kayobe automation will need to SSH to the Tempest runner (even if they are on
258+
the same host), so requires an SSH key exported as
259+
``KAYOBE_AUTOMATION_SSH_PRIVATE_KEY`` e.g.
260+
261+
.. code-block:: bash
262+
263+
export KAYOBE_AUTOMATION_SSH_PRIVATE_KEY=$(cat ~/.ssh/id_rsa)
264+
265+
Tempest outputs will be sent to the ``tempest-artifacts/`` directory. Create
266+
one if it does not exist.
267+
268+
.. code-block:: bash
269+
270+
mkdir tempest-artifacts
271+
272+
The contents of ``tempest-artifacts`` will be overwritten. Ensure any previous
273+
test results have been copied away.
274+
275+
The Tempest playbook is invoked through the Kayobe container using this
276+
command from the base of the ``kayobe-config`` directory:
277+
278+
.. code-block:: bash
279+
280+
sudo -E docker run --detach -it --rm --network host -v $(pwd):/stack/kayobe-automation-env/src/kayobe-config -v $(pwd)/tempest-artifacts:/stack/tempest-artifacts -e KAYOBE_ENVIRONMENT -e KAYOBE_VAULT_PASSWORD -e KAYOBE_AUTOMATION_SSH_PRIVATE_KEY kayobe:latest /stack/kayobe-automation-env/src/kayobe-config/.automation/pipeline/tempest.sh -e ansible_user=stack
281+
282+
By default, ``no_log`` is set to stop credentials from leaking. This can be
283+
disabled by adding ``-e rally_no_sensitive_log=false`` to the end.
284+
285+
To follow the progress of the Kayobe automation container, either remove
286+
``--detach`` from the above command, or follow the docker logs of the
287+
``kayobe`` container.
288+
289+
To follow the progress of the Tempest tests themselves, follow the logs of the
290+
``tempest`` container on the ``tempest_runner`` host.
291+
292+
.. code-block:: bash
293+
294+
ssh <tempest-runner>
295+
sudo docker logs -f tempest
296+
297+
Tempest will keep running until completion if the ``kayobe`` container is
298+
stopped. The ``tempest`` container must be stopped manually. Doing so will
299+
however stop test resources (such as networks, images, and VMs) from being
300+
automatically cleaned up. They must instead be manually removed. They should be
301+
clearly labeled with either rally or tempest in the name, often alongside some
302+
randomly generated string.
303+
304+
Outputs
305+
-------
306+
307+
Tempest outputs will be sent to the ``tempest-artifacts/`` directory. It
308+
contain the following artifacts:
309+
310+
- ``docker.log`` - The logs from the ``tempest`` docker container
311+
- ``failed-tests`` - A simple list of tests that failed
312+
- ``rally-junit.xml`` - An XML file listing all tests in the test list and
313+
their status (skipped/succeeded/failed). Usually not useful.
314+
- ``rally-verify-report.html`` - An HTML page with all test results including
315+
an error trace for failed tests. It is often best to ``scp`` this file back
316+
to your local machine to view it. This is the most user-friendly way to view
317+
the test results, however can be awkward to host.
318+
- ``rally-verify-report.json`` - A JSON blob with all test results including an
319+
error trace for failed tests. It contains all the same data as the HTML
320+
report but without formatting.
321+
- ``stderr.log`` - The stderr log. Usually not useful.
322+
- ``stdout.log`` - The stdout log. Usually not useful.
323+
- ``tempest-load-list`` - The load list that Tempest was invoked with.
324+
- ``tempest.log`` - Detailed logs from Tempest. Contains more data than the
325+
``verify`` reports, but can be difficult to parse. Useful for tracing specific
326+
errors.

0 commit comments

Comments
 (0)