Skip to content

Commit 321ac35

Browse files
authored
Merge pull request #45 from stackhpc/leafcloud
Support deploying multinodes on Leafcloud
2 parents c8867ae + afd9866 commit 321ac35

16 files changed

+334
-243
lines changed

.terraform.lock.hcl

Lines changed: 2 additions & 0 deletions
Some generated files are not rendered by default. Learn more about customizing how changed files appear on GitHub.

README.rst

Lines changed: 113 additions & 97 deletions
Original file line numberDiff line numberDiff line change
@@ -2,16 +2,43 @@
22
Terraform Kayobe Multinode
33
==========================
44

5-
This Terraform configuration deploys a requested amount of Instances on an OpenStack cloud, to be
6-
used as a Multinode Kayobe test environment.
5+
This Terraform configuration deploys a requested amount of instances on an OpenStack cloud, to be
6+
used as a Multinode Kayobe test environment. This includes:
77

8-
Usage
9-
=====
8+
* 1x Ansible control host
9+
* 1x seed host
10+
* controller hosts
11+
* compute hosts
12+
* Ceph storage hosts
13+
* Optional Wazuh manager host
14+
15+
The high-level workflow to deploy a cluster is as follows:
16+
17+
* Prerequisites
18+
* Configure Terraform and Ansible
19+
* Deploy infrastructure on OpenStack using Terraform
20+
* Configure Ansible control host using Ansible
21+
* Deploy multi-node OpenStack using Kayobe
22+
23+
This configuration is typically used with the `ci-multinode` environment in the
24+
`StackHPC Kayobe Configuration
25+
<https://stackhpc-kayobe-config.readthedocs.io/en/stackhpc-yoga/contributor/environments/ci-multinode.html>`__
26+
repository.
27+
28+
Prerequisites
29+
=============
1030

1131
These instructions show how to use this Terraform configuration manually. They
1232
assume you are running an Ubuntu host that will be used to run Terraform. The
13-
machine should have network access to the environment that will be created by this
14-
configuration.
33+
machine should have access to the API of the OpenStack cloud that will host the
34+
infrastructure, and network access to the Ansible control host once it has been
35+
deployed. This may be achieved by direct SSH access, a floating IP on the
36+
Ansible control host, or using an SSH bastion.
37+
38+
The OpenStack cloud should have sufficient capacity to deploy the
39+
infrastructure, and a suitable image registered in Glance. Ideally the image
40+
should be one of the overcloud host images defined in StackHPC Kayobe
41+
configuration and available in `Ark <https://ark.stackhpc.com>`__.
1542

1643
Install Terraform:
1744

@@ -22,21 +49,24 @@ Install Terraform:
2249
sudo apt update
2350
sudo apt install git terraform
2451
25-
Clone and initialise the Kayobe config:
52+
Clone and initialise this Terraform config repository:
2653

2754
.. code-block:: console
2855
2956
git clone https://github.com/stackhpc/terraform-kayobe-multinode
3057
cd terraform-kayobe-multinode
3158
32-
3359
Initialise Terraform:
3460

3561
.. code-block:: console
3662
3763
terraform init
3864
39-
Generate an SSH keypair:
65+
Generate an SSH keypair. The public key will be registered in OpenStack as a
66+
keypair and authorised by the instances deployed by Terraform. The private and
67+
public keys will be transferred to the Ansible control host to allow it to
68+
connect to the other hosts. Note that password-protected keys are not currently
69+
supported.
4070

4171
.. code-block:: console
4272
@@ -94,59 +124,74 @@ Or you can source the provided `init.sh` script which shall initialise terraform
94124
OpenStack Cloud Name: sms-lab
95125
Password:
96126
97-
Generate Terraform variables:
127+
You must ensure that you have `Ansible installed <https://docs.ansible.com/ansible/latest/installation_guide/intro_installation.html>`_ on your local machine.
98128

99129
.. code-block:: console
100130
101-
cat << EOF > terraform.tfvars
102-
prefix = "changeme"
131+
pip install --user ansible
103132
104-
ansible_control_vm_flavor = "general.v1.small"
105-
ansible_control_vm_name = "ansible-control"
106-
ansible_control_disk_size = 100
133+
Install the Ansible galaxy requirements.
107134

108-
seed_vm_flavor = "general.v1.small"
109-
seed_disk_size = 100
135+
.. code-block:: console
110136
111-
multinode_flavor = "general.v1.medium"
112-
multinode_image = "Rocky9-lvm"
113-
multinode_keypair = "changeme"
114-
multinode_vm_network = "stackhpc-ipv4-geneve"
115-
multinode_vm_subnet = "stackhpc-ipv4-geneve-subnet"
116-
compute_count = "2"
117-
controller_count = "3"
118-
compute_disk_size = 100
119-
controller_disk_size = 100
137+
ansible-galaxy install -r ansible/requirements.yml
120138
121-
ssh_public_key = "~/.ssh/changeme.pub"
122-
ssh_user = "cloud-user"
139+
If the deployed instances are behind an SSH bastion you must ensure that your SSH config is setup appropriately with a proxy jump.
123140

124-
storage_count = "3"
125-
storage_flavor = "general.v1.small"
126-
storage_disk_size = 100
141+
.. code-block::
127142
128-
deploy_wazuh = true
129-
infra_vm_flavor = "general.v1.small"
130-
infra_vm_disk_size = 100
143+
Host lab-bastion
144+
HostName BastionIPAddr
145+
User username
146+
IdentityFile ~/.ssh/key
131147
132-
EOF
148+
Host 10.*
149+
ProxyJump=lab-bastion
150+
ForwardAgent no
151+
IdentityFile ~/.ssh/key
152+
UserKnownHostsFile /dev/null
153+
StrictHostKeyChecking no
154+
155+
Configure Terraform variables
156+
=============================
157+
158+
Populate Terraform variables in `terraform.tfvars`. Examples are provided in
159+
files named `*.tfvars.example`. The available variables are defined in
160+
`variables.tf` along with their type, description, and optional default.
133161

134162
You will need to set the `multinode_keypair`, `prefix`, and `ssh_public_key`.
135163
By default, Rocky Linux 9 will be used but Ubuntu Jammy is also supported by
136-
changing `multinode_image` to `Ubuntu-22.04-lvm` and `ssh_user` to `ubuntu`.
137-
Other LVM images should also work but are untested.
164+
changing `multinode_image` to `overcloud-ubuntu-jammy-<release>-<datetime>` and
165+
`ssh_user` to `ubuntu`.
138166

139167
The `multinode_flavor` will change the flavor used for controller and compute
140168
nodes. Both virtual machines and baremetal are supported, but the `*_disk_size`
141169
variables must be set to 0 when using baremetal host. This will stop a block
142170
device being allocated. When any baremetal hosts are deployed, the
143171
`multinode_vm_network` and `multinode_vm_subnet` should also be changed to
144-
`stackhpc-ipv4-vlan-v2` and `stackhpc-ipv4-vlan-subnet-v2` respectively.
172+
a VLAN network and associated subnet.
145173

146174
If `deploy_wazuh` is set to true, an infrastructure VM will be created that
147175
hosts the Wazuh manager. The Wazuh deployment playbooks will also be triggered
148176
automatically to deploy Wazuh agents to the overcloud hosts.
149177

178+
If `add_ansible_control_fip` is set to `true`, a floating IP will be created
179+
and attached to the Ansible control host. In that case
180+
`ansible_control_fip_pool` should be set to the name of the pool (network) from
181+
which to allocate the floating IP, and the floating IP will be used for SSH
182+
access to the control host.
183+
184+
Configure Ansible variables
185+
===========================
186+
187+
Review the vars defined within `ansible/vars/defaults.yml`. In here you can customise the version of kayobe, kayobe-config or openstack-config.
188+
Make sure to define `ssh_key_path` to point to the location of the SSH key in use by the nodes and also `vxlan_vni` which should be unique value between 1 to 100,000.
189+
VNI should be much smaller than the officially supported limit of 16,777,215 as we encounter errors when attempting to bring interfaces up that use a high VNI.
190+
You must set `vault_password_path`; this should be set to the path to a file containing the Ansible vault password.
191+
192+
Deployment
193+
==========
194+
150195
Generate a plan:
151196

152197
.. code-block:: console
@@ -159,91 +204,62 @@ Apply the changes:
159204
160205
terraform apply -auto-approve
161206
162-
You should have requested a number of resources spawned on Openstack, and an ansible_inventory file produced as output for Kayobe.
163-
164-
Copy your generated id_rsa and id_rsa.pub to ~/.ssh/ on Ansible control host if you want Kayobe to automatically pick them up during bootstrap.
207+
You should have requested a number of resources to be spawned on Openstack.
165208

166209
Configure Ansible control host
210+
==============================
167211

168-
Using the `deploy-openstack-config.yml` playbook you can setup the Ansible control host to include the kayobe/kayobe-config repositories with `hosts` and `admin-oc-networks`.
169-
It shall also setup the kayobe virtual environment, allowing for immediate configuration and deployment of OpenStack.
170-
171-
First you must ensure that you have `Ansible installed <https://docs.ansible.com/ansible/latest/installation_guide/intro_installation.html>`_ on your local machine.
212+
Run the configure-hosts.yml playbook to configure the Ansible control host.
172213

173214
.. code-block:: console
174215
175-
pip install --user ansible
176-
177-
Secondly if the machines are behind an SSH bastion you must ensure that your ssh config is setup appropriately with a proxy jump
216+
ansible-playbook -i ansible/inventory.yml ansible/configure-hosts.yml
178217
179-
.. code-block:: console
218+
This playbook sequentially executes 2 other playbooks:
180219

181-
Host lab-bastion
182-
HostName BastionIPAddr
183-
User username
184-
IdentityFile ~/.ssh/key
220+
#. ``grow-control-host.yml`` - Applies LVM configuration to the control host to ensure it has enough space to continue with the rest of the deployment. Tag: ``lvm``
221+
#. ``deploy-openstack-config.yml`` - Prepares the Ansible control host as a Kayobe control host, cloning the Kayobe configuration and installing virtual environments. Tag: ``deploy``
185222

186-
Host 10.*
187-
ProxyJump=lab-bastion
188-
ForwardAgent no
189-
IdentityFile ~/.ssh/key
190-
UserKnownHostsFile /dev/null
191-
StrictHostKeyChecking no
223+
These playbooks are tagged so that they can be invoked or skipped using `tags` or `--skip-tags` as required.
192224

193-
Install the ansible requirements.
225+
Deploy OpenStack
226+
================
194227

195-
.. code-block:: console
228+
Once the Ansible control host has been configured with a Kayobe/OpenStack configuration you can then begin the process of deploying OpenStack.
229+
This can be achieved by either manually running the various commands to configure the hosts and deploy the services or automated by using the generated `deploy-openstack.sh` script.
230+
`deploy-openstack.sh` should be available within the home directory on your Ansible control host provided you ran `deploy-openstack-config.yml` earlier.
231+
This script will go through the process of performing the following tasks:
196232

197-
ansible-galaxy install -r ansible/requirements.yml
233+
* kayobe control host bootstrap
234+
* kayobe seed host configure
235+
* kayobe overcloud host configure
236+
* cephadm deployment
237+
* kayobe overcloud service deploy
238+
* openstack configuration
239+
* tempest testing
198240

199-
Review the vars defined within `ansible/vars/defaults.yml`. In here you can customise the version of kayobe, kayobe-config or openstack-config.
200-
However, make sure to define `ssh_key_path` to point to the location of the SSH key in use amongst the nodes and also `vxlan_vni` which should be unique value between 1 to 100,000.
201-
VNI should be much smaller than the officially supported limit of 16,777,215 as we encounter errors when attempting to bring interfaces up that use a high VNI. You must set``vault_password_path``; this should be set to the path to a file containing the Ansible vault password.
241+
Tempest test results will be written to `~/tempest-artifacts`.
202242

203-
Finally, run the configure-hosts playbook.
243+
If you choose to opt for the automated method you must first SSH into your Ansible control host.
204244

205245
.. code-block:: console
206246
207-
ansible-playbook -i ansible/inventory.yml ansible/configure-hosts.yml
208-
209-
This playbook sequentially executes 4 other playbooks:
210-
211-
#. ``fix-homedir-ownership.yml`` - Ensures the ``ansible_user`` owns their home directory. Tag: ``fix-homedir``
212-
#. ``add-fqdn.yml`` - Ensures FQDNs are added to ``/etc/hosts``. Tag: ``fqdn``
213-
#. ``grow-control-host.yml`` - Applies LVM configuration to the control host to ensure it has enough space to continue with the rest of the deployment. Tag: ``lvm``
214-
#. ``deploy-openstack-config.yml`` - Deploys the OpenStack configuration to the control host. Tag: ``deploy``
247+
ssh $(terraform output -raw ssh_user)@$(terraform output -raw ansible_control_access_ip_v4)
215248
216-
These playbooks are tagged so that they can be invoked or skipped as required. For example, if designate is not being deployed, some time can be saved by skipping the FQDN playbook:
249+
Start a `tmux` session to avoid halting the deployment if you are disconnected.
217250

218251
.. code-block:: console
219252
220-
ansible-playbook -i ansible/inventory.yml ansible/configure-hosts.yml --skip-tags fqdn
221-
222-
Deploy OpenStack
223-
----------------
224-
225-
Once the Ansible control host has been configured with a Kayobe/OpenStack configuration you can then begin the process of deploying OpenStack.
226-
This can be achieved by either manually running the various commands to configures the hosts and deploy the services or automated by using `deploy-openstack.sh`,
227-
which should be available within the homedir on your Ansible control host provided you ran `deploy-openstack-config.yml` earlier.
253+
tmux
228254
229-
If you choose to opt for automated method you must first SSH into your Ansible control host and then run the `deploy-openstack.sh` script
255+
Run the `deploy-openstack.sh` script.
230256

231257
.. code-block:: console
232258
233-
ssh $(terraform output -raw ssh_user)@$(terraform output -raw ansible_control_access_ip_v4)
234259
~/deploy-openstack.sh
235260
236-
This script will go through the process of performing the following tasks
237-
* kayobe control host bootstrap
238-
* kayobe seed host configure
239-
* kayobe overcloud host configure
240-
* cephadm deployment
241-
* kayobe overcloud service deploy
242-
* openstack configuration
243-
* tempest testing
244-
245261
Accessing OpenStack
246-
-------------------
262+
===================
247263

248264
After a successful deployment of OpenStack you make access the OpenStack API and Horizon by proxying your connection via the seed node, as it has an interface on the public network (192.168.39.X).
249265
Using software such as sshuttle will allow for easy access.
@@ -260,15 +276,15 @@ Important to node this will proxy all DNS requests from your machine to the firs
260276
sshuttle -r $(terraform output -raw ssh_user)@$(terraform output -raw seed_access_ip_v4) 192.168.39.0/24 --dns --to-ns 192.168.39.4
261277
262278
Tear Down
263-
---------
279+
=========
264280

265281
After you are finished with the multinode environment please destroy the nodes to free up resources for others.
266282
This can acomplished by using the provided `scripts/tear-down.sh` which will destroy your controllers, compute, seed and storage nodes whilst leaving your Ansible control host and keypair intact.
267283

268284
If you would like to delete your Ansible control host then you can pass the `-a` flag however if you would also like to remove your keypair then pass `-a -k`
269285

270286
Issues & Fixes
271-
--------------
287+
==============
272288

273289
Sometimes a compute instance fails to be provisioned by Terraform or fails on boot for any reason.
274290
If this happens the solution is to mark the resource as tainted and perform terraform apply again which shall destroy and rebuild the failed instance.
File renamed without changes.

ansible/add-fqdn.yml

Lines changed: 0 additions & 16 deletions
This file was deleted.

ansible/configure-hosts.yml

Lines changed: 0 additions & 4 deletions
Original file line numberDiff line numberDiff line change
@@ -1,8 +1,4 @@
11
---
2-
- import_playbook: fix-homedir-ownership.yml
3-
tags: fix-homedir
4-
- import_playbook: add-fqdn.yml
5-
tags: fqdn
62
- import_playbook: grow-control-host.yml
73
tags: lvm
84
- import_playbook: deploy-openstack-config.yml

ansible/deploy-openstack-config.yml

Lines changed: 20 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -11,10 +11,30 @@
1111
- ssh_key_path != None
1212
fail_msg: "Please provide a path to the SSH key used within the multinode environment."
1313

14+
- name: Verify ssh key exists
15+
ansible.builtin.assert:
16+
that:
17+
- ssh_key_path | expanduser is exists
18+
fail_msg: "Could not find SSH key at {{ ssh_key_path | expanduser }}"
19+
20+
- name: Verify vault password path has been set
21+
ansible.builtin.assert:
22+
that:
23+
- vault_password_path != None
24+
fail_msg: "Please provide a path to the vault password used within the multinode environment."
25+
26+
- name: Verify vault password exists
27+
ansible.builtin.assert:
28+
that:
29+
- vault_password_path | expanduser is exists
30+
fail_msg: "Could not find vault password at {{ vault_password_path | expanduser }}"
31+
1432
- name: Verify VXLAN VNI has been set
1533
ansible.builtin.assert:
1634
that:
1735
- vxlan_vni != None
36+
- vxlan_vni | int > 0
37+
- vxlan_vni | int <= 100000
1838
fail_msg: "Please provide a VXLAN VNI. A unique value from 1 to 100,000."
1939

2040
- name: Gather facts about the host

0 commit comments

Comments
 (0)