Skip to content

Support EESSI #252

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
merged 16 commits into from
May 3, 2023
Merged
6 changes: 6 additions & 0 deletions .github/workflows/stackhpc.yml
Original file line number Diff line number Diff line change
Expand Up @@ -99,6 +99,12 @@ jobs:
. environments/.stackhpc/activate
ansible-playbook -vv ansible/adhoc/hpctests.yml

- name: Run EESSI tests
run: |
. venv/bin/activate
. environments/.stackhpc/activate
ansible-playbook -vv ansible/ci/check_eessi.yml

- name: Confirm Open Ondemand is up (via SOCKS proxy)
run: |
. venv/bin/activate
Expand Down
9 changes: 9 additions & 0 deletions ansible/bootstrap.yml
Original file line number Diff line number Diff line change
Expand Up @@ -96,6 +96,15 @@
tasks_from: config.yml
tags: config

- name: Setup EESSI
hosts: eessi
tags: eessi
become: true
tasks:
- name: Install and configure EESSI
import_role:
name: eessi

- hosts: update
gather_facts: false
become: yes
Expand Down
26 changes: 26 additions & 0 deletions ansible/ci/check_eessi.yml
Original file line number Diff line number Diff line change
@@ -0,0 +1,26 @@
---
- name: Run EESSI test job
hosts: login # TODO: Limit this to single node when there are multiple?
become: true
become_user: testuser
tasks:
- name: Clone eessi-demo repo
ansible.builtin.git:
repo: "https://github.com/eessi/eessi-demo.git"
dest: ~/eessi-demo

- name: Run test job
ansible.builtin.shell:
cmd: |
source /cvmfs/pilot.eessi-hpc.org/latest/init/bash
srun ./run.sh
chdir: ~/eessi-demo/TensorFlow
executable: /bin/bash
register: job_output

- name: Fail if job output contains error
fail:
# Note: Job prints live progress bar to terminal, so use regex filter to remove this from stdout
msg: "Test job using EESSI modules failed. Job output was: {{ job_output.stdout | regex_replace('\b', '') }}"
when: '"Epoch 5/5" not in job_output.stdout'

6 changes: 3 additions & 3 deletions ansible/ci/check_sacct_hpctests.yml
Original file line number Diff line number Diff line change
Expand Up @@ -2,7 +2,7 @@
gather_facts: false
become: true
vars:
sacct_stdout_expected: |- # based on CI running hpctests as the first job - NB note no trailing newline
sacct_stdout_expected: |- # based on CI running hpctests as the first job
JobID,JobName,State
1,pingpong.sh,COMPLETED
2,pingmatrix.sh,COMPLETED
Expand All @@ -18,10 +18,10 @@
register: sacct
- name: Check info for ended jobs
assert:
that: sacct.stdout == sacct_stdout_expected
that: sacct_stdout_expected in sacct.stdout
fail_msg: |
Expected:
--{{ sacct_stdout_expected }}--
Got:
--{{ sacct.stdout }}--
success_msg: sacct shows hpctests jobs as first and only jobs
success_msg: sacct shows hpctests jobs as first jobs in list
3 changes: 3 additions & 0 deletions ansible/roles/eessi/defaults/main.yaml
Original file line number Diff line number Diff line change
@@ -0,0 +1,3 @@
---
# Default to 10GB
cvmfs_quota_limit_mb: 10000
22 changes: 22 additions & 0 deletions ansible/roles/eessi/tasks/main.yaml
Original file line number Diff line number Diff line change
@@ -0,0 +1,22 @@
---
- name: Install CVMFS repo
yum:
name: https://ecsft.cern.ch/dist/cvmfs/cvmfs-release/cvmfs-release-latest.noarch.rpm
# NOTE: Can't find any docs on obtaining gpg key
disable_gpg_check: true
- name: Install CVMFS
yum:
name: cvmfs
- name: Install EESSI CVMFS config
yum:
name: https://github.com/EESSI/filesystem-layer/releases/download/latest/cvmfs-config-eessi-latest.noarch.rpm
# NOTE: Can't find any docs on obtaining gpg key
disable_gpg_check: true
- name: Add base CVMFS config
template:
src: default.local.j2
dest: /etc/cvmfs/default.local
# NOTE: Not clear how to make this idempotent
- name: Ensure CVMFS config is setup
command:
cmd: "cvmfs_config setup"
2 changes: 2 additions & 0 deletions ansible/roles/eessi/templates/default.local.j2
Original file line number Diff line number Diff line change
@@ -0,0 +1,2 @@
CVMFS_CLIENT_PROFILE="single"
CVMFS_QUOTA_LIMIT={{ cvmfs_quota_limit_mb }}
3 changes: 3 additions & 0 deletions environments/common/inventory/groups
Original file line number Diff line number Diff line change
Expand Up @@ -13,6 +13,9 @@ login
control
compute

[eessi:children]
# Hosts on which EESSI stack should be configured

[hpctests:children]
# Login group to use for running mpi-based testing.
login
Expand Down
3 changes: 3 additions & 0 deletions environments/common/layouts/everything
Original file line number Diff line number Diff line change
Expand Up @@ -54,3 +54,6 @@ compute

[etc_hosts]
# Hosts to manage /etc/hosts e.g. if no internal DNS. See ansible/roles/etc_hosts/README.md

[eessi:children]
openhpc