-
Notifications
You must be signed in to change notification settings - Fork 34
Replace opendistro #197
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Replace opendistro #197
Conversation
It is leaving massive gaps between jobs: ETA: fixed by 3a25308 |
slurm-stats datasource was not getting the "database" (=index) set, hence in opensearch which adds additional 'security-auditlog*' indices not present in opendistro, the dashboard query was returning non-slurm-stats documents without the fields expected => empty rows
3a25308 fails on checking that the hpctests jobs exist in grafana/opensearch. Turns out neither dashboard nor datasources have been provisioned in grafana after rebuilding control node with packer-built image (although grafana is running). Possibly this has always been broken, just not checked for until this PR. direct configuration:
control image build:
|
Note certs have a hardcoded 2yr life. |
@m-bull I tried using |
FIXED: that merge won't be right as we need an image using updated grafana etc. |
Ticket: https://stackhpc.atlassian.net/browse/DEV-855
OpenDistro is EOL.
This PR:
Replaces OpenDistro with OpenSearch.
Updates
filebeat
to the newest-supported version.Adds the required version faking to enable filebeat.
Configures important opensearch settings for production use.
Updates Grafana version
Updates the opensearch Grafana datasource plugin definition
Removes the appliances
grafana-datasources
role as we can use grafana's provisioning mode with thecloudalchemy.grafana
rather than requiring a customised API-based approach. <-- TODO CHECK: think this was merged already.Adds a test in CI that expected jobs from the hpctests runs are found via Grafana (NB: for slurm-stats this has to be 5 mins past job completion, so may add some delay)
Changes the storage used for open{distro,search} from a podman volume to a host directory to enable easier future upgrades/migration/backups.
Adds a playbook
ansible/adhoc/migrate-opendistro.yml
to migrate opendistro data to opensearch (checked by upgrading a running cluster frommain
7bcacb0
)Uses a new "prebuilt" image in arcus with the updated Grafana version (note actually CI was using this before this PR, so grafana's been getting downgraded during CI deployments)
Merge workaround for
ohpc-base-compute
dependency on singularity.Update image build with correct grafana version, in a PR, and test that.
TODO: fix/document migration - currently it will always run if opendistro service even exists.
[ ]
Once merged and passed:
Move appropriate image to release bucket
Closes #70.