Skip to content

Commit b90e12b

Browse files
authored
[obs] re-enable regular not active alerts (#18341)
* [obs] Add back critical regular not active alerts Related to ENG-15 Now that we have related data, we should resume triggering alerts if the data condition occurs. * [obs] Fix runbook_url for GitpodImageBuildDurationAnomaly Was getting 404 * [obs] Fix GitpodWorkspaceTooManyRegularNotActiveMk2 given https://www.gitpodstatus.com/incidents/bsrqgmsxw1gr * [obs] share why regular not active is excluded from Dedicated * [obs] consolidate runbook for regular not active alerts
1 parent d2b220f commit b90e12b

File tree

2 files changed

+12
-8
lines changed

2 files changed

+12
-8
lines changed

operations/observability/mixins/workspace/rules/central/image-builder.yaml

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -18,7 +18,7 @@ spec:
1818
severity: critical
1919
dedicated: included
2020
annotations:
21-
runbook_url: https://github.com/gitpod-io/runbooks/blob/main/runbooks/GitpodImageBuildDurationAnomaly.md
21+
runbook_url: https://github.com/gitpod-io/runbooks/blob/main/runbooks/GitpodImagebuildDurationAnomaly.md
2222
summary: image-builder duration is unusually high in cluster {{ $labels.cluster }}
2323
description: Users are waiting too long for image builds
2424
expr: |

operations/observability/mixins/workspace/rules/satellite/workspaces.yaml

Lines changed: 11 additions & 7 deletions
Original file line numberDiff line numberDiff line change
@@ -21,26 +21,30 @@ spec:
2121
rules:
2222
- alert: GitpodWorkspaceTooManyRegularNotActiveMk2
2323
labels:
24-
severity: warning
24+
severity: critical
25+
# TODO: uncomment after recording rule import is working in Grafana Cloud
26+
# dedicated: included
2527
for: 10m
2628
annotations:
27-
runbook_url: https://github.com/gitpod-io/runbooks/blob/main/runbooks/GitpodWorkspaceTooManyRegularNotActive.md
29+
runbook_url: https://github.com/gitpod-io/runbooks/blob/main/runbooks/GitpodWorkspaceRegularNotActive.md
2830
summary: too many running but inactive workspaces
2931
description: too many running but inactive workspaces.
3032
expr: |
31-
gitpod_workspace_regular_not_active_percentage_mk2 > 0.08
33+
sum(gitpod_workspace_regular_not_active_percentage_mk2) by(cluster) > 0.08
3234
AND
3335
sum(gitpod_ws_manager_mk2_workspace_activity_total) by(cluster) > 25
3436
3537
- alert: GitpodWorkspacesNotStartingMk2
3638
labels:
37-
severity: warning
39+
severity: critical
40+
# TODO: uncomment after recording rule import is working in Grafana Cloud
41+
# dedicated: included
3842
for: 10m
3943
annotations:
40-
runbook_url: https://github.com/gitpod-io/runbooks/blob/main/runbooks/GitpodWorkspaceNotStarting.md
44+
runbook_url: https://github.com/gitpod-io/runbooks/blob/main/runbooks/GitpodWorkspaceRegularNotActive.md
4145
summary: workspaces are not starting.
4246
description: inactive regular workspaces exists but workspaces are not being started.
4347
expr: |
44-
avg_over_time(gitpod_workspace_regular_not_active_percentage_mk2[1m]) > 0
48+
sum by(cluster) (avg_over_time(gitpod_workspace_regular_not_active_percentage_mk2[1m]) > 0)
4549
AND
46-
rate(gitpod_ws_manager_mk2_workspace_startup_seconds_sum{type="Regular"}[1m]) == 0
50+
sum by(cluster) (rate(gitpod_ws_manager_mk2_workspace_startup_seconds_sum{type="Regular"}[1m])) == 0

0 commit comments

Comments
 (0)