Skip to content

Commit 9ce3618

Browse files
Merge pull request #858 from rabbitmq/document-health-checks
Document new health check endpoints
2 parents 07ec5b6 + 4fd2ec4 commit 9ce3618

File tree

1 file changed

+88
-20
lines changed

1 file changed

+88
-20
lines changed

deps/rabbitmq_management/priv/www/api/index.html

Lines changed: 88 additions & 20 deletions
Original file line numberDiff line numberDiff line change
@@ -951,42 +951,110 @@ <h2>Reference</h2>
951951
<td></td>
952952
<td class="path">/api/aliveness-test/<i>vhost</i></td>
953953
<td>
954-
Declares a test queue, then publishes and consumes a
955-
message. Intended for use by monitoring tools. If everything
956-
is working correctly, will return HTTP status 200 with
957-
body: <pre>{"status":"ok"}</pre> Note: the test queue will
958-
not be deleted (to prevent queue churn if this is
959-
repeatedly pinged).
954+
Declares a test queue on the target node, then publishes and consumes a
955+
message. Intended to be used as a very basic health check.
956+
Responds a 200 OK if the check succeeded,
957+
otherwise responds with a 503 Service Unavailable.
960958
</td>
961959
</tr>
962960
<tr>
963961
<td>X</td>
964962
<td></td>
965963
<td></td>
966964
<td></td>
967-
<td class="path">/api/healthchecks/node</td>
965+
<td class="path">/api/health/checks/alarms</td>
968966
<td>
969-
Runs basic healthchecks in the current node. Checks that the rabbit
970-
application is running, channels and queues can be listed successfully, and
971-
that no alarms are in effect. If everything is working correctly, will
972-
return HTTP status 200 with body: <pre>{"status":"ok"}</pre> If
973-
something fails, will return HTTP status 200 with the body of
974-
<pre>{"status":"failed","reason":"string"}</pre>
967+
Responds a 200 OK if there are no alarms in effect in the cluster,
968+
otherwise responds with a 503 Service Unavailable.
975969
</td>
976970
</tr>
977971
<tr>
978972
<td>X</td>
979973
<td></td>
980974
<td></td>
981975
<td></td>
982-
<td class="path">/api/healthchecks/node/<i>node</i></td>
976+
<td class="path">/api/health/checks/local-alarms</td>
983977
<td>
984-
Runs basic healthchecks in the given node. Checks that the rabbit
985-
application is running, list_channels and list_queues return, and
986-
that no alarms are raised. If everything is working correctly, will
987-
return HTTP status 200 with body: <pre>{"status":"ok"}</pre> If
988-
something fails, will return HTTP status 200 with the body of
989-
<pre>{"status":"failed","reason":"string"}</pre>
978+
Responds a 200 OK if there are no local alarms in effect on the target node,
979+
otherwise responds with a 503 Service Unavailable.
980+
</td>
981+
</tr>
982+
<tr>
983+
<td>X</td>
984+
<td></td>
985+
<td></td>
986+
<td></td>
987+
<td class="path">/api/health/checks/certificate-expiration/<i>within</i>/<i>unit</i></td>
988+
<td>
989+
<p>
990+
Checks the expiration date on the certificates for every listener configured to use TLS.
991+
Responds a 200 OK if all certificates are valid (have not expired),
992+
otherwise responds with a 503 Service Unavailable.
993+
</p>
994+
<p>
995+
Valid units: days, weeks, months, years. The value of the <i>within</i> argument is the number of
996+
units. So, when <i>within</i> is 2 and <i>unit</i> is "months", the expiration period used by the check
997+
will be the next two months.
998+
</p>
999+
</td>
1000+
</tr>
1001+
<tr>
1002+
<td>X</td>
1003+
<td></td>
1004+
<td></td>
1005+
<td></td>
1006+
<td class="path">/api/health/checks/port-listener/<i>port</i></td>
1007+
<td>
1008+
Responds a 200 OK if there is an active listener on the give port,
1009+
otherwise responds with a 503 Service Unavailable.
1010+
</td>
1011+
</tr>
1012+
<tr>
1013+
<td>X</td>
1014+
<td></td>
1015+
<td></td>
1016+
<td></td>
1017+
<td class="path">/api/health/checks/protocol-listener/<i>protocol</i></td>
1018+
<td>
1019+
Responds a 200 OK if there is an active listener for the given protocol,
1020+
otherwise responds with a 503 Service Unavailable. Valid protocol names are: amqp091, amqp10, mqtt, stomp, web-mqtt, web-stomp.
1021+
</td>
1022+
</tr>
1023+
<tr>
1024+
<td>X</td>
1025+
<td></td>
1026+
<td></td>
1027+
<td></td>
1028+
<td class="path">/api/health/checks/virtual-hosts</td>
1029+
<td>
1030+
Responds a 200 OK if all virtual hosts and running on the target node,
1031+
otherwise responds with a 503 Service Unavailable.
1032+
</td>
1033+
</tr>
1034+
<tr>
1035+
<td>X</td>
1036+
<td></td>
1037+
<td></td>
1038+
<td></td>
1039+
<td class="path">/api/health/checks/node-is-mirror-sync-critical</td>
1040+
<td>
1041+
Checks if there are classic mirrored queues without synchronised mirrors online
1042+
(queues that would potentially lose data if the target node is shut down).
1043+
Responds a 200 OK if there are no such classic mirrored queues,
1044+
otherwise responds with a 503 Service Unavailable.
1045+
</td>
1046+
</tr>
1047+
<tr>
1048+
<td>X</td>
1049+
<td></td>
1050+
<td></td>
1051+
<td></td>
1052+
<td class="path">/api/health/checks/node-is-quorum-critical</td>
1053+
<td>
1054+
Checks if there are quorum queues with minimum online quorum (queues that
1055+
would lose their quorum and availability if the target node is shut down).
1056+
Responds a 200 OK if there are no such quorum queues,
1057+
otherwise responds with a 503 Service Unavailable.
9901058
</td>
9911059
</tr>
9921060
<tr>

0 commit comments

Comments
 (0)