Jump to content

Portal:Toolforge/Admin/Runbooks/ToolforgeKubernetesNodeNotReady

From Wikitech
The printable version is no longer supported and may have rendering errors. Please update your browser bookmarks and please use the default browser print function instead.
The procedures in this runbook require admin permissions to complete.

The ToolforgeKubernetesCapacity alert fires when a Toolforge Kubernetes node is marked as not ready. A paging alert also fires when at least 5 nodes are marked as not ready.

Debugging

On a bastion run as your own user:

$ kubectl sudo get node
$ kubectl sudo describe node <node>

Support contacts

Old incidents