Restarting a machine in a Kubernetes cluster¶
Note
Know which kind of machine is going to be restarted
control plane (api-server, controllers, etc.)
node (runs actual workload, e.g. Brig or Webapp)
a and b combined
The kind of machine in question must be deployed redundantly
Take out machines in a rolling fashion (sequentially, one at a time)
Control plane¶
Depending on whether etcd is hosted on the same machine alongside the control plane (common practise), you need to take its implications into account (see How to rolling-restart an etcd cluster) when restarting a machine.
Regardless of where etcd is located, before turning off any machine that is part of the control plane, one should back up the cluster state.
If a part of the control plane does not run sufficiently redundant, it is advised to prevent any mutating interaction during the procedure, until the cluster is healthy again.
kubectl get nodes
Node¶
High-level steps:
Drain the node so that all workload is rescheduled on other nodes
Restart / Update / Decommission
Mark the node as being schedulable again (if not decommissioned)
For more details please refer to the official documentation: Safely Drain a Node