Restarting a machine in a Kubernetes cluster¶

NOTE¶

Know which kind of machine is going to be restarted
1. control plane (api-server, controllers, etc.)
2. node (runs actual workload, e.g. Brig or Webapp)
3. a and b combined
The kind of machine in question must be deployed redundantly
Take out machines in a rolling fashion (sequentially, one at a time)

Control plane¶

Depending on whether etcd is hosted on the same machine alongside the control plane (common practise), you need to take its implications into account (see How to rolling-restart an etcd cluster) when restarting a machine.

Regardless of where etcd is located, before turning off any machine that is part of the control plane, one should back up the cluster state.

If a part of the control plane does not run sufficiently redundant, it is advised to prevent any mutating interaction during the procedure, until the cluster is healthy again.

1	`kubectl get nodes`

Node¶

High-level steps:¶

Drain the node so that all workload is rescheduled on other nodes
Restart / Update / Decommission
Mark the node as being schedulable again (if not decommissioned)

For more details please refer to the official documentation: Safely Drain a Node