Taints and tolerations
When we run a cluster that we want to deploy multiple environments on, for example dev, testing, staging, production. Then we want our production server to be the strongest server with the most resources, while the dev server only has few resources. So when deploying, we will discover that we have a need that we do not want dev pods to be deployed to worker node production and vice versa.
Besides the scheduler automatically choosing which worker node to deploy the pod to, we can also control that job. These operations are called advanced scheduling, and in this article we will talk about techniques by which we can assign pods to the exact worker node we want. The first part looked at two properties that will help us in restricting pods deployed to the node.
Taints and tolerations
The first two advanced scheduling features we'll look at are taints and tolerations.
Taints are used to prohibit a pod from being deployed to a worker node that has taints. We will use the command to type taints into a worker node. For example, if we have a worker node cluster running a production environment, we will type taints on that worker node cluster, and the pod will no longer be able to be deployed to this worker node cluster. But if that's the case, then no pods can be deployed to this cluster, so wouldn't the worker nodes we type taints become redundant? The answer will be no.
Tolerations are used to assign to pods, and pods that are assigned tolerations that satisfy the taints condition will be deployed to worker nodes that have taints assigned. For example, for the worker node production cluster above, we assign taints to it, then we will assign tolerations to the production pods, now only production pods with appropriate tolerations can be deployed to the worker node production cluster. , but pod dev does not.
To make it easier to understand, we will take a look at some of the default taints and tolerations of kubernetes.
Node's Taints
In kubernetes cluster, if you pay attention, you will see that by default our pods cannot be deployed to the master node, but only system pods when we create a kubernetes cluster can be deployed to the master node. node. The reason is because the master node has been taints, and system pods are assigned with tolerations that match the master node's taints, so only system pods can be deployed to the master node.
For example, I have an environment deployed using the kubeadm tool .
We can check the taints of the master node as follows:
In the Taints field, you will see it has a value of node-role.kubernetes.io/master:NoSchedule
, this is the default taints typed into the master node. Taints will have the following format <key>=<value>:<effect>
, with the value above we will have key as role.kubernetes.io/master
, value as null, and effect as NoSchedule.
This taints value will prohibit normal pods from being able to deploy to it, and only system pods with the tolerations value role.kubernetes.io/master=:NoSchedule
can deploy to it.
Pod's tolerations
You list the pods located in the namespace kube-system, and check to see if their tolerations are correct as we said.
These are the system pods created when I use the kubeadm tool. Let's describe the pod kube-apiserver-kube-master to see its tolerations:
You will see that in the Tolerations field it has a value of node-role.kubernetes.io/master=:NoSchedule
, this is the tolerations value so this pod can be deployed to the master node. You will notice that in tolerations we have an additional = sign, so taints and tolerations will have different ways of displaying this value when value = null.
Taint effect
In taints and tolerations, the value of key and value is quite easy to understand, they are just strings. But the effect values ​​are different, we will have the following effect values:
NoSchedule: when this value is assigned to taints, it will not allow pods to be deployed to it.
PreferNoSchedule: this value has the same effect as NoSchedule, but the difference is that if the Pod cannot be deployed to any node, but the node with the effect of PreferNoSchedule has enough resources to run the pod, then the pod can be deployed. deploy to it.
NoExecute: unlike NoSchedule and PreferNoSchedule, which only take effect when the pod is scheduled, this effect value will affect all pods running on the node. If the pod does not have tolerations that match the taints, it will also be removed. that node. For example, we have a pod running on a woker node, then we taints that node with the NoExecute effect, then any pods running on that node without appropriate tolerations will be removed from the pod.
Add Taints to node
As mentioned above, we can add taints to a node, take the example of the dev and production environment above, the production worker node cluster we will type taints for it as follows:
Now when we create the pod, we will see that the entire pod will be deployed to the worker node dev.
To deploy the pod to the production node, we add tolerations to it, create a file called production-deployment.yaml with the following configuration:
We create this deployment and list its pods, we will see that there are pods that will be deployed to the production node.
Understand how to use Taints and tolerations
A node can have many taints and a pod can also declare many tolerations. Taints have key and effect values ​​that are mandatory, while value is optional. We specify tolerations for the pod with the Operator being Equal (default) or Exists.
The first way we use taints and tolerations is the example above of a kubernetes cluster with multiple environments.
The second way we can use taints and tolerations is to configure when a node dies, how often pods on that node will be rescheduled and deployed to another node, using the NoExecute effect. For example, when we describe a pod, we can see that it has some default tolerations as follows:
This value means that when kubernetes Control Plane detects a dead node, it will give taints to that node node.kubernetes.io/not-ready:NoExecute
, and then pod tolerations on this dead node are only valid for 300s, after this time tolerations will be is removed from the current pod, and thus it is removed from the node because there are no longer suitable tolerations.
Last updated