Smooth Transition to Kubernetes
You can visit Ephemeral Pods in Kubernetes If you are not familiar with a Pod and please check the sample project which is used in this post. In this post, we’re going to focus on a Replication Controller in Kubernetes.
After creating a Pod, It should always be accessible and should remain healthy without any manual intervention. There are tons of traditional ways (expensive ones) to achieve that. But our scope is the Kubernetes and we should somehow utilize the Kubernetes feature. So, some resources in the Kubernetes should manage the Pods on behalf of us. At that exact point, a ReplicationController comes into play to ensure its pods are always kept running.
What is ReplicationController?
Just think, what can you do in case of a Pod failure or a node failure or, maybe, you have to shut down a node for maintenance or, maybe, you want to create a new Pod due to performance issue and so on? If you create a Pod manually or via a Jenkins job without a Replication controller, you have to do manual thingy to keep it running. We all have already known pains 🙂 . To get rid of pain, we need to hand over this to a Replication Controller to manages a set of pods and make sure that our pods are in the desired state. ( current state vs the desired state. I’ll explain later on ) . In brief, for high resiliency, a Replication Controller is in harness.
Side Note: Actually, a Deployment Resource is preferred over a Replication Controller, but I’m going to touch the Deployment Resource in another post.
How to use a ReplicationController?
Please check the sample replication controller YAML file:
To hand over Pod management to a Replication Controller, we should use a “selector” like above that will catch the related Pod. Let’s apply it and check it ( we will see three Pods after executing the command below due to “replicas:3”).
$ kubectl apply -f https://raw.githubusercontent.com/ErhanCetin/k8s-smooth-transition/develop/k8s/medium-post/replication-controller/news-tracker-job-rc.yaml$ kubectl get pods -o wide
Here, we have three Pods, all belong to the “news producer” application. When you check the node column, you will see the minikube. Because we have just one node with minikube installation in our local. If we would have more than one node, probably ( not sure ), we will see another node name. I wrote down “not sure” because the Kubernetes decide which node is suitable for the new pods by doing calculations ( more details in the below section). All three Pods can be in one node or can be distributed across the nodes.
How to work a ReplicationController?
Before talking about a Replication Controller, I want to give some details about the aiming of the Controllers ( ReplicaSet, Deployment, Namespace, Stateful Controllers, etc.) in the Kubernetes:
- The Controllers register themselves through the API Server to be notified of interesting resource changes.
- The Controllers watch the API Server for changes to resources ( Pods, Namespaces, ReplicaSets, Deployments, Services, etc.) and take actions for each change like the creation of a new object or an update, deletion of an existing object.
- The Controllers run the reconciliation loop to reconcile the actual (or current) state with the desired state (specified in the resource’s spec section) and write the new actual state to the resource’s status section.
- The Controllers don’t know any other controllers exist and never talk to each other directly. Each controller connects to the API server and, through the watching mechanism, asks to be notified when a change occurs in the list of resources of any type which the controller is responsible for.
As I mentioned above that a ReplicationController should ensure its pods are always kept running. Let’s test this :
$ kubectl get pods -o wide
I’m going to delete one of them to simulate a sample fail scenario:
$ kubectl delete pods newsproducer-pod-q2p89
$ kubectl get pods -o wide
And now, newsproducer-pod-nj7bp is automatically created by the Replication Controller. But, a small reminder, if the whole node fails, the pods on the node are lost and will not be replaced with new ones.
To understand how the Kubernetes creates a new Pod when it’s failed, deleted, or scaling up, we need to dive into the event system in the Kubernetes :
The Kubernetes system components communicate only with the API server. They don’t talk to each other directly and the only API Server can directly talk to etcd. There are reasons behind that, but this is not a topic here. When the Kubernetes is up, all component registers itself through the API Server. That means A Master Node component (Controller Manager, Scheduler … ) can request to be notified when a resource is created, modified, or deleted ( watching mechanism, see figure above).
The Replication Controller resource manifests ( e.g. YAML file) are stored in etcd through the API Server as well as all resources you’ve created. Then, in case of failure or modification, that manifests are being used. Let’s make a small list when you execute a Replication Controller YAML through kubectl (not with overall steps) :
- Steps 1,2, and 3: The “kubectl” sends the manifest (e.g YAML file ) to the API Server, then, after the API Server makes some validation, authorization, authentication, and modification of the manifest, it writes the manifest to the etcd. The Replication Controller is notified and it writes a new Pod definition ( in case of needing a new pod ) through API Server to etcd.
- Step 4: The Scheduler gets notified and fetches the interested Pod definition. After calculation based on computational resources to schedule a target node, it put the node info into the Pod definition and writes it to etcd through API Server.
- Step 5 and 6: Finally, The API server then notifies the Kubelet ( through the watch mechanism ) which is assigned a new Pod by the Scheduler creates actual Pods on the node which the Kubelet works on. As soon as the Kubelet on the target node sees the pod has been scheduled to its node, it creates and runs the pod’s containers.
When a Pod fails :
When a pod fails, the controller is notified to recheck the desired vs. actual replica count of resources which it is responsible for. A Replication Controller constantly checks the list of running pods to makes sure the “desired” replica count is equal “actual (current)” replica count ( see figure below ). The main idea is here to reconcile the actual with the desired number which means the controller finds the number of pods matching its pod selector and compares the number to the desired replica count in each iteration. If too many pods are running, it removes them, otherwise, can create new ones. Let’s say a new Pod will be created. In this case, It will write a new Pod definition ( extract from a template in ReplicationController YAML file ) then the steps above will start over.
Useful Replication Controller Commands :
$ kubectl get rc -n <your namespace> # get rc list
$ kubectl describe rc <your-rc-name> # get detail of rc# the command below also delete pods which a replication controller is responsible for.If you don’t want to delete pod at the same time, you can add —cascade=false flag.
$ kubectl delete rc <your-rc-name>
We can, also, hand over the “Rolling Update” process of Pods. I don’t want to give details here. Because I will expand this process in the Deployment Resource post.
See you there.