Terminator for Kubernetes Cluster

This post going to be short and it’s gonna address a specific issue for the Kubernetes development cluster you may also face.

Issue:

You have a cluster for development so it used by CI/CD chain. During working hours, the cluster will scale up/down depending on the build jobs. I assume you have already enabled auto-scaling for your cluster. In some cases, some pods which run as a runner for CI/CD chain, stay in a running state always. Since runner pods are technically healthy, they will not be killed or remove automatically by CI configuration unless your system owns a kind of feature.

In this case, these pods will run always until it manually removed from the cluster. This hinders the cluster to scale down out of working time.

Solution:

We have set a cronjob that runs periodically and checks if a pod older than the time we specified then cronjob will terminate these pods. I called it terminator.

what we need:

  • kubectl installed docker image
  • service account
  • Role/Rolebinding stuff for the service account
  1. Step: Create a cronjob

Kubernetes CronJob is a K8S object for performing periodic tasks such as backups, notifications, or a task for terminator:) You have to simply specify when CronJob performs the task then it will at that time. You can set time slices minutely, hourly, daily or weekly, and so on.

CronJob fits our needs because we need every day or every 5 hours the cluster has to be checked if any pod running longer than (let’s say 5 hours) 5 hours then the pod should be removed.

here is our cronjob YAML file

apiVersion: batch/v1beta1
kind: CronJob
metadata:
name: terminator
spec:
schedule: “0 * * * *” #every hour
jobTemplate:
spec:
template:
spec:
serviceAccountName: terminator #is not created but soon
containers:
— name: terminator
image: bitnami/kubectl #trusted image
command:
— /bin/sh
— -c
— kubectl get pods -o go-template — template ‘{{range .items}}{{.metadata.name}} {{.metadata.creationTimestamp}}{{“\n”}}{{end}}’ | awk ‘$2 <= “‘$(date -d’now-5 hours’ -Ins — utc | sed ‘s/+0000/Z/’)’” { print $1 }’ | xargs — no-run-if-empty kubectl delete pod;
imagePullPolicy: IfNotPresent
restartPolicy: OnFailure

In the YAML file above, we have the CronJob with a custom named terminator and schedule for every hour so every time on 0 minutes, the job will be triggered. We use the image from bitnami but sure you could use your custom image as well. The key command is:

kubectl get pods -o go-template — template ‘{{range .items}}{{.metadata.name}} {{.metadata.creationTimestamp}}{{“\n”}}{{end}}’ | awk ‘$2 <= “‘$(date -d’now-5 hours’ -Ins — utc | sed ‘s/+0000/Z/’)’” { print $1 }’ | xargs — no-run-if-empty kubectl delete pod;

really long command, it gets all pods and goes through then deletes the pods older than 5 hours respectively.

2. Step: We have to create a service account that kubectl container which will be created by CronJob, need the credentials to access your cluster. For that we will use the service account.

apiVersion: v1
kind: ServiceAccount
metadata:
name: terminator

3. Step: We have created the service account but let’s arm the service account with the required credentials. We have to create a role and bind it.

apiVersion: rbac.authorization.k8s.io/v1
kind: Role
metadata:
name: modify-pods
rules:
— apiGroups: [“”]
resources:
— pods
verbs:
— get
— list
— delete

we have defined which resources will be accessible(pod) and what is allowed to be performed(get, list, delete).

Finally bind the role:

apiVersion: rbac.authorization.k8s.io/v1
kind: RoleBinding
metadata:
name: modify-pods-to-sa
subjects:
- kind: ServiceAccount
name: terminator-4-runner
roleRef:
kind: Role
name: modify-pods
apiGroup: rbac.authorization.k8s.io

Now we are done with the configuration, let’s play it.

As you see the pod named web-79d88c97d6–7h5hp runs for 5 days. Two minutes later CronJob started the terminator container, 20 seconds later web-79d88c97d6–7h5hp pod has terminated. However it has started again, but this because I deployed as deployment, so K8S keeps it alive. The essential part is cron will run periodically and terminate pods that older than the time you specified. This is not for deployment, hence it is for CI/CD cluster that runs pods as a runner.

Here is all in one yaml file -> as gist yaml file

NOTES:

For some Kubernetes objects the apiVersion might be different depends on your Kubernetes version.

Computer Science And DevOps things :)