We have seen sidecar and init containers are in last article, in this will learn how to schedule Jobs and cronjobs in Kubernetes cluster.
Kubernetes jobs and cronjobs are Kubernetes objects that are primarily meant for short-lived and batch workloads. Let’s see in details with in below steps.
Types of Jobs in Kubernetes
There are two types of jobs in Kubernetes, here they are,
- Schedulers (CronJob) – It’s like scheduling tasks in crontab in Linux.
- Run to Completion – It runs the Job in parallel by creating one or more pods for the successful completion
A Kubernetes job creates one or more Pods and will continue to retry execution of the Pods until a specified number of them successfully terminate. As pods are completed, the Job tracks the successful completions. When a specified number of successful completions is reached, the task aka Job is completing. Deleting a Job will clean up the Pods it created.
A simple case is to create one Job object to reliably run one Pod to completion. The Job object will start a new Pod if the first Pod fails or is deleted (for example due to a node hardware failure or a node reboot).
You can also use a Job to run multiple Pods in parallel, you will have control to limit the count.
When a specified number of pods completed, the job itself is complete. If the first pod fails or is deleted, the Job controller will start a new Pod.
The job is designed for parallel processing of independent but related work items like sending emails, transcoding files, scanning database keys, etc. Find the example on parallel job section.
Here is the manifest for a Job:
---
apiVersion: batch/v1
kind: Job
metadata:
name: example-job
spec:
template:
metadata:
name: example-job
spec:
containers:
-
args:
- "-Mbignum=bpi"
- "-wle"
- "print bpi(2000)"
command:
- perl
image: perl
name: pi
restartPolicy: Never
Create a Job:
# kubectl apply -f example-job.yaml job.batch "example-job" created
Display your jobs:
# kubectl get jobs NAME COMPLETIONS DURATION AGE example-job-27540900 1/1 2s 4m33s
Get details of a job:
# kubectl describe job
Edit a job:
# kubectl edit job
Delete a job:
# kubectl delete job
Running Multiple Job Pods in Parallel
Sometimes we need to create a parallel job to accomplish some tasks, for that lets create sample multiple-jobs file and check how it goes.
apiVersion: batch/v1
kind: Job
metadata:
generateName: kube-jobs-
name: kube-parallel-job
labels:
jobgroup: kubecron-group
spec:
completions: 3
parallelism: 2
template:
metadata:
name: kube-parallel-job
labels:
jobgroup: kubecron-group
spec:
containers:
- name: busybox
image: busybox
command: ["echo" , "kubernetes jobs parallel"]
restartPolicy: OnFailure
Let’s Understand the parameters used in above file. Running multiple jobs with the same name will cause an error reporting that the job with the same name already exists. To fix this issue, we should add the generateName field in the metadata section. So, when the Job is executed, it will create the pods with prefix kube-jobs– and with numbers.
completions – the no. of pods that can be used for the successful completion. restartPolicy – accepts always, Never, OnFailure.
As the jobs are intended to run pods till completion, we should use never and onFailure for restartPolicy.
Run the kubectl create, get, delete the multiple-jobs.yaml to see what it does.
Cron Jobs
A CronJob object is just like an entry in crontab in Unix/Linux. It runs a job periodically on a given schedule. You need a working Kubernetes cluster at version >= 1.8 (for CronJob).
For previous versions of the cluster (< 1.8) you need to explicitly enable batch/v2alpha1 API by passing — runtime-config=batch/v2alpha1=true to the API server.
Here is the manifest for Cronjob
---
apiVersion: batch/v1beta1
kind: CronJob
metadata:
name: hello
spec:
jobTemplate:
spec:
template:
spec:
containers:
-
args:
- /bin/sh
- "-c"
- "date; echo Hello FoxuTech, from the your AKS cluster"
image: busybox
name: hello
restartPolicy: OnFailure
schedule: "*/1 * * * *"
Create a Cron Job
# kubectl create -f cronjob.yaml
cronjob.batch “hello” created
# kubectl get cronjobs
NAME SCHEDULE SUSPEND ACTIVE LAST SCHEDULE AGE
hello */5 * * * * False 0 32s
Get details of a cronjob:
# kubectl describe cronjob
Edit a cronjob:
# kubectl edit cronjob
Delete a cronjob:
# kubectl delete cronjob
Writing a Cron Job Spec
As with all other Kubernetes configs, a cron job needs apiVersion, kind, and metadata fields.
Schedule
The .spec.schedule is a required field of the .spec. It takes a Cron format string, such as 0 * * * * or @hourly, as schedule time of its jobs to be created and executed.
restartPolicy – accepts always, Never, OnFailure.
As the jobs are intended to run pods till completion, we should use never and onFailure for restartPolicy.
The additional parameters can be used while creating a Cron job as follows.
apiVersion: batch/v1beta1
kind: CronJob
metadata:
name: test-job
spec:
schedule: "*/5 * * * *"
concurrencyPolicy: Forbid
successfulJobsHistoryLimit: 5
failedJobsHistoryLimit: 5
jobTemplate:
spec:
template:
spec:
containers:
- name: hello
image: busybox
command : ["echo", "Hello Kubernetes Job"]
restartPolicy: OnFailure
Concurrency Policy: is responsible for parallel jobs.
The following concurrency policy can be used as follows,
Allow – which by Default allow the cron job to run concurrently
Forbid – Doesn’t allow concurrent jobs.
Replace – The old job will be replaced with the new job if the old job is not completed on time.
Job History Limits: such as successfulJobHistoryLimit =3 and failedJobsHistoryLimit=1 are optional fields.
It refers how many successful and failed job history can be present on the cluster. Set the limit to 0, There won’t be any job history
backoffLimit: Total number of retries if your pod fails.
activeDeadlineSeconds: You can use this parameter if you want to specify a hard limit on how the time the cronjob runs. For example, if you want to run your cronjob only for one minute, you can set this to 60.
Kubernetes Jobs & CronJobs Use Cases
The best use case for Kubernetes jobs is,
Batch processing: Let’s say you want to run a batch task once a day or during a specific schedule. It could be something like reading files from storage or a database and feed them to a service to process the files.
Operations/ad-hoc tasks: Let’s say you want to run a script/code which runs a database cleanup activity or to even backup a kubernetes cluster itself.