FoxuTech

How to take Azure Kubernetes Backup using Velero

How to take Azure Kubernetes Backup using Velero

One another important topic to discuss, yes backup. Irrespective of any system or environment Backup is mandatory. Like that Kubernetes or any managed Kubernetes, we should take a backup, even if you’re using Infrastructure as Code and all deployments are automated, as it will add additional benefits from taking backups of AKS clusters. You can check some best practices from Microsoft.

Why should take Backup?

Most of the SRE or who all managing the Kubernetes, may experienced the risk beyond taking dump or time consumptions when they manage large database in any Kubernetes platform. Also, some important configuration managements may more crucial. To optimize this or reduce the risk, in AKS we do have couple of options like,

In this post, lets check how to use Velero to take Azure Kubernetes Service backup.

Mean Time To Recover (MTTR)

It is always expected to define the MTTR to bring back any given application or system. Now a days we are mostly automated all our operations like cluster creation and application deployment etc using some pipeline with CI/CD tools like Jenkins, CircleCI, etc. But does the bring the cluster quick? there we have lot to discuss or check. Also, we cannot be sure it will be state as before in rare situation. Also, it takes more time if there was a greater number of services, as mostly we have limited number of parallel executions.

Let’s assume each execution takes ~3-4min and you are expected to deploy 10+ microservices or applications, it may take approximately 30-45min. This number should be okay, 4-5 years back, but considering modern technology growth, this may high number.

With this during any disaster/failure, recreating new infrastructure and re-deploying all components can take time. Depending on the criticality of the incident and the importance of the app, it can feel like an eternity.

To fix this kind of challenges tool like Velero to backup all Kubernetes resources, a cluster can be quickly restored to a certain state, in less time. This tool helps to reduce the recovery time from ~45 minutes to ~15 minutes approximately.

Another advantage on using backup tools is help to backup the data in persistent volume, like said before, if you are running any stateful application on the cluster, it uses persistent volume to store the data.

Velero

Velero(formerly Heptio Ark) is an open source tool for safely backing up and restoring resources in a Kubernetes cluster, performing disaster recovery, and migrating resources and persistent volumes to another Kubernetes cluster.

Velero offers key data protection features, such as scheduled backups, retention schedules, and pre- or post-backup hooks for custom actions. Velero can help protect data stored in persistent volumes and makes your entire Kubernetes cluster more resilient.

Velero Use Cases

Here are some of the things Velero can do:

How it will work

Each Velero operation–on-demand backup, scheduled backup, restoration–is a custom resource that is defined with a Kubernetes custom resource definition, or CRD, and stored in etcd. Velero includes controllers that process the CRDs to back up and restore resources. You can back up or restore all objects in your cluster, or you can filter objects by type, namespace, or label.

Data protection is a chief concern for application owners who want to make sure that they can restore a cluster to a known good state, recover from a crashed cluster, or migrate to a new environment. Velero provides those capabilities.

Velero Components and Architecture

Velero contains two main components:

Velero supports plug-ins to enable it to work with different storage systems and Kubernetes platforms. You can run Velero in clusters on a cloud provider or on premises.

Installation:

CLI:

You can download the CLI from official release page, here let see how to install 1.8.1 version, you can pick any version from release page and install based on your requirement.

# cd /tmp

# wget https://github.com/vmware-tanzu/velero/releases/download/v1.8.1/velero-v1.8.1-linux-amd64.tar.gz

# tar -xvf velero-v1.8.1-linux-amd64.tar.gz

# cd velero-v1.8.1-linux-amd64 && mv velero /usr/local/bin/

# velero help

The Server Side

For the server-side component, there’s two main methods of installation:

Installing Velero

Out of two options, we are going to pick helm and let’s install it with some modification, also we can see how it use it in GitOps in future.

Velero uses an Azure Plugin to interact with Azure. To authenticate, use Service Principals for now. Usually, I prefer using Azure Active Directory Pod Identities but there’s an open issue with Managed Identities. It’s a project that allows pods to authenticate against Azure using Managed Identities. In other words, by using pod identities (managed identities), you won’t need an API secret for Velero to authenticate against Azure. Remember though, managed identities only work on Azure.

Prerequisites:

Before we start, we need following tools should be installed.

Dynamic Resource Group

Azure created the “foxutech-velero” resource group to hold dynamic resources created for my Kubernetes cluster. For example, agent pools, dynamic disks for persistent volumes. 

Once it is done next step is to setup a storage account.

Setup Storage Account

Create blob container inside the storage account:

# az storage account create --name mystoragevelero --resource-group myResourceGroup --sku Standard_GRS --encryption-services blob --https-only true --kind BlobStorage --access-tier Cold

# az storage container create -n velero --public-access off --account-name mystoragevelero

Get your subscription and tenant ID:

# az account list --query '[?isDefault].id' -o tsv
# az account list --query '[?isDefault].tenantId' -o tsv

Create a service principal with contributor access:

# export SUBSCRIPTION_ID=XXXXXX-XXXX-XXXX-XXXX-XXXXXXXXXX

# export STORAGE_RESOURCE_GROUP=myResourceGroup

# export MC_RESOURCE_GROUP=foxutech-velero

# az ad sp create-for-rbac \
 --name "velero" \
  --role "Contributor" \
  --query 'password' \
  -o tsv \
  --scopes /subscriptions/$SUBSCRIPTION_ID/resourceGroups/$STORAGE_RESOURCE_GROUP /subscriptions/$SUBSCRIPTION_ID/resourceGroups/$MC_RESOURCE_GROUP

Save the password that you got while creating the service principal.

Get the app ID for the service principal:

# az ad sp list --display-name "velero" --query '[0].appId' -o tsv

Create a credentials file velero-credentials for Velero, make sure to update the values of subscription id, tenant id, a client id (SP app id), client secret (SP password), and resource group name.

# cat velero-credentials
AZURE_SUBSCRIPTION_ID=XXXX-XXXX-XXX-XXX-XXXX-XXXXXXXX
AZURE_TENANT_ID=XXXX-XXXX-XXX-XXX-XXXX-XXXXXXXX
AZURE_CLIENT_ID=SERVICE_PRINCIPAL_APPID
AZURE_CLIENT_SECRET=SERVICE_PRINCIPAL_PASSWORD
AZURE_RESOURCE_GROUP=foxutech-velero
AZURE_CLOUD_NAME=AzurePublicCloud

Install Velero

Add the repo;

# helm repo add vmware-tanzu https://vmware-tanzu.github.io/helm-charts

Create namespace for velero,

# kubectl create ns velero

Install the chart;

# helm install velero vmware-tanzu/velero --namespace velero \
--set-file credentials.secretContents.cloud=./velero-credentials \
--set configuration.provider=azure \
--set configuration.backupStorageLocation.name=azure \
--set configuration.backupStorageLocation.bucket='velero' \
--set configuration.backupStorageLocation.config.resourceGroup=foxutech \
--set configuration.backupStorageLocation.config.storageAccount=foxtfstate \
--set snapshotsEnabled=true \
--set deployRestic=true \
--set configuration.volumeSnapshotLocation.name=azure \
--set image.repository=velero/velero \
--set image.pullPolicy=Always \
--set initContainers[0].name=velero-plugin-for-microsoft-azure \
--set initContainers[0].image=velero/velero-plugin-for-microsoft-azure:master \
--set initContainers[0].volumeMounts[0].mountPath=/target \
--set initContainers[0].volumeMounts[0].name=plugins

Once you are done with the configuration, now it is time to take up the backup and snapshots.

Velero by default takes the snapshots of all the persistent volumes mounted in a particular namespace.

Backup and Snapshot

Check and get the backup location,

# velero backup-location get
NAME    PROVIDER   BUCKET/PREFIX   PHASE       LAST VALIDATED                  ACCESS MODE   DEFAULT
azure   azure      velero          Available   2022-05-27 17:11:01 +0000 UTC   ReadWrite

let’s install some sample application to test the backup and restore.

Install WordPress:

Create new namespace,

# kubectl create ns wordpress
namespace/wp created

# helm repo add bitnami https://charts.bitnami.com/bitnami
"bitnami" has been added to your repositories

# helm install test-app bitnami/wordpress --namespace wordpress
NAME: test-app
LAST DEPLOYED: Fri May 27 16:22:08 2022
NAMESPACE: wordpress
STATUS: deployed
REVISION: 1
TEST SUITE: None
NOTES:
CHART NAME: wordpress
CHART VERSION: 14.3.1
APP VERSION: 5.9.3

** Please be patient while the chart is being deployed **

Your WordPress site can be accessed through the following DNS name from within your cluster:

    test-app-wordpress.wp.svc.cluster.local (port 80)

To access your WordPress site from outside the cluster follow the steps below:

1. Get the WordPress URL by running these commands:

  NOTE: It may take a few minutes for the LoadBalancer IP to be available.
        Watch the status with: 'kubectl get svc --namespace wp -w test-app-wordpress'

   export SERVICE_IP=$(kubectl get svc --namespace wp test-app-wordpress --include "{{ range (index .status.loadBalancer.ingress 0) }}{{ . }}{{ end }}")
   echo "WordPress URL: http://$SERVICE_IP/"
   echo "WordPress Admin URL: http://$SERVICE_IP/admin"

2. Open a browser and access WordPress using the obtained URL.

3. Login with the following credentials below to see your blog:

  echo Username: user
  echo Password: $(kubectl get secret --namespace wp test-app-wordpress -o jsonpath="{.data.wordpress-password}" | base64 --decode)

now wordpress is up and running, you can either port-forward or loadbalancer and check it.

Now we are all set to take the backup, running following commands to take the backup.

# velero backup create wp-backup --include-namespaces wordpress --storage-location azure --wait
Backup request "wp-backup" submitted successfully.
Waiting for backup to complete. You may safely press ctrl-c to stop waiting - your backup will continue in the background.
....
Backup completed with status: Completed. You may check for more information using the commands `velero backup describe wp-backup` and `velero backup logs wp-backup`.
# velero backup describe wp-backup
Name:         wp-backup
Namespace:    velero
Labels:       velero.io/storage-location=azure
Annotations:  velero.io/source-cluster-k8s-gitversion=v1.22.6
              velero.io/source-cluster-k8s-major-version=1
              velero.io/source-cluster-k8s-minor-version=22

Phase:  Completed


Errors:    0
Warnings:  0

Namespaces:
  Included:  wordpress
  Excluded:  <none>

Resources:
  Included:        *
  Excluded:        <none>
  Cluster-scoped:  auto

Label selector:  <none>

Storage Location:  azure

Velero-Native Snapshot PVs:  auto

TTL:  720h0m0s

Hooks:  <none>

Backup Format Version:  1.1.0

Started:    2022-05-27 16:23:29 +0000 UTC
Completed:  2022-05-27 16:23:33 +0000 UTC

Expiration:  2022-06-26 16:23:29 +0000 UTC

Total items to be backed up:  52
Items backed up:              52

Velero-Native Snapshots: <none included>

Once you see the backup completed, you can now go to your storage account and check the backup objects.

As we seen the backup files are present, now lets delete the resource and restore it.

# kubectl delete ns workpress

Once it deleted, just check all the resources are got deleted by checking the pods and volumes.

# kubectl get po -n workpress
No resources found.
# kubectl get pv -A

Once confirm, now let’s restore the backup,

# velero restore create --from-backup wp-backup
Restore request "wp-backup-20220527162735" submitted successfully.
Run `velero restore describe wp-backup-20220527162735` or `velero restore logs wp-backup-20220527162735` for more details.

# velero restore describe wp-backup-20220527162735
Name:         wp-backup-20220527162735
Namespace:    velero
Labels:       <none>
Annotations:  <none>

Phase:                       Completed
Total items to be restored:  26
Items restored:              26

Started:    2022-05-27 16:27:36 +0000 UTC
Completed:  2022-05-27 16:27:38 +0000 UTC

Backup:  wp-backup

Namespaces:
  Included:  all namespaces found in the backup
  Excluded:  <none>

Resources:
  Included:        *
  Excluded:        nodes, events, events.events.k8s.io, backups.velero.io, restores.velero.io, resticrepositories.velero.io
  Cluster-scoped:  auto
Namespace mappings:  <none>
Label selector:  <none>
Restore PVs:  auto
Preserve Service NodePorts:  auto
# kubectl get po -n wp
NAME                                  READY   STATUS    RESTARTS   AGE
test-app-mariadb-0                    1/1     Running   0          74s
test-app-wordpress-76ccf4865b-8kfs6   1/1     Running   0          74s

As it works, now we can go ahead and create a schedule.

#### Setting up the schedule “Back up my cluster every day at 4 am”

# velero schedule create every-day-at-4 --schedule "0 4 * * *"

Note: again, you might run into this issue and if so then you’ll have to exclude the webhook admission configuration.

# velero schedule create every-day-at-7 --schedule "0 7 * * *" --exclude-resources MutatingWebhookConfiguration.admissionregistration.k8s.io

That’s it for now, we will see more examples in coming post with GitOps integration or other use cases.

Learn Kubernetes on Udemy, now Deal extended: Courses Up To 85% Off

Exit mobile version