In our Previous posts, we have seen about various scenarios about how to troubleshooting Kubernetes errors, Today, will see about How to troubleshoot Kubernetes PVC with Basics and how to create a PV and PVC.
In Kubernetes, there are separate mechanisms for managing compute resources and storage resources. A storage volume is a construct that allows Kubernetes users and administrators to gain access to storage resources, while abstracting the underlying storage implementation.
Kubernetes provides two API resources that allow pods to access persistent storage:
1. PersistentVolume (PV)
A PV represents storage in the cluster, provisioned manually by an administrator, or automatically using a Storage Class. A PV is an independent resource in the cluster, with a separate lifecycle from any individual pod that uses it. When a pod shuts down, the PV remains in place and can be mounted by other pods. Behind the scenes, the PV object interfaces with physical storage equipment using NFS, iSCSI, or public cloud storage services.
2. PersistentVolumeClaim (PVC)
A PVC represents a request for storage by a Kubernetes user. Users define a PVC configuration and apply it to a pod, and Kubernetes then looks for an appropriate PV that can provide storage for that pod. When it finds one, the PV “binds” to the pod.
PVs and PVCs are analogous to nodes and pods. Just like a node is a computing resource, and a pod seeks a node to run on, a PersistentVolume is a storage resource, and a PersistentVolumeClaim seeks a PV to bind to.
The PVC is a complex mechanism that is the cause of many Kubernetes issues, some of which can be difficult to diagnose and resolve. In this post, let’s see the most common issues and basic strategies for troubleshooting them.
Persistent Volume and Claim Lifecycle
PVs and PVCs follow a lifecycle that includes the following stages:
- Provisioning: a PV can be provisioned either manually, via an administrator, or dynamically, based on pre-configured PVCs.
- Binding: when a user applies a PVC to a pod, Kubernetes searches for a PV with the required amount of storage and other requested storage criteria, and binds it exclusively to the pod.
- Using: at this stage, the bound PV is reserved for a specific pod.
- Storage Object in Use Protection: this is a feature that protects data when PVCs bind to PVs, to avoid data loss when a PVC is removed.
- Reclaiming: when users do not need a PV anymore, they can delete the PVC object. Once the claim has been released, the cluster uses it’s reclaim policy to determine what to do with the PV—retain, recycle, or delete it.
- Retain: this status enables PVs to be manually reclaimed. The PV continues existing without binding to any PVC. However, because it still includes data belonging to the previous user, it needs to be manually configured and cleaned before reuse.
- Delete: this status enables the cluster to remove the PV object, and disassociate from storage resources in the external infrastructure. This is the default for dynamically provisioned PVs.
How to Create a PersistentVolumeClaim (PVC) and Bind to a PV
Let’s quickly check how PVs and PVCs work. It is based on the full PV tutorial in the Kubernetes documentation.
1. Setting Up a Node
To use this tutorial, set up a Kubernetes cluster with only one node. Ensure your kubectl command line can communicate with the control plane. On the node, create a directory as follows:# sudo mkdir /mnt/data
Within the directory, create an index.html
file.
2. Creating PersistentVolume
Let’s create a YAML file defining a PersistentVolume:
apiVersion: v1
kind: PersistentVolume
metadata:
name: task-pv-volume
labels:
type: local
spec:
storageClassName: manual
capacity:
storage: 10Gi
accessModes:
—ReadWriteOnce
hostPath:
path: "/mnt/data"
Run the following command to create the PersistentVolume on the node:
# kubectl apply -f https://k8s.io/examples/pods/storage/pv-volume.yaml
3. Creating PersistentVolumeClaim and Bind to PV
Now, let’s create a PersistentVolumeClaim that requests a PV with the following criteria, which match the PV we created earlier:
- Storage volume of at least 3 GB
- Enables read-write access
Let’s create a YAML file for the PVC:
apiVersion: v1
kind: PersistentVolumeClaim
metadata:
name: task-pv-claim
spec:
storageClassName: manual
accessModes:
—ReadWriteOnce
resources:
requests:
storage: 3Gi
Run this command to apply the PVC:
# kubectl apply -f https://k8s.io/examples/pods/storage/pv-claim.yaml
As soon as you create the PVC, the Kubernetes control plane starts looking for an appropriate PV. When it finds one, it binds the PVC to the PV.
Run this command to see the status of the PV we created earlier:
# kubectl get pv task-pv-volume
The output should look like this, indicating binding was successful:
NAME CAPACITY ACCESSMODES RECLAIMPOLICY STATUS CLAIM ...
task-pv-volume 10Gi RWO Retain Bound default/task-pv-claim
4. Creating A Pod and Mounting the PVC
The final step is to create a pod that uses your PVC. Run a pod with a NGINX image, and specify the PVC we created earlier in the relevant part of the pod specification:
spec:
volumes:
—name: task-pv-storage
persistentVolumeClaim:
claimName: task-pv-claim
containers:
—name: task-pv-container
...
volumeMounts:
—mountPath: "/usr/share/nginx/html"
name: task-pv-storage
Bash into your pod, install curl and run the command curl http://localhost/
. The output should show the content of the index.html file you created in step 1. This shows that the new pod was able to access the data in the PV via the PersistentVolumeClaim.
Kubernetes PVC Errors
The Kubernetes PVC is a complex mechanism, and can result in errors that are difficult to diagnose and resolve. In general, PVC errors are related to three broad categories:
- Persistent Volume creation issue: Kubernetes had a problem creating the persistent volume or enabling access to it, even though the underlying storage resources exist.
- Persistent Volume provisioning issue: Kubernetes could not create the required persistent volume because storage resources were unavailable.
- Changes in specs: Kubernetes had a problem connecting a pod to the required Persistent Volume because of a configuration change in the PV or PVC.
All of these issues can happen at different stages of the PVC lifecycle. We’ll review a few common errors you might encounter:
- FailedAttachVolume
- FailedMount
- CrashLoopBackOff caused by PersistentVolume Claim
FailedAttachVolume and FailedMount Errors
FailedAttachVolume and FailedMount are two errors that indicate a pod had a problem mounting a PV. There is a difference between these two errors:
- FailedAttachVolume: occurs when a volume cannot be detached from a previous node to be mounted on the current one.
- FailedMount: occurs when a volume cannot be mounted on the required path. If the FailedAttachVolume error occurred, FailedMount will also occur as a result. But it is also possible that the volume is available, but there was a specific issue mounting on the path required.
Common causes
Cause | Possible Errors |
Failure on the new node | FailedMount |
Incorrect access mode defined on the new node | FailedMount |
New node has too many disks attached | FailedMount |
New node does not have enough mount points | FailedMount |
Network partitioning error | FailedMount |
Incorrect path specified | FailedMount |
Service API call failure | FailedAttachVolume, FailedMount |
Failure of storage infrastructure on previous node | FailedAttachVolume, FailedMount |
Diagnosing the Problem
To diagnose why the FailedAttachVolume and FailedMount issues occurred, run the command:
# kubectl describe pod [name]
In the output, look at the Events section. Look for a message indicating one of the errors and the cause.
Events:
Type Reason Age From Message
---- ------ ---- ---- -------
Warning FailedAttachVolume 5m kubelet FailedAttachVolume Multi-Attach error for volume "pvc-xxxxxxxxxxxx" Volume is already exclusively attached to one node and can’t be attached to another
Warning FailedMount 5m kubelet Unable to mount volumes for pod "sample-pod": timeout expired waiting for volumes to attach/mount for pod "sample-pod".
Resolving the Problem
Since Kubernetes can’t automatically handle the FailedAttachVolume and FailedMount errors on its own, sometimes you have to take manual steps.
If the problem is Failure to Detach:
Use the storage provider’s interface to detach the volume manually. For example, in AWS you can use the following CLI command to detach a volume from a node:
# aws ec2 detach-volume --volume-id [persistent-volume-id] --force
If the problem is Failure to Attach or Mount:
The easiest fix is a problem in the mount configuration. Check for a wrong network path or network partitioning issue that is preventing the PV from mounting.
Next, try to force Kubernetes to run the pod on another node. The PV may be able to mount there. Here are a few options for moving the pod:
- Mark a node as unschedulable via the
kubectl cordon
command. - Run
kubectl delete pod
. This will usually cause Kubernetes to run the pod on another node. - Use node selectors, affinity, or taints, to specify that the pod should schedule on another node.
If you do not have other available nodes, or you tried the above and the problem recurs, try to resolve the problem on the node:
- Reduce the number of disk partitions or add mount points
- Check access mode on the new node
- Identify and resolve issues with underlying storage
CrashLoopBackOff Errors Caused by PersistentVolumeClaim
The CrashLoopBackOff error means that a pod repeatedly crashes, restarts, and crashes again. This error can happen for a variety of reasons: see our guide to CrashLoopBackOff. However, it can also happen due to a corrupted PersistentVolumeClaim.
Diagnosing the Problem
To identify if CrashLoopBackOff is caused by a PVC, do the following:
- Check logs from the previous container instance
- Check Deployment logs
- Failing the above, bash into the container and identify the issue
Resolving the Problem
If CrashLoopBackOff is due to an issue with a PVC, try the following:
- Scale the failed deployment to 0 using the following command. This ensures no other entities on the cluster are writing to the PV during your maintenance.
# kubectl scale deployment [deployment-name] --replicas=0
- Get the Deployment configuration to identify which PVC it uses:
# kubectl get deployment -o jsonpath=" .spec.template.spec.volumes[*].persistentVolumeClaim.claimName}"
failed-deployment - The output of the previous command will be the identifier for the failed PVC. Use a debugging tool like
Busybox
to create a debugging pod that mounts the same PVC. - Create a new debugging pod and run a shell using this command:
#kubectl exec -it volume-debugger sh
- Identify which volume is currently mounted in the
/data
directory and resolve the issue. - Exit the shell and delete the debugger pod.
- Scale the deployment up again using this command (setting the replicas argument to the required number of replicas).
#kubectl scale deployment failed-deployment --replicas=1