219 lines
6.9 KiB
Markdown
219 lines
6.9 KiB
Markdown
---
|
|
title: Kubernetes Persistent Volumes and RBD
|
|
date: 2017-12-03 15:00:00
|
|
---
|
|
|
|
Since the number of stuff I'm deploying on my small Kubernetes cluster is
|
|
increasing and manually managing the volumes is beginning to be a pain, I
|
|
decided to start learning about the Storage Classes, Persistent Volumes
|
|
and Volume claims.
|
|
|
|
Even if at first it seems to be intimidating, it was really easy to integrate
|
|
them with my small Ceph cluster that I also play with.
|
|
|
|
# Ceph
|
|
On the Ceph side, the configuration consists of creating a new pool and
|
|
user that will be used by our Kubernetes cluster.
|
|
|
|
* First create a new pool
|
|
`ceph osd pool create kubernetes 64 64`
|
|
* Then, to reduce compatibility problems, I decided to reduce the features
|
|
to the bare minimum
|
|
`rbd feature disable --pool kubernetes exclusive-lock object-map fast-diff deep-flatten`
|
|
* Once the pool is created, I created a new client key that will be used
|
|
to provision and claim volumes that will be stored in this pool
|
|
`ceph auth get-or-create-key client.kubernetes`
|
|
* We need to add the correct capabilities to this new client so that it
|
|
can create new images, handle the locks and retrieve the images. The `rbd`
|
|
profile automatically allow these operations.
|
|
`ceph auth caps client.kubernetes mon "profile rbd" osd "profile rbd pool=kubernetes`
|
|
* Then, we export the key in base64 to be inserted shortly in the Kubernetes
|
|
storage class configuration.
|
|
`ceph auth get client.kubernetes | grep key | awk '{print $3}' | base64`
|
|
|
|
That's all for the Ceph part of the storage configuration. Easy until now no ?
|
|
|
|
# Storage class
|
|
In Kubernetes, a Storage Class is a way to configure the storage that is
|
|
available and can be used by the Persistent Volumes. It's really an easy
|
|
way to describe the storage so that you don't have to worry about it when
|
|
creating new pods.
|
|
|
|
I created a new file that contains everything needed for the configuration
|
|
of a new `rbd` storage class in my cluster. I will describe it part by
|
|
part, but you can merge everything into one file to apply it with `kubectl`.
|
|
|
|
```yaml
|
|
kind: ServiceAccount
|
|
apiVersion: v1
|
|
metadata:
|
|
name: rbd-provisioner
|
|
namespace: kube-system
|
|
---
|
|
kind: ClusterRoleBinding
|
|
apiVersion: rbac.authorization.k8s.io/v1beta1
|
|
metadata:
|
|
name: rbd-provisioner
|
|
subjects:
|
|
- kind: ServiceAccount
|
|
name: rbd-provisioner
|
|
namespace: kube-system
|
|
roleRef:
|
|
kind: ClusterRole
|
|
name: system:controller:persistent-volume-binder
|
|
apiGroup: rbac.authorization.k8s.io
|
|
---
|
|
apiVersion: extensions/v1beta1
|
|
kind: Deployment
|
|
metadata:
|
|
name: rbd-provisioner
|
|
namespace: kube-system
|
|
spec:
|
|
replicas: 1
|
|
template:
|
|
metadata:
|
|
labels:
|
|
app: rbd-provisioner
|
|
spec:
|
|
containers:
|
|
- name: rbd-provisioner
|
|
image: "quay.io/external_storage/rbd-provisioner:v0.1.0"
|
|
serviceAccountName: rbd-provisioner
|
|
```
|
|
|
|
A rbd provisioner pod and it's related service accounts, based on the
|
|
[RBD Volume Provisioner for Kubernetes 1.5+ incubator project]
|
|
(https://github.com/kubernetes-incubator/external-storage/tree/master/ceph/rbd/deploy/rbac).
|
|
|
|
Not much to add for now on this part. Let's look into the storage class
|
|
configuration.
|
|
|
|
```yaml
|
|
---
|
|
kind: StorageClass
|
|
apiVersion: storage.k8s.io/v1
|
|
metadata:
|
|
name: rbd
|
|
provisioner: ceph.com/rbd
|
|
parameters:
|
|
monitors: 10.42.100.1:6789,10.42.100.2:6789,10.42.100.3:6789
|
|
adminId: kubernetes
|
|
adminSecretName: ceph-secret
|
|
adminSecretNamespace: kube-system
|
|
pool: kubernetes
|
|
userId: kubernetes
|
|
userSecretName: ceph-secret-user
|
|
reclaimPolicy: Retain
|
|
---
|
|
apiVersion: v1
|
|
kind: Secret
|
|
metadata:
|
|
name: ceph-secret
|
|
namespace: kube-system
|
|
type: kubernetes.io/rbd
|
|
data:
|
|
key: QV[...]QPo=
|
|
---
|
|
apiVersion: v1
|
|
kind: Secret
|
|
metadata:
|
|
name: ceph-secret-user
|
|
namespace: default
|
|
type: kubernetes.io/rbd
|
|
data:
|
|
key: QV[...]QPo=
|
|
```
|
|
|
|
This is the part where the storage is described. Update the `monitors` to
|
|
match your Ceph configuration and the secrets to match the key you got from
|
|
the last `ceph auth get client.kubernetes | grep key | awk '{print $3}' | base64`
|
|
command.
|
|
|
|
Here, I cheated a little and used the same client for both the administration
|
|
and the user part of the storage, in part because I didn't want to bother
|
|
with the capabilities needed for each.
|
|
|
|
Once everything seems correct, you can save the file or files and apply
|
|
the configuration on the Kubernetes cluster with `kubectl apply -f ceph-rbd.yaml`
|
|
(or the name of your file).
|
|
|
|
And that's all for the configuration ... We can check that everything is
|
|
working with `kubectl get sc,deploy,po -n kube-system`
|
|
|
|
```sh
|
|
NAME PROVISIONER
|
|
storageclasses/rbd ceph.com/rbd
|
|
|
|
NAME DESIRED CURRENT UP-TO-DATE AVAILABLE AGE
|
|
[...]
|
|
deploy/rbd-provisioner 1 1 1 1 5m
|
|
|
|
NAME READY STATUS RESTARTS AGE
|
|
[...]
|
|
po/rbd-provisioner-5cc5947c77-xdcn5 1/1 Running 0 5m
|
|
```
|
|
|
|
There should be a `rbd-provisioner` deployment with everything as desired,
|
|
a `rbd-provisioner-...-...` pod running and a `storageclasses/rbd` storage
|
|
class with the correct provisioner.
|
|
|
|
# PersistentVolumeClaim and Volumes
|
|
|
|
Now to the usage in the deployments :
|
|
|
|
```yaml
|
|
kind: PersistentVolumeClaim
|
|
apiVersion: v1
|
|
metadata:
|
|
name: myservice-data-claim
|
|
spec:
|
|
storageClassName: rbd
|
|
accessModes:
|
|
- ReadWriteOnce
|
|
resources:
|
|
requests:
|
|
storage: 5Gi
|
|
---
|
|
kind: Pod
|
|
apiVersion: v1
|
|
metadata:
|
|
name: myservice-pod
|
|
spec:
|
|
volumes:
|
|
- name: myservice-data
|
|
persistentVolumeClaim:
|
|
claimName: myservice-data-claim
|
|
containers:
|
|
- name: myservice-cont
|
|
image: nginx
|
|
ports:
|
|
- containerPort: 80
|
|
name: "http-server"
|
|
volumeMounts:
|
|
- mountPath: "/usr/share/nginx/html"
|
|
name: myservice-data
|
|
```
|
|
|
|
If the volume does not yet exist, a new image will be automatically created
|
|
on Ceph, this image will be formatted (by default in `ext4`) and mounted.
|
|
If it exists, it will simply be mounted.
|
|
|
|
All in all, two hours were sufficient to migrate from volume manually
|
|
created and managed to a storage class and volume claims. I learnt that
|
|
even though Kubernetes can really look hard and scary at first,
|
|
verything is there to help you with your stuff.
|
|
|
|
# Possible errors
|
|
## Filesystem error - Access denied
|
|
By default, the pods will have access to the newly generated filesystems.
|
|
If you start them with `securityContext` parameters, you can put them in
|
|
a state where the user the container is running at does not have access
|
|
to the filesystem content, either as read or write.
|
|
|
|
## Image is locked by other nodes
|
|
If like me you battle with `rbd: image is locked by other nodes`
|
|
errors when a pod is migrated between nodes, it usually means that the client
|
|
you created doesn't have the capabilities to remove locks after detaching.
|
|
I fixed that simply by setting the caps to the profile instead of configuring
|
|
manually the `rwx` operations :
|
|
`ceph auth caps client.myclient mon "profile rbd" osd "profile rbd pool=mypool"`
|