diff --git a/articles/2017-12-03-kubernetes-ceph-rbd.md b/articles/2017-12-03-kubernetes-ceph-rbd.md new file mode 100644 index 0000000..4e48506 --- /dev/null +++ b/articles/2017-12-03-kubernetes-ceph-rbd.md @@ -0,0 +1,219 @@ +--- +title: Kubernetes Persistent Volumes and RBD +date: 2017-12-03 15:00:00 +--- + +Since the number of stuff I'm deploying on my small Kubernetes cluster is +increasing and manually managing the volumes is beginning to be a pain, I +decided to start learning about the Storage Classes, Persistent Volumes +and Volume claims. + +Even if at first it seems to be intimidating, it was really easy to integrate +them with my small Ceph cluster that I also play with. + +# Ceph +On the Ceph side, the configuration consists of creating a new pool and +user that will be used by our Kubernetes cluster. + +* First create a new pool +`ceph osd pool create kubernetes 64 64` +* Then, to reduce compatibility problems, I decided to reduce the features +to the bare minimum +`rbd feature disable --pool kubernetes exclusive-lock object-map fast-diff deep-flatten` +* Once the pool is created, I created a new client key that will be used +to provision and claim volumes that will be stored in this pool +`ceph auth get-or-create-key client.kubernetes` +* We need to add the correct capabilities to this new client so that it +can create new images, handle the locks and retrieve the images. The `rbd` +profile automatically allow these operations. +`ceph auth caps client.kubernetes mon "profile rbd" osd "profile rbd pool=kubernetes` +* Then, we export the key in base64 to be inserted shortly in the Kubernetes +storage class configuration. +`ceph auth get client.kubernetes | grep key | awk '{print $3}' | base64` + +That's all for the Ceph part of the storage configuration. Easy until now no ? + +# Storage class +In Kubernetes, a Storage Class is a way to configure the storage that is +available and can be used by the Persistent Volumes. It's really an easy +way to describe the storage so that you don't have to worry about it when +creating new pods. + +I created a new file that contains everything needed for the configuration +of a new `rbd` storage class in my cluster. I will describe it part by +part, but you can merge everything into one file to apply it with `kubectl`. + +```yaml +kind: ServiceAccount +apiVersion: v1 +metadata: + name: rbd-provisioner + namespace: kube-system +--- +kind: ClusterRoleBinding +apiVersion: rbac.authorization.k8s.io/v1beta1 +metadata: + name: rbd-provisioner +subjects: +- kind: ServiceAccount + name: rbd-provisioner + namespace: kube-system +roleRef: + kind: ClusterRole + name: system:controller:persistent-volume-binder + apiGroup: rbac.authorization.k8s.io +--- +apiVersion: extensions/v1beta1 +kind: Deployment +metadata: + name: rbd-provisioner + namespace: kube-system +spec: + replicas: 1 + template: + metadata: + labels: + app: rbd-provisioner + spec: + containers: + - name: rbd-provisioner + image: "quay.io/external_storage/rbd-provisioner:v0.1.0" + serviceAccountName: rbd-provisioner +``` + +A rbd provisioner pod and it's related service accounts, based on the +[RBD Volume Provisioner for Kubernetes 1.5+ incubator project] +(https://github.com/kubernetes-incubator/external-storage/tree/master/ceph/rbd/deploy/rbac). + +Not much to add for now on this part. Let's look into the storage class +configuration. + +```yaml +--- +kind: StorageClass +apiVersion: storage.k8s.io/v1 +metadata: + name: rbd +provisioner: ceph.com/rbd +parameters: + monitors: 10.42.100.1:6789,10.42.100.2:6789,10.42.100.3:6789 + adminId: kubernetes + adminSecretName: ceph-secret + adminSecretNamespace: kube-system + pool: kubernetes + userId: kubernetes + userSecretName: ceph-secret-user +reclaimPolicy: Retain +--- +apiVersion: v1 +kind: Secret +metadata: + name: ceph-secret + namespace: kube-system +type: kubernetes.io/rbd +data: + key: QV[...]QPo= +--- +apiVersion: v1 +kind: Secret +metadata: + name: ceph-secret-user + namespace: default +type: kubernetes.io/rbd +data: + key: QV[...]QPo= +``` + +This is the part where the storage is described. Update the `monitors` to +match your Ceph configuration and the secrets to match the key you got from +the last `ceph auth get client.kubernetes | grep key | awk '{print $3}' | base64` +command. + +Here, I cheated a little and used the same client for both the administration +and the user part of the storage, in part because I didn't want to bother +with the capabilities needed for each. + +Once everything seems correct, you can save the file or files and apply +the configuration on the Kubernetes cluster with `kubectl apply -f ceph-rbd.yaml` +(or the name of your file). + +And that's all for the configuration ... We can check that everything is +working with `kubectl get sc,deploy,po -n kube-system` + +```sh +NAME PROVISIONER +storageclasses/rbd ceph.com/rbd + +NAME DESIRED CURRENT UP-TO-DATE AVAILABLE AGE +[...] +deploy/rbd-provisioner 1 1 1 1 5m + +NAME READY STATUS RESTARTS AGE +[...] +po/rbd-provisioner-5cc5947c77-xdcn5 1/1 Running 0 5m +``` + +There should be a `rbd-provisioner` deployment with everything as desired, +a `rbd-provisioner-...-...` pod running and a `storageclasses/rbd` storage +class with the correct provisioner. + +# PersistentVolumeClaim and Volumes + +Now to the usage in the deployments : + +```yaml +kind: PersistentVolumeClaim +apiVersion: v1 +metadata: + name: myservice-data-claim +spec: + storageClassName: rbd + accessModes: + - ReadWriteOnce + resources: + requests: + storage: 5Gi +--- +kind: Pod +apiVersion: v1 +metadata: + name: myservice-pod +spec: + volumes: + - name: myservice-data + persistentVolumeClaim: + claimName: myservice-data-claim + containers: + - name: myservice-cont + image: nginx + ports: + - containerPort: 80 + name: "http-server" + volumeMounts: + - mountPath: "/usr/share/nginx/html" + name: myservice-data +``` + +If the volume does not yet exist, a new image will be automatically created +on Ceph, this image will be formatted (by default in `ext4`) and mounted. +If it exists, it will simply be mounted. + +All in all, two hours were sufficient to migrate from volume manually +created and managed to a storage class and volume claims. I learnt that +even though Kubernetes can really look hard and scary at first, + verything is there to help you with your stuff. + +# Possible errors +## Filesystem error - Access denied +By default, the pods will have access to the newly generated filesystems. +If you start them with `securityContext` parameters, you can put them in +a state where the user the container is running at does not have access +to the filesystem content, either as read or write. + +## Image is locked by other nodes +If like me you battle with `rbd: image is locked by other nodes` +errors when a pod is migrated between nodes, it usually means that the client +you created doesn't have the capabilities to remove locks after detaching. +I fixed that simply by setting the caps to the profile instead of configuring +manually the `rwx` operations : +`ceph auth caps client.myclient mon "profile rbd" osd "profile rbd pool=mypool"`