Ceph

    PLEASE NOTE: This document applies to v1.1 version and not to the latest stable release v1.8

    Documentation for other releases can be found by using the version selector in the top right of any doc page.

    Shared File System

    A shared file system can be mounted with read/write permission from multiple pods. This may be useful for applications which can be clustered using a shared filesystem.

    This example runs a shared file system for the kube-registry.

    Prerequisites

    This guide assumes you have created a Rook cluster as explained in the main Kubernetes guide

    Multiple File Systems Not Supported

    By default only one shared file system can be created with Rook. Multiple file system support in Ceph is still considered experimental and can be enabled with the environment variable ROOK_ALLOW_MULTIPLE_FILESYSTEMS defined in operator.yaml.

    Please refer to cephfs experimental features page for more information.

    Create the File System

    Create the file system by specifying the desired settings for the metadata pool, data pools, and metadata server in the CephFilesystem CRD. In this example we create the metadata pool with replication of three and a single data pool with replication of three. For more options, see the documentation on creating shared file systems.

    Save this shared file system definition as filesystem.yaml:

    apiVersion: ceph.rook.io/v1
    kind: CephFilesystem
    metadata:
      name: myfs
      namespace: rook-ceph
    spec:
      metadataPool:
        replicated:
          size: 3
      dataPools:
        - replicated:
            size: 3
      metadataServer:
        activeCount: 1
        activeStandby: true
    

    The Rook operator will create all the pools and other resources necessary to start the service. This may take a minute to complete.

    # Create the file system
    $ kubectl create -f filesystem.yaml
    
    # To confirm the file system is configured, wait for the mds pods to start
    $ kubectl -n rook-ceph get pod -l app=rook-ceph-mds
    NAME                                      READY     STATUS    RESTARTS   AGE
    rook-ceph-mds-myfs-7d59fdfcf4-h8kw9       1/1       Running   0          12s
    rook-ceph-mds-myfs-7d59fdfcf4-kgkjp       1/1       Running   0          12s
    

    To see detailed status of the file system, start and connect to the Rook toolbox. A new line will be shown with ceph status for the mds service. In this example, there is one active instance of MDS which is up, with one MDS instance in standby-replay mode in case of failover.

    $ ceph status
      ...
      services:
        mds: myfs-1/1/1 up {[myfs:0]=mzw58b=up:active}, 1 up:standby-replay
    

    Provision Storage

    Before Rook can start provisioning storage, a StorageClass needs to be created based on the filesystem. This is needed for Kubernetes to interoperate with the CSI driver to create persistent volumes.

    NOTE This example uses the CSI driver, which is the preferred driver going forward for K8s 1.13 and newer. Examples are found in the CSI CephFS directory. For an example of a volume using the flex driver (required for K8s 1.12 and earlier), see the Flex Driver section below.

    Save this storage class definition as storageclass.yaml:

    apiVersion: storage.k8s.io/v1
    kind: StorageClass
    metadata:
      name: rook-cephfs
    # Change "rook-ceph" provisioner prefix to match the operator namespace if needed
    provisioner: rook-ceph.cephfs.csi.ceph.com
    parameters:
      # clusterID is the namespace where operator is deployed.
      clusterID: rook-ceph
    
      # CephFS filesystem name into which the volume shall be created
      fsName: myfs
    
      # Ceph pool into which the volume shall be created
      # Required for provisionVolume: "true"
      pool: myfs-data0
    
      # Root path of an existing CephFS volume
      # Required for provisionVolume: "false"
      # rootPath: /absolute/path
    
      # The secrets contain Ceph admin credentials. These are generated automatically by the operator
      # in the same namespace as the cluster.
      csi.storage.k8s.io/provisioner-secret-name: rook-csi-cephfs-provisioner
      csi.storage.k8s.io/provisioner-secret-namespace: rook-ceph
      csi.storage.k8s.io/node-stage-secret-name: rook-csi-cephfs-node
      csi.storage.k8s.io/node-stage-secret-namespace: rook-ceph
    
    reclaimPolicy: Delete
    

    If you’ve deployed the Rook operator in a namespace other than “rook-ceph” as is common change the prefix in the provisioner to match the namespace you used. For example, if the Rook operator is running in “rook-op” the provisioner value should be “rook-op.rbd.csi.ceph.com”.

    Create the storage class.

    kubectl create -f cluster/examples/kubernetes/ceph/csi/cephfs/storageclass.yaml
    

    Consume the Shared File System: K8s Registry Sample

    As an example, we will start the kube-registry pod with the shared file system as the backing store. Save the following spec as kube-registry.yaml:

    apiVersion: v1
    kind: PersistentVolumeClaim
    metadata:
      name: cephfs-pvc
    spec:
      accessModes:
      - ReadWriteMany
      resources:
        requests:
          storage: 1Gi
      storageClassName: csi-cephfs
    ---
    apiVersion: apps/v1
    kind: Deployment
    metadata:
      name: kube-registry
      namespace: kube-system
      labels:
        k8s-app: kube-registry
        kubernetes.io/cluster-service: "true"
    spec:
      replicas: 3
      selector:
        matchLabels:
          k8s-app: kube-registry
      template:
        metadata:
          labels:
            k8s-app: kube-registry
            kubernetes.io/cluster-service: "true"
        spec:
          containers:
          - name: registry
            image: registry:2
            imagePullPolicy: Always
            resources:
              limits:
                cpu: 100m
                memory: 100Mi
            env:
            # Configuration reference: https://docs.docker.com/registry/configuration/
            - name: REGISTRY_HTTP_ADDR
              value: :5000
            - name: REGISTRY_HTTP_SECRET
              value: "Ple4seCh4ngeThisN0tAVerySecretV4lue"
            - name: REGISTRY_STORAGE_FILESYSTEM_ROOTDIRECTORY
              value: /var/lib/registry
            volumeMounts:
            - name: image-store
              mountPath: /var/lib/registry
            ports:
            - containerPort: 5000
              name: registry
              protocol: TCP
            livenessProbe:
              httpGet:
                path: /
                port: registry
            readinessProbe:
              httpGet:
                path: /
                port: registry
          volumes:
          - name: image-store
            persistentVolumeClaim:
              claimName: cephfs-pvc
              readOnly: false
    

    Create the Kube registry deployment:

    kubectl create -f cluster/examples/kubernetes/ceph/csi/cephfs/kube-registry.yaml
    

    You now have a docker registry which is HA with persistent storage.

    Kernel Version Requirement

    If the Rook cluster has more than one filesystem and the application pod is scheduled to a node with kernel version older than 4.7, inconsistent results may arise since kernels older than 4.7 do not support specifying filesystem namespaces.

    Consume the Shared File System: Toolbox

    Once you have pushed an image to the registry (see the instructions to expose and use the kube-registry), verify that kube-registry is using the filesystem that was configured above by mounting the shared file system in the toolbox pod. See the Direct Filesystem topic for more details.

    Teardown

    To clean up all the artifacts created by the file system demo:

    kubectl delete -f kube-registry.yaml
    

    To delete the filesystem components and backing data, delete the Filesystem CRD. Warning: Data will be deleted

    kubectl -n rook-ceph delete cephfilesystem myfs
    

    Flex Driver

    To create a volume based on the flex driver instead of the CSI driver, see the kube-registry.yaml example manifest or refer to the complete flow in the Rook v1.0 Shared File System documentation.

    Advanced Example: Erasure Coded Filesystem

    The Ceph filesystem example can be found here: Ceph Shared File System - Samples - Erasure Coded.