Module 3 - Persistent Storage for Apps
Overview
In this module you will use CNS as a developer would do in OpenShift. For that purpose you will dynamically provision storage both in standalone fashion and in context of an application deployment.
This module requires that you have completed Module 2.
OpenShift Storage 101#
OpenShift uses Kubernetes’ PersistentStorage facility to dynamically allocate storage of any kind for applications. This is a fairly simple framework in which only 3 components are relevant: the storage provider, the storage volume and the request for a storage volume.
OpenShift knows non-ephemeral storage as “persistent” volumes. This is storage that is decoupled from pod lifecycles. Users can request such storage by submitting a PersistentVolumeClaim to the system, which carries aspects like desired capacity or access mode (shared, single, read-only).
A storage provider in the system is represented by a StorageClass and is referenced in the claim. Upon receiving the claim OpenShift talks to the API of the actual storage system to provision the storage.
The provisioned storage is represented in OpenShift as a PersistentVolume which can directly be used by pods to mount it.
With these basics defined we can try CNS in our system. First examine the StorageClass the installer has automatically created for us.
⇨ Remain logged in as operator for now:
oc login -u operator
⇨ Examine the StorageClass objects available:
oc get storageclass
openshift-ansible defined a StorageClass for CNS:
NAME TYPE glusterfs-storage kubernetes.io/glusterfs
⇨ Let’s look at the details:
oc describe storageclass/glusterfs-storage
The output indicates the backing storage type: GlusterFS
Name: glusterfs-storage IsDefaultClass: No Annotations: <none> Provisioner: kubernetes.io/glusterfs Parameters: resturl=http://heketi-storage-app-storage.cloudapps.52.28.134.154.nip.io,restuser=admin,secretName=heketi-storage-admin-secret,secretNamespace=app-storage
Note
The exact value for resturl will again be different for you because it’s based on the route/IP address on your system.
The Provisioner is a module in OpenShift/Kubernetes that can talk to the CNS API service: heketi. The parameters supplied in the StorageClass tell the Provisioner the URL of the API as well as the admin users (defined in restuser) password in the form of an OpenShift secret (base64‘d hash of the password).
The Provisioner is not an entity directly accessible to users.
Requesting Storage#
To get storage provisioned via this StorageClass as a user you have to “claim” storage. The object PersistentVolumeClaim (PVC) basically acts a request to the system to provision storage with certain properties, like a specific capacity.
Also the access mode is set here, where ReadWriteMany allows one or more container in parallel to mount and access this storage. This capability is dependent on the storage backend. In our case, with GlusterFS, we have one of the few systems that can reliable implement shared storage.
⇨ Create a claim by specifying a file called cns-pvc.yml with the following contents:
cns-pvc.yml:
kind: PersistentVolumeClaim apiVersion: v1 metadata: name: my-container-storage spec: accessModes: - ReadWriteMany resources: requests: storage: 10Gi storageClassName: glusterfs-storage
With above PVC we are requesting 10 GiB of shared storage. Instead of ReadWriteMany you could also have specified ReadWriteOnly (for read-only) and ReadWriteOnce (for non-shared storage, where only one pod can mount at a time).
⇨ Submit the PVC to the system like so:
oc create -f cns-pvc.yml
⇨ After a couple of seconds, look at the requests state with the following command:
oc get pvc
You should see the PVC listed and in Bound state.
NAME STATUS VOLUME CAPACITY ACCESSMODES STORAGECLASS AGE my-container-storage Bound pvc-848cbc48-9fe3-11e7-83c3-022238c6a515 10Gi RWX glusterfs-storage 6s
Caution
If the PVC is stuck in PENDING state you will need to investigate. Run oc describe pvc/my-container-storage to see a more detailed explanation. Typically there are two root causes - the StorageClass is not properly specified in the PVC (wrong name, not specified) or (less likely here) the backing storage system has a problem (in our case: error on heketi side, incorrect URL in StorageClass, etc.)
Tip
Alternatively, you can also do this step with the UI. Log on as operator and select any Project. Then go to the “Storage” tab. Select “Create” storage and make selections accordingly to the PVC described before.
When the claim was fulfilled successfully it is in the Bound state. That means the system has successfully (via the StorageClass) reached out to the storage backend (in our case GlusterFS). The backend in turn provisioned the storage and provided a handle back OpenShift. In OpenShift the provisioned storage is then represented by a PersistentVolume (PV) which is bound to the PVC.
⇨ Look at the PVC for these details:
oc describe pvc/my-container-storage
The details of the PVC show all the desired properties of the requested storage and against which StorageClass it has been submitted. Since it’s already bound thanks to dynamic provisioning it also displays the name of the PersistentVolume which was generated to fulfil the claim.
The name of the PV always follows the pattern pvc-....
Name: my-container-storage Namespace: app-storage StorageClass: glusterfs-storage Status: Bound Volume: pvc-848cbc48-9fe3-11e7-83c3-022238c6a515 Labels: <none> Annotations: pv.kubernetes.io/bind-completed=yes pv.kubernetes.io/bound-by-controller=yes volume.beta.kubernetes.io/storage-provisioner=kubernetes.io/glusterfs Capacity: 10Gi Access Modes: RWX Events: FirstSeen LastSeen Count From SubObjectPath Type Reason Message --------- -------- ----- ---- ------------- -------- ------ ------- 10m 10m 1 persistentvolume-controller Normal ProvisioningSucceeded Successfully provisioned volume pvc-848cbc48-9fe3-11e7-83c3-022238c6a515 using kubernetes.io/glusterfs
Note
The PV name will be different in your environment since it’s automatically generated.
In order to look at a the details of a PV in a default setup like this you need more privileges.
⇨ Look at the corresponding PV by it’s name. Use the following command which stores the exact name in an environment variable extracted by an oc command for copy&paste-friendliness:
PV_NAME=$(oc get pvc/my-container-storage -o jsonpath="{.spec.volumeName}")
oc describe pv/${PV_NAME}
The output shows several interesting things, like the access mode (RWX = ReadWriteMany), the reclaim policy (what happens when the PV object gets deleted), the capacity and the type of storage backing this PV (in our case GlusterFS as part of CNS):
Name: pvc-848cbc48-9fe3-11e7-83c3-022238c6a515
Labels: <none>
Annotations: pv.beta.kubernetes.io/gid=2001
pv.kubernetes.io/bound-by-controller=yes
pv.kubernetes.io/provisioned-by=kubernetes.io/glusterfs
volume.beta.kubernetes.io/mount-options=auto_unmount
StorageClass: glusterfs-storage
Status: Bound
Claim: app-storage/my-container-storage
Reclaim Policy: Delete
Access Modes: RWX
Capacity: 10Gi
Message:
Source:
Type: Glusterfs (a Glusterfs mount on the host that shares a pod's lifetime)
EndpointsName:glusterfs-dynamic-my-container-storage
Path: vol_7e1733b13e1b46c028a71590f8cfe8b5
ReadOnly: false
Events: <none>
Note how all the properties exactly match up with what the PVC requested.
Why is it called Bound?
Originally PVs weren’t automatically created. Hence in earlier documentation you may also find references about administrators actually pre-provisioning PVs. Later PVCs would “pick up”/match a suitable PV by looking at it’s capacity and access mode. When successful they are bound to this PV.
This was needed for storage like NFS that does not have an API and therefore does not support dynamic provisioning. That’s called static provisioning.
This kind of storage should not be used anymore as it requires manual intervention, risky capacity planning and incurs inefficient storage utilization.
Although the storage is provisioned on the GlusterFS side it’s not yet used by any application/pod/host. So let’s release this storage capacity again.
Storage is freed up by deleting the PVC. The PVC controls the lifecycle of the storage, not the PV.
Important
Never delete PVs that are dynamically provided. They are only handles for pods mounting the storage. With dynamic provisioning storage lifecycle is entirely controlled via PVCs.
⇨ Delete the storage by deleting the PVC like this:
oc delete pvc/my-container-storage
Make CNS the default storage#
For the following example it is required to make the StorageClass that got created for CNS the system-wide default. This simplifies the following steps.
⇨ Use the oc patch command to change the definition of the StorageClass on the fly:
oc patch storageclass glusterfs-storage \
-p '{"metadata": {"annotations": {"storageclass.kubernetes.io/is-default-class": "true"}}}'
⇨ Look at the StorageClass again to see the change reflected:
oc describe storageclass/glusterfs-storage
Verify it’s indeed the default (see highlighted line):
Name: glusterfs-storage
IsDefaultClass: Yes
Annotations: <none>
Provisioner: kubernetes.io/glusterfs
Parameters: resturl=http://heketi-storage-app-storage.cloudapps.52.28.134.154.nip.io,restuser=admin,secretName=heketi-storage-admin-secret,secretNamespace=app-storage
Important
It is crucial that you do not skip this step as it is fundamental for the next example to work.
Using non-shared storage for databases#
Normally a user doesn’t request storage with a PVC directly. Rather the PVC is part of a larger template that describes the entire application stack. Such examples ship with OpenShift out of the box.
Alternative
The steps described in this section to launch the Rails/Postgres example app can again also be done with the UI. For this purpose follow these steps similar to the one in Module 1:
Log on to the OpenShift UI as the
developeruserCreate a new project called ‘my-test-project’, label and description is optional
In the Overview, next to the project’s name select Add to project
In the Browse Catalog view select Ruby from the list of programming languages
Select the example app entitled Rails + PostgreSQL (Persistent)
(optional) Change the Volume Capacity parameter to 5GiB
Select Create to start deploying the app
Select Continue to Overview in the confirmation screen
Wait for the application deployment to finish and continue below at
To create an application from the OpenShift Example templates on the CLI follow these steps.
⇨ Log in as developer and the password r3dh4t
oc login -u developer
⇨ Create a new project with a name of your choice:
oc new-project my-test-project
To use the example applications that ship with OpenShift we can use the new-app command of the oc client. It will allow us to specify one of the application stack templates in the system. There are a lot of example templates that ship in the pre-defined namespace called openshift which is the default place where oc new-app will look.
Let’s pick a database application that definitely needs persistent storage. It’s going to be part of a simple example blog application based on Rails and PostgreSQL.
⇨ Instantiate this application with the following command
oc new-app rails-pgsql-persistent -p VOLUME_CAPACITY=5Gi
Among various OpenShift resources also our PVC will be created:
[...output omitted...]
secret "rails-pgsql-persistent" created
service "rails-pgsql-persistent" created
route "rails-pgsql-persistent" created
imagestream "rails-pgsql-persistent" created
buildconfig "rails-pgsql-persistent" created
deploymentconfig "rails-pgsql-persistent" created
persistentvolumeclaim "postgresql" created
service "postgresql" created
deploymentconfig "postgresql" created
The deployment process for the application stack continues in the background.
We have given the new-app command an additional switch: -p VOLUME_CAPACITY=5Gi. This causes a parameter in the template called VOLUME_CAPACITY to be set to 5GiB. Parameters make templates more generic. In our case the template contains a PersistentVolumeClaim (like highlighted above) which will take it’s size from this parameter.
What other parameters does this template have?
Plenty. If you are interested about all the variables/parameters this particular template supports, you can run oc process openshift//rails-pgsql-persistent --parameters.
What else does the template file contain?
The template describes all OpenShift resources necessary to stand up the rails pod and the postgres pod and make them accessible via services and routes. If you are curious: oc get template/rails-pgsql-persistent -n openshift -o yaml
In essence it creates Ruby on Rails instance in a pod which functionality mimics a very basic blogging application. The blog articles are saved in a PostgreSQL database that runs in a separate pod.
Above mentioned PVC can be found there as well (around line 194) which supplies the postgres pod with persistent storage below the mount point /var/lib/pgsql/data (around line 275).
You can now either use the OpenShift UI (while being logged as developer in the project my-test-project) or the CLI to follow the deployment process.
In the UI you will observe both pods deploying like this:
⇨ On the CLI watch the containers deploy like this:
oc get pods -w
The complete output should look like this:
NAME READY STATUS RESTARTS AGE postgresql-1-deploy 0/1 ContainerCreating 0 11s rails-pgsql-persistent-1-build 0/1 ContainerCreating 0 11s NAME READY STATUS RESTARTS AGE postgresql-1-deploy 1/1 Running 0 14s postgresql-1-81gnm 0/1 Pending 0 0s postgresql-1-81gnm 0/1 Pending 0 0s rails-pgsql-persistent-1-build 1/1 Running 0 19s postgresql-1-81gnm 0/1 Pending 0 15s postgresql-1-81gnm 0/1 ContainerCreating 0 16s postgresql-1-81gnm 0/1 Running 0 47s postgresql-1-81gnm 1/1 Running 0 4m postgresql-1-deploy 0/1 Completed 0 4m postgresql-1-deploy 0/1 Terminating 0 4m postgresql-1-deploy 0/1 Terminating 0 4m rails-pgsql-persistent-1-deploy 0/1 Pending 0 0s rails-pgsql-persistent-1-deploy 0/1 Pending 0 0s rails-pgsql-persistent-1-deploy 0/1 ContainerCreating 0 0s rails-pgsql-persistent-1-build 0/1 Completed 0 11m rails-pgsql-persistent-1-deploy 1/1 Running 0 6s rails-pgsql-persistent-1-hook-pre 0/1 Pending 0 0s rails-pgsql-persistent-1-hook-pre 0/1 Pending 0 0s rails-pgsql-persistent-1-hook-pre 0/1 ContainerCreating 0 0s rails-pgsql-persistent-1-hook-pre 1/1 Running 0 6s rails-pgsql-persistent-1-hook-pre 0/1 Completed 0 15s rails-pgsql-persistent-1-dkj7w 0/1 Pending 0 0s rails-pgsql-persistent-1-dkj7w 0/1 Pending 0 0s rails-pgsql-persistent-1-dkj7w 0/1 ContainerCreating 0 0s rails-pgsql-persistent-1-dkj7w 0/1 Running 0 1m rails-pgsql-persistent-1-dkj7w 1/1 Running 0 1m rails-pgsql-persistent-1-deploy 0/1 Completed 0 1m rails-pgsql-persistent-1-deploy 0/1 Terminating 0 1m rails-pgsql-persistent-1-deploy 0/1 Terminating 0 1m rails-pgsql-persistent-1-hook-pre 0/1 Terminating 0 1m rails-pgsql-persistent-1-hook-pre 0/1 Terminating 0 1m
Exit out of the watch mode with: Ctrl + c
Note
It may take up to 5-7 minutes for the deployment to complete.
If you did it via the UI the deployment is finished when both, rails app and postgres database are up and running:
You should also see a PVC being issued and in the Bound state.
⇨ Look at the PVC created:
oc get pvc/postgresql
Output:
NAME STATUS VOLUME CAPACITY ACCESSMODES AGE postgresql Bound pvc-6c348fbb-4e9d-11e7-970e-0a9938370404 15Gi RWO 4m
Now go ahead and try out the application. The overview page in the OpenShift UI will tell you the route which has been deployed as well (the http://… link in the upper right hand corner). Use it and append /articles to the URL to get to the actual app.
⇨ Otherwise get it on the CLI like this:
oc get route
Output:
NAME HOST/PORT PATH SERVICES PORT TERMINATION WILDCARD rails-pgsql-persistent rails-pgsql-persistent-my-test-project.cloudapps.34.252.58.209.nip.io rails-pgsql-persistent <all> None
Note
Again, the URL will be slightly different for you.
Following this output, point your browser to the URL (prepend it with http:// and append /articles) to reach the actual application, in this case:
http://rails-pgsql-persistent-my-test-project.cloudapps.<YOUR-IP-HERE>.nip.io/articles
You should be able to successfully create articles and comments. The username/password to create articles and comments is by default openshift/secret.
When they are saved they are actually saved in the PostgreSQL database which stores it’s table spaces on a GlusterFS volume provided by CNS.
⇨ You can verify that the postgres pod indeed mounted the PVC under the pather where PostgreSQL normally stores it’s data with this command:
oc volumes dc --all
You will see that the DeploymentConfig of the postgres pod did indeed include a PVC:
deploymentconfigs/postgresql
pvc/postgresql (allocated 5GiB) as postgresql-data
mounted at /var/lib/pgsql/data
deploymentconfigs/rails-pgsql-persistent
Now let’s take a look at how this was actually achieved.
⇨ A normal user cannot see the details of a PersistentVolume. Log back in as operator:
oc login -u operator -n my-test-project
⇨ Look at the PVC to determine the PV:
oc get pvc
Output:
NAME STATUS VOLUME CAPACITY ACCESSMODES AGE postgresql Bound pvc-6c348fbb-4e9d-11e7-970e-0a9938370404 5Gi RWO 10m
Note
Your volume (PV) name will be different as it’s dynamically generated.
The PVC name is found in above output the VOLUME column.
⇨ Look at the details of this PV with the following copy&paste-friendly short-hand:
PV_NAME=$(oc get pvc/postgresql -o jsonpath="{.spec.volumeName}")
oc describe pv/${PV_NAME}
Output shows (in highlight) the name of the volume, the backend type (GlusterFS) and the volume name GlusterFS uses internally.
Name: pvc-c638ba71-a070-11e7-890c-02ed99595f95 Labels: <none> Annotations: pv.beta.kubernetes.io/gid=2000 pv.kubernetes.io/bound-by-controller=yes pv.kubernetes.io/provisioned-by=kubernetes.io/glusterfs volume.beta.kubernetes.io/mount-options=auto_unmount StorageClass: glusterfs-storage Status: Bound Claim: my-test-project/postgresql Reclaim Policy: Delete Access Modes: RWO Capacity: 5Gi Message: Source: Type: Glusterfs (a Glusterfs mount on the host that shares a pod's lifetime) EndpointsName: glusterfs-dynamic-postgresql Path: vol_4b22dda4c9681f4325ba5e24cb4f64c6 ReadOnly: false Events: <none>
Note the GlusterFS volume name, in this case vol_4b22dda4c9681f4325ba5e24cb4f64c6.
⇨ Save the generated volume name of GlusterFS in a shell variable for later use:
GLUSTER_VOL_NAME=$(oc get pv/${PV_NAME} -o jsonpath="{.spec.glusterfs.path}")
⇨ Now let’s switch to the namespace we used for CNS deployment:
oc project app-storage
⇨ Look at the GlusterFS pods (filtered by label) running:
oc get pods -o wide -l glusterfs=storage-pod
Pick the first of the GlusterFS pods in the list:
NAME READY STATUS RESTARTS AGE IP NODE
glusterfs-storage-16pb0 1/1 Running 0 23m 10.0.2.201 node-1.lab
glusterfs-storage-37tqx 1/1 Running 0 23m 10.0.4.203 node-3.lab
glusterfs-storage-68lxn 1/1 Running 0 23m 10.0.3.202 node-2.lab
Pick the first the pod in the list, in this example glusterfs-storage-16pb0 and note it’s host IP address.
⇨ Use the following command to conveniently save it’s name and host IP address in a shell variable for later use (copy & paste those lines in your shell):
FIRST_GLUSTER_POD=$(oc get pods -l glusterfs=storage-pod -o jsonpath="{.items[0].metadata.name}")
HOST_IP=$(oc get pod/$FIRST_GLUSTER_POD -o jsonpath="{.status.hostIP}")
echo $FIRST_GLUSTER_POD
echo $HOST_IP
Next we are going to use the remote session capability of the oc client to execute a command in that pods namespace. We are going to leverage the GlusterFS CLI utilities being present in that pod.
⇨ Ask GlusterFS from inside the CNS pod about all the GlusterFS volumes defined:
oc rsh $FIRST_GLUSTER_POD gluster vol list
You will see two volumes:
heketidbstorage vol_4b22dda4c9681f4325ba5e24cb4f64c6
-
heketidbstorageis an internal-only volume dedicated to heketi’s internal database. -
the second is the volume backing the PV of the PostgreSQL database deployed earlier, in this example
vol_4b22dda4c9681f4325ba5e24cb4f64c6- your’s will be differently named.
⇨ Ask GlusterFS about the topology of this volume:
oc rsh $FIRST_GLUSTER_POD gluster vol info $GLUSTER_VOL_NAME
The output of the gluster command will show you how the volume has been created. You will also see that the pod you are currently logged on to serves one the bricks.
Volume Name: vol_4b22dda4c9681f4325ba5e24cb4f64c6 Type: Replicate Volume ID: 37d53d51-34bc-4853-b564-3b0ea9bdd935 Status: Started Snapshot Count: 0 Number of Bricks: 1 x 3 = 3 Transport-type: tcp Bricks: Brick1: 10.0.2.201:/var/lib/heketi/mounts/vg_50f5d808e04ccab8d6fd0231c268db35/brick_4b59cd1f4a8ff8d8a3eddf7317829e73/brick Brick2: 10.0.4.203:/var/lib/heketi/mounts/vg_7cb3be478376539d0c4b54cf69688c8e/brick_688627cc5dca8d01a81fa504487116c0/brick Brick3: 10.0.3.202:/var/lib/heketi/mounts/vg_fb1a45c7853f415a3a09a164f0d717fb/brick_931730cb987383a605c1d1ff5d796fa9/brick Options Reconfigured: transport.address-family: inet performance.readdir-ahead: on nfs.disable: on
The above output tells us GlusterFS created this volume as a 3-way replica set across 3 bricks. Bricks are local directories on GlusterFS nodes. They make up replication targets.
In our case the GlusterFS nodes are our CNS pods and since they share the physical hosts network they are displayed with these IP addresses (see highlighted lines) . This volume type Replicate is currently the only supported volume type in production. It synchronously replicates all data across those 3 bricks.
Let’s take a look at what’s inside a brick.
⇨ Paste this little piece of bash magic into your shell to conveniently store the brick directory from the first CNS pod you saw earlier in an environment variable:
BRICK_DIR=$(echo -n $(oc rsh $FIRST_GLUSTER_POD gluster vol info $GLUSTER_VOL_NAME | grep $HOST_IP) | cut -d ':' -f 3 | tr -d $'\r' ) echo $BRICK_DIR
⇨ Now let’s look at a brick directory from inside a CNS pod:
oc rsh $FIRST_GLUSTER_POD ls -ahl $BRICK_DIR
What you see is the content of the brick directory from within the GlusterFS pod, which makes up 1 out of 3 copies of our postgres volume:
total 16K drwxrwsr-x. 5 root 2001 57 Jun 6 14:44 . drwxr-xr-x. 3 root root 19 Jun 6 14:44 .. drw---S---. 263 root 2001 8.0K Jun 6 14:46 .glusterfs drwxr-sr-x. 3 root 2001 25 Jun 6 14:44 .trashcan drwx------. 20 1000080000 2001 8.0K Jun 6 14:46 userdata
⇨ Going one level deeper, we see a data structure familiar to PostgreSQL users:
oc rsh $FIRST_GLUSTER_POD ls -ahl $BRICK_DIR/userdata
This is one of 3 copies of the postgres data directory hosted by CNS:
total 68K drwx------. 20 1000080000 2001 8.0K Jun 6 14:46 . drwxrwsr-x. 5 root 2001 57 Jun 6 14:44 .. -rw-------. 2 1000080000 root 4 Jun 6 14:44 PG_VERSION drwx------. 6 1000080000 root 54 Jun 6 14:46 base drwx------. 2 1000080000 root 8.0K Jun 6 14:47 global drwx------. 2 1000080000 root 18 Jun 6 14:44 pg_clog drwx------. 2 1000080000 root 6 Jun 6 14:44 pg_commit_ts drwx------. 2 1000080000 root 6 Jun 6 14:44 pg_dynshmem -rw-------. 2 1000080000 root 4.6K Jun 6 14:46 pg_hba.conf -rw-------. 2 1000080000 root 1.6K Jun 6 14:44 pg_ident.conf drwx------. 2 1000080000 root 32 Jun 6 14:46 pg_log drwx------. 4 1000080000 root 39 Jun 6 14:44 pg_logical drwx------. 4 1000080000 root 36 Jun 6 14:44 pg_multixact drwx------. 2 1000080000 root 18 Jun 6 14:46 pg_notify drwx------. 2 1000080000 root 6 Jun 6 14:44 pg_replslot drwx------. 2 1000080000 root 6 Jun 6 14:44 pg_serial drwx------. 2 1000080000 root 6 Jun 6 14:44 pg_snapshots drwx------. 2 1000080000 root 6 Jun 6 14:46 pg_stat drwx------. 2 1000080000 root 84 Jun 6 15:16 pg_stat_tmp drwx------. 2 1000080000 root 18 Jun 6 14:44 pg_subtrans drwx------. 2 1000080000 root 6 Jun 6 14:44 pg_tblspc drwx------. 2 1000080000 root 6 Jun 6 14:44 pg_twophase drwx------. 3 1000080000 root 60 Jun 6 14:44 pg_xlog -rw-------. 2 1000080000 root 88 Jun 6 14:44 postgresql.auto.conf -rw-------. 2 1000080000 root 21K Jun 6 14:46 postgresql.conf -rw-------. 2 1000080000 root 46 Jun 6 14:46 postmaster.opts -rw-------. 2 1000080000 root 89 Jun 6 14:46 postmaster.pid
You are looking at the PostgreSQL internal data file structure from the perspective of the GlusterFS server side. Evidence that the database uses CNS.
Clients, like the OpenShift nodes and their application pods talk to this storage with the GlusterFS protocol as it were an ordinary local mounts.
When a pod starts that mounts storage from a PV backed by CNS the GlusterFS mount plugin in OpenShift will mount the GlusterFS volume on the right OpenShift node and then bind-mount this directory to the right pod’s file namespace.
This happens transparently to the application and looks like a normal local filesystem inside the pod as you just saw. Let’s have a look from the container host perspective:
⇨ Get the name and the host IP of the postgres pod with this shell shortcut into environment variables for easy copy&paste later:
POSTGRES_POD=$(oc get pods -l name=postgresql -n my-test-project -o jsonpath="{.items[0].metadata.name}")
POSTGRES_CONTAINER_HOST=$(oc get pod/$POSTGRES_POD -n my-test-project -o jsonpath="{.status.hostIP}")
echo $POSTGRES_POD
echo $POSTGRES_CONTAINER_HOST
Since you are acting from the master node master.lab you can use SSH without password to execute a remote command on the OpenShift node hosting the postgres pod.
⇨ Look for the GlusterFS mount points on the host, searching the GlusterFS volume that was provisioned for the database
ssh $POSTGRES_CONTAINER_HOST mount | grep $GLUSTER_VOL_NAME
Tip
Answer the SSH clients question “Are you sure you want to continue connecting (yes/no)?” with yes.
The host should have mounted this GlusterFS volume, for example:
10.0.2.201:vol_4b22dda4c9681f4325ba5e24cb4f64c6 on /var/lib/origin/openshift.local.volumes/pods/c7029a5a-a070-11e7-890c-02ed99595f95/volumes/kubernetes.io~glusterfs/pvc-c638ba71-a070-11e7-890c-02ed99595f95 type fuse.glusterfs (rw,relatime,user_id=0,group_id=0,default_permissions,allow_other,max_read=131072)
This sums up the relationship between PVCs, PVs, GlusterFS volumes and container mounts in CNS.
The mounting and unmounting of GlusterFS volumes is faciliated automatically by the GlusterFS mount plugin that ships with OpenShift.
Providing shared storage to multiple application instances#
In the previous example we provisioned an RWO PV - the volume is only usable with one pod at a time. RWO is what most of the OpenShift storage backends support and it just happened to be default in the example template.
So far only very few options, like the basic NFS support existed, to provide a PersistentVolume to more than one container at once. The reason is that most supported storage backends are actually block-based. That is a block device is made available to one of the container hosts and is then formatted with an XFS filesystem, which is inherently not cluster-aware (cannot be safely written to from multiple Operating Systems / Containers).
GlusterFS on the other hand is a true scale-out cluster filesystem with distributed locking. Hence we can use the access mode ReadWriteMany on OpenShift.
With CNS this capability is now available to all OpenShift deployments, no matter where they are deployed. To demonstrate this capability with an application we will deploy a PHP-based file uploader that has multiple front-end instances sharing a common storage repository.
⇨ Log back in as developer to our project my-test-project
oc login -u developer -n my-test-project
⇨ Next deploy the example application:
oc new-app openshift/php:7.0~https://github.com/christianh814/openshift-php-upload-demo --name=file-uploader
Note
This is yet another way to build and launch an application from source code in OpenShift. The content before the ~ is the name of a Source-to-Image builder (a container that knows how to build applications of a certain type from source, in this case PHP) and the URL following is a GitHub repository hosting the source code.
Output:
--> Found image a1ebebb (6 weeks old) in image stream "openshift/php" under tag "7.0" for "openshift/php:7.0"
Apache 2.4 with PHP 7.0
-----------------------
Platform for building and running PHP 7.0 applications
Tags: builder, php, php70, rh-php70
* A source build using source code from https://github.com/christianh814/openshift-php-upload-demo will be created
* The resulting image will be pushed to image stream "file-uploader:latest"
* Use 'start-build' to trigger a new build
* This image will be deployed in deployment config "file-uploader"
* Port 8080/tcp will be load balanced by service "file-uploader"
* Other containers can access this service through the hostname "file-uploader"
--> Creating resources ...
imagestream "file-uploader" created
buildconfig "file-uploader" created
deploymentconfig "file-uploader" created
service "file-uploader" created
--> Success
Build scheduled, use 'oc logs -f bc/file-uploader' to track its progress.
Run 'oc status' to view your app.
⇨ Observe the application to be deployed with the suggested command:
oc logs -f bc/file-uploader
The follow-mode of the above command ends automatically when the build is successful and you return to your shell.
[ ...output omitted...]
Cloning "https://github.com/christianh814/openshift-php-upload-demo" ...
Commit: 7508da63d78b4abc8d03eac480ae930beec5d29d (Update index.html)
Author: Christian Hernandez <christianh814@users.noreply.github.com>
Date: Thu Mar 23 09:59:38 2017 -0700
---> Installing application source...
Pushing image 172.30.120.134:5000/my-test-project/file-uploader:latest ...
Pushed 0/5 layers, 2% complete
Pushed 1/5 layers, 20% complete
Pushed 2/5 layers, 40% complete
Push successful
⇨ When the build is completed ensure the pods are running:
oc get pods
Among your existing pods you should see new pods running.
NAME READY STATUS RESTARTS AGE file-uploader-1-build 0/1 Completed 0 2m file-uploader-1-g7b0h 1/1 Running 0 1m ...
As part of the deployment a Service has been created for our app automatically. It load balances traffic to our PHP pods internally but not externally. For that a Route needs to expose it to the network outside of OpenShift.
⇨ Let’s fix this:
oc expose svc/file-uploader
⇨ Check the route that has been created:
oc get route/file-uploader
The route forwards all traffic on port 80 of it’s automatically generated subdomain of the OpenShift router to port 8080 of the container running the app.
NAME HOST/PORT PATH SERVICES PORT TERMINATION WILDCARD file-uploader file-uploader-my-test-project.cloudapps.34.252.58.209.nip.io file-uploader 8080-tcp None
Point your browser to the URL advertised by the route, that is http://file-uploader-my-test-project.cloudapps.<YOUR-IP-HERE>.nip.io
Alternatively, in the OpenShift UI, while logged on as devleoper to the project called my-test-project, click the Down Arrow in the Overview section next to the deployment called file-uploader. The URL to your app will be in the section called ROUTES.
The application again is very simply: it lists all file previously uploaded files and offers the ability to upload new ones, as well as download the existing uploads. Right now there is nothing.
Try it out in your browser: select an arbitrary from your local system and upload it to the app.
After uploading a file validate it has been stored successfully by following the link List Uploaded Files in the browser.
Let’s see how this is stored locally in the container.
⇨ List the running pods of our application:
oc get pods -l app=file-uploader
You will see two entries:
file-uploader-1-build 0/1 Completed 0 7m file-uploader-1-g7b0h 1/1 Running 0 6m
The name of the single pod currently running the app is this example is file-uploader-1-g7b0h.
The container called file-uploader-1-build is the builder container that deployed the application and it has already terminated.
Note
The exact name of the pod will be different in your environment.
⇨ Use the following shell command to store the exact name of the file-uploader application pod in your environment in a shell variable called UPLOADER_POD:
UPLOADER_POD=$(oc get pods -l app=file-uploader -o jsonpath="{.items[0].metadata.name}")
echo $UPLOADER_POD
⇨ Use the remote shell capability of the oc client to list the content of uploaded/ directory inside the pod after you uploaded a file in the PHP app:
oc rsh $UPLOADER_POD ls -ahl /opt/app-root/src/uploaded
In the below example output we’ve uploaded a file named cns-deploy-4.0.0-15.el7rhgs.x86_64.rpm.gz in the app via the browser, and we see it store from within the pod:
total 16K -rw-r--r--. 1 1000080000 root 16K May 26 09:32 cns-deploy-4.0.0-15.el7rhgs.x86_64.rpm.gz
The app should also list the file in the overview:
However, in it’s default configuration his pod currently does not use any persistent storage. It uses it’s local filesystem - that is stores the file in inside the container image’s root filesystem.
Important
Never store important data inside a pods root filesystem or emptyDir. It’s ephemeral by definition and will be lost as soon as the pod terminates.
Worse, the container’s root filesystem is even slower than emptyDir as it needs to traverse the overlay2 stack, that Red Hat Enterprise Linux uses by default as of version 7.4 for running container images.
Also, inherently pods using this kind of storage cannot be scaled out trivially.
Let’s see when this become a problem.
⇨ Let’s scale the deployment to 3 instances of the app:
oc scale dc/file-uploader --replicas=3
⇨ Watch the additional pods getting spawned:
oc get pods -l app=file-uploader
You will see 2 additional pods being spawned:
NAME READY STATUS RESTARTS AGE file-uploader-1-3cgh1 1/1 Running 0 20s file-uploader-1-3hckj 1/1 Running 0 20s file-uploader-1-g7b0h 1/1 Running 0 3m
Note
The pod names will be different in your environment since they are automatically generated. It takes a couple of seconds until they are ready.
Alternatively, in the UI, wait for the file-uploader application reach 3 healthy pods (the blue circle is completely filled):
On the command line this will look like this:
oc get pods -l app=file-uploader
NAME READY STATUS RESTARTS AGE file-uploader-1-98fwm 1/1 Running 0 2m file-uploader-1-g7b0h 1/1 Running 0 8m file-uploader-1-rwt2p 1/1 Running 0 2m
These 3 pods now make up our application. OpenShift will load balance incoming traffic between them.
However, when you log on to one of the new instances you will see they have no data.
⇨ Store the names of all in some environment variables for easy copy&paste:
UPLOADER_POD_1=$(oc get pods -l app=file-uploader -o jsonpath="{.items[0].metadata.name}")
UPLOADER_POD_2=$(oc get pods -l app=file-uploader -o jsonpath="{.items[1].metadata.name}")
UPLOADER_POD_3=$(oc get pods -l app=file-uploader -o jsonpath="{.items[2].metadata.name}")
⇨ Lets check all upload directories of all pods:
oc rsh $UPLOADER_POD_1 ls -ahl /opt/app-root/src/uploaded oc rsh $UPLOADER_POD_2 ls -ahl /opt/app-root/src/uploaded oc rsh $UPLOADER_POD_3 ls -ahl /opt/app-root/src/uploaded
Oh oh, only one of the pods has the previously uploaded file. Looks like our application data is not consistent anymore:
oc rsh $UPLOADER_POD_1 ls -ahl /opt/app-root/src/uploaded total 0 drwxrwxr-x. 2 default root 22 Sep 24 11:31 . drwxrwxr-x. 1 default root 124 Sep 24 11:31 .. -rw-rw-r--. 1 default root 0 Sep 24 11:31 .gitkeep oc rsh $UPLOADER_POD_2 ls -ahl /opt/app-root/src/uploaded total 108K drwxrwxr-x. 1 default root 52 Sep 24 11:35 . drwxrwxr-x. 1 default root 22 Sep 24 11:31 .. -rw-rw-r--. 1 default root 0 Sep 24 11:31 .gitkeep -rw-r--r--. 1 1000080000 root 16K May 26 09:32 cns-deploy-4.0.0-15.el7rhgs.x86_64.rpm.gz oc rsh $UPLOADER_POD_3 ls -ahl /opt/app-root/src/uploaded total 0 drwxrwxr-x. 2 default root 22 Sep 24 11:31 . drwxrwxr-x. 1 default root 124 Sep 24 11:31 .. -rw-rw-r--. 1 default root 0 Sep 24 11:31 .gitkeep
It’s empty because the previously uploaded files were stored locally in the first container and are not available to the others.
Similarly, other users of the app will sometimes see your uploaded files and sometimes not. With the deployment scaled to 3 instances OpenShifts router will simply round-robin across them. You can simulate this with another instance of your browser in “Incognito mode” pointing to your app.
The app is of course not usable like this. We can fix this by providing shared storage to this app.
⇨ First create a PVC with the appropriate setting in a file called cns-rwx-pvc.yml with below contents:
cns-rwx-pvc.yml:
kind: PersistentVolumeClaim apiVersion: v1 metadata: name: my-shared-storage spec: accessModes: - ReadWriteMany resources: requests: storage: 5Gi storageClassName: glusterfs-storage
Notice the access mode explicitly requested to be ReadWriteMany (also referred to as RWX). Storage provisioned like this can be mounted by multiple containers on multiple hosts at the same time.
⇨ Submit the request to the system:
oc create -f cns-rwx-pvc.yml
⇨ Let’s look at the result:
oc get pvc
ACCESSMODES is set to RWX:
NAME STATUS VOLUME CAPACITY ACCESSMODES AGE my-shared-storage Bound pvc-62aa4dfe-4ad2-11e7-b56f-2cc2602a6dc8 10Gi RWX 22s ...
We can now update the DeploymentConfig of our application to use this PVC to provide the application with persistent, shared storage for uploads.
⇨ Update the configuration of the application by adding a volume claim like this:
oc volume dc/file-uploader --add --name=shared-storage --type=persistentVolumeClaim --claim-name=my-shared-storage --mount-path=/opt/app-root/src/uploaded
Our app will now re-deploy (in a rolling fashion) with the new settings - all pods will mount the volume identified by the PVC under /opt/app-root/src/upload (the path is predictable so we can hard-code it here).
⇨ You can watch it like this:
oc logs dc/file-uploader -f
The new DeploymentConfig will supersede the old one.
--> Scaling up file-uploader-2 from 0 to 3, scaling down file-uploader-1 from 3 to 0 (keep 3 pods available, don't exceed 4 pods)
Scaling file-uploader-2 up to 1
Scaling file-uploader-1 down to 2
Scaling file-uploader-2 up to 2
Scaling file-uploader-1 down to 1
Scaling file-uploader-2 up to 3
Scaling file-uploader-1 down to 0
--> Success
Exit out of the follow mode with: Ctrl + c
Warning
Changing the storage settings of a pod can be destructive. Any existing data will not be preserved. You are responsible to care for data migration.
One strategy here could have been to use oc rsync saving the data to a local directory on the machine running the oc client.
You can also observe the rolling upgrade of the file uploader application in the OpenShift UI:
The new DeploymentConfig named file-uploader-2 will have 3 pods all sharing the same storage.
⇨ Get the names of the new pods:
oc get pods -l app=file-uploader
Output:
NAME READY STATUS RESTARTS AGE file-uploader-1-build 0/1 Completed 0 18m file-uploader-2-jd22b 1/1 Running 0 1m file-uploader-2-kw9lq 1/1 Running 0 2m file-uploader-2-xbz24 1/1 Running 0 1m
Try it out in your application: upload new files and watch them being visible from within all application pods. In new browser Incognito sessions, simulating other users, the application behaves normally as it circles through the pods between browser requests.
That’s it. You have successfully provided shared storage to pods throughout the entire system, therefore avoiding the need for data to be replicated at the application level to each pod.
With CNS this is available wherever OpenShift is deployed with no external dependency.








