ECK Storage Recommendations

Dear All,
We are planning to deploy storage sol on ECK cluster, internal build k8s platform.
Storage solution provided by our team does not support dynamic provisioning currently. So we are left with following approach for Elasticsearch storage management.

Manually Create SC/PVC/PV with standard naming for Elasticsearch. Once Elasticsearch cluster will spin up, it can use the existing PVC to mount storage:

Naming Strategy for PVC: volClaimTempName-clustername-es-nodeName-[number seq 0-x]

So once PersistentVolumeClaim is in place, the control plane looks for a PersistentVolume that satisfies the claim's requirements. If the control plane finds a suitable PersistentVolume with the same StorageClass, it binds the claim to the volume. The status of the PVC and the PV changes to Bound. This we have tested with this storage solution and it works.

In the above case one manual overhead comes into picture in which I want some guidance.

overhead Details: If PVC get removed (accidentally, pod kill, any issue on the cluster etc), but the PV, the physical storage will remain same. Then K8s object will try to create resource Elasticsearch again and will try to mount the PV with retained data.

Keeping the volume data intact and allowing POD to mount it again is a recommended practice in terms of Elasticsearch deployment ?

**Or Elasticsearch recommends to clean the data on PV every time before mounting when a fresh PVC request comes into picture ? (default approach for dynamic provisioning) **

Regards
AK

I'm not sure I understood it all, but here are a few things that can maybe help:

  • as long as you have existing PVs with a storageClass defined, you shouldn't have to create PVCs yourself. K8s itself will create PVCs that match the storage class of your existing PVs. SO it should be enough to manage StorageClass and PersitentVolumes only.
  • if a PVC is deleted, and the Pod is removed, a new PVC will be created automatically. If there is a corresponding PV already, that PV will be reused for the new PVC. Which is fine, since we are talking about recreating the same Elasticsearch node by reusing the existing data. Edit March 2021: not sure what I had in mind, this is false. Reusing PVs when PVCs have been removing require some intervention: pre-create the PVCs in advance, and modify the claimRef of the existing PVs to match the new PVCs. See this tool as an example.

Thanks @sebgl

In case of recreating of cluster I'd like to use already created storage volumes (which contain data that was created by deleted cluster) should I register only PV's in advance before running elastic-operator and elastic-operator will create PVC automatically? Also is the new cluster will be able use the data that old one created?

You also have to make sure you have a PVC that is referenced in the volume via the claimRef attribute. If both are in place and the naming of the PVC follows the naming convention the ECK operator uses it will be adopted. The naming convention is elasticsearch-data-$CLUSTER_NAME-es-$NODESET_NAME-$ORDINAL

We are making this easier in one of the next releases of ECK with an option that allows you to tell the ECK operator that you want to keep and reuse the PVCs and PVs https://github.com/elastic/cloud-on-k8s/pull/4050

thanks. I tried to create PV and PVC using exactly with the same yml files that were created by elastic operator in previous cluster. Unfortunately Elasticsearch instances were not be able to use these PVs and PVCs. So, the approach you suggested doesn't work. And using 'DeleteOnScaledownOnly' is not exactly that I wanted to see. What I'd like to do is create PVs and PVCs manually and then allows elastic-operator to map and use it.