elasticsearch.yml supports configuration setting where I can specify multiple data path for the ES data node as follows:
I can define additional volumes and/or PVC using spec.nodeSets.volumeClaimTemplates
When I define additional volumes using pod spec.volumeClaimTemplates
however, when I specify:
- name: default
path.data: [ "/mnt/data1', "/mnt/data2", "/mnt/data3" ]
I cant an error from kubernetes operator that path.data is no configurable.
Is there a way to achieve this configuration using ECK?
Thank you in advance!
Thanks for opening this thread and sorry for the late answer. I spent some time investigating.
Here is an example manifest that should achieve what you want:
- name: default
- name: elasticsearch
- name: path.data
- name: elasticsearch-data
- name: elasticsearch-data2
- name: elasticsearch-data3
- name: chown-data-volumes
command: ["sh", "-c", "chown elasticsearch:elasticsearch /mnt/data && chown elasticsearch:elasticsearch /mnt/data2 && chown elasticsearch:elasticsearch /mnt/data3"]
A few things to note:
- See how I override the Elasticsearch settings through an environment variable (
path.data). That's an undocumented way of providing Elasticsearch settings. I opened an issue in our github repo, because you should be able to also set this through the
config section: https://github.com/elastic/cloud-on-k8s/issues/2573.
- The warning when you use a "blacklisted" setting is just a warning, but does not prevent you from actually using that setting if you know what you are doing.
- Make sure one of your volume is named
elasticsearch-data, otherwise ECK will still create the default
elasticsearch-data volume for you, in addition to other volumes you ay have defined in the manifest. I created an issue in our github repo so we fix that: https://github.com/elastic/cloud-on-k8s/issues/2574
- See how the
chown-data-volumes init container is changing permission on the volumes underlying filesystems so the
elasticsearch user is able to write data into them. This is set up by ECK on the default volume, but not your own volumes.
- I created https://github.com/elastic/cloud-on-k8s/issues/2575 so we simplify the overall experience of setting up multiple volumes for Elasticsearch data.
What if we do not want to use pvc at all?
I don't want to bind persistent volume to master and client nodes, is that possible with ECK?
I want to create an architechture like below.
@hmz see the end of this doc page to use emptyDirs for some of your NodeSets (I guess client nodes in your case).
Note that if you also use emptyDirs for your master nodes, and happen to loose more than half your master nodes Pods, chances are your cluster won't be able to recover. I would advise against it. It seems fine for client nodes though.
On your diagram, please also note that all Pods will talk to each other directly: data Pods won't go through a headless service to talk to Master Pods.
It does make sense, as your diagram suggests, to create your own service to route traffic to client Pods only. It does also make sense to scale data Pods independently from masters and clients.