I have a few powerful machines which is a bit of wasted to run only one instance of es. I was thinking of running multiple instances. Each of these machines have 12 SSD with JBOD configuration.
I have mounted them as /data/1/, /data/2/.. and so on
My worry is whether multiple instances sharing same data directories would have any issue. Eg. Data corruption
I have included max_local_storage_nodes in the config. Is this appropriate for production setup?
Also, my raid setup is JBOD. The purpose is to swap faulty SSD and maintain redundancy. Using docker containers, I would still be specifying volumes pointing to multiple data paths.
Is there some other design that you have in mind for supporting multiple instances?
max_local_storage_nodes is not recommended for production; you should use separate data directories for each instance of elasticsearch (meaning each instance needs its own config directory). Additionally, that setting is being deprecated in 7.x (see https://github.com/elastic/elasticsearch/pull/42426).
Hi Ryan, thanks for the reply and info. I will allocate separate data directories for each instance.
For config directory, I am trying to "eliminate" the need to use the elasticsearch.yml for each instance by having each instance start up with the environment and variables.
Apache, Apache Lucene, Apache Hadoop, Hadoop, HDFS and the yellow elephant
logo are trademarks of the
Apache Software Foundation
in the United States and/or other countries.