Multiple JVM (instances) in same machine sharing same data and log directory

I have a few powerful machines which is a bit of wasted to run only one instance of es. I was thinking of running multiple instances. Each of these machines have 12 SSD with JBOD configuration.
I have mounted them as /data/1/, /data/2/.. and so on

My worry is whether multiple instances sharing same data directories would have any issue. Eg. Data corruption

path:
data: /data/1/,/data/2/,/data/3/ ...
logs: /var/log/elasticsearch

Yes, Elasticsearch will detect this and not run another instance using the same data directory.

Have you considered using containers?

I have included max_local_storage_nodes in the config. Is this appropriate for production setup?

Also, my raid setup is JBOD. The purpose is to swap faulty SSD and maintain redundancy. Using docker containers, I would still be specifying volumes pointing to multiple data paths.
Is there some other design that you have in mind for supporting multiple instances?

max_local_storage_nodes is not recommended for production; you should use separate data directories for each instance of elasticsearch (meaning each instance needs its own config directory). Additionally, that setting is being deprecated in 7.x (see https://github.com/elastic/elasticsearch/pull/42426).

2 Likes

Hi Ryan, thanks for the reply and info. I will allocate separate data directories for each instance.

For config directory, I am trying to "eliminate" the need to use the elasticsearch.yml for each instance by having each instance start up with the environment and variables.

Eg. Using systemd
Environment="ES_JAVA_OPTS=-Xms1g Xmx1g"
ExecStart=/opt/elasticsearch/bin/elasticsearch
-Ecluster.name=production
-Ecluster.initial_master_nodes=es01,es02,es03
-Ediscovery.seed_hosts=192.168.100.1,192.168.100.2,192.168.100.3
-Enetwork.host=192.168.100.1
-Ehttp.port=9200
-Enode.name=es01
-Epath.logs=/var/log/elasticsearch/es01
-Epath.data=/data/es01

Seems that all the required settings can be placed in the service file and "eliminate" the need to use elasticsearch.yml.

Is there any caveat in doing this?

This topic was automatically closed 28 days after the last reply. New replies are no longer allowed.