Im planning to set up a high availability cluster with 2 physical host machines hosting a number of VMs / elasticsearch nodes
How many nodes do I need on each host machine so that I can safely restart a host machine and keep everything operational? data integrity is crucial, can't lose any data
Would this work? 2 nodes on each host machine
Host A: node 1, node 2
Host B: node 3, node 4
or is this no different than having 1 node per host?
Yes, assuming that you have at least one replica configured for all indices.
so basically I just need to set up a cluster with 3 nodes on 3 separate hosts with the default settings for primary and replica shards? and elasticsearch will automatically manage the shards, etc.?
The cluster handles the load and any request can be sent to any node. Exactly how load is distributed is however a large topic.
when running a request (index or search), does it have to be sent to the current master node or any node in the cluster is fine? does the client need to know the IPs of all the nodes?
or any node IP will work, but it's good to know all in case of connection failure?
so basically I just need to set up a cluster with 3 nodes on 3 separate hosts with the default settings for primary and replica shards? and elasticsearch will automatically manage the shards, etc.?
`
For 3 node cluster the best suggested settings for index should some thing be like 3 primary and one replica which can handle the failure of one host.
Apache, Apache Lucene, Apache Hadoop, Hadoop, HDFS and the yellow elephant
logo are trademarks of the
Apache Software Foundation
in the United States and/or other countries.