I want to run loadbalancing on our elasticSearch , in Nest I use a SniffingConnectionPool, that should help our client choose an active node. This seems to work.
However I’m having problems setting up my nodes, I have a node on my two server. I want them both to be able to work independent of the other and make sure they both contain all the data and sync it between themselves. So that if one of the servers goes down the clients don’t notice this. I cannot get this to work as I want, in my current setup when I shutdown one of the servers shards get lost and so does the data they contain.
It sounds like that you are running low on space on your machines:
[2018-01-16T09:34:06,298][WARN ][o.e.c.r.a.DiskThresholdMonitor] [VM-SOADEV04] high disk watermark [90%] exceeded on [uXIJFTi7Q9KntaZuHDtiYQ][VM-SOADEV03][C:\ProgramData\Elastic\Elasticsearch\data\nodes\0] free: 2.8gb[7.1%], shards will be relocated away from this node
[2018-01-16T09:34:06,298][WARN ][o.e.c.r.a.DiskThresholdMonitor] [VM-SOADEV04] high disk watermark [90%] exceeded on [hKdY08xETjmgB-O_iuP5eQ][VM-SOADEV04][C:\ProgramData\Elastic\Elasticsearch\data\nodes\0] free: 3.7gb[9.4%], shards will be relocated away from this node
[2018-01-16T09:34:06,298][INFO ][o.e.c.r.a.DiskThresholdMonitor] [VM-SOADEV04] rerouting shards: [high disk watermark exceeded on one or more nodes]
Also, it's not recommended to run with 2 nodes. You should add a 3rd one, even small and master only. Then set discovery.zen.minimum_master_nodes to 2.
[2018-01-16T09:34:06,251][WARN ][o.e.d.z.ElectMasterService] [VM-SOADEV04] value for setting "discovery.zen.minimum_master_nodes" is too low. This can result in data loss! Please set it to at least a quorum of master-eligible nodes (current value: [-1], total number of master-eligible nodes used for publishing in this round: [2])
value for setting "discovery.zen.minimum_master_nodes" is too low. This can result in data loss! Please set it to at least a quorum of master-eligible nodes (current value: [-1], total number of master-eligible nodes used for publishing in this round: [2])
We should just add 1 other node with the following settings then?
But that doesn't explain why we should add another node? What does the extra, master only, node add? Do we risk data loss with 'only' 2 master and data nodes?
If we add a third master only node, can it be on one of the previous servers? And if that servers crashes, does the other node still have all the data?
I'm just being curious here and want to understand this fully.
In order to avoid split-brain scenarios and resulting data loss, Elasticsearch requires a majority of master eligible nodes to be available in order to elect a master node. With only 2 master eligible nodes the majority is 2 nodes. This means that you can not elect a master if one of the nodes is missing.
Once you have 3 master eligible nodes in the cluster, the size of the majority is still 2, which means you can lose one node and still be able to elect a master.
Yes, I think that looks good. Your 2 data nodes will both hold a full set of the data as long as you have 1 replica enabled and you can lose any of the nodes while still keeping a majority of master eligible nodes available to elect a master.
Apache, Apache Lucene, Apache Hadoop, Hadoop, HDFS and the yellow elephant
logo are trademarks of the
Apache Software Foundation
in the United States and/or other countries.