Hello,
We recently upgraded our cluster from 6.8 to 7.4.2 and are now facing a major stability issue: the cluster randomly loses its master then is unable to re-elect a new one.
Our current configuration is:
- 6 nodes on distinct hosts, all master-eligible
- java heap size 8GB
- about 200 open indices and 300 closed ones
- relevant elasticsearch.yml settings (identical on all nodes apart from data.path and network.host):
bootstrap.memory_lock: true
network.host: x.x.x.x
http.port: 9200
discovery.seed_hosts:
- es01
- es02
- es03
cluster.initial_master_nodes:
- es01
- es02
- es03
xpack.security.enabled: false
node.ml: false
xpack.ml.enabled: false
cluster.publish.timeout: 90s
xpack.monitoring.enabled: false
Here are the logs (taken for the same period) from 2 different nodes - I can upload logs from the other machines if needed:
es03
es04
Occasionally the cluster forms again but loses its master shortly afterwards.
We have tried many combinations of settings, which did not seem to make any difference.
Any help will be greatly appreciated
Thanks