Thanks, we did some tests and we can confirm that we definately ended up with a split-brain (as reported by you ) with the current configuration and this is the reason why the nodes were not able to rejoin to the whole cluster.
We also tested your suggested configuration :
- two master nodes per rack instead of three
- discovery.zen.minimum_master_nodes= 3
It works fine if the network between the 2 locations(racks) is working fine.
But if the network is down the cluster is completely unresponsive , for example the status:
[root@~]# curl -i "http://localhost:9200/_cluster/health?pretty"
HTTP/1.1 503 Service Unavailable
content-type: application/json; charset=UTF-8
content-length: 228
{
"error" : {
"root_cause" : [
{
"type" : "master_not_discovered_exception",
"reason" : null
}
],
"type" : "master_not_discovered_exception",
"reason" : null
},
"status" : 503
}
In the elasticsearch.log we also see the error "not enough master nodes discovered during pinging (found [[]], but needed [3]), pinging again"
Below our elasticsearch.yml file :
======================== Elasticsearch Configuration =========================
NOTE: Elasticsearch comes with reasonable defaults for most settings.
Before you set out to tweak and tune the configuration, make sure you
understand what are you trying to accomplish and the consequences.
The primary way of configuring a node is via this file. This template lists
the most important settings you may want to configure for a production cluster.
Please consult the documentation for further information on configuration options:
---------------------------------- Cluster -----------------------------------
Use a descriptive name for your cluster:
Rotterdam configuration to be used during upgrade: step 1
cluster.name: elasticsearch
------------------------------------ Node ------------------------------------
Use a descriptive name for the node:
node.name: aapps292
Add custom attributes to the node:
#node.attr.rack: r1
node.attr.rack: LOCATION1
cluster.routing.allocation.awareness.attributes: rack
#Only one node per datacenter must be pure data node
node.master: true
node.data: true
----------------------------------- Paths ------------------------------------
Path to directory where to store the data (separate multiple locations by comma):
#path.data: /path/to/data
Path to log files:
#path.logs: /path/to/logs
----------------------------------- Memory -----------------------------------
Lock the memory on startup:
bootstrap.memory_lock: true
bootstrap.system_call_filter: false
Make sure that the heap size is set to about half the memory available
on the system and that the owner of the process is allowed to use this
limit.
Elasticsearch performs poorly when the system is swapping the memory.
---------------------------------- Network -----------------------------------
Set the bind address to a specific IP (IPv4 or IPv6):
network.host: 0.0.0.0
#network.host: 10.133.76.32
transport.host: 10.133.76.32
Set a custom port for HTTP:
http.port: 9200
For more information, consult the network module documentation.
--------------------------------- Discovery ----------------------------------
Pass an initial list of hosts to perform discovery when new node is started:
The default list of hosts is ["127.0.0.1", "[::1]"]
discovery.zen.ping.unicast.hosts: ["ELKNODE01", "ELKNODE02", "ELKNODE03", "ELKNODE04", "ELKNODE05", "ELKNODE06", "ELKNODE07", "ELKNODE08"]
#discovery.zen.ping.unicast.hosts: ["ELKNODE05", "ELKNODE06", "ELKNODE07", "ELKNODE08"]
Prevent the "split brain" by configuring the majority of nodes (total number of master-eligible nodes / 2 + 1):
discovery.zen.minimum_master_nodes: 3
discovery.zen.fd.ping_timeout: 30s
For more information, consult the zen discovery module documentation.
---------------------------------- Gateway -----------------------------------
Block initial recovery after a full cluster restart until N nodes are started:
#gateway.recover_after_nodes: 3
gateway.expected_nodes: 3
For more information, consult the gateway module documentation.
---------------------------------- Various -----------------------------------
Require explicit names when deleting indices:
#action.destructive_requires_name: true
script.inline: true
script.stored: true
script.file: true
action.destructive_requires_name: true
thread_pool.bulk.queue_size: 300
thread_pool.search.size: 8
thread_pool.search.queue_size : 3000
indices.recovery.max_bytes_per_sec: 100Mb
cluster.routing.allocation.node_concurrent_recoveries: 10
I thought, basing on your previous response, that the read operations should work in case of location faiover but they are not working in our case , maybe I miss something ?
Could you please help ?
Thanks a lot
M