Hi there
I am running a re-creational cluster at home for firewall logs and some nginx traffic logs. Recently I added two nodes to my setup and now there's 2 data nodes and one master non-data node.
Today I tried to restart the elasticsearch service on one of the data nodes and suddenly Kibana reported both data nodes as down. Only the master was reported as up.
/_cat/shards
shows a bunch of unassigned shards after the reallocation stops. Restarting the node again resulted in even more. From the looks of it all primary shards has been allocated, but the replicas has not.
Checking one of the indexes with unallocated shards shows that replication is enabled:
{
"fortigate-2019.11.19" : {
"settings" : {
"index" : {
"creation_date" : "1574121601277",
"number_of_shards" : "5",
"number_of_replicas" : "1",
"uuid" : "_lUL_6mjQYKGtP_LJ5K3ww",
"version" : {
"created" : "6080299",
"upgraded" : "7050099"
},
"provided_name" : "fortigate-2019.11.19"
}
}
}
}
All nodes seems to be detected from each of the members:
{
"cluster_name" : "siem",
"status" : "red",
"timed_out" : false,
"number_of_nodes" : 3,
"number_of_data_nodes" : 2,
"active_primary_shards" : 2182,
"active_shards" : 3398,
"relocating_shards" : 0,
"initializing_shards" : 0,
"unassigned_shards" : 965,
"delayed_unassigned_shards" : 0,
"number_of_pending_tasks" : 0,
"number_of_in_flight_fetch" : 0,
"task_max_waiting_in_queue_millis" : 0,
"active_shards_percent_as_number" : 77.88219115287646
}
Cluster config (grep -e '^[^#]' /etc/elasticsearch/elasticsearch.yml)
cluster.name: siem
node.name: siem-1
path.data: /var/lib/elasticsearch
path.logs: /var/log/elasticsearch
network.host: 192.168.70.150
discovery.seed_hosts:
- 192.168.70.150
- 192.168.70.161
cluster.max_shards_per_node: 4000
cluster.name: siem
node.name: siem-2
path.data: /var/lib/elasticsearch
path.logs: /var/log/elasticsearch
network.host: 192.168.70.161
discovery.seed_hosts:
- 192.168.70.150
- 192.168.70.161
cluster.max_shards_per_node: 4000
cluster.name: siem
node.name: siem-master
path.data: /var/lib/elasticsearch
path.logs: /var/log/elasticsearch
network.host: 192.168.70.162
discovery.seed_hosts:
- 192.168.70.150
- 192.168.70.161
- 192.168.70.162
node.master: true
node.voting_only: false
node.data: false
node.ingest: false
node.ml: false
xpack.ml.enabled: true
cluster.remote.connect: false
Cluster log shows a bunch of these messages:
Caused by: org.elasticsearch.action.UnavailableShardsException: [.monitoring-es-7-2019.12.20][0] primary shard is not active Timeout: [1m]
I can probably fix this by running on of the allocation scripts, but I'd rather understand why this happened if anyone would be up for explaining.
Kind regards,
Patrik