The ES Cluster consists of 4 nodes (3 master/data, 1 data node) running ES 6.4.2. Disk loads are at 34, 65, 28 and 29 percent.
Kibana runs on one of the master nodes. I access Kibana via ssh.
When creating an index pattern Kibana freezes at 'Creating index pattern...':
Console output is:
The ES logs show:
[2019-05-15T10:17:15,542][WARN ][r.suppressed ] path: /_template/kibana_index_template%3A.kibana, params: {name=kibana_index_template:.kibana}
org.elasticsearch.transport.RemoteTransportException: [data_2][IP:PORT][indices:admin/template/put]
Caused by: org.elasticsearch.cluster.metadata.ProcessClusterEventTimeoutException: failed to process cluster event (create-index-template [kibana_index_template:.kibana], cause [api]) within 30s
at org.elasticsearch.cluster.service.MasterService$Batcher.lambda$onTimeout$0(MasterService.java:125) ~[elasticsearch-6.4.2.jar:6.4.2]
at java.util.ArrayList.forEach(ArrayList.java:1257) ~[?:1.8.0_181]
at org.elasticsearch.cluster.service.MasterService$Batcher.lambda$onTimeout$1(MasterService.java:124) ~[elasticsearch-6.4.2.jar:6.4.2]
at org.elasticsearch.common.util.concurrent.ThreadContext$ContextPreservingRunnable.run(ThreadContext.java:624) ~[elasticsearch-6.4.2.jar:6.4.2]
at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149) ~[?:1.8.0_181]
at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624) ~[?:1.8.0_181]
at java.lang.Thread.run(Thread.java:748) [?:1.8.0_181]
What could be the issue here? 30s should be enough to create an index pattern.
Are you seeing any errors in the Kibana server ouput?
What is the state of the ES cluster? Can I get the output of GET /_cluster/settings and GET _cluster/health from ES?
Log shows no errors, no.
The cluster state is yellow.
GET _cluster/settings:
{
"persistent": {
"xpack": {
"monitoring": {
"collection": {
"enabled": "true"
}
}
}
},
"transient": {}
}
GET _cluster/health:
{
"cluster_name": "cluster",
"status": "yellow",
"timed_out": false,
"number_of_nodes": 4,
"number_of_data_nodes": 4,
"active_primary_shards": 165,
"active_shards": 273,
"relocating_shards": 0,
"initializing_shards": 4,
"unassigned_shards": 55,
"delayed_unassigned_shards": 0,
"number_of_pending_tasks": 0,
"number_of_in_flight_fetch": 0,
"task_max_waiting_in_queue_millis": 0,
"active_shards_percent_as_number": 82.2289156626506
}
The unassigned shards occured after a cluster restart. Could the ES setting
discovery.zen.minimum_master_nodes: 2
have lead to issues while rebooting the cluster (3 master/data + 1 data)?
That is probably the cause of the error. Do you have two master nodes currently in the cluster? What is the output of GET /_nodes
GET _cat/nodes?v says:
ip heap.percent ram.percent cpu load_1m load_5m load_15m node.role master name
MASTER_IP_1 51 99 34 7.21 7.32 7.46 mdi - master_1
DATA_IP_1 71 99 13 1.18 1.53 1.62 di - data_3
MASTER_IP_2 30 99 36 6.25 6.35 6.78 mdi - data_1
MASTER_IP_3 42 91 2 1.15 0.94 1.07 mdi * data_2
note: data_1 and data_2 are just named that way, but are also master eligible nodes.
Are there any best practices for a ES Cluster reboot?
Should I remove and re-add every node at a time?
EDIT:
GET _cat/shards?v&h=index,shard,prirep,state,unassigned.reason shows me
index shard prirep state unassigned.reason
<index-name> 0 r UNASSIGNED CLUSTER_RECOVERED
...
for every unassigned node.
Looks like you have three master eligible. Probably need to look at GET _cat/shards and see why the shards are unassigned and reassign to get the cluster back to a green state.
That was it. Needed to bring back the Cluster to green state than everything worked like a charm. Thanks again!