I've been trying to figure out why an elasticsearch install I've inherited is running poorly.
My original thoughts were both hardware and cluster sizing.
I'm in the position of not knowing alot of things about the install, I don't know how much data is being sent to the cluster, what is retained and what our query types are.
I've got x-pack installed now and working but that has highlighted a major performance issue to the point that Kibana is constantly showing the status page with Status of Red, I can't get to any of the other apps as the status page is always loaded!
I started to dig further into the timeouts and my cluster health showed:
{
"cluster_name" : "elasticsearch",
"status" : "yellow",
"timed_out" : false,
"number_of_nodes" : 1,
"number_of_data_nodes" : 1,
"active_primary_shards" : 2224,
"active_shards" : 2224,
"relocating_shards" : 0,
"initializing_shards" : 0,
"unassigned_shards" : 2223,
"delayed_unassigned_shards" : 0,
"number_of_pending_tasks" : 0,
"number_of_in_flight_fetch" : 0,
"task_max_waiting_in_queue_millis" : 0,
"active_shards_percent_as_number" : 50.0112435349674
}
The size of unassigned shards was a worry
curl -u elastic -XGET 172.20.3.247:9200/_cat/shards?h=index,shard,prirep,state,unassigned.reason| grep UNASSIGNED
utm-2017.08.11 4 r UNASSIGNED CLUSTER_RECOVERED
utm-2017.08.11 1 r UNASSIGNED CLUSTER_RECOVERED
utm-2017.08.11 2 r UNASSIGNED CLUSTER_RECOVERED
utm-2017.08.11 3 r UNASSIGNED CLUSTER_RECOVERED
utm-2017.08.11 0 r UNASSIGNED CLUSTER_RECOVERED
winlogbeat-2017.06.16 4 r UNASSIGNED CLUSTER_RECOVERED
winlogbeat-2017.06.16 1 r UNASSIGNED CLUSTER_RECOVERED
winlogbeat-2017.06.16 2 r UNASSIGNED CLUSTER_RECOVERED
winlogbeat-2017.06.16 3 r UNASSIGNED CLUSTER_RECOVERED
winlogbeat-2017.06.16 0 r UNASSIGNED CLUSTER_RECOVERED
winlogbeat-2017.09.08 4 r UNASSIGNED CLUSTER_RECOVERED
winlogbeat-2017.09.08 1 r UNASSIGNED CLUSTER_RECOVERED
winlogbeat-2017.09.08 2 r UNASSIGNED CLUSTER_RECOVERED
winlogbeat-2017.09.08 3 r UNASSIGNED CLUSTER_RECOVERED
winlogbeat-2017.09.08 0 r UNASSIGNED CLUSTER_RECOVERED
utm-2017.08.14 4 r UNASSIGNED CLUSTER_RECOVERED
utm-2017.08.14 1 r UNASSIGNED CLUSTER_RECOVERED
utm-2017.08.14 2 r UNASSIGNED CLUSTER_RECOVERED
utm-2017.08.14 3 r UNASSIGNED CLUSTER_RECOVERED
utm-2017.08.14 0 r UNASSIGNED CLUSTER_RECOVERED
I'm stuck as to where to even start to size this cluster and I don't know where to go next to get the install usable.
The node is running in AWS, it's on a m4.xlarge
I can throw more resource at it short term but I want to get it into a better state long term.