ElasticSearch - Active shards


I've a 4-node cluster that I've setup for ELK, it usually goes around 5-10K events per second so it gets a good number of traffic from all our hosts. Here we have curator to only keep the last 10 days.

Each of our hosts is 32G of ram and each host has ES+Logstash+Kibana the disks we have are SSD and for the most part the cluster had been doing pretty well until recently.

We noted that ES would start timing out to searches or get very slowly, while looking at our graphs we noted something odd:

"task_max_waiting_in_queue_millis" : 0,
"active_shards" : 2325,
"unassigned_shards" : 0,
"number_of_data_nodes" : 4,
"status" : "green",
"cluster_name" : "logger",
"number_of_pending_tasks" : 0,
"number_of_in_flight_fetch" : 0,
"relocating_shards" : 0,
"delayed_unassigned_shards" : 0,
"active_shards_percent_as_number" : 100,
"timed_out" : false,
"initializing_shards" : 0,
"number_of_nodes" : 4,
"active_primary_shards" : 1161

It is my understanding that active shards are only shards that are getting indexing activity, am I right? We keep hourly indices that means each hour we generate a new index (so that then with curator we can only keep certain number of hours). Is there something I'm missing here with related to the active shards? I'm wondering if all those active shards are in use, the indices rarely get searched for more than 1-2 hours and as said, we index by the hour so there is no way that the index that was created two hours ago would be getting an index/bulk/delete operation.

Until this troubleshooting, we used to have 1 replica and 3 shards, now we moved to 1 replica and 2 shards.

Am I missing something with related to active shards?


This topic was automatically closed 28 days after the last reply. New replies are no longer allowed.