Ignore lost shards and make cluster health from RED to GREEN

no_jihun · October 1, 2015, 12:46am

Hello.

Recently some of shards disappeared and can not be recovered.( 4 shard lost of 6 month logstash with 5 shard + 1replica)
I have changed replica from 1 to 0 due to storage shortage. but I don't know why this made some shard removed both primary and replica.
Because of this cluster's health is RED and cluster rolling restart doesn't help.

I have nothing can do so decided consider this to normal even there is some lost data.
how can I make cluster health to green in this situation?

some shards is nowhere in the cluster.

/_cluster/health
{
"status": "red",
"timed_out": false,
"number_of_nodes": 5,
"number_of_data_nodes": 5,
"active_primary_shards": 1400,
"active_shards": 2800,
"relocating_shards": 0,
"initializing_shards": 0,
"unassigned_shards": 8,
"number_of_pending_tasks": 0,
"number_of_in_flight_fetch": 0
}

dadoonet · October 1, 2015, 5:30am

So you have 5 nodes and 2800 shards. Means 560 shards per node.
Probably 280 days of data so far.

That's a lot IMO.

I'm unsure if you really need to have 5 shards per daily index. I would most likely reduce that in the logstash template to 1 shard, 1 replica. Also, if you don't use all indices, you can close them.

To answer to your question, if you don't really care about the incomplete indices, you can always delete them.

no_jihun · October 1, 2015, 5:56am

@dadoonet thanks for your advice.
I understand you mean in usual case (1 shard + 1 replica ) good to logstash indices.
I've read this post and it says like you too. https://qbox.io/blog/optimizing-elasticsearch-how-many-shards-per-index

my further question is

if one shard of a day getting large up to 10GB~20GB, it will be ok when most queries made by kibana(aggregation, filter, search)?
no way to keep 4 shard(1 lost) opened and status make GREEN without closing that index?

magnusbaeck · October 1, 2015, 6:14am

That depends on the performance of your nodes, the mappings, and the query patterns so I don't think it's possible to give a straight answer.
A green cluster requires all shards to be present.

no_jihun · October 1, 2015, 9:29am

Thanks @magnusbaeck.
btw. could you please let me know more about

That depends on the performance of your nodes, the mappings, and the query patterns

I saw many articles which generally says

a little overallocation good to scalability
shard consume resources, so too many shard make performance/resource issue.

In my situation

there is no one document get.
you can see my mapping at this thread
How to add not analyzed field with dynamic tempate? - #3 by no_jihun
cluster is under pressure both cpu and ram.
there are about 10 times full gc which takes 10~20secs per a day per a node.
almost of the query is aggretations which made by kibana dashboard in date range 1day ~ 30days.

all node's hardward is

22GB RAM(to es), 24v cores.

It would be great if you let me know or give some url something like Trifork Blog - Keep updated on the technical solutions Trifork is working on!
but this post only reveal the upside of less shards_count, and tested with facets not aggs.

thanks!

Topic		Replies	Views
Recover from red status Elasticsearch	3	687	July 5, 2017
Red status because of 1 index has an unassigned shard Elasticsearch	8	1067	July 5, 2017
Cluster health and the amount of indexes and data Elasticsearch	8	1343	July 5, 2017
Https://discuss.elastic.co/c/elasticsearch Elasticsearch	4	1008	July 5, 2017
Index status not changing to green after switching to cluster from single node Elasticsearch	2	938	June 17, 2017

Ignore lost shards and make cluster health from RED to GREEN

Related topics