Ignore lost shards and make cluster health from RED to GREEN

(no jihun) #1


Recently some of shards disappeared and can not be recovered.( 4 shard lost of 6 month logstash with 5 shard + 1replica)
I have changed replica from 1 to 0 due to storage shortage. but I don't know why this made some shard removed both primary and replica.
Because of this cluster's health is RED and cluster rolling restart doesn't help.

I have nothing can do so decided consider this to normal even there is some lost data.
how can I make cluster health to green in this situation?

some shards is nowhere in the cluster.

"status": "red",
"timed_out": false,
"number_of_nodes": 5,
"number_of_data_nodes": 5,
"active_primary_shards": 1400,
"active_shards": 2800,
"relocating_shards": 0,
"initializing_shards": 0,
"unassigned_shards": 8,
"number_of_pending_tasks": 0,
"number_of_in_flight_fetch": 0

(David Pilato) #2

So you have 5 nodes and 2800 shards. Means 560 shards per node.
Probably 280 days of data so far.

That's a lot IMO.

I'm unsure if you really need to have 5 shards per daily index. I would most likely reduce that in the logstash template to 1 shard, 1 replica. Also, if you don't use all indices, you can close them.

To answer to your question, if you don't really care about the incomplete indices, you can always delete them.

(no jihun) #3

@dadoonet thanks for your advice.
I understand you mean in usual case (1 shard + 1 replica ) good to logstash indices.
I've read this post and it says like you too. https://qbox.io/blog/optimizing-elasticsearch-how-many-shards-per-index

my further question is

  1. if one shard of a day getting large up to 10GB~20GB, it will be ok when most queries made by kibana(aggregation, filter, search)?
  2. no way to keep 4 shard(1 lost) opened and status make GREEN without closing that index?

(Magnus B├Ąck) #4
  1. That depends on the performance of your nodes, the mappings, and the query patterns so I don't think it's possible to give a straight answer.
  2. A green cluster requires all shards to be present.

(no jihun) #5

Thanks @magnusbaeck.
btw. could you please let me know more about

That depends on the performance of your nodes, the mappings, and the query patterns

I saw many articles which generally says

  • a little overallocation good to scalability
  • shard consume resources, so too many shard make performance/resource issue.

In my situation

  • there is no one document get.
  • you can see my mapping at this thread
    How to add not analyzed field with dynamic tempate?
  • cluster is under pressure both cpu and ram.
  • there are about 10 times full gc which takes 10~20secs per a day per a node.
  • almost of the query is aggretations which made by kibana dashboard in date range 1day ~ 30days.

all node's hardward is

  • 22GB RAM(to es), 24v cores.

It would be great if you let me know or give some url something like http://blog.trifork.com/2014/01/07/elasticsearch-how-many-shards/
but this post only reveal the upside of less shards_count, and tested with facets not aggs.


(system) #6