Delete too many indexes at once leads the cluster to red

I work on a ES 5.6 "on premises", and we work to switch to ES 7.x.
Before that, we aim to reduce our over-sharding issue : we generate too much index (every day when every month is ok) and espacially, with 5 shards (default value)...
And then, we re-indexed bad indexes into good ones et deleted them.
Every time, a "delete" on my-index-...-* (400 indexes behind this pattern) leads the cluster to yellow (good case), and even red (very bad case)...
I can imagine what is going on (rebalance too heavy ?), but is it avoidable, this behaviour ?

It's hard to say without more info.
What do your Elasticsearch logs show when this happens?

This sounds like one of the side effects of having let the shard count get out of hand. I assume you have already fixed the sharding on the input side so you are now only generating monthly indices and are working through converting older indices. Is this correct?

If you are moving to ES 7.x from ES 5.x you will either have to reindex from remote or reindex in place while going via ES 6.8.x. If you are going directly through a reindex from remote and can spin up a second cluster, it might be worthwhile exploring whether you can start writing new data to both clusters in parallel and then at the same time reindex daily indices into monthly from remote. This way you do not need to delete old indices in the ES 5.x cluster which will reduce the amount of reallocations and cluster updates.

I assume you have already fixed the sharding on the input side so you are now only generating monthly indices and are working through converting older indices. Is this correct?

This is correct and thanks for your advices.

I ask more logs to the ops team and go back to you.

11h25 : we performed curl -i -XDELETE ...

many :

[2021-02-01T11:25:29,472][INFO ][o.e.c.m.MetaDataDeleteIndexService] [nodexxxx_master-adm_90] [my-index-2020-07-20/QGPFxEJfRjqlS_xkarBCeg] deleting index

and then :

[2021-02-01T11:26:00,916][WARN ][o.e.d.z.PublishClusterStateAction] [nodexxxx_master-adm_90] timed out waiting for all nodes to process published state [36397]

many :

[2021-02-01T11:31:31,860][DEBUG][o.e.a.a.c.n.s.TransportNodesStatsAction] [nodexxxx-adm_90] failed to execute on node [u_ll-MJ1SA-aVkXkmscMCA]
org.elasticsearch.transport.ReceiveTimeoutTransportException: [nodexxxx_data_03][0.0.0.0:9303][cluster:monitor/nodes/stats[n]] request_id [559821693] timed out after [15023ms]

[2021-02-01T11:31:38,602][ERROR][o.e.x.m.c.i.IndexRecoveryCollector] [nodexxxx_master-adm_90] collector [index-recovery] timed out when collecting data

many :

[2021-02-01T11:32:24,167][WARN ][o.e.t.TransportService ] [nodexxxx_master-adm_90] Received response for a request that has timed out, sent [96919ms] ago, timed out [66917ms] ago, action [internal:discovery/zen/fd/ping], node [{nodexxxx_data_03}{u_ll-MJ1SA-aVkXkmscMCA}{O0gG1f-RR_ex9U3HpotRGA}{0.0.0.0}{0.0.0.0:9303}{rack_id=nodexxxx}], id [559820848]

[2021-02-01T11:32:26,476][WARN ][o.e.g.GatewayAllocator$InternalReplicaShardAllocator] [nodexxxx_master-adm_90] [my-index-2020-11-08][2]: failed to list shard for shard_store on node [rCc5h51dRcGKUO9vD7tP0g]
org.elasticsearch.action.FailedNodeException: Failed node [rCc5h51dRcGKUO9vD7tP0g] at org.elasticsearch.action.support.nodes.TransportNodesAction$AsyncAction.onFailure(TransportNodesAction.java:239) ~[elasticsearch-5.6.14.jar:5.6.14]

Caused by: org.elasticsearch.transport.RemoteTransportException: [nodexxx_data_04][0.0.0.0:9304][internal:cluster/nodes/indices/shard/store[n]]
Caused by: org.elasticsearch.ElasticsearchException: Failed to list store metadata for shard [[my-index-2020-11-08][2]] at org.elasticsearch.indices.store.TransportNodesListShardStoreMetaData.nodeOperation(TransportNodesListShardStoreMetaData.java:114) ~[elasticsearch-5.6.14.jar:5.6.14]

Caused by: java.io.FileNotFoundException: no segments* file found in store(mmapfs(/var/opt/data/flat/elastic/files03/04/nodes/0/indices/v5KrU_ZQQi28oyNlxyxn3g/2/index)): files: [recovery.AXddJCueclgzgH391Ga1._1851.dii, recovery.AXddJCueclgzgH391Ga1._1851.fdx, recovery.AXddJCueclgzgH391Ga1._1851.fnm, recovery.AXddJCueclgzgH391Ga1._1851.nvm, recovery.AXddJCueclgzgH391Ga1._1851.si, recovery.AXddJCueclgzgH391Ga1._1851_1.liv, recovery.AXddJCueclgzgH391Ga1._1851_Lucene50_0.tip, recovery.AXddJCueclgzgH391Ga1._1851_Lucene54_0.dvm, recovery.AXddJCueclgzgH391Ga1._1b4p.cfe, recovery.AXddJCueclgzgH391Ga1._1b4p.si, recovery.AXddJCueclgzgH391Ga1._1hmg.dii, recovery.AXddJCueclgzgH391Ga1._1hmg.fdx, recovery.AXddJCueclgzgH391Ga1._1hmg.fnm, recovery.AXddJCueclgzgH391Ga1._1hmg.nvd, recovery.AXddJCueclgzgH391Ga1._1hmg.nvm, recovery.AXddJCueclgzgH391Ga1._1hmg.si, recovery.AXddJCueclgzgH391Ga1._1hmg_Lucene50_0.tip, recovery.AXddJCueclgzgH391Ga1._1hmg_Lucene54_0.dvm, recovery.AXddJCueclgzgH391Ga1._1i5t.cfe, recovery.AXddJCueclgzgH391Ga1._1i5t.si, recovery.AXddJCueclgzgH391Ga1._1jcq.cfe, recovery.AXddJCueclgzgH391Ga1._1jcq.si, recovery.AXddJCueclgzgH391Ga1._1kaw.cfe, recovery.AXddJCueclgzgH391Ga1._1kaw.si, recovery.AXddJCueclgzgH391Ga1._1lrs.cfe, recovery.AXddJCueclgzgH391Ga1._1lrs.si, recovery.AXddJCueclgzgH391Ga1._1lvv.cfe, recovery.AXddJCueclgzgH391Ga1._1lvv.si, recovery.AXddJCueclgzgH391Ga1._1m28.cfe, recovery.AXddJCueclgzgH391Ga1._1m28.cfs, recovery.AXddJCueclgzgH391Ga1._1m28.si, recovery.AXddJCueclgzgH391Ga1._1mb8.cfe, recovery.AXddJCueclgzgH391Ga1._1mb8.cfs, recovery.AXddJCueclgzgH391Ga1._1mb8.si, recovery.AXddJCueclgzgH391Ga1._1mea.cfe, recovery.AXddJCueclgzgH391Ga1._1mea.cfs, recovery.AXddJCueclgzgH391Ga1._1mea.si, recovery.AXddJCueclgzgH391Ga1._1mkv.cfe, recovery.AXddJCueclgzgH391Ga1._1mkv.si, recovery.AXddJCueclgzgH391Ga1._1mp1.cfe, recovery.AXddJCueclgzgH391Ga1._1mp1.cfs, recovery.AXddJCueclgzgH391Ga1._1mp1.si, recovery.AXddJCueclgzgH391Ga1._1msn.cfe, recovery.AXddJCueclgzgH391Ga1._1msn.cfs, recovery.AXddJCueclgzgH391Ga1._1msn.si, recovery.AXddJCueclgzgH391Ga1._1msw.cfe, recovery.AXddJCueclgzgH391Ga1._1msw.cfs, recovery.AXddJCueclgzgH391Ga1._1msw.si, recovery.AXddJCueclgzgH391Ga1._1mtz.cfe, recovery.AXddJCueclgzgH391Ga1._1mtz.cfs, recovery.AXddJCueclgzgH391Ga1._1mtz.si, recovery.AXddJCueclgzgH391Ga1._1muw.cfe, recovery.AXddJCueclgzgH391Ga1._1muw.cfs, recovery.AXddJCueclgzgH391Ga1._1muw.si, recovery.AXddJCueclgzgH391Ga1._1mv9.cfe, recovery.AXddJCueclgzgH391Ga1._1mv9.cfs, recovery.AXddJCueclgzgH391Ga1._1mv9.si, recovery.AXddJCueclgzgH391Ga1._1mvi.cfe, recovery.AXddJCueclgzgH391Ga1._1mvi.cfs, recovery.AXddJCueclgzgH391Ga1._1mvi.si, recovery.AXddJCueclgzgH391Ga1._1mvj.cfe, recovery.AXddJCueclgzgH391Ga1._1mvj.cfs, recovery.AXddJCueclgzgH391Ga1._1mvj.si, recovery.AXddJCueclgzgH391Ga1._1mvy.cfe, recovery.AXddJCueclgzgH391Ga1._1mvy.cfs, recovery.AXddJCueclgzgH391Ga1._1mvy.si, recovery.AXddJCueclgzgH391Ga1._1mvz.cfe, recovery.AXddJCueclgzgH391Ga1._1mvz.cfs, recovery.AXddJCueclgzgH391Ga1._1mvz.si, recovery.AXddJCueclgzgH391Ga1._1mw6.cfe, recovery.AXddJCueclgzgH391Ga1._1mw6.cfs, recovery.AXddJCueclgzgH391Ga1._1mw6.si, recovery.AXddJCueclgzgH391Ga1._1mw8.cfe, recovery.AXddJCueclgzgH391Ga1._1mw8.cfs, recovery.AXddJCueclgzgH391Ga1._1mw8.si, recovery.AXddJCueclgzgH391Ga1._1mw9.cfe, recovery.AXddJCueclgzgH391Ga1._1mw9.cfs, recovery.AXddJCueclgzgH391Ga1._1mw9.si, recovery.AXddJCueclgzgH391Ga1._1mwj.cfe, recovery.AXddJCueclgzgH391Ga1._1mwj.cfs, recovery.AXddJCueclgzgH391Ga1._1mwj.si, recovery.AXddJCueclgzgH391Ga1._1mwk.cfe, recovery.AXddJCueclgzgH391Ga1._1mwk.cfs, recovery.AXddJCueclgzgH391Ga1._1mwk.si, recovery.AXddJCueclgzgH391Ga1._1mwl.cfe, recovery.AXddJCueclgzgH391Ga1._1mwl.cfs, recovery.AXddJCueclgzgH391Ga1._1mwl.si, recovery.AXddJCueclgzgH391Ga1._1mwm.cfe, recovery.AXddJCueclgzgH391Ga1._1mwm.cfs, recovery.AXddJCueclgzgH391Ga1._1mwm.si, recovery.AXddJCueclgzgH391Ga1._1mwx.cfe, recovery.AXddJCueclgzgH391Ga1._1mwx.cfs, recovery.AXddJCueclgzgH391Ga1._1mwx.si, recovery.AXddJCueclgzgH391Ga1._1mwy.cfe, recovery.AXddJCueclgzgH391Ga1._1mwy.cfs, recovery.AXddJCueclgzgH391Ga1._1mwy.si, recovery.AXddJCueclgzgH391Ga1._eej.dii, recovery.AXddJCueclgzgH391Ga1._eej.fdx, recovery.AXddJCueclgzgH391Ga1._eej.fnm, recovery.AXddJCueclgzgH391Ga1._eej.nvm, recovery.AXddJCueclgzgH391Ga1._eej.si, recovery.AXddJCueclgzgH391Ga1._eej_Lucene50_0.tip, recovery.AXddJCueclgzgH391Ga1._eej_Lucene54_0.dvm, recovery.AXddJCueclgzgH391Ga1._qyr.dii, recovery.AXddJCueclgzgH391Ga1._qyr.fdx, recovery.AXddJCueclgzgH391Ga1._qyr.fnm, recovery.AXddJCueclgzgH391Ga1._qyr.nvm, recovery.AXddJCueclgzgH391Ga1._qyr.si, recovery.AXddJCueclgzgH391Ga1._qyr_Lucene50_0.tip, recovery.AXddJCueclgzgH391Ga1._qyr_Lucene54_0.dvm, recovery.AXddJCueclgzgH391Ga1._z18.dii, recovery.AXddJCueclgzgH391Ga1._z18.fdx, recovery.AXddJCueclgzgH391Ga1._z18.fnm, recovery.AXddJCueclgzgH391Ga1._z18.nvd, recovery.AXddJCueclgzgH391Ga1._z18.nvm, recovery.AXddJCueclgzgH391Ga1._z18.si, recovery.AXddJCueclgzgH391Ga1._z18_Lucene50_0.tip, recovery.AXddJCueclgzgH391Ga1._z18_Lucene54_0.dvm, recovery.AXddJCueclgzgH391Ga1.segments_4g, write.lock]
at org.apache.lucene.index.SegmentInfos$FindSegmentsFile.run(SegmentInfos.java:687) ~[lucene-core-6.6.1.jar:6.6.1 9aa465a89b64ff2dabe7b4d50c472de32c298683 - varunthacker - 2017-08-29 21:54:39]

At last :

[2021-02-01T11:33:01,377][INFO ][o.e.c.r.a.AllocationService] [nodexxxx_master-adm_90] Cluster health status changed from [YELLOW] to [RED] (reason: [{nodexxxx_data_03}{lfkgmXXCS2an-9vGMwJgMw}{yYYj3CmvTjWtlCIuh5xDqA}{0.0.0.0}{0.0.0.0.140:9303}{rack_id=nodexxxx} failed to ping, tried [3] times, each with maximum [30s] timeout

What is the output from the _cluster/stats?pretty&human API?

Here is the output from _cluster/stats :

HTTP/1.1 200 OK
content-type: application/json; charset=UTF-8
content-length: 4807
{
"_nodes" : {
"total" : 42,
"successful" : 42,
"failed" : 0
},
"cluster_name" : "##########",
"timestamp" : 1612338624367,
"status" : "green",
"indices" : {
"count" : 3073,
"shards" : {
"total" : 28923,
"primaries" : 14445,
"replication" : 1.0022845275181724,
"index" : {
"shards" : {
"min" : 2,
"max" : 35,
"avg" : 9.411975268467296
},
"primaries" : {
"min" : 1,
"max" : 6,
"avg" : 4.700618288317605
},
"replication" : {
"min" : 1.0,
"max" : 34.0,
"avg" : 1.010738691832086
}
}
},
"docs" : {
"count" : 68461661389,
"deleted" : 54085877
},
"store" : {
"size" : "97.6tb",
"size_in_bytes" : 107351103384485,
"throttle_time" : "0s",
"throttle_time_in_millis" : 0
},
"fielddata" : {
"memory_size" : "56.1mb",
"memory_size_in_bytes" : 58883064,
"evictions" : 0
},
"query_cache" : {
"memory_size" : "35.3gb",
"memory_size_in_bytes" : 38002057402,
"total_count" : 902465343,
"hit_count" : 136761051,
"miss_count" : 765704292,
"cache_size" : 596074,
"cache_count" : 10200931,
"evictions" : 9604857
},
"completion" : {
"size" : "0b",
"size_in_bytes" : 0
},
"segments" : {
"count" : 436453,
"memory" : "188.9gb",
"memory_in_bytes" : 202936988938,
"terms_memory" : "148gb",
"terms_memory_in_bytes" : 158915678934,
"stored_fields_memory" : "30.4gb",
"stored_fields_memory_in_bytes" : 32734430952,
"term_vectors_memory" : "0b",
"term_vectors_memory_in_bytes" : 0,
"norms_memory" : "586.6mb",
"norms_memory_in_bytes" : 615113984,
"points_memory" : "5.6gb",
"points_memory_in_bytes" : 6102942496,
"doc_values_memory" : "4.2gb",
"doc_values_memory_in_bytes" : 4568822572,
"index_writer_memory" : "62.9mb",
"index_writer_memory_in_bytes" : 65983588,
"version_map_memory" : "1.6mb",
"version_map_memory_in_bytes" : 1701845,
"fixed_bit_set" : "0b",
"fixed_bit_set_memory_in_bytes" : 0,
"max_unsafe_auto_id_timestamp" : 1612310410289,
"file_sizes" : { }
}
},
"nodes" : {
"count" : {
"total" : 42,
"data" : 35,
"coordinating_only" : 4,
"master" : 3,
"ingest" : 0
},
"versions" : [
"5.6.14"
],
"os" : {
"available_processors" : 1680,
"allocated_processors" : 1344,
"names" : [
{
"name" : "Linux",
"count" : 42
}
],
"mem" : {
"total" : "11.5tb",
"total_in_bytes" : 12703406112768,
"free" : "233.8gb",
"free_in_bytes" : 251059400704,
"used" : "11.3tb",
"used_in_bytes" : 12452346712064,
"free_percent" : 2,
"used_percent" : 98
}
},
"process" : {
"cpu" : {
"percent" : 15
},
"open_file_descriptors" : {
"min" : 1626,
"max" : 3524,
"avg" : 3126
}
},
"jvm" : {
"max_uptime" : "140.8d",
"max_uptime_in_millis" : 12169334519,
"versions" : [
{
"version" : "1.8.0_141",
"vm_name" : "OpenJDK 64-Bit Server VM",
"vm_version" : "25.141-b16",
"vm_vendor" : "Oracle Corporation",
"count" : 42
}
],
"mem" : {
"heap_used" : "844.4gb",
"heap_used_in_bytes" : 906730397368,
"heap_max" : "1.2tb",
"heap_max_in_bytes" : 1343251021824
},
"threads" : 9945
},
"fs" : {
"total" : "579.7tb",
"total_in_bytes" : 637486114570240,
"free" : "482.1tb",
"free_in_bytes" : 530113496121344,
"available" : "453tb",
"available_in_bytes" : 498111363497984,
"spins" : "true"
},
"plugins" : [
{
"name" : "search-guard-5",
"version" : "5.6.14-19.2",
"description" : "Provide access control related features for Elasticsearch 5",
"classname" : "com.floragunn.searchguard.SearchGuardPlugin",
"has_native_controller" : false
},
{
"name" : "x-pack",
"version" : "5.6.14",
"description" : "Elasticsearch Expanded Pack Plugin",
"classname" : "org.elasticsearch.xpack.XPackPlugin",
"has_native_controller" : true
}
],
"network_types" : {
"transport_types" : {
"com.floragunn.searchguard.ssl.http.netty.SearchGuardSSLNettyTransport" : 42
},
"http_types" : {
"com.floragunn.searchguard.http.SearchGuardHttpServerTransport" : 42
}
}
}
}

That's the cause. That makes nearly 830 shards per node, way too many.
And the average size of your shards is way too small as well.

The short term solution is to do small batches of deletes, or add more nodes to the cluster to bring the per node shard count down.

You're heading in the right direction though!

We keep going then, thanks a lot.

This topic was automatically closed 28 days after the last reply. New replies are no longer allowed.