Delete Unassigned replica shards

After a cluster crash for filesystem corruption , I restarted the 5 nodes elasticsearch.

Actually I have 30 Unassigned replica shards that elasticsearch try to reassign without success.
I want to delete them without lost the others primary shards.
It's possible to delete only the replica unassigned shards? What happens if I delete them?

These are logs of my old elasticsearch 1.3

[2019-05-27 11:15:53,058][WARN ][cluster.action.shard     ] [Node4] [.marvel-2019.05.20][0] sending failed shard for [.marvel-2019.05.20][0], node[UXetSPnwQIuchTbBlonOrA], [P], s[INITIALIZING], indexUUID [LAWodau7QsOPINKIAwFJDg], reason [Failed to start shard, message [IndexShardGatewayRecoveryException[[.marvel-2019.05.20][0] failed recovery]; nested: EngineCreationFailureException[[.marvel-2019.05.20][0] failed to create engine]; nested: EOFException[read past EOF: NIOFSIndexInput(path="/home/elastic/ELK/Node4/data/My-ELK/nodes/0/indices/.marvel-2019.05.20/0/index/_oap.cfs")]; ]]
[2019-05-27 11:15:53,342][WARN ][indices.cluster          ] [Node4] [logstash-2019.05.20][3] failed to start shard
org.elasticsearch.index.gateway.IndexShardGatewayRecoveryException: [logstash-2019.05.20][3] failed to fetch index version after copying it over
        at org.elasticsearch.index.gateway.local.LocalIndexShardGateway.recover(LocalIndexShardGateway.java:152)
        at org.elasticsearch.index.gateway.IndexShardGatewayService$1.run(IndexShardGatewayService.java:132)
        at java.util.concurrent.ThreadPoolExecutor.runWorker(Unknown Source)
        at java.util.concurrent.ThreadPoolExecutor$Worker.run(Unknown Source)
        at java.lang.Thread.run(Unknown Source)
Caused by: org.apache.lucene.index.CorruptIndexException: [logstash-2019.05.20][3] Corrupted index [corrupted_0Qn9G6RVSvqqDw401kcqNQ] caused by: CorruptIndexException[codec footer mismatch: actual footer=0 vs expected footer=-1071082520 (resource: NIOFSIndexInput(path="/home/elastic/ELK/Node4/data/My-ELK/nodes/0/indices/logstash-2019.05.20/3/index/_6ou.cfs"))]
        at org.elasticsearch.index.store.Store.failIfCorrupted(Store.java:343)
        at org.elasticsearch.index.store.Store.failIfCorrupted(Store.java:328)
        at org.elasticsearch.index.gateway.local.LocalIndexShardGateway.recover(LocalIndexShardGateway.java:119)
        ... 4 more
[2019-05-27 11:15:53,346][WARN ][cluster.action.shard     ] [Node4] [logstash-2019.05.20][3] sending failed shard for [logstash-2019.05.20][3], node[UXetSPnwQIuchTbBlonOrA], [P], s[INITIALIZING], indexUUID [9gVFPrm9T36C7U0-zfpT3w], reason [Failed to start shard, message [IndexShardGatewayRecoveryException[[logstash-2019.05.20][3] failed to fetch index version after copying it over]; nested: CorruptIndexException[[logstash-2019.05.20][3] Corrupted index [corrupted_0Qn9G6RVSvqqDw401kcqNQ] caused by: CorruptIndexException[codec footer mismatch: actual footer=0 vs expected footer=-1071082520 (resource: NIOFSIndexInput(path="/home/elastic/ELK/Node4/data/My-ELK/nodes/0/indices/logstash-2019.05.20/3/index/_6ou.cfs"))]]; ]]
[2019-05-27 11:15:58,371][WARN ][indices.cluster          ] [Node4] [.marvel-2019.05.20][0] failed to start shard
org.elasticsearch.index.gateway.IndexShardGatewayRecoveryException: [.marvel-2019.05.20][0] failed recovery
        at org.elasticsearch.index.gateway.IndexShardGatewayService$1.run(IndexShardGatewayService.java:185)
        at java.util.concurrent.ThreadPoolExecutor.runWorker(Unknown Source)
        at java.util.concurrent.ThreadPoolExecutor$Worker.run(Unknown Source)
        at java.lang.Thread.run(Unknown Source)
Caused by: org.elasticsearch.index.engine.EngineCreationFailureException: [.marvel-2019.05.20][0] failed to create engine
        at org.elasticsearch.index.engine.internal.InternalEngine.start(InternalEngine.java:277)
        at org.elasticsearch.index.shard.service.InternalIndexShard.performRecoveryPrepareForTranslog(InternalIndexShard.java:714)
        at org.elasticsearch.index.gateway.local.LocalIndexShardGateway.recover(LocalIndexShardGateway.java:225)
        at org.elasticsearch.index.gateway.IndexShardGatewayService$1.run(IndexShardGatewayService.java:132)
        ... 3 more
Caused by: java.io.EOFException: read past EOF: NIOFSIndexInput(path="/home/elastic/ELK/Node4/data/My-ELK/nodes/0/indices/.marvel-2019.05.20/0/index/_oap.cfs")
        at org.apache.lucene.store.BufferedIndexInput.refill(BufferedIndexInput.java:336)
        at org.apache.lucene.store.BufferedIndexInput.readByte(BufferedIndexInput.java:54)
        at org.apache.lucene.store.DataInput.readVInt(DataInput.java:120)
        at org.apache.lucene.store.BufferedIndexInput.readVInt(BufferedIndexInput.java:221)
        at org.apache.lucene.store.CompoundFileDirectory.readEntries(CompoundFileDirectory.java:139)
        at org.apache.lucene.store.CompoundFileDirectory.<init>(CompoundFileDirectory.java:105)
        at org.apache.lucene.index.SegmentReader.readFieldInfos(SegmentReader.java:280)
        at org.apache.lucene.index.IndexWriter.getFieldNumberMap(IndexWriter.java:835)
        at org.apache.lucene.index.IndexWriter.<init>(IndexWriter.java:787)
        at org.elasticsearch.index.engine.internal.InternalEngine.createWriter(InternalEngine.java:1407)
        at org.elasticsearch.index.engine.internal.InternalEngine.start(InternalEngine.java:271)
        ... 6 more
[2019-05-27 11:15:58,372][WARN ][cluster.action.shard     ] [Node4] [.marvel-2019.05.20][0] sending failed shard for [.marvel-2019.05.20][0], node[UXetSPnwQIuchTbBlonOrA], [P], s[INITIALIZING], indexUUID [LAWodau7QsOPINKIAwFJDg], reason [Failed to start shard, message [IndexShardGatewayRecoveryException[[.marvel-2019.05.20][0] failed recovery]; nested: EngineCreationFailureException[[.marvel-2019.05.20][0] failed to create engine]; nested: EOFException[read past EOF: NIOFSIndexInput(path="/home/elastic/ELK/Node4/data/My-ELK/nodes/0/indices/.marvel-2019.05.20/0/index/_oap.cfs")]; ]]
^C

I want to delete them without lost the others primary shards.

Deleting replica shards has absolutely no impact on primary shards!

It's possible to delete only the replica unassigned shards?

Yes, to achieve this reduce the count of 'number_of_replicas' over the entire cluster or desired index/indices. To update number of replicas follow this: Update Indices Settings | Elasticsearch Guide [6.4] | Elastic

(change the version number in url to your elasticsearch version)

Thanks elk11,

it's possible to delete the shards without reduce number_of_replicas of the entire cluster?
I have a lot of documents and reduce replicas and next increase them I think it causes a big overhead.

I found the problem. The unassigned replica shards are already in the indexes of some nodes. I think that when the cluster went up again, it recreated the replica shards into nodes. Now that all nodes are up it found more replicas than expected.

it's possible to delete the shards without reduce number_of_replicas of the entire cluster?

You can reduce the number of replicas for a particular index/indices too.

I have a lot of documents and reduce replicas and next increase them I think it causes a big overhead.

I don't exactly understand what are you trying to achieve with this.

Now that all nodes are up it found more replicas than expected.

I don't think it is ever possible. You cannot get more number of replicas than you configured!

Ok.

I thought that I had to delete the replicas of the entire cluster and after cluster crash I am worried that cluster don't supports the load to recreate all replicas and crash another time but if I can remove replicas of particular index I can do it without problem.

I will show you:

-bash-4.1$ curl -s localhost:9200/_cat/shards | grep logstash-2019.05.27
logstash-2019.05.27  0 r STARTED      1739419    1.2gb 192.168.0.4    Node4
logstash-2019.05.27  0 p STARTED      1741863    1.2gb 192.168.0.3 Node3
logstash-2019.05.27  3 p STARTED      1737897    1.2gb 192.168.0.1 Node1
logstash-2019.05.27  3 r UNASSIGNED
logstash-2019.05.27  1 r STARTED      1740639    1.2gb 192.168.0.4     Node4
logstash-2019.05.27  1 p STARTED      1743116    1.2gb 192.168.0.3     Node3
logstash-2019.05.27  2 p STARTED      1737028    1.2gb 192.168.0.1  Node1
logstash-2019.05.27  2 r UNASSIGNED

Now if search in Node4 I see the folders of 0,1,2 and 3 shards but Node4 has only shard 0 and 1.

[root@Node4 /]$  ls -l /home/elastic/ELK/Node4/data/My-ELK/nodes/0/indices/logstash-2019.05.27/
total 20
drwxr-xr-x 5 root root 4096 May 27 15:39 0
drwxr-xr-x 5 root root 4096 May 27 15:40 1
drwxr-xr-x 5 root root 4096 May 27 01:26 2
drwxr-xr-x 5 root root 4096 May 27 01:26 3
drwxr-xr-x 2 root root 4096 May 27 16:27 _state

I think is not normal because in other folder I don't have all the shard folders.

I solved most of unassigned replicas.
Now still remains some primary shards in initializing status. How can I solve it?

-bash-4.1$  curl -s localhost:9200/_cat/shards | grep INI
logstash-2019.05.20  3 p INITIALIZING                  192.168.0.3     Node3
logstash-2019.05.20  2 p INITIALIZING                  192.168.0.3     Node3
.marvel-2019.05.20    0 p INITIALIZING                  192.168.0.1 Node1


[2019-05-29 13:52:32,735][WARN ][cluster.action.shard     ] [Node1] [.marvel-2019.05.20][0] sending failed shard for [.marvel-2019.05.20][0], node[fsazd6S2RzSnkYCrPxFFlQ], [P], s[INITIALIZING], indexUUID [LAWodau7QsOPINKIAwFJDg], reason [Failed to start shard, message [IndexShardGatewayRecoveryException[[.marvel-2019.05.20][0] failed to recover shard]; nested: IllegalArgumentException[No type mapped for [0]]; ]]
[2019-05-29 13:52:36,576][WARN ][indices.cluster          ] [Node1] [.marvel-2019.05.20][0] failed to start shard
org.elasticsearch.index.gateway.IndexShardGatewayRecoveryException: [.marvel-2019.05.20][0] failed to recover shard
        at org.elasticsearch.index.gateway.local.LocalIndexShardGateway.recover(LocalIndexShardGateway.java:269)
        at org.elasticsearch.index.gateway.IndexShardGatewayService$1.run(IndexShardGatewayService.java:132)
        at java.util.concurrent.ThreadPoolExecutor.runWorker(Unknown Source)
        at java.util.concurrent.ThreadPoolExecutor$Worker.run(Unknown Source)
        at java.lang.Thread.run(Unknown Source)
Caused by: java.lang.IllegalArgumentException: No type mapped for [0]
        at org.elasticsearch.index.translog.Translog$Operation$Type.fromId(Translog.java:224)
        at org.elasticsearch.index.translog.TranslogStreams.readTranslogOperation(TranslogStreams.java:34)
        at org.elasticsearch.index.gateway.local.LocalIndexShardGateway.recover(LocalIndexShardGateway.java:241)
        ... 4 more

Which version of elasticsearch are you on?

Given that you are on Elasticsearch 1.3 you really, really should upgrade, at least to the latest 1.x release, but ideally further.

Thanks Christian, I know but we can't upgrade before 6 months for administrative problems.

Temporary, I would like to solve these issues without upgrade if a solution exists.

I do not know as I have not used that version in many years.

This topic was automatically closed 28 days after the last reply. New replies are no longer allowed.