Hello ,
A folder (index) have been deleted from a node : /data3/data/es/esdartyprd/nodes/0/indices/enc_idx/2/index )
I can see that data are missed for primary shard in this node (node1) =>
for this index , here what gives me the command curl -XGET http://localhost:9200/_cat/shards
enc_idx 2 p STARTED 57824815 28.8gb 10.135.8.201 HDPESPRD1
enc_idx 2 r STARTED 57824815 36.5gb 10.135.8.202 HDPESPRD2
enc_idx 2 r STARTED 57824815 36.5gb 10.135.10.15 HDPESPRA1
enc_idx 2 r STARTED 57824815 36.5gb 10.135.8.203 HDPESPRD3
enc_idx 2 r UNASSIGNED
You can see that data are missed in the node1 which is master ! And still have problem to assign data to node4
Any possibility to recover or replicate missed data from replicate shards ? Is that done automatically? Any solutions to assign the shard which is not assigned ? In the log I find an error that a file is corrupted from this folder :
[enc_idx][2] Corrupted index [corrupted_1ytPPpKkTZCGG_zQcbBG-w] caused by: CorruptIndexException[codec footer mismatch: actual footer=1063427 vs expected footer=-1071082520 (resource: NIOFSIndexInput(path="/data3/data/es/esdartyprd/nodes/0/indices/enc_idx/2/index/_3gv6_es090_0.pos"))]
That index corruption seems really awful. Maybe it's not possible to recover.
You could try updating index settings with number_of_replicas: 0, so Elasticsearch will delete all replicas, including the one with problems. And then restore back to 4 replicas and check if they all get assigned after fully replicating.
Thank you @thiago . In fact , the index which was corrupted, is the index that we deleted from the path :/data3/data/es/esdartyprd/nodes/0/indices/enc_idx/2/index . This deleting was from the primary shard in the master node (node1), and even that we have always problems to reroute the shard, and we get the same error of index corruption.
My proposition :
1- Shut down the cluster.
2- Change the master node to the node 2
3- Reboot ElasticSearch
That will give the hand to node 2 to be the primary shard , and so others shard will be replicated and assigned to other nodes.
What do you think ? Have a comment or an other solution ?
Yes. It's what it is expected. But, to be honest, I am not fully aware of the current state of your cluster. That would require running a diagnostics tool, but this is only done under a Support subscription.
Apache, Apache Lucene, Apache Hadoop, Hadoop, HDFS and the yellow elephant
logo are trademarks of the
Apache Software Foundation
in the United States and/or other countries.