Data is lost after elasticsearch restart

simplearebest · March 28, 2023, 3:43am

The cluster(es 2.4.0) contains three nodes (node137,node138,node139) , create an index, 5 shards, and 1 replica.

I follow these steps to test:

the cluster status is green.
shutdown node137.
create large amount of data.
shutdown node138 and node139 immediately.
copy folder of es data as backup.
start node137, node138 and node139.
query from cluster, some data is lose.

elasticsearch.yml

 http.port: 9200
 transport.tcp.port: 9300
 discovery.zen.ping.unicast.hosts: ["192.168.59.137:9300","192.168.59.138:9300","192.168.59.139:9300"]
 discovery.zen.minimum_master_nodes: 2
 gateway.recover_after_nodes: 2

index.translog.durability use the default value, which is request

After several tests, it was found that the lost data was in the translog of node 138(from step 5 backup), but not on node137. after a restart and recovery, this part of data was lost.

When node137 is shutdown, some shards don't have replica, so the lost data are only in the translog on node138, after the restart, why not restore the translog data on node138?
QQ截图20230328122530

Why? Will data be lost after the es restart and recover?

leandrojmp · March 28, 2023, 3:46am

You need to provide a lot more of context about this issue and tests you made

First, what version are you using? What are the configuration of your nodes? Please share the elasticsearch.yml from your three nodes.

What tests you did? What do you have in the logs of those nodes?

warkolm · March 28, 2023, 4:02am

Welcome to our community!

This is positively ancient and you need to upgrade as a matter of serious urgency.

Christian_Dahlqvist · March 28, 2023, 5:57am

As already pointed out, you are running a very very old version of Elasticsearch. A lot of work has gone into improving resiliency since then so I would recommend upgrading to the latest version.

I have not used this version in many years, so any comments may very well be wrong. The issue here I think is that you are first shutting down one node and allowing this to fall behind. If you then brought this back while the other nodes are still running, I would expect you to not see any data loss. The fact that you are shutting down all nodes before restarting all at the same time means that you do not know which node will be elected as master on startup nor which shards will be selected as primaries when recovered. If you instead of restarting all nodes at the same time first restarted the nodes that were active last and then, once a mster has been elected, added the first node to go down I suspect the result would be different.

system · April 25, 2023, 5:57am

This topic was automatically closed 28 days after the last reply. New replies are no longer allowed.

Topic		Replies	Views
Elasticsearch filesystem recovery? Elasticsearch	4	191	June 1, 2023
Recreating lost shard data Elasticsearch	1	347	July 6, 2017
[Elasticsearch 5.5] when node left and rejoin, all data in the node gone Elasticsearch	5	223	May 5, 2022
ES 2.2 cluster - how to deal with data loss? Elasticsearch	4	1633	July 5, 2017
Data loss with 0.19.8 Elasticsearch	3	636	July 6, 2017

Data is lost after elasticsearch restart

Related topics