Elastic node fetchs all data from master on restart

Dekel · May 22, 2016, 12:46pm

I have a cluster with one master and several nodes, all nodes have complete replica of the data.
The status of the cluster is green.

When I'm doing stop/start of the service on one of the nodes in the cluster - that node reload all of the data from the master node - even though all of the data already existed there before the restart, no new data was added and the status was green. Any ideas?

I have "auto_expand_replicas": "0-all" for that specific indice.
I'm using elastic 2.3.0 on all nodes and java version "1.8.0_77" on all nodes (64 bit).

Will appreciate any help.

Glen_Smith · May 28, 2016, 2:28am

Dekel,

First, understand that in Elasticsearch, the concept of "master node" only relates to which node controls cluster state. It does not mean that node holds the authoritative copy of every index. There aren't any "slave nodes". (In fact, a best practice is to have dedicated masters - the nodes that are master eligible don't store any data. This is to increase cluster stability.) Which copy of a shard is the "primary" shard is entirely arbitrary, and nearly inconsequential. You likely have primary shards on each of your data nodes.

But to your question. What you are experiencing is that when you start a node, it recognizes that there are shards on the disk, but can't determine whether any one of those shards is identical to their supposed copies on other nodes. One reason for this is that each shard operates in isolation, and the segments of which it is comprised will most likely be different from the segments of another copy of the shard even when they are identical, and a full document-based comparison of the two would be possibly enormously costly.

So what do you do to avoid that painful wait? Enter Synced Flush [1]. The documentation explains nicely what it does, and how to do it, so I leave it there.

Hope that helps!

[1] https://www.elastic.co/guide/en/elasticsearch/reference/current/indices-synced-flush.html

Dekel · June 2, 2016, 5:33pm

Hi Glen,
Thank you for your reply!

I'm aware of the master/salve/nodes states etc, but thanks for the clarifications there.
As for your suggestion to enter Synced Flush - I'm afraid I already did that, this is the reply I got:
{"_shards":{"total":30,"successful":30,"failed":0},"INDICE":{"total":30,"successful":30,"failed":0}}

In general this is the situation:

Indexing data.
Stop indexing (from now on no new data is getting into elastic)
Wait (a few minutes)
flush/synced
Stop the node that I just did the flush/synced on
Wait (a few minutes) [still no new data is getting into elastic)
Start the node we just stopped a few minutes ago.
The node think the shards are "out of date" and the data (again) from the master (the primary shards), which doesn't make any sense to me since the data on that node should have been consistent with the data on the primary shards (as no new data got into elastic during the "shut-down" of that node.

Any ideas?

Glen_Smith · June 14, 2016, 12:37am

Dekel,

Are you disabling allocation?

Basically, follow the steps for a rolling upgrade [1] without the upgrade.

You can drop the "wait a few minutes" steps.

[1] https://www.elastic.co/guide/en/elasticsearch/reference/current/rolling-upgrades.html

Topic		Replies	Views
Problem of sync elasticsearch data after failover Elasticsearch	6	394	July 6, 2017
Problem of sync elasticsearch data after failover Elasticsearch	5	424	July 6, 2017
ES process restart causes full resync of all all shards to the restarted node Elasticsearch	4	754	July 6, 2017
WARNING: master node with node.data=false deletes cluster Elasticsearch	5	536	July 6, 2017
Shard rebalancing after node restart Elasticsearch	2	785	July 5, 2017

Elastic node fetchs all data from master on restart

Related topics