Shard sync_id keeps changing in read-only cluster (ES 5.5)

larschri · December 8, 2017, 11:40am

I was hoping to use synced flush to speed up recovery after a node restart, but I have not been able to make it work. Is it possible to make sure that sync_id does not change for any shards, to avoid copying indices across the network?

I have tried the following without any success:

cluster.routing.allocation.enable is set to none
cluster.blocks.read_only is set to true
synced flush has been performed successfully on all indices
Shut down all external processes trying to index, delete or change anything.

Some properties of the cluster:

5TB of data
10 nodes
4000 indices of uneven size.
50-100 writes (indexing+delete) per second
Elasticsearch 5.5

Recovery of just one node takes several hours. I know that it is possible to tune recovery speed by various settings, but it will always be much slower to copy than to use local files.

Is this working as expected? Are there anything else I can do to avoid copying indices across the network when a node is restarted?

Thanks!

larschri · December 11, 2017, 10:59am

Further debugging showed that elastic will always set new sync_ids when marking indices as inactive (which happens after 5 minutes of no indexing activity by default). It doesn't matter if the index has already been sync flushed manually and nothing needs to be synced.

So instead of performing a synced flush after shutting down indexing, it is much quicker to wait for 5 minutes (indices.memory.shard_inactive_time) before restarting a node.

larschri · December 15, 2017, 12:20pm

See https://github.com/elastic/elasticsearch/issues/27838

system · January 12, 2018, 12:20pm

This topic was automatically closed 28 days after the last reply. New replies are no longer allowed.

Topic		Replies	Views
Way to make a shard completely read-only? Elasticsearch	5	1255	July 5, 2017
Elasticsearch 1.7.3 not adding sync_id to stale index shards Elasticsearch	1	476	July 5, 2017
Quickly restarting a node Elasticsearch	6	563	April 11, 2019
Synced flush and recovery Elasticsearch	5	1305	June 21, 2017
Why does a restart performs recovery which takes long time (6-12hrs)? Elasticsearch	3	2705	January 23, 2019

Shard sync_id keeps changing in read-only cluster (ES 5.5)

Related topics