Increasing shard relocation speed

haizaar · June 27, 2016, 8:15am

I have ES 2.3.3 running with 6 data nodes. Each index has one replica. ONE of the data nodes was down for a day. Now it's back to life and ES started relocating shards to it.

The thing is that it only relocates two shards at a time. Each nodes holds about 1TB of data and it looks like it will take many hours. How can I increase this number to speed the process up?

P.S. I've also set indices.recovery.max_bytes_per_sec to 200mb, though I see that java process on the recovering node writes only 70-80MB/s (and I've tested my disks to provide 200+ Mb/s).

Thanks!

dadoonet · June 27, 2016, 8:37am

You could try one of those settings: https://www.elastic.co/guide/en/elasticsearch/reference/2.3/recovery.html#recovery

indices.recovery.concurrent_streams
indices.recovery.concurrent_small_file_streams

haizaar · June 27, 2016, 8:43am

concurrent_streams was already set to 5. Set concurrent_small_file_streams to 5 as well, but still, only two shards a time

ywelsch · June 27, 2016, 12:24pm

Have a look at the allocation decider cluster.routing.allocation.node_concurrent_recoveries:
How many concurrent shard recoveries are allowed to happen on a node. Defaults to 2.

https://www.elastic.co/guide/en/elasticsearch/reference/2.3/shards-allocation.html

haizaar · June 27, 2016, 12:39pm

Changed to 10. My cluster settings now look like the below. To test this, I evicted one node by excluding it through cluster.routing.allocation.exclude._ip. However still there were only 2 relocating shards in ES at a time.

I think this is because the settings you've mentioned relate to recovery, while what I'm experiencing is shard relocation. I.e. I'm joining new node to the cluster (to scale out) and only two shards at a time are being moved to it.

So my original question still stands - how to boost shard relocation speed?

My cluster settings:

{
  "persistent" : {
    "cluster" : {
      "routing" : {
        "allocation" : {
          "node_concurrent_recoveries" : "10",
          "node_initial_primaries_recoveries" : "10"
        }
      }
    },
    "discovery" : {
      "zen" : {
        "minimum_master_nodes" : "2"
      }
    },
    "indices" : {
      "recovery" : {
        "concurrent_small_file_streams" : "5",
        "concurrent_streams" : "5",
        "max_bytes_per_sec" : "200mb"
      }
    }
  },
  "transient" : {
    "cluster" : {
      "routing" : {
        "allocation" : {
          "enable" : "all",
          "exclude" : {
            "_ip" : ""
          }
        }
      }
    }
  }
}

ywelsch · June 27, 2016, 12:53pm

On the page I linked, you can find the setting cluster.routing.allocation.cluster_concurrent_rebalance

Allow to control how many concurrent shard rebalances are allowed cluster wide. Defaults to 2.

Relocations are considered as recoveries as well, so you should still increase cluster.routing.allocation.node_concurrent_recoveries if all relocations go to one node.

haizaar · June 27, 2016, 12:56pm

Yeehaa! Set this to 10 and it did the trick! 10 relocating shards.

Thank you very much for your help.

Topic		Replies	Views
Could relocate speed faster than 96MB/s? Elasticsearch	11	2051	May 9, 2019
How to improve recovery speed? Elasticsearch	5	22788	July 5, 2017
Moving shards is slow Elasticsearch	15	5306	May 10, 2018
Is it me or is ES 1.6.0 node startup/recovery slower then before? Elasticsearch	15	1078	July 6, 2017
What's the secret to fast recovery when adding a new node? Elasticsearch	1	146	October 17, 2023

Increasing shard relocation speed

Related topics