Slow reindex operation on heavy index

viniciof · September 5, 2019, 10:26pm

Hi all,

How can I speed up a long running reindex operation ?

This is being done from a source index of around 4.2TB with 16 shards each of around 300GB~ in a 10 data nodes clusters.

Target index is 90 shards. I've set # of replicas to 0 and refresh rate to -1 to try to speed things up. BUT at this point it has only indexed 1GB in the last 3 hours, which is very slow.

Here the monitoring statistics:

What else can be done to speed this up ?

Regards,

Christian_Dahlqvist · September 7, 2019, 10:41am

Have you tried slicing the reindex operation?

viniciof · September 7, 2019, 9:05pm

This is the solution. Thanks! It helped indeed. I also merged the source index in a single segment as I don't expect any further writes to it anytime soon. Also disabled all type of shard allocation throughout the cluster and now my reindex is avg ~15,000 docs/sec which is the best historical indexing rate I've ever had in this cluster

POST _reindex?wait_for_completion=false&slices=20&refresh
{
  "source": {
    "index": "puma.compilation.pipeline.96f19f5b-bc84-4d4b-8694-b80a293e78e4-latest",
    "size": 500,
    "query": {
"range": {
      "ibi_logtime": {
        "gte": "now-9M/M"
      }
    }
    }
  },
  "dest": {
    "index": "puma.compilation.pipeline.96f19f5b-bc84-4d4b-8694-b80a293e78e4-optimized"
  }
}

viniciof · September 8, 2019, 2:02am

A disclaimer here, since my reindex operations take too long, I would not recommend anybody to disable allocations at cluster level if there new indices being created in the cluster (it would cause red state)

system · October 6, 2019, 2:02am

This topic was automatically closed 28 days after the last reply. New replies are no longer allowed.

Topic		Replies	Views
How to optimize a reindex operation to perform really fast on big source index Elasticsearch	1	353	October 7, 2019
Indexing rate decaying dramatically in short period of time Elasticsearch	3	500	October 8, 2019
Improve reindex speed into new cluster Elasticsearch	4	1090	January 5, 2019
Improving performance of reindex API? Elasticsearch	7	12147	July 5, 2017
Reindex API performance Elasticsearch	3	4494	July 5, 2017

Slow reindex operation on heavy index

Related topics