Reindexing throughput degrades over time

My reindex operation has slowed considerably over time - what can explain this trend of decrease in throughput? Please see attached.

While reindexing is happening, the search latency has also gone up too. See attached

CPU utilization is fairly constant

Other information:

  • Total number of documents to index is about 300 million.
  • Index is configured with 15 shards
  • Total number of data nodes is 3
  • Refresh interval is set to 10s
  • Replica count is set to 2

Here is the Reindex request:

        ReindexRequest request = new ReindexRequest();
        request.setScript(new Script(ScriptType.INLINE, "painless", "ctx._source.sortId = ctx._id", Collections.emptyMap()));

I would recommend looking at disk I/O and iowait as this very well could be the bottleneck. I believe reindexing retains the original document ID and this means each indexing operation actually is an update (read and write) as Elasticsearch need to check if the document already exists. This is a lot more expensive than just indexing a new document and tend to get slower the larger the index that is written to gets. If disk performance seems likely to limit performance I would recommend temporarily setting the number of replicas to 0 for the destination index as that will reduce the disk I/O. Also check in the logs whether there is anything around long or frequent GC.

1 Like

This topic was automatically closed 28 days after the last reply. New replies are no longer allowed.