Reindex API - Need to improve performance


The goal is to accelerate the reindex of indexes from an ELK cluster (version 7.6.2) composed of 3 nodes each with all the roles to a datastream in an ELK cluster (version 7.11) installed in a container system with a total of 11 nodes: 2 coordinators, 3 masters, 3 data_hot node, 3 noed data_warm

Some information about the indexes to be transferred: 189 indexes with 1 replica, a little bit more than 5 TB of data, 6 532 237 567 docs.

With the reindex function, I have already done several tests (changing the refresh interval, with / without replica, several reindexes in parallel to the same datastream, ...) and it works correctly but it takes time!

Concretely to reindex 1 index of 30 Gb, 34 731 806 docs to a datastream without replica and with a refresh_interval equal to 300s it takes about 12h to realize the transfer.

I am looking for ideas to improve the performance of this transfer.

Thanks in advance, :wink:

Welcome to our community! :smiley:

If you've done all of that so far and are unable to improve speeds, can you increase the size of your cluster?


Yes, I can increase the size of the cluster if really necessary, but since it's production, it has to be justified... :wink:
In addition, I forgot to mention this, but I need to re-index into an existing datastream that already contains production data. So I have to be careful what I do.