Reindex API: parallel reindex requests, reindexing while still indexing to source indices


(Murilo Pereira) #1

We upgraded ES from 2 to 5 and want to do a full reindex so that our indices are upgradable to ES6.

Given clusters running on AWS D2 nodes each with hundreds of +100GB indices configured to have 1 primary and 0 replicas, what would be an optimal reindexing strategy?

Nodes don't have a lot of extra disk space (~80% full).

What we're currently thinking is reindexing/deleting 1 index at a time with wait_for_completion=true but initial tests show that this takes a long time. We're seeing average throughput of 4.5MBps.

Would it make sense to drop wait_for_completion=true and let the cluster parallelize reindex tasks? Would the cluster retry reindexes that failed due to a temporary lack of disk space?

Does it parallelize wait_for_completion=false reindex requests without specifying slicing?

What happens when reindexing from indices which are still being written to? Would the destination index only get documents that were in the source at the point in time when the reindex request was sent?


(Murilo Pereira) #2

Would you expect a reindex on a single-node cluster to have an average throughput of 4MB/s given these node IO metrics?

$ mount | grep elasticsearch                                                                                 
/dev/mapper/ephemeral-elasticsearch on /mnt/elasticsearch type xfs (rw)
$ df -h | grep elasticsearch
/dev/mapper/ephemeral-elasticsearch                    5.4T  4.8T  678G  88% /mnt/elasticsearch
$ sudo hdparm -Tt /dev/mapper/ephemeral-elasticsearch 
/dev/mapper/ephemeral-elasticsearch:
 Timing cached reads:   18780 MB in  2.00 seconds = 9400.56 MB/sec
 Timing buffered disk reads: 920 MB in  3.01 seconds = 306.07 MB/sec
$ sudo dd if=/dev/zero of=/mnt/elasticsearch/output bs=8k count=100k; sudo rm -f /mnt/elasticsearch/output
102400+0 records in
102400+0 records out
838860800 bytes (839 MB) copied, 2.17104 s, 386 MB/s

(Murilo Pereira) #3

Answer to this seems to be yes after experiments.


(Murilo Pereira) #4

I hate to bump this, but could someone at least opine on whether this throughput rate is normal? Do you need more information?


(Lee Hinman) #5

@nik9000 this sounds like something you might be able to help with?


(system) #6

This topic was automatically closed 28 days after the last reply. New replies are no longer allowed.