Reindex ES 2.3 is taking forever


(Veerendra Kumar Balla) #1

Hi,

We are running elastic search on azure with DS4 machines for data nodes (3 data nodes), Memory 28GB, 8TB of premium HDDs (5000 IOPS per disk). We have around 10 indexes and each index has unique type of document. All the indexes together we have 1.4TB of data and recently we had to change index map settings and lead to reindexing. In order to minimize the impact to users we created new indexes and indexing from old to new. We started reindex 5 days back and it still nowhere close to completion. We are not even sure when it is going to complete. We have tried all best practices suggested by community.

We have removed all the replicas for indexes.
We configured refresh interval to -1
We increased the memory for index in heap.

Still there is no improvement at all. We need to complete this as soon as possible and we cannot continue like this without any timelines. Any help on this would be appreciated. Please help with some pointers.


(Nik Everett) #2

Fetch the status of the reindex with the tasks API. It should have a total and a few fields like created and updated. You should be able to divide to get the estimated completion time.

Reindex's default batch size is very, very small in 2.3. It is fixed in 2.4. You can bump the batch size without upgrading like this:

curl -XPOST localhost:9200/_reindex -d'{
  "source": {
    "index": "foo",
    "size": 1000
  },
  "dest": {
    "index": "bar"
  }
}'

I suggest seeing what kind of performance you get if you bump the size up and doing the math to figure out if it makes sense to just let what you have running finish or to start over.


(Veerendra Kumar Balla) #3

Hi,

Thanks for the quick help Nik. I typed the version wrong, we are actually using ES 2.4. However increasing the size of batch improved indexing speed. In fact we scaled up our servers as well, thanks to Microsoft Azure for making it so easier.


(system) #4