Reindex all documents merge configuration

Hey,

We need support a functionality to reindex all our documents in a index.
We ran scan/index for all our existing elasticsearch index which has about
1 million documents and it looks like the size of the index doubles itself,
this is concerning regarding running out of the space. (Production has over
100 million documents)
The reindex process ran for about 1-1.5 hours with 1 thread.
After the index was done the merge policy kicked in and cleaned the old
segments, is there a way to make the merge policy kick in more often?

Thanks,
Itay

--

Hello Itay,

I'd try to increase the thread count on ConcurrentMergeScheduler:
http://www.elasticsearch.org/guide/reference/index-modules/merge.html

If that doesn't help, I think you'd have to either:

  • throttle reindexing, to give merging more time
  • add more disks

Best regards,
Radu

http://sematext.com/ -- ElasticSearch -- Solr -- Lucene

On Tue, Jan 22, 2013 at 9:01 PM, itay yahimovitz itay1336@gmail.com wrote:

Hey,

We need support a functionality to reindex all our documents in a index.
We ran scan/index for all our existing elasticsearch index which has about
1 million documents and it looks like the size of the index doubles itself,
this is concerning regarding running out of the space. (Production has over
100 million documents)
The reindex process ran for about 1-1.5 hours with 1 thread.
After the index was done the merge policy kicked in and cleaned the old
segments, is there a way to make the merge policy kick in more often?

Thanks,
Itay

--

--

thanks for the advise, we will try it out.

-Itay

On Tuesday, January 22, 2013 11:01:38 AM UTC-8, itay yahimovitz wrote:

Hey,

We need support a functionality to reindex all our documents in a index.
We ran scan/index for all our existing elasticsearch index which has about
1 million documents and it looks like the size of the index doubles itself,
this is concerning regarding running out of the space. (Production has over
100 million documents)
The reindex process ran for about 1-1.5 hours with 1 thread.
After the index was done the merge policy kicked in and cleaned the old
segments, is there a way to make the merge policy kick in more often?

Thanks,
Itay

--