Will multi thread indexing lead to smaller index?

Shellbye_Bai · April 11, 2017, 12:05pm

I am indexing about 17w sentences, and I use bulk. When I indexed use single thread, it end up about 150M, but when I tried use multi thread, it became 110M, is this possible? I used the count api checked, the total number of document is the same.

Is there any way I can find the different between this two index?

Christian_Dahlqvist · April 11, 2017, 12:10pm

The size of the index can vary depending on how segments have or have not merged. If you are indexing at a higher speed using multiple threads it is possible that the initial segments will be larger and therefore merge differently.

Shellbye_Bai · April 11, 2017, 12:16pm

Thanks for your reply.

system · May 9, 2017, 12:19pm

This topic was automatically closed 28 days after the last reply. New replies are no longer allowed.

Topic		Replies	Views
Different Index sizes for same data Elasticsearch	2	972	July 6, 2017
Too many Lucene merge threads while indexing Elasticsearch	7	4043	July 19, 2019
Indexing to two relatively large indices slows down. Any ideas? Elasticsearch	3	959	July 5, 2017
Indexing Performance, Threads + Bulk Size Elasticsearch	2	414	July 6, 2017
Threadpool sizes Elasticsearch	4	394	July 6, 2017

Will multi thread indexing lead to smaller index?

Related topics