I wrote an application who does the reindexation. It reads from one index with scan and scroll and send it to another index. Here I got two things on the new index:
I have exactly the same size of the documents as the base index
I have %50 more size than the base index ( I have the same number of shards and replica)
Do you have any idea why I have more size while having the same number of documents and is this a problem?
EDIT: I looked at the segments settings the only difference is here
I have tried reindexation for many indexes. I got the same doc number every time, and nearly the same size for other indexes. But still got the difference for one of my indexes (%50 size difference), what can be the other possibilities ?
are u adding the content for the first time in that index? OR you are rewriting the same documents? What is the delete %age? If delete %age is different, try optimize (expunge delete) and compare sizes. It wont't be exact same size but not huge difference too.
Apache, Apache Lucene, Apache Hadoop, Hadoop, HDFS and the yellow elephant
logo are trademarks of the
Apache Software Foundation
in the United States and/or other countries.