Hi,
I'm struggling with having an index that has some fields that are frequently updated and documents that are deleted. I want to avoid ballooning the index for obvious reasons. I tried to go with freezing the index and reindex it, but it seems you can't reindex a frozen (read-only) index, which is weird.
So I need a solution where I can dispose of the unused documents while it's continuously available for queries. Thankfully, the index isn't huge it's less than 1GB.
I don't have that much experience with this type of index. I usually go with daily/monthly ones. I remembered that you can't really delete from an index and updates will create new versions. Also, that you can't force merge active indices. I may have overthought this but I have to be sure that this won't be an issue in production. @warkolm could you point me to documentation or a blog post where the auto merging is explained so I know what to expect? I mostly found force merge docs and git issues with auto merging when I googled it.
Thank you!
My inexperience As I mentioned, I don't have any experience with continuously updated indices and I always think of the worst. For example, if every doc in the index is updated X times (4-10) in a day it'll double its size very quickly and performance drops.
Apache, Apache Lucene, Apache Hadoop, Hadoop, HDFS and the yellow elephant
logo are trademarks of the
Apache Software Foundation
in the United States and/or other countries.