I read on elastic documentation that warm phase could reduce index size,
but it didn't state clearly how much index will be compressed on warm phase.
It is stated that warm phase also have shrink and force merge feature, should I use this feature to get my indices "compressed"? because I already have my indices on my hot phase with 1 primary shard, so I don't think it is effective to shrink indices on rollover,
so I can conclude that my index won't be compressed if I don't force merge it on warm phase right?
for the shrink one, I don't think I can shrink my index because it already has 1 shard on each index
sorry for further question, I don't know if I should make a new thread or put the question here.
I just tried the force merge on the 3GB index, and the size increased to around 8GB on the process.
and from the documentation I've read
Force merge should only be called against an index after you have finished writing to it. Force merge can cause very large (>5GB) segments to be produced
does it mean that if I try to force merge an index about 500GB, the segment to be produced could be up to 100GB++?
I checked the index stats API frequently, since the increased size on force merge process has been back to normal (decreased by around 100MB from the normal size), the segments decreased from 37 to 1
yes I have, it has 3 primary shard with no replica
I am trying to accomplish a normal ILM cycle, thats why I asked the detailed on force merge because I want to make full ILM cycle with force merge setting
I'm afraid when the ILM is implemented, force merging 500GB index could make a huge down performance on the cluster
First of all if you had a 500 GB index you would hopefully have at least 10 shards. And only one force merge happens is at a time unless you purposely tell it not to.
Of course my other question would be why do you want 500 GB indexes which you can but why not make 150 gb indexes.
3 shards at 50 GB a piece.
We have hundreds of customers that implement force merge on very large clusters and very large data sets under very large volumes It's all about properly configuring your cluster your data all of it together.
I will keep this in mind, I will try to tune the shard size next time,
so 2 force merge task can't happen at the same time right? since the force merge should be happen automatically on each indices regardless the condition met on warm phase
By default there is only one force merge thread on any one node at any one time.
If you are very advanced you can change that setting.
If you want to know the exact behavior of that code you would need to look at our code I don't know it at that level I do know there is a single thread. Whether it distributes that thread across more than one shard at a time not certain but I think it works its way through one shard at a time
actually it is relief that the code sets everything up like that, so I can actually set the force merge on my ILM now,
so the thing I should set up now only about the index sizing
again, thanks for explaining everything clearly, hope you have a nice day!
Apache, Apache Lucene, Apache Hadoop, Hadoop, HDFS and the yellow elephant
logo are trademarks of the
Apache Software Foundation
in the United States and/or other countries.