Forcemerge multiple indices = high fragmentation on disk?

I'm applying an ILM policy on older (daily) indices. In warm phase I have enabled forcemerge to reduce the number of segments.

If I assign the ILM policy to multiple indices, say 30 days, will the forcemerge run on all the indices at the same time, or does Elastic handle this 1 at a time?

The reason I'm asking is because if Elastic would run all forcemerge at the same time, I figure this will result in high fragmentation of the files on disk.

(ELK 7.4.2 on Windows 2021R2)

From Force merge API | Elasticsearch Guide [7.11] | Elastic;

Multi-index operations are executed one shard at a time per node

1 Like

Ah yes, thanks! :+1:

I did read that page, but apparently overlooked the multi-index operations bit.

1 Like

That's not quite the full story:

  • if you run two separate force-merges then they might run in parallel; in particular ILM-requested force merges are all individual requests
  • in future even a single force-merge request will be parallelised
  • ... but only if you increase the number of force-merge threads from the default of 1.

All this is kind of irrelevant, however, because force merges will be interleaved with other write operations (including automatic merges) so there's definitely no guarantee that serial force-merges imply no fragmentation. It's even more irrelevant because you will likely only run into actual problems related to fragmentation if your disks are nearly full and/or you're using some ancient filesystem that doesn't control fragmentation properly.

1 Like

Hmm, is Windows 2012R2 ancient? :sweat_smile:

2TB of data on 4TB spinning disks with fragmentation at 92%. The weekly defrag task didn't work that well I guess... After all force-merges where finished I let the customer do a manual defrag so it's no longer a problem.

It's >7 years old and had mainstream support withdrawn 2½ years ago, so it's certainly no spring chicken...

Was it a problem before? How did that problem manifest? I get that fragmentation was reported, but I don't understand how that actually affected anything important.

Like the OS, the server and used disks are no spring chicken either :smiley:
Fragmentation can have a negative performance impact on spinning disks. So to keep performance as good as possible, less fragmentation is better.

With newer servers with more memory and SSD storage this is no longer an issue...

Yes in theory it can have a performance impact - my question is whether it really did. I would expect the difference to be lost in the noise in most cases. Can you quantify the improvement that you observed via manual defragmentation in terms of its effect on actual performance? If so, that's unexpected and interesting to me. If not, why worry about it?

I did not do any performance tests before and after, so unfortunately I cannot tell you if there is any noticeable difference. Maybe I should have done so, but at the time I wrote my question, I had already started with the force-merge actions and de-fragmentation.

This question was mainly based on experiences I had in the past with highly fragmented drives where Windows and some disk intensive applications would benefit from low file fragmentation (at that time disks where a lot slower than currently available drives though).

1 Like

This topic was automatically closed 28 days after the last reply. New replies are no longer allowed.