Cluster upgrade from 5.4 -> 5.6 doubled disk usage

jmwilkinson · December 19, 2019, 6:10pm

Was requested to post a discussion topic to start rather than a github issue.

Original link: https://github.com/elastic/elasticsearch/issues/50323

The problem, as described in the issue, is that following the standard cluster upgrade guide from nodes on version 5.4 to 5.6, disk usage on each node doubled. This disk usage has not gone dropped even after the entire cluster finished the upgrade.

Some of the indices were originally created under version 5.3, some under version 5.4.

The question is: how can I force elasticsearch to clean up the disk usage? I assume there wasn't some atrocious change in the underlying data format that ballooned disk usage in version 5.6.

I also don't know very much about the ES data structure, so I don't know how to tell which shards are "leftovers".

DavidTurner · December 19, 2019, 8:15pm

Are you also the poster of this similar-looking (but more detailed) question on Stack Overflow?

As mentioned there, I recommend to use the index stats APIs to work out the detail of what's going on.

jmwilkinson · December 19, 2019, 10:37pm

Yes, that's correct.

I appreciate the advice not to manually delete the generated directories.

If you could clarify what exactly I'm looking for when I look at index stats, that would help. I have a lot of indices, and there's a lot of stats, and I don't really know what I should be looking for.

And in case this needs more clarification, the cluster is green, disk usage doubled after the upgrade, and has not gone down, and it has been days since the cluster was completely upgraded and green.

DavidTurner · December 19, 2019, 11:31pm

The two main components of each shard's disk usage are the store and the translog, whose sizes are reported in the indices stats. I would check that these numbers seem reasonable to you and, if possible, compare them to any stats you might have from before the upgrade to help pinpoint exactly what has got larger. I would expect the translog of indices that haven't recently seen write activity to be pretty tiny, and that the total of all this disk usage corresponds closely with the amount of space actually consumed on disk.

jmwilkinson · December 20, 2019, 12:34am

Size among different indices can vary from a few Mb to hundreds of Gb, so I don't really have a benchmark of reasonable for the storage size, unfortunately.

Translog values are low.

I guess maybe there's an underlying miscommunication about this: I'm not really trying to prove to myself that disk usage doubled after the upgrade. I know that's what happened, I saw it happen on eight nodes. Assuming this is abnormal behavior, how do I fix it? How do I get elasticsearch to cleanup the old data? If this is not abnormal behavior, that means ES ballooned the index size to double using the 5.6 data format, which seems really unlikely.

Another way to put it: I can spend time searching through index stats for hundreds of indices and looking at their store and translog values, but why? What's the endgame here?

DavidTurner · December 20, 2019, 8:37am

Yes, this behaviour sounds abnormal to me. Unfortunately without knowing any detail about what it is that's consuming the extra disk space it isn't really possible to offer any advice about what action you should take. Total disk usage is a very coarse measure. Is each shard twice as large as before or are they all the same size? Maybe they're the same size but there's twice as many of them? Maybe there's something that only affects a small subset of the shards in a very severe way? Maybe Elasticsearch thinks it's using the same disk usage as before and the extra disk usage is something else entirely?

system · January 17, 2020, 8:37am

This topic was automatically closed 28 days after the last reply. New replies are no longer allowed.

Topic		Replies	Views
Higher Disk usage after Upgrade to ES 2.4.3 Elasticsearch	3	613	March 2, 2017
Elasticsearch don't remove old shards Elasticsearch	9	2600	October 26, 2021
Disk Usage Elasticsearch 5.6 compared to Elasticsearch 7.9 Elasticsearch	4	715	November 18, 2020
Files not deleted on upgrade Elasticsearch	8	414	July 6, 2017
Elasticsearch disk usage issue Elasticsearch	1	262	September 28, 2022

Cluster upgrade from 5.4 -> 5.6 doubled disk usage

Related topics