Higher Disk usage after Upgrade to ES 2.4.3


we are running an 9 machine Elasticsearch cluster at the company I work at. This cluster is running Elasticsearch 1.7.3. We have two indices in this cluster, one using 4.3GB and one using 312GB of disk.

Last week we started a task force to upgrade our cluster to Elasticsearch 2.4.3, so we've deployed a new cluster running the new version of Elasticsearch, and configured a Logstash instance to consume from our S3 backup and index the documents in your new cluster. The larger index is using 950GB and the smaller one is using 6.9GB. The document count for both indices remain really similar, less than 100 000 documents different for a 10 billion document index.

The only changes we've made in the indices mappings was changing from "index_analyzer" to "analyzer" in some fields, due to "index_analyzer" not beeing a valid configuration anymore.

Do this disk usage increment make any sense? Is newer versions of ES consuming more disk thank before?

Thanks in advance.


probably a stupid question: have you diffed your mappings? All differences as expected? No _all field accidentally enabled, etc?

Are you using doc values? Are the values stored very sparse?


Hi, thank you for your answer.

You were right about doc_values, it wasn't enabled by default in the version I was using.
The disk consumption is back to what it was before.

Thank you very much!

This topic was automatically closed 28 days after the last reply. New replies are no longer allowed.