Disk Usage Elasticsearch 5.6 compared to Elasticsearch 7.9

We are currently moving from Elasticsearch 5.6.4 to Elasticsearch 7.9.1 and are seeing some concerning differences in disk usage and CPU usage, which we suspect may be a symptom of the larger indices.

The same data with largely the same mappings and analysis settings is in both clusters. Same number of shards and replicas. An index with 81.6k docs in ES7 is 15.6GB while an index in ES5 with the same docs in it is 869MB. An ~18x increase in disk usage doesn't seem like something we would expect.

Looking through other disk usage threads I've checked a few things:

  • That the store size, not the transaction log size is what's eating up space.
  • That the store and doc_values defaults have not changed between the two versions

I've started re-reading release notes for all the versions in between with an eye towards disk usage related things but thought I'd see if anyone had any ideas or information that might help in the meantime.

Here is the mapping template and analysis settings:

{
  "mappings": {
    "properties": {
      "field1": {
        "type": "keyword"
      }, 
      "field2": {
        "type": "integer"
      }, 
      "field3": {
        "type": "integer"
      }, 
      "field4": {
        "type": "keyword"
      }, 
      "field5": {
        "analyzer": "ngram_suggest", 
        "fields": {
          "keyword": {
            "type": "keyword"
          }
        }, 
        "type": "text"
      }, 
      "field6": {
        "type": "keyword"
      }, 
      "field7": {
        "type": "keyword"
      }, 
      "field8": {
        "type": "boolean"
      }, 
      "field9": {
        "type": "boolean"
      }, 
      "field10": {
        "type": "text"
      }, 
      "field11": {
        "type": "boolean"
      }, 
      "field12": {
        "type": "boolean"
      }, 
      "field13": {
        "type": "boolean"
      }, 
      "field14": {
        "type": "boolean"
      }, 
      "field15": {
        "type": "boolean"
      }, 
      "field16": {
        "type": "boolean"
      }, 
      "field17": {
        "type": "boolean"
      }, 
      "field18": {
        "type": "boolean"
      }, 
      "field19": {
        "type": "boolean"
      }, 
      "field20": {
        "type": "boolean"
      }, 
      "field21": {
        "type": "boolean"
      }, 
      "field22": {
        "type": "boolean"
      }, 
      "field23": {
        "type": "boolean"
      }, 
      "field24": {
        "type": "boolean"
      }, 
      "field25": {
        "type": "boolean"
      }, 
      "field26": {
        "type": "boolean"
      }, 
      "field27": {
        "analyzer": "normalized", 
        "fields": {
          "keyword": {
            "type": "keyword"
          }
        }, 
        "type": "text"
      }, 
      "field28": {
        "type": "keyword"
      }, 
      "field29": {
        "type": "integer"
      }, 
      "field30": {
        "type": "integer"
      }, 
      "field31": {
        "analyzer": "normalized", 
        "fields": {
          "raw": {
            "type": "keyword"
          }
        }, 
        "type": "text"
      }, 
      "field32": {
        "analyzer": "ngram_partial", 
        "type": "text"
      }, 
      "field33": {
        "analyzer": "ngram_suggest", 
        "type": "text"
      }, 
      "field34": {
        "type": "boolean"
      }, 
      "field35": {
        "type": "boolean"
      }, 
      "field36": {
        "type": "keyword"
      }, 
      "field37": {
        "type": "keyword"
      }, 
      "field38": {
        "type": "integer"
      }, 
      "field39": {
        "type": "keyword"
      }, 
      "field40": {
        "type": "boolean"
      }, 
      "field41": {
        "type": "boolean"
      }, 
      "field42": {
        "scaling_factor": 100000, 
        "type": "scaled_float"
      }, 
      "field43": {
        "type": "integer"
      }, 
      "field44": {
        "type": "keyword"
      }, 
      "field45": {
        "analyzer": "normalized", 
        "fields": {
          "keyword": {
            "type": "keyword"
          }
        }, 
        "type": "text"
      }, 
      "field46": {
        "analyzer": "normalized", 
        "fields": {
          "keyword": {
            "type": "keyword"
          }
        }, 
        "type": "text"
      }, 
      "field47": {
        "analyzer": "ngram_suggest", 
        "fields": {
          "keyword": {
            "type": "keyword"
          }
        }, 
        "type": "text"
      }, 
      "field48": {
        "type": "boolean"
      }, 
      "field49": {
        "enabled": false
      }, 
      "field50": {
        "type": "long"
      }, 
      "field51": {
        "type": "boolean"
      }
    }
  }, 
  "settings": {
    "index": {
      "analysis": {
        "analyzer": {
          "ngram_partial": {
            "filter": [
              "asciifolding", 
              "lowercase"
            ], 
            "tokenizer": "ngram"
          }, 
          "ngram_suggest": {
            "filter": [
              "asciifolding", 
              "lowercase"
            ], 
            "tokenizer": "edge_ngram"
          }, 
          "normalized": {
            "filter": [
              "asciifolding", 
              "lowercase", 
              "english_stemmer"
            ], 
            "tokenizer": "standard", 
            "type": "custom"
          }
        }, 
        "filter": {
          "english_stemmer": {
            "language": "english", 
            "type": "stemmer"
          }
        }, 
        "tokenizer": {
          "edge_ngram": {
            "max_gram": "20", 
            "min_gram": "1", 
            "token_chars": [
              "letter", 
              "digit", 
              "punctuation"
            ], 
            "type": "edge_ngram"
          }, 
          "ngram": {
            "max_gram": "3", 
            "min_gram": "3", 
            "token_chars": [
              "letter", 
              "digit", 
              "punctuation"
            ], 
            "type": "ngram"
          }
        }
      }
    }
  }
}

Update:
We identified that soft deletes were added and enabled by default in between 5.6 and 7.9. I set our indices with this setting to see if that would solve our problem. The indices still growing far beyond what they do in ES5.

"soft_deletes" : {
  "retention_lease" : {
    "period" : "1m"
  }
}

The growth does seem to be related to writes. For the last 20 hours we've been running with no search traffic other than some testing with Kibana. However our full production write volume is occurring.

A few interesting differences between the ES5 and ES7 versions of the index mentioned in the original post.

  • Merges vs Store disk usage
    ** ES5 - low Store high Merges
    ** ES7 - low Merges high Store (merge constant, store growing)
  • Index Memory - Index Writer
    ** ES5 - hovers around 38mb
    ** ES7 - Seems to be following what looks like a 'garbage collection' pattern where it increases steadily and then plummets

I believe I have identified the cause of the disk usage.
As a test I used the deprecated index.settings.soft_deletes.enabled setting and set it to false.
The new index I created with that setting seems to be maintaining its disk size of ~40mb. Whereas the one I created a few days ago with the retention_lease setting of '1m' has grown to 10gb.

So my new question is...
Is this sort of massive disk usage just the new reality going forward? With disabling of soft_deletes deprecated we need to be able to live with them enabled. However 10gb disk usage for a 40mb index seems ludicrous.

Wrapping up the original problem.

In a scenario like ours where we have decently high write volume ES 5.6.4 was able to keep our indices pretty small (sub 500mb). However in ES 7.9.1 grew that same index to about 17gb.

We also saw that the same query traffic against that index caused about 9x the CPU usage we had seen in ES 5.6.

This fixed the issue:

{
  "template": {
    "settings": {
      "index": {
        "soft_deletes": {
          "enabled": "false"
        }
      }
    }
  }
}

With soft_deletes disabled our CPU usage dropped down to only a little more than what we saw in ES 5.6.4 for query traffic and our indices stayed around the same size as they had before rather than growing to over 34 times the size on disk.

We're happy to have found a solution. Though the fact that it comes in the form of a deprecated setting is concerning. I plan to file an issue to address this and if anyone asks I will update this thread with a link to that issue.

1 Like

This topic was automatically closed 28 days after the last reply. New replies are no longer allowed.