Index size with elasticseach 8 increasing?

Hi,

before filing a bug report I would like to ask if someone else is encountering increase in index sizes after the update to Elasticsearch 8? I posted the sizes of our logstash daily indices from one cluster. Size and doc count was nearly steady up until the upgrade (March 14./15.). After that, index sizes increased even for days with less documents. Can someone confirm similar behaviour?

green open logstash-2022.03.01                           M6b9NC2bTkq6Z1uSxnvjwQ 15 2   5320524       0   1.8gb 629.6mb
green open logstash-2022.03.02                           Q2D0acQSSyW5ljoA7ZpjRA 15 2   5219499       0   1.8gb 619.1mb
green open logstash-2022.03.03                           -jKjQAnYQdCs-HtHTEtOrA 15 2   5638363       0   1.9gb 665.1mb
green open logstash-2022.03.04                           5bCEQw9XTR6lHSyaIu1gDg 15 2   5200039       0   1.7gb 611.8mb
green open logstash-2022.03.05                           HplGtbwcR-O5_qYhXGPCbg 15 2   5172135       0   1.7gb 603.3mb
green open logstash-2022.03.06                           dyHaEEP8QnCO1dL8lWQDuw 15 2   5176126       0   1.7gb 603.8mb
green open logstash-2022.03.07                           mNzUCtXGRqG84et3LPAMLQ 15 2   5221957       0   1.7gb 613.8mb
green open logstash-2022.03.08                           PqCFfT68QdmdFWSzw2EGWw 15 2   5170401       0   1.7gb 613.8mb
green open logstash-2022.03.09                           w8kjFuaaRMqk51HzstR-Jw 15 2   5067126       0   1.7gb   596mb
green open logstash-2022.03.10                           WbIQOdSuQJGRg2qcAN89oA 15 2   5217776       0   1.8gb 620.9mb
green open logstash-2022.03.11                           xLqf2zqkSsS05h_6ZZTYqg 15 2   5250895       0   1.8gb   629mb
green open logstash-2022.03.12                           _3Hn2cBzTPerGrqc8gyynQ 15 2   5231920       0   1.8gb 619.9mb
green open logstash-2022.03.13                           tZe9-g5dR1KckVnkbhLWKQ 15 2   5234610       0   1.8gb 619.8mb
green open logstash-2022.03.14                           OzEYnRLtRwK9bFsZdPoIfw 15 2   5292696       0   1.8gb   635mb
green open logstash-2022.03.15                           UHa9diC6RqK-awjbxxRSRQ 15 2   6729563       0   3.5gb   1.1gb
green open logstash-2022.03.16                           Pm4F9hK8QbahxBDzWapr6g 15 2   3793259       0   2.2gb 775.3mb
green open logstash-2022.03.17                           pUsvHuolQVmVTVxrPLAxSw 15 2   6346073       0   3.8gb   1.2gb
green open logstash-2022.03.18                           LsIwkyeWQgSJ3m-8gIkoKw 15 2   5282288       0   3.1gb     1gb
green open logstash-2022.03.19                           QjXw0NjkRv6-SZhIvxYyUg 15 2   4425355       0   2.5gb 868.1mb
green open logstash-2022.03.20                           2DxEXL-lSu2aipiShFwFAg 15 2   4425410       0   2.5gb 871.3mb
green open logstash-2022.03.21                           xCjBW8YIQ6q1gIs_kP8pUQ 15 2   4455392       0   2.6gb 886.2mb

Why do you have 15 primary shards for such small indices where even a single primary shard would result in smaller shards than recommended?

The shard configuration is a leftover from when this has been a bigger production cluster, now it's just a testing cluster - the settings have not been adjusted. These settings have not changed in a while, so they should have no impact regarding the increased index size?

Difficult to tell without knowing the actual mapping.Are you using fields with binary doc values, for example wildcard or geo_shape fields?

This is the simple mapping that is in use here:

{
  "dynamic_templates": [
    {
      "text_fields": {
        "mapping": {
          "index": "true",
          "type": "text"
        },
        "match_mapping_type": "string",
        "match": "*"
      }
    },
    {
      "long_fields": {
        "mapping": {
          "type": "long"
        },
        "match_mapping_type": "long",
        "match": "*"
      }
    }
  ],
  "date_detection": false,
  "properties": {
    "request": {
      "index": true,
      "type": "keyword"
    },
    "agent": {
      "index": "true",
      "type": "keyword"
    },
    "geoip": {
      "dynamic": true,
      "type": "object",
      "properties": {
        "location": {
          "type": "geo_point"
        }
      }
    },
    "auth": {
      "index": "true",
      "type": "keyword"
    },
    "ident": {
      "index": "true",
      "type": "keyword"
    },
    "verb": {
      "index": "true",
      "type": "keyword"
    },
    "req_time": {
      "type": "double"
    },
    "referrer": {
      "index": "true",
      "type": "text"
    },
    "@timestamp": {
      "format": "date_optional_time",
      "index": true,
      "ignore_malformed": false,
      "store": false,
      "type": "date",
      "doc_values": true
    },
    "srchost": {
      "index": "true",
      "type": "keyword"
    },
    "bytes": {
      "type": "long"
    },
    "response": {
      "type": "integer"
    },
    "clientip": {
      "type": "ip"
    },
    "@version": {
      "type": "integer"
    },
    "host": {
      "index": "true",
      "type": "keyword"
    },
    "httpversion": {
      "index": "true",
      "type": "keyword"
    }
  }
}

This topic was automatically closed 28 days after the last reply. New replies are no longer allowed.