ILM not rolling over at correct size

Hi

I have created an ILM policy and applied it to my Index -

{
"policy": "my_logs_policy",
"phase_definition": {
"min_age": "0ms",
"actions": {
"rollover": {
"max_size": "30gb"
}
}
},

I have also set the below -
PUT /_cluster/settings
{
"persistent" : {
"indices.lifecycle.poll_interval": "5s"
}
}

However, my index is rolling over at 60GB, and not 30GB, can anyone advise why and how to resolve this?

The ILM is deffo applied, as it does rollover just not when it should.

Is the older index still 60GB after refreshing and flushing it, and ensuring that there aren't any ongoing merges? Is that 60GB the size of the store or are you including the translog size too? Looking through the code, it seems we ignore "transient" data like the translog and invisible segments when computing the size for rollover purposes, because those transient data are eventually discarded.

1 Like

Hi

Yes i have refreshed/flushed it and it is still 60GB.

The ILM seems to be rolling over @60GB instead of 30GB for some reason, as over the weekend it has created 2 more indexs @ 60GB.

How are you measuring the size of these rolled-over indices? Can you share the exact API call you're using as well as its output?

Hi

I am just looking in Kibana under index management, no using any sort of API ect.

Ok, I'm not quite sure where that UI gets its data. Can you share the output of GET /<index>/_stats for the index in question?

{
  "_shards" : {
    "total" : 6,
    "successful" : 6,
    "failed" : 0
  },
  "_all" : {
    "primaries" : {
      "docs" : {
        "count" : 32112036,
        "deleted" : 0
      },
      "store" : {
        "size_in_bytes" : 32410447708
      },
      "indexing" : {
        "index_total" : 32112036,
        "index_time_in_millis" : 16358145,
        "index_current" : 0,
        "index_failed" : 0,
        "delete_total" : 0,
        "delete_time_in_millis" : 0,
        "delete_current" : 0,
        "noop_update_total" : 0,
        "is_throttled" : false,
        "throttle_time_in_millis" : 0
      },
      "get" : {
        "total" : 0,
        "time_in_millis" : 0,
        "exists_total" : 0,
        "exists_time_in_millis" : 0,
        "missing_total" : 0,
        "missing_time_in_millis" : 0,
        "current" : 0
      },
      "search" : {
        "open_contexts" : 0,
        "query_total" : 308,
        "query_time_in_millis" : 182324,
        "query_current" : 0,
        "fetch_total" : 3,
        "fetch_time_in_millis" : 425,
        "fetch_current" : 0,
        "scroll_total" : 0,
        "scroll_time_in_millis" : 0,
        "scroll_current" : 0,
        "suggest_total" : 0,
        "suggest_time_in_millis" : 0,
        "suggest_current" : 0
      },
      "merges" : {
        "current" : 0,
        "current_docs" : 0,
        "current_size_in_bytes" : 0,
        "total" : 4682,
        "total_time_in_millis" : 21228980,
        "total_docs" : 76385733,
        "total_size_in_bytes" : 98041573499,
        "total_stopped_time_in_millis" : 0,
        "total_throttled_time_in_millis" : 12440347,
        "total_auto_throttle_in_bytes" : 15728640
      },
      "refresh" : {
        "total" : 13865,
        "total_time_in_millis" : 1719787,
        "listeners" : 0
      },
      "flush" : {
        "total" : 140,
        "periodic" : 131,
        "total_time_in_millis" : 817375
      },
      "warmer" : {
        "current" : 0,
        "total" : 13671,
        "total_time_in_millis" : 296
      },
      "query_cache" : {
        "memory_size_in_bytes" : 3332620,
        "total_count" : 622,
        "hit_count" : 129,
        "miss_count" : 493,
        "cache_size" : 22,
        "cache_count" : 22,
        "evictions" : 0
      },
      "fielddata" : {
        "memory_size_in_bytes" : 67497800,
        "evictions" : 0
      },

The size of the primaries looks to be around 30GB, which is what we expect:

    "size_in_bytes" : 32410447708

The stats are truncated, but does this index have 1 replica? If so, I'm guessing the UI you're looking at is counting the sizes of the replicas too, but index rollover does not do this:

max_size: The maximum estimated size of the primary shard of the index

2 Likes

Ah, Ok, so the dashboard is showing the size plus replicas.

Which would be 60GB.

Thank you.

While i've got you, can I ask a question re sharding?

I have 6 Data nodes, and all my indexs are set to 3 shards, would increasing this to 6, increase performance, as some of my queries time out.

Your shards, at ~10GB each, are already a little smaller than recommended:

That said, predicting the performance of different configurations is very hard. You can normally only answer questions like this with careful benchmarking.

1 Like