DataStream does not roll over on primary shard max size

Roura_Antoine · February 7, 2024, 10:20am

Hello,
I'm new to Elastic. I am trying to configure a DataStream that must roll over when the primary shard size reach 1MB (it is a functional test), but it does not.

Here is my ILM :

PUT _ilm/policy/short_life_policy2
{
  "policy": {
    "phases": {
      "hot": {
        "min_age": "0ms",
        "actions": {
          "rollover": {
            "max_size": "1mb",
            "max_age": "1m"
          },
          "set_priority": {
            "priority": 100
          }
        }
      },
      "warm": {
        "min_age": "0ms",
        "actions": {
          "readonly": {},
          "forcemerge": {
            "max_num_segments": 1
          },
          "set_priority": {
            "priority": 50
          },
          "allocate": {
            "require": {
              "data": "warm"
            }
          }
        }
      }
    }
  }
}

The component of the index template :

PUT _component_template/short_life_component
{
  "template": {
    "settings": {
      "index": {
        "number_of_shards": "2",
        "number_of_replicas": "1",
        "lifecycle.name": "short_life_policy2"
      }
    }
  }
}

The index template :

PUT _index_template/short_life_template2
{
  "index_patterns": ["logs-webdav.access*"],
  "data_stream": {},
  "composed_of": ["short_life_component"],
  "priority": 500
}

Then my logs are injected in the cluster after being parsed by Logstash. It creates a DataStream that is correctly affected to the index template and the relative ILM. However, it does not roll over for the mentioned primary shard max size, and either the mentioned maximum amount of time which is 1minute. It keeps growing.

I've look at some tutorials and topics but I can't find the mistake I made.

I can give more information.

Roura_Antoine · February 7, 2024, 10:57am

I've found my mistake.

As I'm testing the roll over behaviour, I'm ingesting a large amount of data instantaneously, and I'm expecting the roll over to act quicly. But in real cases, the maximum amount of data for the roll over is not reached instantaneously, it makes more time.

Default clusters are based on real situations cases, and it check if the system needs to roll over all the 10 minutes with the setting "indices.lifecycle.poll_interval" : 10m.

If you want to test roll over for datastreams that you are filling quicly, you need to temporary adjust this parameter so that the cluster may check if roll over is needed in shorter amounts of time.

For example :

PUT _cluster/settings
{
  "persistent": {
    "indices.lifecycle.poll_interval": "10m"
  },
  "transient": {
    "indices.lifecycle.poll_interval": "5s"
  }
}

system · March 6, 2024, 10:58am

This topic was automatically closed 28 days after the last reply. New replies are no longer allowed.

Topic		Replies	Views
Is my data stream not rolling over to a new index like it's supposed to? Elasticsearch datastreams	3	949	November 22, 2021
Elasticsearch ilm rollover NOT applied as it should on datastreams v8.5.2 Elasticsearch ilm-index-lifecycle-management , datastreams	5	686	March 28, 2023
Datastream does not rollover properly Elasticsearch datastreams	4	691	January 18, 2021
Data stream rollover exceeded max_size Elasticsearch	1	294	December 9, 2020
ILM my index is not doing rollover properly Elasticsearch ilm-index-lifecycle-management	5	358	August 7, 2021

DataStream does not roll over on primary shard max size

Related topics