Hot/Warm/Cold index assignment

Callahan · October 11, 2019, 12:33pm

Hi,

I've set up a 3 node cluster for a testing stack. Each node is set up on Docker with the required: node.attr.data=hot / node.attr.data=warm / node.attr.data=cold respectivly.
I have created a policy in Kibana as follows:

PUT _ilm/policy/cold_policy_3_MB
{
  "policy": {
    "phases": {
      "hot": {
        "min_age": "0ms",
        "actions": {
          "rollover": {
            "max_age": "1d",
            "max_size": "3mb"
          },
          "set_priority": {
            "priority": 100
          }
        }
      },
      "cold": {
        "min_age": "1h",
        "actions": {
          "allocate": {
            "number_of_replicas": 1,
            "include": {},
            "exclude": {},
            "require": {
              "data": "cold"
            }
          },
          "set_priority": {
            "priority": 0
          }
        }
      }
    }
  }
}

I have assigned this policy to an auditbeat index. What I was expecting to see here is the auditbeat index ending up initially in the HOT node and when it hits 3MB in size, it rolls over and creates another index on the HOT node (it's set up for 1 primary, no replicas). Once those rollovers are an hour old, they are moved to the COLD node. This isn't happening. The issues I'm seeing is that the index initially ends up on the WARM node and never seems to touch the HOT node. Also, the index is allowed to climb above the specified 3MB without rolling over.
I assumed that by default, Elasticsearch would push the new index to the HOT node by default. I also assumed that as I have assigned this policy to this particular index, those settings would be applied to the index. I can see that the ILM policy has been assigned to the index, I'm just not sure why the settings are not being adhered to.

Any suggestions would be appreciated.

ariemenschneider · October 14, 2019, 9:52am

Hi,

You'll need to assign the index.routing.allocation.require setting initially when the index is created in Your cluster. Here we're using a template to ensure that all indices are created on a hot node:

      "routing": {
        "allocation": {
          "require": {
            "temp": "hot"
          }
        }
      },

For debugging why the rollover is not working, I'd suggest using the Rollover API with the ?dry-run mode:

POST /<indexname>/_rollover?dry_run
{
  "conditions" : {
    "max_age": "1d",
    "max_size": "3mb"
  }
}

Regards,
Alex

coudenysj · October 25, 2019, 6:57am

Hi @ariemenschneider, I have a followup question.

Do we really need 3 data node types (hot, warm, cold)? We ingest quite a lot of data, so the hot nodes are only used for 2 hours, the rest is moved to warm nodes.

It looks like the hot phase is very short, so those hot boxes aren't doing that much work.

Is it a better approach to just combine the hot and the warm nodes on SSD nodes (as the hot->warm phase is only used to rollover the indexes)?

ariemenschneider · October 25, 2019, 8:27am

Hi,

that depends on Your needs (and the available hardware/budget). At our installation (for log ingestion/~10k e/s) we're using 3 Nodes with fast storage as hot nodes and move the indices after some days (depending on type) to 2 warm nodes with more but slower storage. That setup allows us to ingest the logs fast with room for logging peaks and faster searches on hot data, but also enables us to store more historical data which is less often queried.

I've found that this is an area where all advice gets a bit fuzzy as it heavily depends on various individual parameters. But it is usually no real problem to start with one setup and if You find that suboptimal, You can change later to a different architecture without losing any availability.
Hth,
Alex

coudenysj · October 25, 2019, 8:43am

Thanks for the information.

So basically, you let the hot phase rollover on every 50GB, and only move the to warm "3 days after rollover"?

coudenysj · October 25, 2019, 8:44am

Oh, and do you use cold nodes?

ariemenschneider · October 25, 2019, 11:12am

We're using neither rollover or cold nodes. This is an example of one of our ILM policy:

{
    "policy": {
        "phases": {
            "hot": {
                "min_age": "0ms",
                "actions": {}
            },
            "warm": {
                "min_age": "6d",
                "actions": {
                    "allocate": {
                        "include": {},
                        "exclude": {},
                        "require": {
                            "temp": "warm"
                        }
                    },
                    "forcemerge": {
                        "max_num_segments": 1
                    }
                }
            },
            "delete": {
                "min_age": "24d",
                "actions": {
                    "delete": {}
                }
            }
        }
    }
}

system · November 22, 2019, 11:12am

This topic was automatically closed 28 days after the last reply. New replies are no longer allowed.

Topic		Replies	Views
How to setup elasticsearch warm node Elasticsearch ilm-index-lifecycle-management	2	411	March 25, 2021
Allocate index on the hot _tier Elasticsearch ilm-index-lifecycle-management	5	523	June 21, 2021
Index doesn't want to be stored in hot node Elasticsearch	1	285	December 25, 2020
ILM and alias confusion Elasticsearch	9	5089	December 20, 2019
Adding a warm node Elasticsearch ilm-index-lifecycle-management	16	962	May 3, 2022

Hot/Warm/Cold index assignment

Related topics