ILM Warm Phase Executes on Hot Node

Hi All,

I recently noticed some CPU usage issues on my hot data nodes. When looking at the hot threads (GET /_nodes/<node_id>/hot_threads). I noticed that the main cause was force merging during rollover from hot -> warm.

Example thread:

::: {<node_id>}{<snipped>}{<snipped>}{10.42.0.201}{10.42.0.201:9300}{hs}{k8s_node_name=<snipped>, xpack.installed=true, zone=rack1, transform.node=false}
   Hot threads at 2022-01-28T17:52:23.277Z, interval=500ms, busiestThreads=3, ignoreIdleThreads=true:
   
   100.0% [cpu=38.4%, other=61.6%] (500ms out of 500ms) cpu usage by thread 'elasticsearch[<node_id>][[filebeat-7.16.3-2022.01.28-000003][0]: Lucene Merge Thread #734]'

What confuses me, and something that I haven't been able to find in the docs, is why a warm action of my ILM policy (force merge) is being executed on a hot node. I'd expect warm actions to be executed on warm nodes, not hot nodes.

Would anyone have an explanation for why this is, or point me in the direction of the docs which would explain this?

ILM Policy in question:

PUT _ilm/policy/filebeat
{
  "policy": {
    "phases": {
      "hot": {
        "min_age": "0ms",
        "actions": {
          "readonly": {},
          "rollover": {
            "max_age": "30d",
            "max_primary_shard_size": "50gb"
          },
          "set_priority": {
            "priority": 100
          }
        }
      },
      "warm": {
        "min_age": "5d",
        "actions": {
          "forcemerge": {
            "max_num_segments": 1,
            "index_codec": "best_compression"
          },
          "readonly": {},
          "set_priority": {
            "priority": 50
          }
        }
      },
      "cold": {
        "min_age": "30d",
        "actions": {
          "readonly": {},
          "searchable_snapshot": {
            "snapshot_repository": "es-prod-snapshots",
            "force_merge_index": true
          },
          "set_priority": {
            "priority": 0
          },
          "allocate": {
            "number_of_replicas": 0
          }
        }
      },
      "delete": {
        "min_age": "500d",
        "actions": {
          "delete": {
            "delete_searchable_snapshot": true
          },
          "wait_for_snapshot": {
            "policy": "snap_all"
          }
        }
      }
    }
  }
}

Output of GET filebeat-7.16.3-2022.01.28-000003/_ilm/explain

{
  "indices" : {
    "filebeat-7.16.3-2022.01.28-000003" : {
      "index" : "filebeat-7.16.3-2022.01.28-000003",
      "managed" : true,
      "policy" : "filebeat",
      "lifecycle_date_millis" : 1643389546068,
      "age" : "55.64m",
      "phase" : "hot",
      "phase_time_millis" : 1643389548826,
      "action" : "rollover",
      "action_time_millis" : 1643389551427,
      "step" : "check-rollover-ready",
      "step_time_millis" : 1643389551427,
      "phase_execution" : {
        "policy" : "filebeat",
        "phase_definition" : {
          "min_age" : "0ms",
          "actions" : {
            "readonly" : { },
            "rollover" : {
              "max_primary_shard_size" : "50gb",
              "max_age" : "30d"
            },
            "set_priority" : {
              "priority" : 100
            }
          }
        },
        "version" : 11,
        "modified_date_in_millis" : 1630510914564
      }
    }
  }
}

Hi @BenB196 ,

looks like your filebeat is version 7.16, is your ES cluster also that version?

You would then be using the ILM migrate action to move data to warm node. This sets the _tier_preference setting, which requires you to use dedicated data tier roles in your node role configuration.

If not specifying node.roles, data nodes will have all data node roles by default, which I think could match the symptom you see.

That is one guess at the issue. If that turns out not to be it, I wonder if you can share the output of GET _nodes as well as the output of GET <index>/_settings for a hot and a warm index?

Hi @HenningAndersen,

Yes, the Elasticsearch cluster is 7.16.2.

Here is the output of GET _cat/nodes:

10.42.4.216 61  96 35 15.34 13.20 14.43 hs  - <snipped>-rack2-data-hot-0
10.42.0.203 41 100 35 18.83 17.64 16.98 w   - <snipped>-rack1-data-warm-1
10.42.0.200 35 100 20 18.83 17.64 16.98 cr  - <snipped>-rack1-data-cold-1
10.42.1.47  31  65 16 13.45 13.48 13.45 irt - <snipped>-rack2-ingest-0
10.42.5.182 69 100 13 11.60 10.79  8.25 w   - <snipped>-rack5-data-warm-0
10.42.3.12  42 100 15 31.40 32.76 30.04 cr  - <snipped>-rack1-data-cold-0
10.42.3.13  64 100 39 31.40 32.76 30.04 hs  - <snipped>-rack1-data-hot-0
10.42.0.202 64  64 16 18.83 17.64 16.98 irt - <snipped>-rack1-ingest-0
10.42.2.220 56  65 17 20.71 21.81 21.68 irt - <snipped>-rack5-ingest-0
10.42.4.215 48  87 37 15.34 13.20 14.43 r   - <snipped>-rack2-coord-0
10.42.5.181 27 100 10 11.60 10.79  8.25 cr  - <snipped>-rack5-data-cold-0
10.42.0.201 60  97 49 19.40 17.78 17.03 hs  - <snipped>-rack1-data-hot-1
10.42.1.45  56 100  5 13.45 13.48 13.45 cr  - <snipped>-rack2-data-cold-1
10.42.1.49  64 100 47 13.45 13.48 13.45 hs  - <snipped>-rack2-data-hot-1
10.42.4.213 58 100  5 15.34 13.20 14.43 w   - <snipped>-rack2-data-warm-0
10.42.2.212 49 100 16 20.71 21.81 21.68 cr  - <snipped>-rack5-data-cold-1
10.42.5.180 75  86 12 11.60 10.79  8.25 r   - <snipped>-rack5-coord-0
10.42.3.11  66  88 26 31.40 32.76 30.04 r   - <snipped>-rack1-coord-0
10.42.1.44  42 100  5 13.45 13.48 13.45 w   - <snipped>-rack2-data-warm-1
10.42.3.14  43 100  3 31.40 32.76 30.04 w   - <snipped>-rack1-data-warm-0
10.42.2.215 25  79  1 20.71 21.81 21.68 m   - <snipped>-rack5-controller-0
10.42.4.211 68  78  4 15.34 13.20 14.43 cr  - <snipped>-rack2-data-cold-0
10.42.2.211 61 100 86 20.71 21.81 21.68 hs  - <snipped>-rack5-data-hot-1
10.42.5.183 54  95 18 11.60 10.79  8.25 hs  - <snipped>-rack5-data-hot-0
10.42.1.48  44  87 12 13.45 13.48 13.45 m   * <snipped>-rack2-controller-0
10.42.3.9   45  55 22 31.40 32.76 30.04 lr  - <snipped>-rack1-ml-0
10.42.5.184 58  25  0 11.60 10.79  8.25 lr  - <snipped>-rack5-ml-0
10.42.0.199 36  78  1 18.83 17.64 16.98 m   - <snipped>-rack1-controller-0
10.42.2.219 54 100 15 20.71 21.81 21.68 w   - <snipped>-rack5-data-warm-1
10.42.4.214 34  30 35 15.34 13.20 14.43 lr  - <snipped>-rack2-ml-0

Hot (content & hot) and Warm (warm) nodes are both dedicated node types within the cluster.

The settings for the index are mainly just the base Filebeat ones:
GET filebeat-7.16.3-2022.01.28-000003/_settings

{
  "filebeat-7.16.3-2022.01.28-000003" : {
    "settings" : {
      "index" : {
        "mapping" : {
          "total_fields" : {
            "limit" : "10000"
          }
        },
        "refresh_interval" : "5s",
        "blocks" : {
          "write" : "true"
        },
        "provided_name" : "<filebeat-7.16.3-{now/d}-000003>",
        "query" : {
          "default_field" : [
            <truncated>
            "fields.*"
          ]
        },
        "creation_date" : "1643389546068",
        "priority" : "100",
        "number_of_replicas" : "1",
        "uuid" : "-zDb6hvnSt-bROXFEsNGWg",
        "version" : {
          "created" : "7160299"
        },
        "lifecycle" : {
          "name" : "filebeat",
          "rollover_alias" : "filebeat-7.16.3",
          "indexing_complete" : "true"
        },
        "routing" : {
          "allocation" : {
            "include" : {
              "_tier_preference" : "data_content"
            }
          }
        },
        "number_of_shards" : "1",
        "max_docvalue_fields_search" : "200"
      }
    }
  }
}

Looking at this further now, is it actually seems like this wasn't a move from hot to warm but from content to hot, but given that my ILM policy doesn't have a force merge for going to hot, I'm not sure why a merge would occur there. Unless I'm missing something fundamental to how ES handles rollover + merging.

Hi @BenB196,

Given this additional info, I would think that the merge you see in hot threads is simply a normal merge and not a force merge. The name of the thread could fit that though I am not 100% sure if forcemerge would use a similarly named thread.

Elasticsearch will merge smaller segments into bigger segments automatically when it deems it necessary. Having too many segments is costly to search, indexing and storage. It therefore automatically keeps the number of segments at bay by merging them.

Thanks @HenningAndersen that's kind of a long the same lines that I'm thinking now as well.

Since this seems to be a slightly different issue, I'll open a new thread to not cause confusion.

This topic was automatically closed 28 days after the last reply. New replies are no longer allowed.