Index management and rollover

Marcell0e · November 8, 2019, 2:34pm

{
  "index" : "packetbeat-7.3.0-000001",
  "shard" : 0,
  "primary" : false,
  "current_state" : "unassigned",
  "unassigned_info" : {
    "reason" : "CLUSTER_RECOVERED",
    "at" : "2019-11-06T20:31:04.071Z",
    "last_allocation_status" : "no_attempt"
  },
  "can_allocate" : "no",
  "allocate_explanation" : "cannot allocate because allocation is not permitted to any of the nodes",
  "node_allocation_decisions" : [
    {
      "node_id" : "0j54YASqTNOnOKXpr4Nz5A",
      "node_name" : "INDY-LOGSRV01",
      "transport_address" : "10.3.200.45:9300",
      "node_attributes" : {
        "ml.machine_memory" : "68718940160",
        "xpack.installed" : "true",
        "box_type" : "hot",
        "ml.max_open_jobs" : "20"
      },
      "node_decision" : "no",
      "deciders" : [
        {
          "decider" : "max_retry",
          "decision" : "YES",
          "explanation" : "shard has no previous failures"
        },
        {
          "decider" : "replica_after_primary_active",
          "decision" : "YES",
          "explanation" : "primary shard for this replica is already active"
        },
        {
          "decider" : "enable",
          "decision" : "YES",
          "explanation" : "all allocations are allowed"
        },
        {
          "decider" : "node_version",
          "decision" : "YES",
          "explanation" : "can allocate replica shard to a node with version [7.3.0] since this is equal-or-newer than the primary version [7.3.0]"
        },
        {
          "decider" : "snapshot_in_progress",
          "decision" : "YES",
          "explanation" : "the shard is not being snapshotted"
        },
        {
          "decider" : "restore_in_progress",
          "decision" : "YES",
          "explanation" : "ignored as shard is not being recovered from a snapshot"
        },
        {
          "decider" : "filter",
          "decision" : "NO",
          "explanation" : """node does not match index setting [index.routing.allocation.require] filters [box_type:"warm"]"""
        },
        {
          "decider" : "same_shard",
          "decision" : "YES",
          "explanation" : "the shard does not exist on the same node"
        },
        {
          "decider" : "disk_threshold",
          "decision" : "YES",
          "explanation" : "enough disk for shard on node, free: [283.8gb], shard size: [0b], free after allocating shard: [283.8gb]"
        },
        {
          "decider" : "throttling",
          "decision" : "YES",
          "explanation" : "below shard recovery limit of outgoing: [0 < 2] incoming: [0 < 2]"
        },
        {
          "decider" : "shards_limit",
          "decision" : "YES",
          "explanation" : "total shard limits are disabled: [index: -1, cluster: -1] <= 0"
        },
        {
          "decider" : "awareness",
          "decision" : "YES",
          "explanation" : "allocation awareness is not enabled, set cluster setting [cluster.routing.allocation.awareness.attributes] to enable it"
        }
      ]
    }

Marcell0e · November 8, 2019, 2:39pm

> {
>       "node_id" : "GqP1krHiTX6Iqe0t9bzvaQ",
>       "node_name" : "INDY-LOGSRV06",
>       "transport_address" : "10.3.200.50:9300",
>       "node_attributes" : {
>         "ml.machine_memory" : "68718940160",
>         "ml.max_open_jobs" : "20",
>         "xpack.installed" : "true",
>         "box_type" : "hot"
>       },
>       "node_decision" : "no",
>       "deciders" : [
>         {
>           "decider" : "max_retry",
>           "decision" : "YES",
>           "explanation" : "shard has no previous failures"
>         },
>         {
>           "decider" : "replica_after_primary_active",
>           "decision" : "YES",
>           "explanation" : "primary shard for this replica is already active"
>         },
>         {
>           "decider" : "enable",
>           "decision" : "YES",
>           "explanation" : "all allocations are allowed"
>         },
>         {
>           "decider" : "node_version",
>           "decision" : "YES",
>           "explanation" : "can allocate replica shard to a node with version [7.3.0] since this is equal-or-newer than the primary version [7.3.0]"
>         },
>         {
>           "decider" : "snapshot_in_progress",
>           "decision" : "YES",
>           "explanation" : "the shard is not being snapshotted"
>         },
>         {
>           "decider" : "restore_in_progress",
>           "decision" : "YES",
>           "explanation" : "ignored as shard is not being recovered from a snapshot"
>         },
>         {
>           "decider" : "filter",
>           "decision" : "NO",
>           "explanation" : """node does not match index setting [index.routing.allocation.require] filters [box_type:"warm"]"""
>         },
>         {
>           "decider" : "same_shard",
>           "decision" : "YES",
>           "explanation" : "the shard does not exist on the same node"
>         },
>         {
>           "decider" : "disk_threshold",
>           "decision" : "YES",
>           "explanation" : "enough disk for shard on node, free: [317.7gb], shard size: [0b], free after allocating shard: [317.7gb]"
>         },
>         {
>           "decider" : "throttling",
>           "decision" : "YES",
>           "explanation" : "below shard recovery limit of outgoing: [0 < 2] incoming: [0 < 2]"
>         },
>         {
>           "decider" : "shards_limit",
>           "decision" : "YES",
>           "explanation" : "total shard limits are disabled: [index: -1, cluster: -1] <= 0"
>         },
>         {
>           "decider" : "awareness",
>           "decision" : "YES",
>           "explanation" : "allocation awareness is not enabled, set cluster setting [cluster.routing.allocation.awareness.attributes] to enable it"
>         }
>       ]
>     },
>     {
>       "node_id" : "Xzmh5yzfRhOPwRK0gBiujw",
>       "node_name" : "INDY-LOGSRV02",
>       "transport_address" : "10.3.200.46:9300",
>       "node_attributes" : {
>         "ml.machine_memory" : "68718940160",
>         "ml.max_open_jobs" : "20",
>         "xpack.installed" : "true",
>         "box_type" : "warm"
>       },
>       "node_decision" : "no",
>       "deciders" : [
>         {
>           "decider" : "max_retry",
>           "decision" : "YES",
>           "explanation" : "shard has no previous failures"
>         },
>         {
>           "decider" : "replica_after_primary_active",
>           "decision" : "YES",
>           "explanation" : "primary shard for this replica is already active"
>         },
>         {
>           "decider" : "enable",
>           "decision" : "YES",
>           "explanation" : "all allocations are allowed"
>         },
>         {
>           "decider" : "node_version",
>           "decision" : "YES",
>           "explanation" : "can allocate replica shard to a node with version [7.3.0] since this is equal-or-newer than the primary version [7.3.0]"
>         },
>         {
>           "decider" : "snapshot_in_progress",
>           "decision" : "YES",
>           "explanation" : "the shard is not being snapshotted"
>         },
>         {
>           "decider" : "restore_in_progress",
>           "decision" : "YES",
>           "explanation" : "ignored as shard is not being recovered from a snapshot"
>         },
>         {
>           "decider" : "filter",
>           "decision" : "YES",
>           "explanation" : "node passes include/exclude/require filters"
>         },
>         {
>           "decider" : "same_shard",
>           "decision" : "NO",
>           "explanation" : "the shard cannot be allocated to the same node on which a copy of the shard already exists [[packetbeat-7.3.0-000001][0], node[Xzmh5yzfRhOPwRK0gBiujw], [P], s[STARTED], a[id=DnG9RX2FSTmurLu91iTK5Q]]"
>         },
>         {
>           "decider" : "disk_threshold",
>           "decision" : "YES",
>           "explanation" : "enough disk for shard on node, free: [231.4gb], shard size: [0b], free after allocating shard: [231.4gb]"
>         },
>         {
>           "decider" : "throttling",
>           "decision" : "YES",
>           "explanation" : "below shard recovery limit of outgoing: [0 < 2] incoming: [0 < 2]"
>         },
>         {
>           "decider" : "shards_limit",
>           "decision" : "YES",
>           "explanation" : "total shard limits are disabled: [index: -1, cluster: -1] <= 0"
>         },
>         {
>           "decider" : "awareness",
>           "decision" : "YES",
>           "explanation" : "allocation awareness is not enabled, set cluster setting [cluster.routing.allocation.awareness.attributes] to enable it"
>         }
>       ]
>     }

Marcell0e · November 8, 2019, 2:40pm

> {
>       "node_id" : "oIyIKckYTAia9O8L8DYFRw",
>       "node_name" : "INDY-LOGSRV05",
>       "transport_address" : "10.3.200.49:9300",
>       "node_attributes" : {
>         "ml.machine_memory" : "25756258304",
>         "ml.max_open_jobs" : "20",
>         "xpack.installed" : "true",
>         "box_type" : "cold"
>       },
>       "node_decision" : "no",
>       "deciders" : [
>         {
>           "decider" : "max_retry",
>           "decision" : "YES",
>           "explanation" : "shard has no previous failures"
>         },
>         {
>           "decider" : "replica_after_primary_active",
>           "decision" : "YES",
>           "explanation" : "primary shard for this replica is already active"
>         },
>         {
>           "decider" : "enable",
>           "decision" : "YES",
>           "explanation" : "all allocations are allowed"
>         },
>         {
>           "decider" : "node_version",
>           "decision" : "YES",
>           "explanation" : "can allocate replica shard to a node with version [7.3.0] since this is equal-or-newer than the primary version [7.3.0]"
>         },
>         {
>           "decider" : "snapshot_in_progress",
>           "decision" : "YES",
>           "explanation" : "the shard is not being snapshotted"
>         },
>         {
>           "decider" : "restore_in_progress",
>           "decision" : "YES",
>           "explanation" : "ignored as shard is not being recovered from a snapshot"
>         },
>         {
>           "decider" : "filter",
>           "decision" : "NO",
>           "explanation" : """node does not match index setting [index.routing.allocation.require] filters [box_type:"warm"]"""
>         },
>         {
>           "decider" : "same_shard",
>           "decision" : "YES",
>           "explanation" : "the shard does not exist on the same node"
>         },
>         {
>           "decider" : "disk_threshold",
>           "decision" : "YES",
>           "explanation" : "enough disk for shard on node, free: [4.3tb], shard size: [0b], free after allocating shard: [4.3tb]"
>         },
>         {
>           "decider" : "throttling",
>           "decision" : "YES",
>           "explanation" : "below shard recovery limit of outgoing: [0 < 2] incoming: [0 < 2]"
>         },
>         {
>           "decider" : "shards_limit",
>           "decision" : "YES",
>           "explanation" : "total shard limits are disabled: [index: -1, cluster: -1] <= 0"
>         },
>         {
>           "decider" : "awareness",
>           "decision" : "YES",
>           "explanation" : "allocation awareness is not enabled, set cluster setting [cluster.routing.allocation.awareness.attributes] to enable it"
>         }
>       ]
>     }
>   ]
> }

Marcell0e · November 8, 2019, 2:42pm

Sorry for the 3 posts. I didn't want to leave out something important again.

Thank you

Glen_Smith · November 8, 2019, 2:54pm

I'm only seeing one "warm" node, which is where the primary is. Do you have at least 2 warm nodes?
What does
GET _cat/nodeattrs?h=name,attr,value
return?

And what is the size of the index?

Marcell0e · November 8, 2019, 3:53pm

The command gets the following:

INDY-LOGSRV02 ml.machine_memory 68718940160
INDY-LOGSRV02 ml.max_open_jobs  20
INDY-LOGSRV02 xpack.installed   true
INDY-LOGSRV02 box_type          warm
INDY-LOGSRV06 ml.machine_memory 68718940160
INDY-LOGSRV06 ml.max_open_jobs  20
INDY-LOGSRV06 xpack.installed   true
INDY-LOGSRV06 box_type          hot
INDY-LOGSRV01 ml.machine_memory 68718940160
INDY-LOGSRV01 xpack.installed   true
INDY-LOGSRV01 box_type          hot
INDY-LOGSRV01 ml.max_open_jobs  20
INDY-LOGSRV05 ml.machine_memory 25756258304
INDY-LOGSRV05 ml.max_open_jobs  20
INDY-LOGSRV05 xpack.installed   true
INDY-LOGSRV05 box_type          cold

I only have 1 Warm node. I was shrinking the shards down to 1 on the move to the warm node.

The index size is just under 53Gbs.

Could the problem be that the index was created before I assigned the Hot, Warm, Cold settings?

Thanks

Glen_Smith · November 8, 2019, 4:03pm

I was shrinking the shards down to 1 on the move to the warm node.

Ok. That means reducing to a single primary shard from however many primary shards the index has in its hot stage. That does not mean no replicas, so if you have the index configured for a single replica per shard, you'll have one primary and one replica.
If you want no replicas, you need to set number_of_replics to 0 in your ILM allocation action.

Glen_Smith · November 8, 2019, 4:17pm

In the meantime, you can drop the replica(s) on any offending index directly.

PUT /packetbeat-7.3.0-000001/_settings
{
    "index" : {
        "number_of_replicas" : 0
    }
}

Marcell0e · November 8, 2019, 5:46pm

To make sure I understand correctly. On the Warm Phase of ILM I set the replicas to 0 then that should solve my issue. I will need to run the the command from of above on all my indexes that have rolled over. I will have to do the same when it rolls over again, but after the roll over it will be automatic. The ILM index change won't take effect until index 003.

Thank you

Glen_Smith · November 8, 2019, 5:52pm

Yes, that sounds correct.
If you only have 2 unallocated shards, though, it sounds like you have maximum 2 warm indices to manually fix.

Just to belabor the obvious, this means you will not have any replicas on your warm indices. If you lose your single warm node, you will lose all those indices.

You should consider systematic snapshotting of indices, if you are not already doing so. As yet, this isn't a capability of ILM, so an external agent must be used. Curator is a tool for exactly such purposed.

Marcell0e · November 8, 2019, 6:32pm

I have did what you have suggested. Nothing is being written to the new indexes. Will this take sometime or rather quickly?

Thank you for all your help and helping me understand.

Glen_Smith · November 8, 2019, 6:48pm

I don't understand. What are the "new indices"? From the daily rollover, or something else?

Is your cluster now green (i.e. no more unallocated shards)?

Marcell0e · November 8, 2019, 6:58pm

Maybe I am using the wrong term, but here is a screenshot of what I am talking about.

Glen_Smith · November 8, 2019, 7:06pm

That appears to be a completely distinct issue to the shard allocation issue.

Is your cluster now green, all shards allocated?

If so, I'd recommend opening a new topic.

Marcell0e · November 8, 2019, 7:37pm

Yes my cluster is Green.

Thank you for your help.

system · December 6, 2019, 7:52pm

This topic was automatically closed 28 days after the last reply. New replies are no longer allowed.

Topic		Replies	Views
ILM rollover Elasticsearch	13	311	August 17, 2023
Index not rolling over Elasticsearch ilm-index-lifecycle-management	16	6346	March 24, 2021
Indices not rolling over Elasticsearch ilm-index-lifecycle-management	3	926	July 10, 2021
Delete indices with 50 gb or older than Elasticsearch ilm-index-lifecycle-management	8	1549	November 3, 2022
Index not rolloving over based on any param Elasticsearch ilm-index-lifecycle-management	1	310	October 1, 2020

Index management and rollover

Related topics