Increase number of shards for system indices

Hi

We have 3 data nodes where 1 is in datacenter A and 2 are in datacenter B. We tried testing the failover resiliency of our cluster which resulted in kibana completely unable to do anything and a red cluster. I noticed that the system indices only have 1 primary and 1 replica shard which could in theory mean that both the shards could be allocated to Datacenter B.

Is there a way to update the amount of primary shards to maybe 2 for all system indices?

Anyone? I cannot seem to find any information about increasing the shards for existing system indices like ".kibana" and such.

Read this and specifically the "Also be patient" part.

It's fine to answer on your own thread after 2 or 3 days (not including weekends) if you don't have an answer.

IMO it's better to tell elasticsearch that you have multiple DCs so to allocate primaries and replicas accordingly.

See https://www.elastic.co/guide/en/elasticsearch/reference/current/allocation-awareness.html

Hi David, Thank you for the reply.

So we have three data nodes where two are "hot" nodes which stores data that is 1 week old and another "archive" node which gets the indexes which are older than 1 week.

Node 1 is in Datacenter A while Node 2 and Node 3 are in datacenter B.
By setting Node 1 to use:

"cluster.routing.allocation.awareness.attributes: datacenter_A"

and Node 2 & 3 to use:

"cluster.routing.allocation.awareness.attributes: datacenter_B"

this would allocate the primary and replica shards automatically without the need to modify the existing "normal" and system indices?

Thank you.

Yes. That's a cluster level setting which applies to all indices.

I applied what i wrote above where node one is using:

"cluster.routing.allocation.awareness.attributes: datacenter_A"

And node 2 and 3 is using:

"cluster.routing.allocation.awareness.attributes: datacenter_B"

And my cluster health turned to RED. So i used the Explain API call and this is what i got:

"can_allocate": "no",
  "allocate_explanation": "cannot allocate because allocation is not permitted to any of the nodes",
  "node_allocation_decisions": [
    {
      "node_id": "045e5WjzQCCjHqj8g_VA2Q",
      "node_name": "sealikreela04-archive",
      "transport_address": "10.229.1.14:9300",
      "node_attributes": {
        "ml.machine_memory": "101352407040",
        "ml.max_open_jobs": "20",
        "xpack.installed": "true",
        "box_type": "archive",
        "ml.enabled": "true"
      },
      "node_decision": "no",
      "deciders": [
        {
          "decider": "awareness",
          "decision": "NO",
          "explanation": "node does not contain the awareness attribute [JVB]; required attributes cluster setting [cluster.routing.allocation.awareness.attributes=JVB]"

Where did it go wrong? I have a total of 5 nodes where 3 are data nodes and 1 is only acting as a master node and the last one which is hosting kibana.

Do i need to tag all the nodes, even if they're not data nodes?

Apparently you need to.

So i changed all of the 5 nodes to use allocation awareness attribute depending on which datacentre they're located in but the cluster health is now yellow with the following explained by the explain API:

"current_state": "unassigned",
  "unassigned_info": {
    "reason": "NODE_LEFT",
    "at": "2018-07-04T11:47:09.221Z",
    "details": "node_left[8MvsALY9RpSEPqJVXz3qxQ]",
    "last_allocation_status": "no_attempt"
  },
  "can_allocate": "no",
  "allocate_explanation": "cannot allocate because allocation is not permitted to any of the nodes",
  "node_allocation_decisions": [
    {
      "node_id": "045e5WjzQCCjHqj8g_VA2Q",
      "node_name": "sealikreela04-archive",
      "transport_address": "10.229.1.14:9300",
      "node_attributes": {
        "ml.machine_memory": "101352407040",
        "ml.max_open_jobs": "20",
        "xpack.installed": "true",
        "box_type": "archive",
        "ml.enabled": "true"
      },
      "node_decision": "no",
      "deciders": [
        {
          "decider": "filter",
          "decision": "NO",
          "explanation": """node does not match index setting [index.routing.allocation.require] filters [box_type:"hot"]"""
        },
        {
          "decider": "awareness",
          "decision": "NO",
          "explanation": "node does not contain the awareness attribute [KRE]; required attributes cluster setting [cluster.routing.allocation.awareness.attributes=KRE]"
        }

However, the nodes do have the attribute configured in their elasticsearch.yml

EDIT: It looks like the cluster doesn't want to allocate to nodes because it doesn't think that the node is tagged with Datacentre A or Datacentre B. If i run the following command "GET /_cluster/settings" i get this:

{
  "persistent": {
    "xpack": {
      "monitoring": {
        "collection": {
          "enabled": "true"
        }
      }
    }
  },
  "transient": {
    "logger": {
      "_root": "info",
      "org": {
        "elasticsearch": {
          "transport": "info"
        }
      }
    }
  }
}

How would i go about updating via the API (https://www.elastic.co/guide/en/elasticsearch/reference/current/cluster-update-settings.html) to set node specific attributes?

Okay i totally figured it out, i was coming at it all wrong. In case anyone finds this in the future here's what i did wrong.

I never set the node attributes correctly. This:

cluster.routing.allocation.awareness.attributes: rack_id,zone

is for the cluster, it belongs in the cluster settings. But in order to use that you have to specify that the nodes have the attribytes rack_id and zone or both.

node.attr.rack_id: rack2

This topic was automatically closed 28 days after the last reply. New replies are no longer allowed.