Increase number of shards for system indices


#1

Hi

We have 3 data nodes where 1 is in datacenter A and 2 are in datacenter B. We tried testing the failover resiliency of our cluster which resulted in kibana completely unable to do anything and a red cluster. I noticed that the system indices only have 1 primary and 1 replica shard which could in theory mean that both the shards could be allocated to Datacenter B.

Is there a way to update the amount of primary shards to maybe 2 for all system indices?


#2

Anyone? I cannot seem to find any information about increasing the shards for existing system indices like ".kibana" and such.


(David Pilato) #3

Read this and specifically the "Also be patient" part.

It's fine to answer on your own thread after 2 or 3 days (not including weekends) if you don't have an answer.


(David Pilato) #4

IMO it's better to tell elasticsearch that you have multiple DCs so to allocate primaries and replicas accordingly.

See https://www.elastic.co/guide/en/elasticsearch/reference/current/allocation-awareness.html


#5

Hi David, Thank you for the reply.

So we have three data nodes where two are "hot" nodes which stores data that is 1 week old and another "archive" node which gets the indexes which are older than 1 week.

Node 1 is in Datacenter A while Node 2 and Node 3 are in datacenter B.
By setting Node 1 to use:

"cluster.routing.allocation.awareness.attributes: datacenter_A"

and Node 2 & 3 to use:

"cluster.routing.allocation.awareness.attributes: datacenter_B"

this would allocate the primary and replica shards automatically without the need to modify the existing "normal" and system indices?

Thank you.


(David Pilato) #6

Yes. That's a cluster level setting which applies to all indices.


#7

I applied what i wrote above where node one is using:

"cluster.routing.allocation.awareness.attributes: datacenter_A"

And node 2 and 3 is using:

"cluster.routing.allocation.awareness.attributes: datacenter_B"

And my cluster health turned to RED. So i used the Explain API call and this is what i got:

"can_allocate": "no",
  "allocate_explanation": "cannot allocate because allocation is not permitted to any of the nodes",
  "node_allocation_decisions": [
    {
      "node_id": "045e5WjzQCCjHqj8g_VA2Q",
      "node_name": "sealikreela04-archive",
      "transport_address": "10.229.1.14:9300",
      "node_attributes": {
        "ml.machine_memory": "101352407040",
        "ml.max_open_jobs": "20",
        "xpack.installed": "true",
        "box_type": "archive",
        "ml.enabled": "true"
      },
      "node_decision": "no",
      "deciders": [
        {
          "decider": "awareness",
          "decision": "NO",
          "explanation": "node does not contain the awareness attribute [JVB]; required attributes cluster setting [cluster.routing.allocation.awareness.attributes=JVB]"

Where did it go wrong? I have a total of 5 nodes where 3 are data nodes and 1 is only acting as a master node and the last one which is hosting kibana.

Do i need to tag all the nodes, even if they're not data nodes?


(David Pilato) #8

Apparently you need to.


#9

So i changed all of the 5 nodes to use allocation awareness attribute depending on which datacentre they're located in but the cluster health is now yellow with the following explained by the explain API:

"current_state": "unassigned",
  "unassigned_info": {
    "reason": "NODE_LEFT",
    "at": "2018-07-04T11:47:09.221Z",
    "details": "node_left[8MvsALY9RpSEPqJVXz3qxQ]",
    "last_allocation_status": "no_attempt"
  },
  "can_allocate": "no",
  "allocate_explanation": "cannot allocate because allocation is not permitted to any of the nodes",
  "node_allocation_decisions": [
    {
      "node_id": "045e5WjzQCCjHqj8g_VA2Q",
      "node_name": "sealikreela04-archive",
      "transport_address": "10.229.1.14:9300",
      "node_attributes": {
        "ml.machine_memory": "101352407040",
        "ml.max_open_jobs": "20",
        "xpack.installed": "true",
        "box_type": "archive",
        "ml.enabled": "true"
      },
      "node_decision": "no",
      "deciders": [
        {
          "decider": "filter",
          "decision": "NO",
          "explanation": """node does not match index setting [index.routing.allocation.require] filters [box_type:"hot"]"""
        },
        {
          "decider": "awareness",
          "decision": "NO",
          "explanation": "node does not contain the awareness attribute [KRE]; required attributes cluster setting [cluster.routing.allocation.awareness.attributes=KRE]"
        }

However, the nodes do have the attribute configured in their elasticsearch.yml

EDIT: It looks like the cluster doesn't want to allocate to nodes because it doesn't think that the node is tagged with Datacentre A or Datacentre B. If i run the following command "GET /_cluster/settings" i get this:

{
  "persistent": {
    "xpack": {
      "monitoring": {
        "collection": {
          "enabled": "true"
        }
      }
    }
  },
  "transient": {
    "logger": {
      "_root": "info",
      "org": {
        "elasticsearch": {
          "transport": "info"
        }
      }
    }
  }
}

How would i go about updating via the API (https://www.elastic.co/guide/en/elasticsearch/reference/current/cluster-update-settings.html) to set node specific attributes?


#10

Okay i totally figured it out, i was coming at it all wrong. In case anyone finds this in the future here's what i did wrong.

I never set the node attributes correctly. This:

cluster.routing.allocation.awareness.attributes: rack_id,zone

is for the cluster, it belongs in the cluster settings. But in order to use that you have to specify that the nodes have the attribytes rack_id and zone or both.

node.attr.rack_id: rack2


(system) #11

This topic was automatically closed 28 days after the last reply. New replies are no longer allowed.