ES Unassigned shards not assigning

Maxwell_Flanders · June 27, 2016, 5:29pm

We recently had an elasticsearch cluster that we were moving to a new environment. Not certain why, but at some point half of our nodes in an old environment shut off. this caused a bunch of shards to go unassigned, which I didnt figure was a problem. However, I'd like the current shards to just distribute themselves amongst the REMAINING nodes, as we are decomissioning the old ones.

I have been trying to use cluster allocation settings and rebalance settings, and for some reason, the shards STILL won't assign. Here are what my cluster looks like currently, and my cluster settings look like currently:

{
    "cluster_name" : "non-prod-management",
    "status" : "red",
    "timed_out" : false,
    "number_of_nodes" : 8,
    "number_of_data_nodes" : 5,
    "active_primary_shards" : 4772,
    "active_shards" : 9612,
    "relocating_shards" : 2,
    "initializing_shards" : 0,
    "unassigned_shards" : 2994,
    "delayed_unassigned_shards" : 0,
    "number_of_pending_tasks" : 0,
    "number_of_in_flight_fetch" : 0,
    "task_max_waiting_in_queue_millis" : 0,
    "active_shards_percent_as_number" : 76.24940504521656
}

{
    "persistent" : {
      "cluster" : {
        "routing" : {
          "allocation" : {
            "enable" : "all"
          }
        }
      }
    },
    "transient" : {
      "cluster" : {
        "routing" : {
          "rebalance" : {
            "enable" : "all"
          },
          "allocation" : {
            "enable" : "all",
            "allow_rebalance" : "always"
          }
        }
      }
    }
}

I am considering using the reroute api in a loop across the other shards, but only really as a last resort. Are there any other ways I can possibly get these to re-allocate, I cannot figure out why they seem to be stuck.

Any help appreciated... Thank you!!

abeyad · June 27, 2016, 5:40pm

Your cluster might be RED because you have 2 shards relocating and they may be primaries that are currently unassigned but in the process of getting assigned. Are these large shards?

How many indices do you have and what is your number of replicas setting for each?

Lastly, that's a lot of shards to distribute over just 5 nodes. Is it possible your nodes are dying from out of memory errors or something similar causing the cluster chaos?

Maxwell_Flanders · June 27, 2016, 6:13pm

We have 3282 indices, SOME of which range up to 70g in size, but the vast majority are under a gig. It looks like 200 of them are over 1g, and only 28 of those are over 10g. We have 5 shards per index with 2 replicas. The servers are not MASSIVE, but when I check the memory and cpu on the machines it seems reasonable, 1.5g memory free for use, so I dont think that that is the problem....

Maxwell_Flanders · June 27, 2016, 6:14pm

Oh it seems that number was on the master, not the slaves, one of those is hanging at 223M free memory... I am going to try adding another node to see if that helps to alleviate the load.

abeyad · June 27, 2016, 6:17pm

Yeah, in general, thats a pretty high number of shards for just 5 data nodes, prefer less shards with more data (up to a limit). Is your one node with 223M free mem doing a lot of GC?

Maxwell_Flanders · June 27, 2016, 6:21pm

I am not sure, how can I check that detail??

abeyad · June 27, 2016, 8:06pm

curl -XGET 'http://localhost:9200/_nodes/stats'

There is a JVM section in there with GC stats (as well as other JVM related stats).

Maxwell_Flanders · June 27, 2016, 8:11pm

On just one of the slaves, I see this:

"gc" : {
      "collectors" : {
        "young" : {
          "collection_count" : 405843,
          "collection_time_in_millis" : 22200921
        },
        "old" : {
          "collection_count" : 627,
          "collection_time_in_millis" : 75766
        }
      }
}

Since earlier I have added two more nodes to the cluster and it has had basically no effect.... The memory on the slaves ranges from 500mb to 3G, so I no longer think its a memory problem...

abeyad · June 27, 2016, 9:12pm

Is there info in the logs that might give some clues as to whats happening? Those around shards not being allocated or rerouting or anything to do with recovery?

Topic		Replies	Views
Force reroute all unassigned shards Elasticsearch	5	8989	January 9, 2017
Unassigned shards - problem (does not recover / assign) Elasticsearch	4	530	February 27, 2020
Unassigned shards with reason "ElasticSearch can allocate shards" Elasticsearch	8	69	August 15, 2024
Replica shards unassigned after Shards allocation filter applied Elasticsearch	9	671	July 5, 2018
All shards remain unassigned Elasticsearch	5	1998	July 5, 2017

ES Unassigned shards not assigning

Related Topics