Problem with shards when upgrading to 2.3.3

Hi,

I have upgraded a cluster of 3 nodes from 2.1 to 2.3.3, now i have problems with shards replicas not being allocated on all my nodes:

I have upgraded the nodes this way:
1 turn off es
2 upgraded es with yum repositories
3 updated node with yum
4 checked configurations and permissions
5 started es

When the first node started it joined the cluster and everything became green.
After the second node has upgraded replicas won't allocate on that node. I thought that the problem could be the last non upgraded node which became master, so after a while I upgraded also it.
While upgrading the replicas started allocating on the second node (the one on which they won't allocate before), then after the upgrade of the third node finished i had the same problem on this, replicas won't allocate.
I decided to wait the we because i read that on 2.3.3 indexing can be very slow, now the state is what you can see in the image, new indices allocate on all nodes, old indices don't.

Nodes are centos 7 with logstash 2.3.3 and jdk1.8

Has anyone idea on what causes this problem?

Thank you,
Miso

Please try a reroute command with the explain parameter turned on:

(see https://www.elastic.co/guide/en/elasticsearch/reference/current/cluster-reroute.html )

curl -XPOST 'localhost:9200/_cluster/reroute?explain' -d '{
    "commands" : [
        {
          "allocate" : {
              "index" : "winlogbeat-2016.07.01", "shard" : 0, "node" : "ith-grs-sec-centos03"
          }
        }
    ]
}'

Hi,

i launched the command, here's the output:

{
"acknowledged": true,
"explanations": [
{
"command": "allocate",
"decisions": [
{
"decider": "filter",
"decision": "YES",
"explanation": "node passes include/exclude/require filters"
},
{
"decider": "enable",
"decision": "YES",
"explanation": "allocation disabling is ignored"
},
{
"decider": "shards_limit",
"decision": "YES",
"explanation": "total shard limit disabled: [index: -1, cluster: -1] <= 0"
},
{
"decider": "same_shard",
"decision": "YES",
"explanation": "shard is not allocated to same node or host"
},
{
"decider": "awareness",
"decision": "YES",
"explanation": "no allocation awareness enabled"
},
{
"decider": "disable",
"decision": "YES",
"explanation": "allocation disabling is ignored"
},
{
"decider": "node_version",
"decision": "YES",
"explanation": "target node version [2.3.3] is same or newer than source node version [2.3.3]"
},
{
"decider": "disk_threshold",
"decision": "YES",
"explanation": "enough disk for shard on node, free: [22.4gb]"
},
{
"decider": "throttling",
"decision": "YES",
"explanation": "below shard recovery limit of [2]"
},
{
"decider": "replica_after_primary_active",
"decision": "YES",
"explanation": "primary is already active"
},
{
"decider": "snapshot_in_progress",
"decision": "YES",
"explanation": "shard not primary or relocation disabled"
}
],
"parameters": {
"allow_primary": false,
"index": "winlogbeat-2016.07.01",
"node": "ith-grs-sec-centos03",
"shard": 1
}
}
],
"state": {
"blocks": {},
"master_node": "KMqX1LEhR1ubRj5ixdiFLw",
"nodes": {
"KMqX1LEhR1ubRj5ixdiFLw": {
"attributes": {
"master": "true",
"rack": "ith-grs-sec-centos08"
},
"name": "ith-grs-sec-centos08",
"transport_address": "10.200.144.25:9300"
},
"W4MbeuAOTC-UbLovyqi4QA": {
"attributes": {
"master": "true",
"rack": "ith-grs-sec-centos03"
},
"name": "ith-grs-sec-centos03",
"transport_address": "10.200.144.23:9300"
},
"j017l3miS6iLYlD4fLyHng": {
"attributes": {
"master": "true",
"rack": "ith-grs-sec-centos04"
},
"name": "ith-grs-sec-centos04",
"transport_address": "10.200.144.27:9300"
}
},
"routing_nodes": {

then there's info on all of my nodes and shards, i removed this information because repetitive and the output was very very big, the info on all unassigned replicas is:

            {
                "index": "winlogbeat-2016.07.01",
                "node": null,
                "primary": false,
                "relocating_node": null,
                "shard": 2,
                "state": "UNASSIGNED",
                "unassigned_info": {
                    "at": "2016-07-01T15:03:18.945Z",
                    "reason": "REPLICA_ADDED"
                },
                "version": 14
            },

after the command the shard was allocated.

If you need more info or the complete output let me know

Are all the shards now allocated or just this one? Are shards allocated if you issue an empty reroute command?

curl -XPOST 'localhost:9200/_cluster/reroute

Only one shard was allocated

No, shards are not allocated, i launched the command with explain but got no explanations

{
"acknowledged": true,
"explanations": [],
"state": {
"blocks": {},
"master_node": "KMqX1LEhR1ubRj5ixdiFLw",
"nodes": {
"KMqX1LEhR1ubRj5ixdiFLw": {
"attributes": {
"master": "true",
"rack": "ith-grs-sec-centos08"
},
"name": "ith-grs-sec-centos08",
"transport_address": "10.200.144.25:9300"
},
"W4MbeuAOTC-UbLovyqi4QA": {
"attributes": {
"master": "true",
"rack": "ith-grs-sec-centos03"
},
"name": "ith-grs-sec-centos03",
"transport_address": "10.200.144.23:9300"
},
"j017l3miS6iLYlD4fLyHng": {
"attributes": {
"master": "true",
"rack": "ith-grs-sec-centos04"
},
"name": "ith-grs-sec-centos04",
"transport_address": "10.200.144.27:9300"
}
},
"routing_nodes": {

Maybe allocation is just disabled?

Can you show me the cluster settings?

curl -XGET 'http://localhost:9200/_cluster/settings?pretty'

I had tried ad got

{
"persistent" : { },
"transient" : { }
}

I tried to enable with:

curl -XPUT 'localhost:9200/_cluster/settings' -d '{
"transient" : {
"cluster.routing.allocation.enable" : "all"
}
}'

and now i get:

{
"persistent" : { },
"transient" : {
"cluster" : {
"routing" : {
"allocation" : {
"enable" : "all"
}
}
}
}
}

but after waiting a while the shards are not allocated yet :confused:

I'm running out of ideas here. Can you check that all three nodes are indeed on v2.3.3?

Also output of

curl localhost:9200/_cluster/health?pretty

and

curl localhost:9200/_cat/shards

I checked with _nodes?pretty and I got that all versions are 2.3.3

"KMqX1LEhR1ubRj5ixdiFLw" : {
"name" : "ith-grs-sec-centos08",
"transport_address" : "10.200.144.25:9300",
"host" : "10.200.144.25",
"ip" : "10.200.144.25",
"version" : "2.3.3",
"build" : "218bdf1",
"http_address" : "10.200.144.25:9200",
"attributes" : {
"rack" : "ith-grs-sec-centos08",
"master" : "true"
},
...
"W4MbeuAOTC-UbLovyqi4QA" : {
"name" : "ith-grs-sec-centos03",
"transport_address" : "10.200.144.23:9300",
"host" : "10.200.144.23",
"ip" : "10.200.144.23",
"version" : "2.3.3",
"build" : "218bdf1",
"http_address" : "10.200.144.23:9200",
"attributes" : {
"rack" : "ith-grs-sec-centos03",
"master" : "true"
},
...
"j017l3miS6iLYlD4fLyHng" : {
"name" : "ith-grs-sec-centos04",
"transport_address" : "10.200.144.27:9300",
"host" : "10.200.144.27",
"ip" : "10.200.144.27",
"version" : "2.3.3",
"build" : "218bdf1",
"http_address" : "10.200.144.27:9200",
"attributes" : {
"rack" : "ith-grs-sec-centos04",
"master" : "true"
},

If you need entire output just let me know.

Here it is:

{
"cluster_name" : "security-test-kibana",
"status" : "yellow",
"timed_out" : false,
"number_of_nodes" : 3,
"number_of_data_nodes" : 3,
"active_primary_shards" : 495,
"active_shards" : 595,
"relocating_shards" : 0,
"initializing_shards" : 0,
"unassigned_shards" : 395,
"delayed_unassigned_shards" : 0,
"number_of_pending_tasks" : 0,
"number_of_in_flight_fetch" : 0,
"task_max_waiting_in_queue_millis" : 0,
"active_shards_percent_as_number" : 60.1010101010101
}

and finally:

the output is too big for this form, I'm copying here some sample output lines:

winlogbeat-2016.06.30 3 p STARTED 6531 4.4mb 10.200.144.25 ith-grs-sec-centos08
winlogbeat-2016.06.30 3 r UNASSIGNED
winlogbeat-2016.06.30 2 p STARTED 6372 4.4mb 10.200.144.25 ith-grs-sec-centos08
winlogbeat-2016.06.30 2 r UNASSIGNED
winlogbeat-2016.06.30 1 p STARTED 6397 4.3mb 10.200.144.23 ith-grs-sec-centos03
winlogbeat-2016.06.30 1 r UNASSIGNED
winlogbeat-2016.06.30 0 p STARTED 6235 4.3mb 10.200.144.25 ith-grs-sec-centos08
winlogbeat-2016.06.30 0 r UNASSIGNED
winlogbeat-2016.07.01 3 p STARTED 5802 4.1mb 10.200.144.23 ith-grs-sec-centos03
winlogbeat-2016.07.01 3 r UNASSIGNED
winlogbeat-2016.07.01 2 p STARTED 4394 2.9mb 10.200.144.25 ith-grs-sec-centos08
winlogbeat-2016.07.01 2 r STARTED 4394 2.9mb 10.200.144.27 ith-grs-sec-centos04
winlogbeat-2016.07.01 1 r STARTED 4517 3.1mb 10.200.144.23 ith-grs-sec-centos03
winlogbeat-2016.07.01 1 p STARTED 4517 3.1mb 10.200.144.25 ith-grs-sec-centos08
winlogbeat-2016.07.01 0 r STARTED 4461 3mb 10.200.144.23 ith-grs-sec-centos03
winlogbeat-2016.07.01 0 p STARTED 4461 3mb 10.200.144.25 ith-grs-sec-centos08
winlogbeat-2016.07.02 3 r STARTED 3085 2mb 10.200.144.25 ith-grs-sec-centos08
winlogbeat-2016.07.02 3 p STARTED 3085 2mb 10.200.144.27 ith-grs-sec-centos04
winlogbeat-2016.07.02 2 r STARTED 3108 2.1mb 10.200.144.23 ith-grs-sec-centos03
winlogbeat-2016.07.02 2 p STARTED 3108 2.1mb 10.200.144.27 ith-grs-sec-centos04
winlogbeat-2016.07.02 1 r STARTED 3165 2.1mb 10.200.144.23 ith-grs-sec-centos03
winlogbeat-2016.07.02 1 p STARTED 3165 2.1mb 10.200.144.27 ith-grs-sec-centos04
winlogbeat-2016.07.02 0 r STARTED 3194 2.1mb 10.200.144.23 ith-grs-sec-centos03
winlogbeat-2016.07.02 0 p STARTED 3194 2.1mb 10.200.144.27 ith-grs-sec-centos04

The shards from 07.02 and after are those allocated after the upgrade.
The started shards from 07.01 are those which i allocated with the command @ywelsch suggested to me.
The shards from 07.01 and before are all like these, with primary allocated and replicas unassigned.
If you need complete output let me know.

Maybe it's easier to provide me with the full output of curl 'http://localhost:9200/_cluster/state?pretty' (upload on http://pastebin.com for example). If it contains sensitive information, send me the link via private message here.

Ok i uploaded the result on my google drive because it was too big also for pastebin.

https://drive.google.com/open?id=0B5Vf-vwufarxdU5jQXI0YXJGT3c

Ok, this shows the issue clearly. For many of the indices you have the index setting index.routing.allocation.disable_allocation set to all, disabling allocation for shards of that index. Please set this to false and shards should be allocating again.

curl -XPUT http://localhost:9200/*/_settings -d '{"index.routing.allocation.disable_allocation": false}'
1 Like

OK now is slowly allocating all the shards, It has not finished yet but the problem seems resolved.
I can't understand why the setting was like this since I tell the cluster to enable allocation:

Anyway thank you very much!