Rolling upgrade issue - ES 2.1.1 to ES 2.4.1

mosiddi · December 2, 2016, 4:40am

Hi there,
We are trying rolling upgrade of our clusters from ES 2.1.1 to ES 2.4.1. What we noticed is once we move the 2nd data node to ES 2.4.1, not all shards in that node are able to come up and remain in unallocated state. Our cluster also remains in yellow state. Since we have logic of not going ahead with next node till cluster is green, we remain stuck in yellow state for days.
Any inputs?
Thanks
Imran

ywelsch · December 2, 2016, 8:15am

Once a primary shard is allocated to a 2.4.1 node, a replica for that shard cannot be allocated to a 2.1.1 node anymore as the primary on the new node might have already written segments that use a new postings format or codec that is not available on the lower-version node. Can you check if this is the scenario you're seeing?

mosiddi · December 2, 2016, 8:58am

Yup.. We know the issue now!

mosiddi · December 2, 2016, 9:01am

Closing the thread with the reason why we ran into this issue -

We have 3 data nodes and 2 replicas for each shard; each data node having 1 copy of the shard. To explain our issue with an example –

Assume the data nodes had something like this (R means replica, P means primary, # that follows P/R is shard number)
- D1 – P1, R2, R3
- D2 – R1, P2, R3
- D3 – R1, R2, P3
D1 is taken out of cluster and upgraded
- P1, R2, R3 not available
- D3 is asked to promote R1 as P1
D1 come back
- R2 and R3 were replicas and get initialized
- P1 is now initialized as R1 since D3 has P1
- The above 2 are possible because primary with old version and replicas with new version is fine
D2 is taken out of cluster and upgraded
- R1, P2, R3 not available
- D1 is asked to promote R2 as P2
- D3’s R2 is not available as the D3 nodes’ ES version is lesser than D1
D2 come back
- R1 and R3 were replicas and get initialized. This is possible because the primary is in node with ES version lower than current version.
- P2 is initialized as R2 since D1 has P1 and the version of ES is same in both node

This will not happen when # of nodes in cluster <> # of shard copies.

ywelsch · December 2, 2016, 9:22am

thank you for taking the time to write up the explanation.

system · December 30, 2016, 9:23am

This topic was automatically closed 28 days after the last reply. New replies are no longer allowed.

Topic		Replies	Views
Rolling upgrade alternative? Elasticsearch	2	395	July 11, 2018
Elasticseach upgrade nodes Elasticsearch	1	427	July 5, 2017
Unassigned shards during rolling upgrade 5.5 -> 5.6.4 Elasticsearch	2	1045	February 2, 2018
Yellow cluster state because of unassigned nodes, Can't relocate unassigned shard Elasticsearch	9	1600	July 5, 2017
Es rolling upgrade 1.3.2->1.4.2 Elasticsearch	4	547	July 6, 2017

Rolling upgrade issue - ES 2.1.1 to ES 2.4.1

Related topics