Multiple ES versions in cluster support


(Srividhya Umashanker) #1

I am trying the rolling upgrade, which is expected to be fixed since ES
1.0.0 release.

While doing so,

  • I have 3 Nodes running ES 1.1.1

  • I shutdown, one of those, and installed ES 1.0.3

  • Unlike early problems, when the ES 1.0.3 installed node starts, it is
    seen in the cluster

  • But, it does not share the shards/replicas of other nodes

  • Now, i reinstalled ES 1.1.1 in all appliances, and now the nodes are
    shared.

Is this because I am installing ES 1.0.3 in a cluster running nodes with ES
1.1.1?

Runnng multiple versions of ES (1.0.0 and beyond) are expected to work
together? Please help

https://lh4.googleusercontent.com/-XM4g7LuYMNw/U1qF0p-b4yI/AAAAAAAAAkU/NAgSmmIsGsI/s1600/es_multi_support.png

--
You received this message because you are subscribed to the Google Groups "elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an email to elasticsearch+unsubscribe@googlegroups.com.
To view this discussion on the web visit https://groups.google.com/d/msgid/elasticsearch/f7047a98-07a0-489f-9567-d9106b7b86be%40googlegroups.com.
For more options, visit https://groups.google.com/d/optout.


(Jilles van Gurp) #2

You should only roll forward with releases. I don't think downgrades are
supported. 1.1 moved to a new version of lucene (4.7) and indices are not
compatible with older versions of elastic search once you upgrade them. You
shouldn't run clusters with mixed versions other than for the purpose of
performing a rolling restart of the cluster during an upgrade.

In any case, when you perform an upgrade, you will want to use curl
localhost:9200/_cluster/health?pretty=true to keep an eye on the cluster
state.

Basically as soon as you start or stop a node, the cluster will start
rebalancing (depending on your replica settings) and you want to wait for
that activity to stop before proceeding. Alternatively, you can temporarily
change the balancing settings via the API.

Basically my update process is as follows:
verify current cluster status is green
shut down a node
wait for activity in the cluster to stabilize (no nodes
initializing/relocating).
upgrade the node
start the node
wait for activity to stabilize (no nodes initializing/relocating).
only proceed to next node when cluster status is green again.

If you do it like this, you should be able to do rolling restarts and
upgrade. I've upgraded this way from 1.0.0 to 1.1.0 and recently to 1.1.1.

I have a three node cluster with 2 replicas. So all nodes have all the data
and no rebalancing happens when I remove a node and then re-add it. Also,
if I take out a node, I still have a two instances of each shard. The
cluster goes yellow when I take a node down for upgrading and as soon as I
start the upgraded node it initializes and upgrades its shards in a few
minutes and then the cluster goes green.

Jilles

On Friday, April 25, 2014 6:01:41 PM UTC+2, Srividhya Umashanker wrote:

I am trying the rolling upgrade, which is expected to be fixed since ES
1.0.0 release.

While doing so,

  • I have 3 Nodes running ES 1.1.1

  • I shutdown, one of those, and installed ES 1.0.3

  • Unlike early problems, when the ES 1.0.3 installed node starts, it
    is seen in the cluster

  • But, it does not share the shards/replicas of other nodes

  • Now, i reinstalled ES 1.1.1 in all appliances, and now the nodes are
    shared.

Is this because I am installing ES 1.0.3 in a cluster running nodes with
ES 1.1.1?

Runnng multiple versions of ES (1.0.0 and beyond) are expected to work
together? Please help

https://lh4.googleusercontent.com/-XM4g7LuYMNw/U1qF0p-b4yI/AAAAAAAAAkU/NAgSmmIsGsI/s1600/es_multi_support.png

--
You received this message because you are subscribed to the Google Groups "elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an email to elasticsearch+unsubscribe@googlegroups.com.
To view this discussion on the web visit https://groups.google.com/d/msgid/elasticsearch/648d7e4a-a8b2-4014-bcbf-eb518bbf1bc2%40googlegroups.com.
For more options, visit https://groups.google.com/d/optout.


(Srividhya Umashanker) #3

Jilles -

Thanks for your quick response.

I think, i tired doing a graceful shutdown and then did an upgrade. I made
sure i did the graceful upgrade as you mentioned.

Now the same problem, I had 3 nodes running 1.0.3, i upgraded two of those
to 1.1.1 (graceful upgrade), as you see the indexes are sharded/replicated
among the 1.1.1 nodes alone.
Later, i shutdown one of the nodes running 1.1.1, I expect the "test1"
index to be allocated in "victor strange" (1.0.3), but u can see it
unassigned.
When i bring back another node running 1.1.1, the indexes/shards are
allocated properly.

Does that mean, when indexes are places in higher versions, they can never
be allocated in lower version nodes? I know, vice versa is possible.
I am trying to relate to a federation topology (where nodes run different
versions and work together)

https://lh4.googleusercontent.com/-20qvisNaQ28/U1qe83USfaI/AAAAAAAAAks/44U6xqRIQY4/s1600/problem+2.png

--
You received this message because you are subscribed to the Google Groups "elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an email to elasticsearch+unsubscribe@googlegroups.com.
To view this discussion on the web visit https://groups.google.com/d/msgid/elasticsearch/6c7fa16b-5b4c-41e0-943c-5250da16995a%40googlegroups.com.
For more options, visit https://groups.google.com/d/optout.


(Brian Flad) #4

The unassigned shard behavior you witnessed is correct even between 1.1.0
and 1.1.1 nodes due to a Lucene upgrade (among others). If a primary shard
becomes located on a newer (1.1.1 node in this case), it cannot create a
replica on any older node (such as 1.1.0), leaving in unassigned until you
have more nodes at the same version.

You can actually see the reasoning behind this behavior if you use the
cluster rerouting API to try to move an unassigned shard to a node:

curl -X POST http://hostname:9200/_cluster/reroute -d
'{"commands":[{"allocate":{"index":"my_index","shard":1,"node":"some_node"}}]}'

The output of that command will helpfully tell you about the above version
conflict between 1.1.0 and 1.1.1 for example.

Also, not specifically mentioned in this thread, but if you're just
restarting nodes across a cluster (to upgrade for example), you can
temporarily disable the "rebalancing" of shards so its not needlessly
shuffling data around via disable_allocation:

curl -X PUT http://hostname:9200/_cluster/settings -d
'{"transient":{"cluster.routing.allocation.disable_allocation": true}}'

When you're done:

curl -X PUT http://hostname:9200/_cluster/settings -d
'{"transient":{"cluster.routing.allocation.disable_allocation": false}}'

We use this to keep Elasticsearch up to date without downtime across 10s of
clusters.

Hope this helps,
Brian

On Fri, Apr 25, 2014 at 1:48 PM, Srividhya Umashanker <
srividhya.umashanker@gmail.com> wrote:

Jilles -

Thanks for your quick response.

I think, i tired doing a graceful shutdown and then did an upgrade. I made
sure i did the graceful upgrade as you mentioned.

Now the same problem, I had 3 nodes running 1.0.3, i upgraded two of
those to 1.1.1 (graceful upgrade), as you see the indexes are
sharded/replicated among the 1.1.1 nodes alone.
Later, i shutdown one of the nodes running 1.1.1, I expect the "test1"
index to be allocated in "victor strange" (1.0.3), but u can see it
unassigned.
When i bring back another node running 1.1.1, the indexes/shards are
allocated properly.

Does that mean, when indexes are places in higher versions, they can never
be allocated in lower version nodes? I know, vice versa is possible.
I am trying to relate to a federation topology (where nodes run different
versions and work together)

https://lh4.googleusercontent.com/-20qvisNaQ28/U1qe83USfaI/AAAAAAAAAks/44U6xqRIQY4/s1600/problem+2.png

--
You received this message because you are subscribed to the Google Groups
"elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an
email to elasticsearch+unsubscribe@googlegroups.com.
To view this discussion on the web visit
https://groups.google.com/d/msgid/elasticsearch/6c7fa16b-5b4c-41e0-943c-5250da16995a%40googlegroups.comhttps://groups.google.com/d/msgid/elasticsearch/6c7fa16b-5b4c-41e0-943c-5250da16995a%40googlegroups.com?utm_medium=email&utm_source=footer
.

For more options, visit https://groups.google.com/d/optout.

--

Brian Flad

--
You received this message because you are subscribed to the Google Groups "elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an email to elasticsearch+unsubscribe@googlegroups.com.
To view this discussion on the web visit https://groups.google.com/d/msgid/elasticsearch/CAF3hLz%3D7rZczoUPLs8ErVdMRDuXVbmE%3D7306W8VfX9TUV%3De_Lg%40mail.gmail.com.
For more options, visit https://groups.google.com/d/optout.


(Mark Walkom) #5

cluster.routing.allocation.disable_allocation was deprecated in early v1,
its now cluster.routing.allocation.enable as per
http://www.elasticsearch.org/guide/en/elasticsearch/reference/current/modules-cluster.html

Regards,
Mark Walkom

Infrastructure Engineer
Campaign Monitor
email: markw@campaignmonitor.com
web: www.campaignmonitor.com

On 26 April 2014 09:31, Brian Flad bflad417@gmail.com wrote:

The unassigned shard behavior you witnessed is correct even between 1.1.0
and 1.1.1 nodes due to a Lucene upgrade (among others). If a primary shard
becomes located on a newer (1.1.1 node in this case), it cannot create a
replica on any older node (such as 1.1.0), leaving in unassigned until you
have more nodes at the same version.

You can actually see the reasoning behind this behavior if you use the
cluster rerouting API to try to move an unassigned shard to a node:

curl -X POST http://hostname:9200/_cluster/reroute -d
'{"commands":[{"allocate":{"index":"my_index","shard":1,"node":"some_node"}}]}'

The output of that command will helpfully tell you about the above version
conflict between 1.1.0 and 1.1.1 for example.

Also, not specifically mentioned in this thread, but if you're just
restarting nodes across a cluster (to upgrade for example), you can
temporarily disable the "rebalancing" of shards so its not needlessly
shuffling data around via disable_allocation:

curl -X PUT http://hostname:9200/_cluster/settings -d
'{"transient":{"cluster.routing.allocation.disable_allocation": true}}'

When you're done:

curl -X PUT http://hostname:9200/_cluster/settings -d
'{"transient":{"cluster.routing.allocation.disable_allocation": false}}'

We use this to keep Elasticsearch up to date without downtime across 10s
of clusters.

Hope this helps,
Brian

On Fri, Apr 25, 2014 at 1:48 PM, Srividhya Umashanker <
srividhya.umashanker@gmail.com> wrote:

Jilles -

Thanks for your quick response.

I think, i tired doing a graceful shutdown and then did an upgrade. I
made sure i did the graceful upgrade as you mentioned.

Now the same problem, I had 3 nodes running 1.0.3, i upgraded two of
those to 1.1.1 (graceful upgrade), as you see the indexes are
sharded/replicated among the 1.1.1 nodes alone.
Later, i shutdown one of the nodes running 1.1.1, I expect the "test1"
index to be allocated in "victor strange" (1.0.3), but u can see it
unassigned.
When i bring back another node running 1.1.1, the indexes/shards are
allocated properly.

Does that mean, when indexes are places in higher versions, they can
never be allocated in lower version nodes? I know, vice versa is possible.
I am trying to relate to a federation topology (where nodes run different
versions and work together)

https://lh4.googleusercontent.com/-20qvisNaQ28/U1qe83USfaI/AAAAAAAAAks/44U6xqRIQY4/s1600/problem+2.png

--
You received this message because you are subscribed to the Google Groups
"elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an
email to elasticsearch+unsubscribe@googlegroups.com.
To view this discussion on the web visit
https://groups.google.com/d/msgid/elasticsearch/6c7fa16b-5b4c-41e0-943c-5250da16995a%40googlegroups.comhttps://groups.google.com/d/msgid/elasticsearch/6c7fa16b-5b4c-41e0-943c-5250da16995a%40googlegroups.com?utm_medium=email&utm_source=footer
.

For more options, visit https://groups.google.com/d/optout.

--

Brian Flad
http://about.me/bflad

--
You received this message because you are subscribed to the Google Groups
"elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an
email to elasticsearch+unsubscribe@googlegroups.com.
To view this discussion on the web visit
https://groups.google.com/d/msgid/elasticsearch/CAF3hLz%3D7rZczoUPLs8ErVdMRDuXVbmE%3D7306W8VfX9TUV%3De_Lg%40mail.gmail.comhttps://groups.google.com/d/msgid/elasticsearch/CAF3hLz%3D7rZczoUPLs8ErVdMRDuXVbmE%3D7306W8VfX9TUV%3De_Lg%40mail.gmail.com?utm_medium=email&utm_source=footer
.
For more options, visit https://groups.google.com/d/optout.

--
You received this message because you are subscribed to the Google Groups "elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an email to elasticsearch+unsubscribe@googlegroups.com.
To view this discussion on the web visit https://groups.google.com/d/msgid/elasticsearch/CAEM624b874Cz5xbMqXKkpWd98cPv-VhrjHb%3Dk%3DhnGkgzWLvx-g%40mail.gmail.com.
For more options, visit https://groups.google.com/d/optout.


(Srividhya Umashanker) #6

That seems to work fine. Thankyou.

--
You received this message because you are subscribed to the Google Groups "elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an email to elasticsearch+unsubscribe@googlegroups.com.
To view this discussion on the web visit https://groups.google.com/d/msgid/elasticsearch/fba8d177-3b81-44f3-9fdf-57a91ce9dd83%40googlegroups.com.
For more options, visit https://groups.google.com/d/optout.


(system) #7