Migration of 0.90.3 cluster to new cluster running 1.3.4

Magnus_Persson · October 23, 2014, 8:13pm

So I'm about to upgrade to 1.3.4, but due to some unfortunate circumstances
I need to migrate my ES cluster to new VMs.
The environment is fairly simple. At the top I have logstash agent pulling
messages off of a Redis server and feeding it to my 2 node cluster (2
replicas, 2 shards per index). So for what it's worth I can stop logstash
and the cluster will essentially stop indexing data, allowing me to shut it
down without issue. Once I have the old cluster shut down, I intend to
rsync it over to the new cluster which is 3 nodes (2 replicas, 3 shards per
index).
What is the best approach here? I was thinking that I could rsync the data
folder from 1 of my 2 VMs running on the old cluster but then I realized
that the primary shard for each index might not be on that VM. Can I
manually set the primary shard somehow?

--
You received this message because you are subscribed to the Google Groups "elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an email to elasticsearch+unsubscribe@googlegroups.com.
To view this discussion on the web visit https://groups.google.com/d/msgid/elasticsearch/fd9ba5ba-5170-4b82-91a0-1af17971c89f%40googlegroups.com.
For more options, visit https://groups.google.com/d/optout.

Magnus_Persson · October 23, 2014, 8:18pm

Ah, slight typo in regard to the old cluster. It is 1 replica per index.

On Thursday, October 23, 2014 10:13:57 PM UTC+2, Magnus Persson wrote:

So I'm about to upgrade to 1.3.4, but due to some unfortunate
circumstances I need to migrate my ES cluster to new VMs.
The environment is fairly simple. At the top I have logstash agent pulling
messages off of a Redis server and feeding it to my 2 node cluster (2
replicas, 2 shards per index). So for what it's worth I can stop logstash
and the cluster will essentially stop indexing data, allowing me to shut it
down without issue. Once I have the old cluster shut down, I intend to
rsync it over to the new cluster which is 3 nodes (2 replicas, 3 shards per
index).
What is the best approach here? I was thinking that I could rsync the data
folder from 1 of my 2 VMs running on the old cluster but then I realized
that the primary shard for each index might not be on that VM. Can I
manually set the primary shard somehow?

--
You received this message because you are subscribed to the Google Groups "elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an email to elasticsearch+unsubscribe@googlegroups.com.
To view this discussion on the web visit https://groups.google.com/d/msgid/elasticsearch/ee5aa6d1-3339-4d45-8cd6-76614269e501%40googlegroups.com.
For more options, visit https://groups.google.com/d/optout.

Ivan · October 24, 2014, 3:54pm

Unless you are moving to new hardware, there is no need to rsync your data.
Both Elasticsaerch 0.90.x and 1.3.x are based on Lucene 4, so the
underlying data is compatible. Of course, you should backup your data
before such an upgrade.

After restarting your new cluster with your old data, I would run an
optimize on your indices so that Lucene can upgrade all your segments into
the new format. There have been some issues with Lucene format
incompatibilities, but they usually deal with indices with beta Lucene
versions.

You cannot bring up a mixed cluster between 0.90 and 1.x, so you would need
to stop all your VMs. Why are you interested in primary shards?
Elasticsearch is not like most database where the primary node has an extra
special connotation. I have not played around with shard allocation much,
but here is an old article:

Cheers,

Ivan

On Thu, Oct 23, 2014 at 4:18 PM, Magnus Persson magnus.e.persson@gmail.com
wrote:

Ah, slight typo in regard to the old cluster. It is 1 replica per index.

On Thursday, October 23, 2014 10:13:57 PM UTC+2, Magnus Persson wrote:

So I'm about to upgrade to 1.3.4, but due to some unfortunate
circumstances I need to migrate my ES cluster to new VMs.
The environment is fairly simple. At the top I have logstash agent
pulling messages off of a Redis server and feeding it to my 2 node cluster
(2 replicas, 2 shards per index). So for what it's worth I can stop
logstash and the cluster will essentially stop indexing data, allowing me
to shut it down without issue. Once I have the old cluster shut down, I
intend to rsync it over to the new cluster which is 3 nodes (2 replicas, 3
shards per index).
What is the best approach here? I was thinking that I could rsync the
data folder from 1 of my 2 VMs running on the old cluster but then I
realized that the primary shard for each index might not be on that VM. Can
I manually set the primary shard somehow?

--
You received this message because you are subscribed to the Google Groups
"elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an
email to elasticsearch+unsubscribe@googlegroups.com.
To view this discussion on the web visit
https://groups.google.com/d/msgid/elasticsearch/ee5aa6d1-3339-4d45-8cd6-76614269e501%40googlegroups.com
https://groups.google.com/d/msgid/elasticsearch/ee5aa6d1-3339-4d45-8cd6-76614269e501%40googlegroups.com?utm_medium=email&utm_source=footer
.

For more options, visit https://groups.google.com/d/optout.

--
You received this message because you are subscribed to the Google Groups "elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an email to elasticsearch+unsubscribe@googlegroups.com.
To view this discussion on the web visit https://groups.google.com/d/msgid/elasticsearch/CALY%3DcQB_R8bj9mNSASWJVpGZwR5JYJSdu6bk_5DvzxPgtbU-Bg%40mail.gmail.com.
For more options, visit https://groups.google.com/d/optout.

Magnus_Persson · October 24, 2014, 6:03pm

Oh, didn't know about optimize so I'll definitely keep that in mind.

The reason I was asking about primary shards is that I saw, when starting
from a rsync'd datafolder off of one of the nodes, double the amount of
documents. It wasn't immediatly apparent but when I later on tried with two
rsyncs matching up old node 1 with new node 1 and old node 2 with new node
2 the "duplicates" went away... and the cluster recovered significantly
faster. But reading this, it seems to be sufficient just to rsync the data
folder from any 1 node in the old cluster and things will just work? Is
there a way to verify the consistency of my cluster? Something like index
checksums, or somesuch?

On 24 October 2014 17:54, Ivan Brusic ivan@brusic.com wrote:

Unless you are moving to new hardware, there is no need to rsync your
data. Both Elasticsaerch 0.90.x and 1.3.x are based on Lucene 4, so the
underlying data is compatible. Of course, you should backup your data
before such an upgrade.

After restarting your new cluster with your old data, I would run an
optimize on your indices so that Lucene can upgrade all your segments into
the new format. There have been some issues with Lucene format
incompatibilities, but they usually deal with indices with beta Lucene
versions.

You cannot bring up a mixed cluster between 0.90 and 1.x, so you would
need to stop all your VMs. Why are you interested in primary shards?
Elasticsearch is not like most database where the primary node has an extra
special connotation. I have not played around with shard allocation much,
but here is an old article:
ElasticSearch Shard Placement Control - Sematext

Cheers,

Ivan

On Thu, Oct 23, 2014 at 4:18 PM, Magnus Persson <
magnus.e.persson@gmail.com> wrote:

Ah, slight typo in regard to the old cluster. It is 1 replica per index.

On Thursday, October 23, 2014 10:13:57 PM UTC+2, Magnus Persson wrote:

So I'm about to upgrade to 1.3.4, but due to some unfortunate
circumstances I need to migrate my ES cluster to new VMs.
The environment is fairly simple. At the top I have logstash agent
pulling messages off of a Redis server and feeding it to my 2 node cluster
(2 replicas, 2 shards per index). So for what it's worth I can stop
logstash and the cluster will essentially stop indexing data, allowing me
to shut it down without issue. Once I have the old cluster shut down, I
intend to rsync it over to the new cluster which is 3 nodes (2 replicas, 3
shards per index).
What is the best approach here? I was thinking that I could rsync the
data folder from 1 of my 2 VMs running on the old cluster but then I
realized that the primary shard for each index might not be on that VM. Can
I manually set the primary shard somehow?

--
You received this message because you are subscribed to the Google Groups
"elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an
email to elasticsearch+unsubscribe@googlegroups.com.
To view this discussion on the web visit
https://groups.google.com/d/msgid/elasticsearch/ee5aa6d1-3339-4d45-8cd6-76614269e501%40googlegroups.com
https://groups.google.com/d/msgid/elasticsearch/ee5aa6d1-3339-4d45-8cd6-76614269e501%40googlegroups.com?utm_medium=email&utm_source=footer
.

For more options, visit https://groups.google.com/d/optout.

--
You received this message because you are subscribed to a topic in the
Google Groups "elasticsearch" group.
To unsubscribe from this topic, visit
https://groups.google.com/d/topic/elasticsearch/8MWsKqDIKpA/unsubscribe.
To unsubscribe from this group and all its topics, send an email to
elasticsearch+unsubscribe@googlegroups.com.
To view this discussion on the web visit
https://groups.google.com/d/msgid/elasticsearch/CALY%3DcQB_R8bj9mNSASWJVpGZwR5JYJSdu6bk_5DvzxPgtbU-Bg%40mail.gmail.com
https://groups.google.com/d/msgid/elasticsearch/CALY%3DcQB_R8bj9mNSASWJVpGZwR5JYJSdu6bk_5DvzxPgtbU-Bg%40mail.gmail.com?utm_medium=email&utm_source=footer
.

For more options, visit https://groups.google.com/d/optout.

--
You received this message because you are subscribed to the Google Groups "elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an email to elasticsearch+unsubscribe@googlegroups.com.
To view this discussion on the web visit https://groups.google.com/d/msgid/elasticsearch/CACjthGbFU8CMUZQSZmQkLbaLgtMpXMQfyUDjO9LEBnfcb29ThQ%40mail.gmail.com.
For more options, visit https://groups.google.com/d/optout.

jprante · October 24, 2014, 9:42pm

The plan to move from a 2 node to a 3 node cluster is as follows

backup your old data files (in case you want to go back, once upgraded,
there is no way back)
shutdown old cluster
move the data file folder of the old cluster nodes to the new cluster
nodes data folders. One node gets no data folder. No rsync required.
check minimum_master_nodes = 2. This is essential for 3 nodes.
start up cluster, all nodes. See the shards rebalancing. No need to worry
about primary shards.

Jörg

On Fri, Oct 24, 2014 at 8:03 PM, Magnus Persson magnus.e.persson@gmail.com
wrote:

Oh, didn't know about optimize so I'll definitely keep that in mind.

The reason I was asking about primary shards is that I saw, when starting
from a rsync'd datafolder off of one of the nodes, double the amount of
documents. It wasn't immediatly apparent but when I later on tried with two
rsyncs matching up old node 1 with new node 1 and old node 2 with new node
2 the "duplicates" went away... and the cluster recovered significantly
faster. But reading this, it seems to be sufficient just to rsync the data
folder from any 1 node in the old cluster and things will just work? Is
there a way to verify the consistency of my cluster? Something like index
checksums, or somesuch?

On 24 October 2014 17:54, Ivan Brusic ivan@brusic.com wrote:

Unless you are moving to new hardware, there is no need to rsync your
data. Both Elasticsaerch 0.90.x and 1.3.x are based on Lucene 4, so the
underlying data is compatible. Of course, you should backup your data
before such an upgrade.

After restarting your new cluster with your old data, I would run an
optimize on your indices so that Lucene can upgrade all your segments into
the new format. There have been some issues with Lucene format
incompatibilities, but they usually deal with indices with beta Lucene
versions.

You cannot bring up a mixed cluster between 0.90 and 1.x, so you would
need to stop all your VMs. Why are you interested in primary shards?
Elasticsearch is not like most database where the primary node has an extra
special connotation. I have not played around with shard allocation much,
but here is an old article:
ElasticSearch Shard Placement Control - Sematext

Cheers,

Ivan

On Thu, Oct 23, 2014 at 4:18 PM, Magnus Persson <
magnus.e.persson@gmail.com> wrote:

Ah, slight typo in regard to the old cluster. It is 1 replica per index.

On Thursday, October 23, 2014 10:13:57 PM UTC+2, Magnus Persson wrote:

So I'm about to upgrade to 1.3.4, but due to some unfortunate
circumstances I need to migrate my ES cluster to new VMs.
The environment is fairly simple. At the top I have logstash agent
pulling messages off of a Redis server and feeding it to my 2 node cluster
(2 replicas, 2 shards per index). So for what it's worth I can stop
logstash and the cluster will essentially stop indexing data, allowing me
to shut it down without issue. Once I have the old cluster shut down, I
intend to rsync it over to the new cluster which is 3 nodes (2 replicas, 3
shards per index).
What is the best approach here? I was thinking that I could rsync the
data folder from 1 of my 2 VMs running on the old cluster but then I
realized that the primary shard for each index might not be on that VM. Can
I manually set the primary shard somehow?

--
You received this message because you are subscribed to the Google
Groups "elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send
an email to elasticsearch+unsubscribe@googlegroups.com.
To view this discussion on the web visit
https://groups.google.com/d/msgid/elasticsearch/ee5aa6d1-3339-4d45-8cd6-76614269e501%40googlegroups.com
https://groups.google.com/d/msgid/elasticsearch/ee5aa6d1-3339-4d45-8cd6-76614269e501%40googlegroups.com?utm_medium=email&utm_source=footer
.

For more options, visit https://groups.google.com/d/optout.

--
You received this message because you are subscribed to a topic in the
Google Groups "elasticsearch" group.
To unsubscribe from this topic, visit
https://groups.google.com/d/topic/elasticsearch/8MWsKqDIKpA/unsubscribe.
To unsubscribe from this group and all its topics, send an email to
elasticsearch+unsubscribe@googlegroups.com.
To view this discussion on the web visit
https://groups.google.com/d/msgid/elasticsearch/CALY%3DcQB_R8bj9mNSASWJVpGZwR5JYJSdu6bk_5DvzxPgtbU-Bg%40mail.gmail.com
https://groups.google.com/d/msgid/elasticsearch/CALY%3DcQB_R8bj9mNSASWJVpGZwR5JYJSdu6bk_5DvzxPgtbU-Bg%40mail.gmail.com?utm_medium=email&utm_source=footer
.

For more options, visit https://groups.google.com/d/optout.

--
You received this message because you are subscribed to the Google Groups
"elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an
email to elasticsearch+unsubscribe@googlegroups.com.
To view this discussion on the web visit
https://groups.google.com/d/msgid/elasticsearch/CACjthGbFU8CMUZQSZmQkLbaLgtMpXMQfyUDjO9LEBnfcb29ThQ%40mail.gmail.com
https://groups.google.com/d/msgid/elasticsearch/CACjthGbFU8CMUZQSZmQkLbaLgtMpXMQfyUDjO9LEBnfcb29ThQ%40mail.gmail.com?utm_medium=email&utm_source=footer
.

For more options, visit https://groups.google.com/d/optout.

--
You received this message because you are subscribed to the Google Groups "elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an email to elasticsearch+unsubscribe@googlegroups.com.
To view this discussion on the web visit https://groups.google.com/d/msgid/elasticsearch/CAKdsXoEC4zCuq7SnoDW3x1byUXOfEezH35VFKc5n5dOq3gCKzQ%40mail.gmail.com.
For more options, visit https://groups.google.com/d/optout.

Magnus_Persson · October 27, 2014, 11:37am

This is very strange.

I shut down the old cluster while copying the files. For some reason I'm
seeing duplicate docs again with ~3.2M docs on the old cluster and ~6.3M
docs on the new cluster (using Kopf to compare). Am I missing something
obvious? At one point I think I got the document count to match up but
obviously I'm not able to reach this state again.

On Friday, October 24, 2014 11:42:27 PM UTC+2, Jörg Prante wrote:

The plan to move from a 2 node to a 3 node cluster is as follows

backup your old data files (in case you want to go back, once upgraded,
there is no way back)

shutdown old cluster

move the data file folder of the old cluster nodes to the new cluster
nodes data folders. One node gets no data folder. No rsync required.

check minimum_master_nodes = 2. This is essential for 3 nodes.

start up cluster, all nodes. See the shards rebalancing. No need to
worry about primary shards.

Jörg

On Fri, Oct 24, 2014 at 8:03 PM, Magnus Persson <magnus.e...@gmail.com
<javascript:>> wrote:

Oh, didn't know about optimize so I'll definitely keep that in mind.

The reason I was asking about primary shards is that I saw, when starting
from a rsync'd datafolder off of one of the nodes, double the amount of
documents. It wasn't immediatly apparent but when I later on tried with two
rsyncs matching up old node 1 with new node 1 and old node 2 with new node
2 the "duplicates" went away... and the cluster recovered significantly
faster. But reading this, it seems to be sufficient just to rsync the data
folder from any 1 node in the old cluster and things will just work? Is
there a way to verify the consistency of my cluster? Something like index
checksums, or somesuch?

On 24 October 2014 17:54, Ivan Brusic <iv...@brusic.com <javascript:>>
wrote:

Unless you are moving to new hardware, there is no need to rsync your
data. Both Elasticsaerch 0.90.x and 1.3.x are based on Lucene 4, so the
underlying data is compatible. Of course, you should backup your data
before such an upgrade.

After restarting your new cluster with your old data, I would run an
optimize on your indices so that Lucene can upgrade all your segments into
the new format. There have been some issues with Lucene format
incompatibilities, but they usually deal with indices with beta Lucene
versions.

You cannot bring up a mixed cluster between 0.90 and 1.x, so you would
need to stop all your VMs. Why are you interested in primary shards?
Elasticsearch is not like most database where the primary node has an extra
special connotation. I have not played around with shard allocation much,
but here is an old article:
ElasticSearch Shard Placement Control - Sematext

Cheers,

Ivan

On Thu, Oct 23, 2014 at 4:18 PM, Magnus Persson <magnus.e...@gmail.com
<javascript:>> wrote:

Ah, slight typo in regard to the old cluster. It is 1 replica per index.

On Thursday, October 23, 2014 10:13:57 PM UTC+2, Magnus Persson wrote:

So I'm about to upgrade to 1.3.4, but due to some unfortunate
circumstances I need to migrate my ES cluster to new VMs.
The environment is fairly simple. At the top I have logstash agent
pulling messages off of a Redis server and feeding it to my 2 node cluster
(2 replicas, 2 shards per index). So for what it's worth I can stop
logstash and the cluster will essentially stop indexing data, allowing me
to shut it down without issue. Once I have the old cluster shut down, I
intend to rsync it over to the new cluster which is 3 nodes (2 replicas, 3
shards per index).
What is the best approach here? I was thinking that I could rsync the
data folder from 1 of my 2 VMs running on the old cluster but then I
realized that the primary shard for each index might not be on that VM. Can
I manually set the primary shard somehow?

--
You received this message because you are subscribed to the Google
Groups "elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send
an email to elasticsearc...@googlegroups.com <javascript:>.
To view this discussion on the web visit
https://groups.google.com/d/msgid/elasticsearch/ee5aa6d1-3339-4d45-8cd6-76614269e501%40googlegroups.com
https://groups.google.com/d/msgid/elasticsearch/ee5aa6d1-3339-4d45-8cd6-76614269e501%40googlegroups.com?utm_medium=email&utm_source=footer
.

For more options, visit https://groups.google.com/d/optout.

--
You received this message because you are subscribed to a topic in the
Google Groups "elasticsearch" group.
To unsubscribe from this topic, visit
https://groups.google.com/d/topic/elasticsearch/8MWsKqDIKpA/unsubscribe.
To unsubscribe from this group and all its topics, send an email to
elasticsearc...@googlegroups.com <javascript:>.
To view this discussion on the web visit
https://groups.google.com/d/msgid/elasticsearch/CALY%3DcQB_R8bj9mNSASWJVpGZwR5JYJSdu6bk_5DvzxPgtbU-Bg%40mail.gmail.com
https://groups.google.com/d/msgid/elasticsearch/CALY%3DcQB_R8bj9mNSASWJVpGZwR5JYJSdu6bk_5DvzxPgtbU-Bg%40mail.gmail.com?utm_medium=email&utm_source=footer
.

For more options, visit https://groups.google.com/d/optout.

--
You received this message because you are subscribed to the Google Groups
"elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an
email to elasticsearc...@googlegroups.com <javascript:>.
To view this discussion on the web visit
https://groups.google.com/d/msgid/elasticsearch/CACjthGbFU8CMUZQSZmQkLbaLgtMpXMQfyUDjO9LEBnfcb29ThQ%40mail.gmail.com
https://groups.google.com/d/msgid/elasticsearch/CACjthGbFU8CMUZQSZmQkLbaLgtMpXMQfyUDjO9LEBnfcb29ThQ%40mail.gmail.com?utm_medium=email&utm_source=footer
.

For more options, visit https://groups.google.com/d/optout.

--
You received this message because you are subscribed to the Google Groups "elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an email to elasticsearch+unsubscribe@googlegroups.com.
To view this discussion on the web visit https://groups.google.com/d/msgid/elasticsearch/b94b79c7-424a-44d4-8455-0eebc2985dc5%40googlegroups.com.
For more options, visit https://groups.google.com/d/optout.

Magnus_Persson · October 27, 2014, 12:16pm

https://gist.github.com/magnusp/515a5c3debed12802d1f is the configuration
im running on the new cluster. The old cluster is the default that came
with 0.90.3 (replicas and shards were set via templates I guess)

On Monday, October 27, 2014 12:37:48 PM UTC+1, Magnus Persson wrote:

This is very strange.

I shut down the old cluster while copying the files. For some reason I'm
seeing duplicate docs again with ~3.2M docs on the old cluster and ~6.3M
docs on the new cluster (using Kopf to compare). Am I missing something
obvious? At one point I think I got the document count to match up but
obviously I'm not able to reach this state again.

On Friday, October 24, 2014 11:42:27 PM UTC+2, Jörg Prante wrote:

The plan to move from a 2 node to a 3 node cluster is as follows

backup your old data files (in case you want to go back, once upgraded,
there is no way back)

shutdown old cluster

move the data file folder of the old cluster nodes to the new cluster
nodes data folders. One node gets no data folder. No rsync required.

check minimum_master_nodes = 2. This is essential for 3 nodes.

start up cluster, all nodes. See the shards rebalancing. No need to
worry about primary shards.

Jörg

On Fri, Oct 24, 2014 at 8:03 PM, Magnus Persson magnus.e...@gmail.com
wrote:

Oh, didn't know about optimize so I'll definitely keep that in mind.

The reason I was asking about primary shards is that I saw, when
starting from a rsync'd datafolder off of one of the nodes, double the
amount of documents. It wasn't immediatly apparent but when I later on
tried with two rsyncs matching up old node 1 with new node 1 and old node 2
with new node 2 the "duplicates" went away... and the cluster recovered
significantly faster. But reading this, it seems to be sufficient just to
rsync the data folder from any 1 node in the old cluster and things will
just work? Is there a way to verify the consistency of my cluster?
Something like index checksums, or somesuch?

On 24 October 2014 17:54, Ivan Brusic iv...@brusic.com wrote:

Unless you are moving to new hardware, there is no need to rsync your
data. Both Elasticsaerch 0.90.x and 1.3.x are based on Lucene 4, so the
underlying data is compatible. Of course, you should backup your data
before such an upgrade.

After restarting your new cluster with your old data, I would run an
optimize on your indices so that Lucene can upgrade all your segments into
the new format. There have been some issues with Lucene format
incompatibilities, but they usually deal with indices with beta Lucene
versions.

You cannot bring up a mixed cluster between 0.90 and 1.x, so you would
need to stop all your VMs. Why are you interested in primary shards?
Elasticsearch is not like most database where the primary node has an extra
special connotation. I have not played around with shard allocation much,
but here is an old article:
ElasticSearch Shard Placement Control - Sematext

Cheers,

Ivan

On Thu, Oct 23, 2014 at 4:18 PM, Magnus Persson magnus.e...@gmail.com
wrote:

Ah, slight typo in regard to the old cluster. It is 1 replica per
index.

On Thursday, October 23, 2014 10:13:57 PM UTC+2, Magnus Persson wrote:

So I'm about to upgrade to 1.3.4, but due to some unfortunate
circumstances I need to migrate my ES cluster to new VMs.
The environment is fairly simple. At the top I have logstash agent
pulling messages off of a Redis server and feeding it to my 2 node cluster
(2 replicas, 2 shards per index). So for what it's worth I can stop
logstash and the cluster will essentially stop indexing data, allowing me
to shut it down without issue. Once I have the old cluster shut down, I
intend to rsync it over to the new cluster which is 3 nodes (2 replicas, 3
shards per index).
What is the best approach here? I was thinking that I could rsync the
data folder from 1 of my 2 VMs running on the old cluster but then I
realized that the primary shard for each index might not be on that VM. Can
I manually set the primary shard somehow?

--
You received this message because you are subscribed to the Google
Groups "elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send
an email to elasticsearc...@googlegroups.com.
To view this discussion on the web visit
https://groups.google.com/d/msgid/elasticsearch/ee5aa6d1-3339-4d45-8cd6-76614269e501%40googlegroups.com
https://groups.google.com/d/msgid/elasticsearch/ee5aa6d1-3339-4d45-8cd6-76614269e501%40googlegroups.com?utm_medium=email&utm_source=footer
.

For more options, visit https://groups.google.com/d/optout.

--
You received this message because you are subscribed to a topic in the
Google Groups "elasticsearch" group.
To unsubscribe from this topic, visit
https://groups.google.com/d/topic/elasticsearch/8MWsKqDIKpA/unsubscribe
.
To unsubscribe from this group and all its topics, send an email to
elasticsearc...@googlegroups.com.
To view this discussion on the web visit
https://groups.google.com/d/msgid/elasticsearch/CALY%3DcQB_R8bj9mNSASWJVpGZwR5JYJSdu6bk_5DvzxPgtbU-Bg%40mail.gmail.com
https://groups.google.com/d/msgid/elasticsearch/CALY%3DcQB_R8bj9mNSASWJVpGZwR5JYJSdu6bk_5DvzxPgtbU-Bg%40mail.gmail.com?utm_medium=email&utm_source=footer
.

For more options, visit https://groups.google.com/d/optout.

--
You received this message because you are subscribed to the Google
Groups "elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send
an email to elasticsearc...@googlegroups.com.
To view this discussion on the web visit
https://groups.google.com/d/msgid/elasticsearch/CACjthGbFU8CMUZQSZmQkLbaLgtMpXMQfyUDjO9LEBnfcb29ThQ%40mail.gmail.com
https://groups.google.com/d/msgid/elasticsearch/CACjthGbFU8CMUZQSZmQkLbaLgtMpXMQfyUDjO9LEBnfcb29ThQ%40mail.gmail.com?utm_medium=email&utm_source=footer
.

For more options, visit https://groups.google.com/d/optout.

--
You received this message because you are subscribed to the Google Groups "elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an email to elasticsearch+unsubscribe@googlegroups.com.
To view this discussion on the web visit https://groups.google.com/d/msgid/elasticsearch/6bcca51a-329b-4b7f-a821-12b8e05b0e11%40googlegroups.com.
For more options, visit https://groups.google.com/d/optout.

Magnus_Persson · October 27, 2014, 2:21pm

When using the count API the document count seems to more reasonably match
up. Might possibly be that Kopf is counting documents differently on 0.90
than on 1.3.. seems far fetched though.

On Monday, October 27, 2014 1:16:40 PM UTC+1, Magnus Persson wrote:

https://gist.github.com/magnusp/515a5c3debed12802d1f is the configuration
im running on the new cluster. The old cluster is the default that came
with 0.90.3 (replicas and shards were set via templates I guess)

On Monday, October 27, 2014 12:37:48 PM UTC+1, Magnus Persson wrote:

This is very strange.

I shut down the old cluster while copying the files. For some reason I'm
seeing duplicate docs again with ~3.2M docs on the old cluster and ~6.3M
docs on the new cluster (using Kopf to compare). Am I missing something
obvious? At one point I think I got the document count to match up but
obviously I'm not able to reach this state again.

On Friday, October 24, 2014 11:42:27 PM UTC+2, Jörg Prante wrote:

The plan to move from a 2 node to a 3 node cluster is as follows

backup your old data files (in case you want to go back, once
upgraded, there is no way back)

shutdown old cluster

move the data file folder of the old cluster nodes to the new cluster
nodes data folders. One node gets no data folder. No rsync required.

check minimum_master_nodes = 2. This is essential for 3 nodes.

start up cluster, all nodes. See the shards rebalancing. No need to
worry about primary shards.

Jörg

On Fri, Oct 24, 2014 at 8:03 PM, Magnus Persson magnus.e...@gmail.com
wrote:

Oh, didn't know about optimize so I'll definitely keep that in mind.

The reason I was asking about primary shards is that I saw, when
starting from a rsync'd datafolder off of one of the nodes, double the
amount of documents. It wasn't immediatly apparent but when I later on
tried with two rsyncs matching up old node 1 with new node 1 and old node 2
with new node 2 the "duplicates" went away... and the cluster recovered
significantly faster. But reading this, it seems to be sufficient just to
rsync the data folder from any 1 node in the old cluster and things will
just work? Is there a way to verify the consistency of my cluster?
Something like index checksums, or somesuch?

On 24 October 2014 17:54, Ivan Brusic iv...@brusic.com wrote:

Unless you are moving to new hardware, there is no need to rsync your
data. Both Elasticsaerch 0.90.x and 1.3.x are based on Lucene 4, so the
underlying data is compatible. Of course, you should backup your data
before such an upgrade.

After restarting your new cluster with your old data, I would run an
optimize on your indices so that Lucene can upgrade all your segments into
the new format. There have been some issues with Lucene format
incompatibilities, but they usually deal with indices with beta Lucene
versions.

You cannot bring up a mixed cluster between 0.90 and 1.x, so you would
need to stop all your VMs. Why are you interested in primary shards?
Elasticsearch is not like most database where the primary node has an extra
special connotation. I have not played around with shard allocation much,
but here is an old article:
ElasticSearch Shard Placement Control - Sematext

Cheers,

Ivan

On Thu, Oct 23, 2014 at 4:18 PM, Magnus Persson <magnus.e...@gmail.com

wrote:

Ah, slight typo in regard to the old cluster. It is 1 replica per
index.

On Thursday, October 23, 2014 10:13:57 PM UTC+2, Magnus Persson wrote:

So I'm about to upgrade to 1.3.4, but due to some unfortunate
circumstances I need to migrate my ES cluster to new VMs.
The environment is fairly simple. At the top I have logstash agent
pulling messages off of a Redis server and feeding it to my 2 node cluster
(2 replicas, 2 shards per index). So for what it's worth I can stop
logstash and the cluster will essentially stop indexing data, allowing me
to shut it down without issue. Once I have the old cluster shut down, I
intend to rsync it over to the new cluster which is 3 nodes (2 replicas, 3
shards per index).
What is the best approach here? I was thinking that I could rsync
the data folder from 1 of my 2 VMs running on the old cluster but then I
realized that the primary shard for each index might not be on that VM. Can
I manually set the primary shard somehow?

--
You received this message because you are subscribed to the Google
Groups "elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it,
send an email to elasticsearc...@googlegroups.com.
To view this discussion on the web visit
https://groups.google.com/d/msgid/elasticsearch/ee5aa6d1-3339-4d45-8cd6-76614269e501%40googlegroups.com
https://groups.google.com/d/msgid/elasticsearch/ee5aa6d1-3339-4d45-8cd6-76614269e501%40googlegroups.com?utm_medium=email&utm_source=footer
.

For more options, visit https://groups.google.com/d/optout.

--
You received this message because you are subscribed to a topic in the
Google Groups "elasticsearch" group.
To unsubscribe from this topic, visit
https://groups.google.com/d/topic/elasticsearch/8MWsKqDIKpA/unsubscribe
.
To unsubscribe from this group and all its topics, send an email to
elasticsearc...@googlegroups.com.
To view this discussion on the web visit
https://groups.google.com/d/msgid/elasticsearch/CALY%3DcQB_R8bj9mNSASWJVpGZwR5JYJSdu6bk_5DvzxPgtbU-Bg%40mail.gmail.com
https://groups.google.com/d/msgid/elasticsearch/CALY%3DcQB_R8bj9mNSASWJVpGZwR5JYJSdu6bk_5DvzxPgtbU-Bg%40mail.gmail.com?utm_medium=email&utm_source=footer
.

For more options, visit https://groups.google.com/d/optout.

--
You received this message because you are subscribed to the Google
Groups "elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send
an email to elasticsearc...@googlegroups.com.
To view this discussion on the web visit
https://groups.google.com/d/msgid/elasticsearch/CACjthGbFU8CMUZQSZmQkLbaLgtMpXMQfyUDjO9LEBnfcb29ThQ%40mail.gmail.com
https://groups.google.com/d/msgid/elasticsearch/CACjthGbFU8CMUZQSZmQkLbaLgtMpXMQfyUDjO9LEBnfcb29ThQ%40mail.gmail.com?utm_medium=email&utm_source=footer
.

For more options, visit https://groups.google.com/d/optout.

--
You received this message because you are subscribed to the Google Groups "elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an email to elasticsearch+unsubscribe@googlegroups.com.
To view this discussion on the web visit https://groups.google.com/d/msgid/elasticsearch/e8360504-2a3c-4825-855f-dcbedea7810c%40googlegroups.com.
For more options, visit https://groups.google.com/d/optout.

Magnus_Persson · October 27, 2014, 3:09pm

This was confirmed as a bug in Kopf master
(Document count is different on 0.90 than on 1.3 · Issue #190 · lmenezes/elasticsearch-kopf · GitHub)

On Monday, October 27, 2014 3:21:24 PM UTC+1, Magnus Persson wrote:

When using the count API the document count seems to more reasonably match
up. Might possibly be that Kopf is counting documents differently on 0.90
than on 1.3.. seems far fetched though.

On Monday, October 27, 2014 1:16:40 PM UTC+1, Magnus Persson wrote:

https://gist.github.com/magnusp/515a5c3debed12802d1f is the
configuration im running on the new cluster. The old cluster is the default
that came with 0.90.3 (replicas and shards were set via templates I guess)

On Monday, October 27, 2014 12:37:48 PM UTC+1, Magnus Persson wrote:

This is very strange.

I shut down the old cluster while copying the files. For some reason I'm
seeing duplicate docs again with ~3.2M docs on the old cluster and ~6.3M
docs on the new cluster (using Kopf to compare). Am I missing something
obvious? At one point I think I got the document count to match up but
obviously I'm not able to reach this state again.

On Friday, October 24, 2014 11:42:27 PM UTC+2, Jörg Prante wrote:

The plan to move from a 2 node to a 3 node cluster is as follows

backup your old data files (in case you want to go back, once
upgraded, there is no way back)

shutdown old cluster

move the data file folder of the old cluster nodes to the new cluster
nodes data folders. One node gets no data folder. No rsync required.

check minimum_master_nodes = 2. This is essential for 3 nodes.

start up cluster, all nodes. See the shards rebalancing. No need to
worry about primary shards.

Jörg

On Fri, Oct 24, 2014 at 8:03 PM, Magnus Persson magnus.e...@gmail.com
wrote:

Oh, didn't know about optimize so I'll definitely keep that in mind.

The reason I was asking about primary shards is that I saw, when
starting from a rsync'd datafolder off of one of the nodes, double the
amount of documents. It wasn't immediatly apparent but when I later on
tried with two rsyncs matching up old node 1 with new node 1 and old node 2
with new node 2 the "duplicates" went away... and the cluster recovered
significantly faster. But reading this, it seems to be sufficient just to
rsync the data folder from any 1 node in the old cluster and things will
just work? Is there a way to verify the consistency of my cluster?
Something like index checksums, or somesuch?

On 24 October 2014 17:54, Ivan Brusic iv...@brusic.com wrote:

Unless you are moving to new hardware, there is no need to rsync your
data. Both Elasticsaerch 0.90.x and 1.3.x are based on Lucene 4, so the
underlying data is compatible. Of course, you should backup your data
before such an upgrade.

After restarting your new cluster with your old data, I would run an
optimize on your indices so that Lucene can upgrade all your segments into
the new format. There have been some issues with Lucene format
incompatibilities, but they usually deal with indices with beta Lucene
versions.

You cannot bring up a mixed cluster between 0.90 and 1.x, so you
would need to stop all your VMs. Why are you interested in primary shards?
Elasticsearch is not like most database where the primary node has an extra
special connotation. I have not played around with shard allocation much,
but here is an old article:
ElasticSearch Shard Placement Control - Sematext

Cheers,

Ivan

On Thu, Oct 23, 2014 at 4:18 PM, Magnus Persson <
magnus.e...@gmail.com> wrote:

Ah, slight typo in regard to the old cluster. It is 1 replica per
index.

On Thursday, October 23, 2014 10:13:57 PM UTC+2, Magnus Persson
wrote:

So I'm about to upgrade to 1.3.4, but due to some unfortunate
circumstances I need to migrate my ES cluster to new VMs.
The environment is fairly simple. At the top I have logstash agent
pulling messages off of a Redis server and feeding it to my 2 node cluster
(2 replicas, 2 shards per index). So for what it's worth I can stop
logstash and the cluster will essentially stop indexing data, allowing me
to shut it down without issue. Once I have the old cluster shut down, I
intend to rsync it over to the new cluster which is 3 nodes (2 replicas, 3
shards per index).
What is the best approach here? I was thinking that I could rsync
the data folder from 1 of my 2 VMs running on the old cluster but then I
realized that the primary shard for each index might not be on that VM. Can
I manually set the primary shard somehow?

--
You received this message because you are subscribed to the Google
Groups "elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it,
send an email to elasticsearc...@googlegroups.com.
To view this discussion on the web visit
https://groups.google.com/d/msgid/elasticsearch/ee5aa6d1-3339-4d45-8cd6-76614269e501%40googlegroups.com
https://groups.google.com/d/msgid/elasticsearch/ee5aa6d1-3339-4d45-8cd6-76614269e501%40googlegroups.com?utm_medium=email&utm_source=footer
.

For more options, visit https://groups.google.com/d/optout.

--
You received this message because you are subscribed to a topic in
the Google Groups "elasticsearch" group.
To unsubscribe from this topic, visit
https://groups.google.com/d/topic/elasticsearch/8MWsKqDIKpA/unsubscribe
.
To unsubscribe from this group and all its topics, send an email to
elasticsearc...@googlegroups.com.
To view this discussion on the web visit
https://groups.google.com/d/msgid/elasticsearch/CALY%3DcQB_R8bj9mNSASWJVpGZwR5JYJSdu6bk_5DvzxPgtbU-Bg%40mail.gmail.com
https://groups.google.com/d/msgid/elasticsearch/CALY%3DcQB_R8bj9mNSASWJVpGZwR5JYJSdu6bk_5DvzxPgtbU-Bg%40mail.gmail.com?utm_medium=email&utm_source=footer
.

For more options, visit https://groups.google.com/d/optout.

--
You received this message because you are subscribed to the Google
Groups "elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send
an email to elasticsearc...@googlegroups.com.
To view this discussion on the web visit
https://groups.google.com/d/msgid/elasticsearch/CACjthGbFU8CMUZQSZmQkLbaLgtMpXMQfyUDjO9LEBnfcb29ThQ%40mail.gmail.com
https://groups.google.com/d/msgid/elasticsearch/CACjthGbFU8CMUZQSZmQkLbaLgtMpXMQfyUDjO9LEBnfcb29ThQ%40mail.gmail.com?utm_medium=email&utm_source=footer
.

For more options, visit https://groups.google.com/d/optout.

--
You received this message because you are subscribed to the Google Groups "elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an email to elasticsearch+unsubscribe@googlegroups.com.
To view this discussion on the web visit https://groups.google.com/d/msgid/elasticsearch/7901c256-5f8b-4767-9bd0-2b394ba214ab%40googlegroups.com.
For more options, visit https://groups.google.com/d/optout.

Topic		Replies	Views
Migrate elastic search data from one cluster to other Elasticsearch	4	349	April 4, 2019
Migrating to new cluster Elasticsearch	13	1465	July 6, 2017
Upgrade introducing new nodes Elasticsearch	3	355	July 6, 2017
Upgrade 1.3 to 1.4 shard reallocation question Elasticsearch	3	341	July 6, 2017
Upgrade from 0.20.2 to 0.90.3 Elasticsearch	2	347	July 6, 2017

Migration of 0.90.3 cluster to new cluster running 1.3.4

Related topics