I thought I would share an experience I had recently in case it helps
anybody having a similar problem:
We are in the process up upgrading from es 0.16.2 to the latest 0.17.7. We
did this by shutting down the cluster
and starting up the 0.17.7 instances pointing to our existing data
directories (we use local gateway).
We're not sure how it happened, but when the new cluster came up, it never
left red state and it showed no indices existing.
Switching back to 0.16.2, even restoring the data directory from a backup
didn't help.
So, here's what we did:
Delete the data directories completely (we had a backup elsewhere)
Start up a clean es 0.17.7 and wait for green (no indices to wait for,
of course)
Issue create index / put mapping commands to recreate all our index
definitions and mappings (same # of shards, same mappings, etc. as before)
Shutdown the cluster and copy the data index directories only (not the
_state directories) over from the backup
Startup the cluster -- all indices came up green and had all our data!
Note that es seems to delete any index directories that don't match
up with existing indices, so make sure you hang on the the backup until you
are sure
On the next environment we tried this on, we first flushed the transaction
logs before shutting down the cluster and upgrading. Everything went
smoothly.
I don't know if flushing had anything to do with it, or if the first problem
was kind of a freak occurrence, but I thought I would mention it.
this method will only work if it ends up with the same shard distribution
across the cluster. did any upgrade you tried from 0.16.2 to 0.17.7 caused
missing data?
On Fri, Sep 30, 2011 at 8:01 PM, Curtis Caravone caravone@gmail.com wrote:
Hey all,
I thought I would share an experience I had recently in case it helps
anybody having a similar problem:
We are in the process up upgrading from es 0.16.2 to the latest 0.17.7. We
did this by shutting down the cluster
and starting up the 0.17.7 instances pointing to our existing data
directories (we use local gateway).
We're not sure how it happened, but when the new cluster came up, it never
left red state and it showed no indices existing.
Switching back to 0.16.2, even restoring the data directory from a backup
didn't help.
So, here's what we did:
Delete the data directories completely (we had a backup elsewhere)
Start up a clean es 0.17.7 and wait for green (no indices to wait for,
of course)
Issue create index / put mapping commands to recreate all our index
definitions and mappings (same # of shards, same mappings, etc. as before)
Shutdown the cluster and copy the data index directories only (not the
_state directories) over from the backup
Startup the cluster -- all indices came up green and had all our data!
Note that es seems to delete any index directories that don't match
up with existing indices, so make sure you hang on the the backup until you
are sure
On the next environment we tried this on, we first flushed the transaction
logs before shutting down the cluster and upgrading. Everything went
smoothly.
I don't know if flushing had anything to do with it, or if the first
problem was kind of a freak occurrence, but I thought I would mention it.
That's a good point. In our case, we are starting with three nodes and two
replicas, so every node has all the shards.
To answer your question:
The first one we tried was a single-node dev instance. It failed and we had
to rebuild the metatdata.
The second one we tried was also single-node, and it succeeded without a
problem.
The third one we tried was the three-node production instance. It had the
same failure as the first dev instance.
In the failure cases, there didn't seem to be anything in the logs, even at
trace level, except a "starting" message.
Curtis
On Sun, Oct 2, 2011 at 5:59 AM, Shay Banon kimchy@gmail.com wrote:
this method will only work if it ends up with the same shard distribution
across the cluster. did any upgrade you tried from 0.16.2 to 0.17.7 caused
missing data?
On Fri, Sep 30, 2011 at 8:01 PM, Curtis Caravone caravone@gmail.comwrote:
Hey all,
I thought I would share an experience I had recently in case it helps
anybody having a similar problem:
We are in the process up upgrading from es 0.16.2 to the latest 0.17.7.
We did this by shutting down the cluster
and starting up the 0.17.7 instances pointing to our existing data
directories (we use local gateway).
We're not sure how it happened, but when the new cluster came up, it never
left red state and it showed no indices existing.
Switching back to 0.16.2, even restoring the data directory from a backup
didn't help.
So, here's what we did:
Delete the data directories completely (we had a backup elsewhere)
Start up a clean es 0.17.7 and wait for green (no indices to wait for,
of course)
Issue create index / put mapping commands to recreate all our index
definitions and mappings (same # of shards, same mappings, etc. as before)
Shutdown the cluster and copy the data index directories only (not the
_state directories) over from the backup
Startup the cluster -- all indices came up green and had all our data!
Note that es seems to delete any index directories that don't match
up with existing indices, so make sure you hang on the the backup until you
are sure
On the next environment we tried this on, we first flushed the transaction
logs before shutting down the cluster and upgrading. Everything went
smoothly.
I don't know if flushing had anything to do with it, or if the first
problem was kind of a freak occurrence, but I thought I would mention it.
Strange regarding the failure... . Can you recreate it in some way. I will
try myself to run an upgrade from 0.16.2 to 0.17.7 in different scenarios,
would help if you can try and pin point the steps taken to recreate it.
On Sun, Oct 2, 2011 at 6:25 PM, Curtis Caravone caravone@gmail.com wrote:
That's a good point. In our case, we are starting with three nodes and two
replicas, so every node has all the shards.
To answer your question:
The first one we tried was a single-node dev instance. It failed and we
had to rebuild the metatdata.
The second one we tried was also single-node, and it succeeded without a
problem.
The third one we tried was the three-node production instance. It had the
same failure as the first dev instance.
In the failure cases, there didn't seem to be anything in the logs, even at
trace level, except a "starting" message.
Curtis
On Sun, Oct 2, 2011 at 5:59 AM, Shay Banon kimchy@gmail.com wrote:
this method will only work if it ends up with the same shard distribution
across the cluster. did any upgrade you tried from 0.16.2 to 0.17.7 caused
missing data?
On Fri, Sep 30, 2011 at 8:01 PM, Curtis Caravone caravone@gmail.comwrote:
Hey all,
I thought I would share an experience I had recently in case it helps
anybody having a similar problem:
We are in the process up upgrading from es 0.16.2 to the latest 0.17.7.
We did this by shutting down the cluster
and starting up the 0.17.7 instances pointing to our existing data
directories (we use local gateway).
We're not sure how it happened, but when the new cluster came up, it
never left red state and it showed no indices existing.
Switching back to 0.16.2, even restoring the data directory from a backup
didn't help.
So, here's what we did:
Delete the data directories completely (we had a backup elsewhere)
Start up a clean es 0.17.7 and wait for green (no indices to wait
for, of course)
Issue create index / put mapping commands to recreate all our index
definitions and mappings (same # of shards, same mappings, etc. as before)
Shutdown the cluster and copy the data index directories only (not
the _state directories) over from the backup
Startup the cluster -- all indices came up green and had all our
data!
Note that es seems to delete any index directories that don't
match up with existing indices, so make sure you hang on the the backup
until you are sure
On the next environment we tried this on, we first flushed the
transaction logs before shutting down the cluster and upgrading. Everything
went smoothly.
I don't know if flushing had anything to do with it, or if the first
problem was kind of a freak occurrence, but I thought I would mention it.
On Sun, Oct 2, 2011 at 3:14 PM, Shay Banon kimchy@gmail.com wrote:
Strange regarding the failure... . Can you recreate it in some way. I will
try myself to run an upgrade from 0.16.2 to 0.17.7 in different scenarios,
would help if you can try and pin point the steps taken to recreate it.
On Sun, Oct 2, 2011 at 6:25 PM, Curtis Caravone caravone@gmail.comwrote:
That's a good point. In our case, we are starting with three nodes and
two replicas, so every node has all the shards.
To answer your question:
The first one we tried was a single-node dev instance. It failed and we
had to rebuild the metatdata.
The second one we tried was also single-node, and it succeeded without a
problem.
The third one we tried was the three-node production instance. It had the
same failure as the first dev instance.
In the failure cases, there didn't seem to be anything in the logs, even
at trace level, except a "starting" message.
Curtis
On Sun, Oct 2, 2011 at 5:59 AM, Shay Banon kimchy@gmail.com wrote:
this method will only work if it ends up with the same shard distribution
across the cluster. did any upgrade you tried from 0.16.2 to 0.17.7 caused
missing data?
On Fri, Sep 30, 2011 at 8:01 PM, Curtis Caravone caravone@gmail.comwrote:
Hey all,
I thought I would share an experience I had recently in case it helps
anybody having a similar problem:
We are in the process up upgrading from es 0.16.2 to the latest 0.17.7.
We did this by shutting down the cluster
and starting up the 0.17.7 instances pointing to our existing data
directories (we use local gateway).
We're not sure how it happened, but when the new cluster came up, it
never left red state and it showed no indices existing.
Switching back to 0.16.2, even restoring the data directory from a
backup didn't help.
So, here's what we did:
Delete the data directories completely (we had a backup elsewhere)
Start up a clean es 0.17.7 and wait for green (no indices to wait
for, of course)
Issue create index / put mapping commands to recreate all our index
definitions and mappings (same # of shards, same mappings, etc. as before)
Shutdown the cluster and copy the data index directories only (not
the _state directories) over from the backup
Startup the cluster -- all indices came up green and had all our
data!
Note that es seems to delete any index directories that don't
match up with existing indices, so make sure you hang on the the backup
until you are sure
On the next environment we tried this on, we first flushed the
transaction logs before shutting down the cluster and upgrading. Everything
went smoothly.
I don't know if flushing had anything to do with it, or if the first
problem was kind of a freak occurrence, but I thought I would mention it.
On Mon, 2011-10-03 at 00:14 +0200, Shay Banon wrote:
Strange regarding the failure... . Can you recreate it in some way. I
will try myself to run an upgrade from 0.16.2 to 0.17.7 in different
scenarios, would help if you can try and pin point the steps taken to
recreate it.
Might this not be the bug that was incorrectly adding delete-by-query to
the translogs?
clint
On Sun, Oct 2, 2011 at 6:25 PM, Curtis Caravone caravone@gmail.com
wrote:
That's a good point. In our case, we are starting with three
nodes and two replicas, so every node has all the shards.
To answer your question:
The first one we tried was a single-node dev instance. It
failed and we had to rebuild the metatdata.
The second one we tried was also single-node, and it succeeded
without a problem.
The third one we tried was the three-node production
instance. It had the same failure as the first dev instance.
In the failure cases, there didn't seem to be anything in the
logs, even at trace level, except a "starting" message.
Curtis
On Sun, Oct 2, 2011 at 5:59 AM, Shay Banon <kimchy@gmail.com>
wrote:
this method will only work if it ends up with the same
shard distribution across the cluster. did any upgrade
you tried from 0.16.2 to 0.17.7 caused missing data?
On Fri, Sep 30, 2011 at 8:01 PM, Curtis Caravone
<caravone@gmail.com> wrote:
Hey all,
I thought I would share an experience I had
recently in case it helps anybody having a
similar problem:
We are in the process up upgrading from es
0.16.2 to the latest 0.17.7. We did this by
shutting down the cluster
and starting up the 0.17.7 instances pointing
to our existing data directories (we use local
gateway).
We're not sure how it happened, but when the
new cluster came up, it never left red state
and it showed no indices existing.
Switching back to 0.16.2, even restoring the
data directory from a backup didn't help.
So, here's what we did:
1) Delete the data directories completely (we
had a backup elsewhere)
2) Start up a clean es 0.17.7 and wait for
green (no indices to wait for, of course)
3) Issue create index / put mapping commands
to recreate all our index definitions and
mappings (same # of shards, same mappings,
etc. as before)
4) Shutdown the cluster and copy the data
index directories only (not the _state
directories) over from the backup
5) Startup the cluster -- all indices came up
green and had all our data!
* Note that es seems to delete any index
directories that don't match up with existing
indices, so make sure you hang on the the
backup until you are sure
On the next environment we tried this on, we
first flushed the transaction logs before
shutting down the cluster and upgrading.
Everything went smoothly.
I don't know if flushing had anything to do
with it, or if the first problem was kind of a
freak occurrence, but I thought I would
mention it.
Curtis
Apache, Apache Lucene, Apache Hadoop, Hadoop, HDFS and the yellow elephant
logo are trademarks of the
Apache Software Foundation
in the United States and/or other countries.