Lost the data


(Amit Mohan) #1

I am running 34 shards on 17 heavy duty machines. Due to a network maint.
we had to restart the cluster ( 0.19.1 ). After the restart 1 of the
machine lost all it's shard data for some reason. There is nothing in the
logs. No disk/memory/hardware issue detected. Just the data is gone. Now I
can't bring the cluster to Green.

Is there a way to tell the cluster that it should ignore those 2 shards'
data and come up happily and start indexing again ? I am ready to loose
those 2 shards' data since it's the logs I am indexing there.

Thanks for any help!
-Amit Mohan


(Shay Banon) #2

Hi,

elasticsearch won't delete the data explicitly, I wonder though, do you
have replicas for the shards setup? How many indices do you have where
those 2 shards are missing now?

On Thu, Apr 19, 2012 at 6:15 PM, Amit Mohan amisakrenvan@gmail.com wrote:

I am running 34 shards on 17 heavy duty machines. Due to a network maint.
we had to restart the cluster ( 0.19.1 ). After the restart 1 of the
machine lost all it's shard data for some reason. There is nothing in the
logs. No disk/memory/hardware issue detected. Just the data is gone. Now I
can't bring the cluster to Green.

Is there a way to tell the cluster that it should ignore those 2 shards'
data and come up happily and start indexing again ? I am ready to loose
those 2 shards' data since it's the logs I am indexing there.

Thanks for any help!
-Amit Mohan


(Amit Mohan) #3

I had no replica in order to save some space. I have 17 machines and 34
shards, so that machine had 2 shards on it, ( 0 and 24 ). All other 32
shards are good though. I had no gateway setup either which I intend to do
now but not sure in this case it would have helped.

On Thursday, April 19, 2012 11:19:05 AM UTC-4, kimchy wrote:

Hi,

elasticsearch won't delete the data explicitly, I wonder though, do you
have replicas for the shards setup? How many indices do you have where
those 2 shards are missing now?

On Thu, Apr 19, 2012 at 6:15 PM, Amit Mohan amisakrenvan@gmail.comwrote:

I am running 34 shards on 17 heavy duty machines. Due to a network maint.
we had to restart the cluster ( 0.19.1 ). After the restart 1 of the
machine lost all it's shard data for some reason. There is nothing in the
logs. No disk/memory/hardware issue detected. Just the data is gone. Now I
can't bring the cluster to Green.

Is there a way to tell the cluster that it should ignore those 2 shards'
data and come up happily and start indexing again ? I am ready to loose
those 2 shards' data since it's the logs I am indexing there.

Thanks for any help!
-Amit Mohan

On Thursday, April 19, 2012 11:19:05 AM UTC-4, kimchy wrote:

Hi,

elasticsearch won't delete the data explicitly, I wonder though, do you
have replicas for the shards setup? How many indices do you have where
those 2 shards are missing now?

On Thu, Apr 19, 2012 at 6:15 PM, Amit Mohan amisakrenvan@gmail.comwrote:

I am running 34 shards on 17 heavy duty machines. Due to a network maint.
we had to restart the cluster ( 0.19.1 ). After the restart 1 of the
machine lost all it's shard data for some reason. There is nothing in the
logs. No disk/memory/hardware issue detected. Just the data is gone. Now I
can't bring the cluster to Green.

Is there a way to tell the cluster that it should ignore those 2 shards'
data and come up happily and start indexing again ? I am ready to loose
those 2 shards' data since it's the logs I am indexing there.

Thanks for any help!
-Amit Mohan

On Thursday, April 19, 2012 11:19:05 AM UTC-4, kimchy wrote:

Hi,

elasticsearch won't delete the data explicitly, I wonder though, do you
have replicas for the shards setup? How many indices do you have where
those 2 shards are missing now?

On Thu, Apr 19, 2012 at 6:15 PM, Amit Mohan amisakrenvan@gmail.comwrote:

I am running 34 shards on 17 heavy duty machines. Due to a network maint.
we had to restart the cluster ( 0.19.1 ). After the restart 1 of the
machine lost all it's shard data for some reason. There is nothing in the
logs. No disk/memory/hardware issue detected. Just the data is gone. Now I
can't bring the cluster to Green.

Is there a way to tell the cluster that it should ignore those 2 shards'
data and come up happily and start indexing again ? I am ready to loose
those 2 shards' data since it's the logs I am indexing there.

Thanks for any help!
-Amit Mohan

On Thursday, April 19, 2012 11:19:05 AM UTC-4, kimchy wrote:

Hi,

elasticsearch won't delete the data explicitly, I wonder though, do you
have replicas for the shards setup? How many indices do you have where
those 2 shards are missing now?

On Thu, Apr 19, 2012 at 6:15 PM, Amit Mohan amisakrenvan@gmail.comwrote:

I am running 34 shards on 17 heavy duty machines. Due to a network maint.
we had to restart the cluster ( 0.19.1 ). After the restart 1 of the
machine lost all it's shard data for some reason. There is nothing in the
logs. No disk/memory/hardware issue detected. Just the data is gone. Now I
can't bring the cluster to Green.

Is there a way to tell the cluster that it should ignore those 2 shards'
data and come up happily and start indexing again ? I am ready to loose
those 2 shards' data since it's the logs I am indexing there.

Thanks for any help!
-Amit Mohan


(Amit Mohan) #4

Sorry, did not read your question properly. I have only 1 index.

On Thursday, April 19, 2012 11:15:12 AM UTC-4, Amit Mohan wrote:

I am running 34 shards on 17 heavy duty machines. Due to a network maint.
we had to restart the cluster ( 0.19.1 ). After the restart 1 of the
machine lost all it's shard data for some reason. There is nothing in the
logs. No disk/memory/hardware issue detected. Just the data is gone. Now I
can't bring the cluster to Green.

Is there a way to tell the cluster that it should ignore those 2 shards'
data and come up happily and start indexing again ? I am ready to loose
those 2 shards' data since it's the logs I am indexing there.

Thanks for any help!
-Amit Mohan

On Thursday, April 19, 2012 11:15:12 AM UTC-4, Amit Mohan wrote:

I am running 34 shards on 17 heavy duty machines. Due to a network maint.
we had to restart the cluster ( 0.19.1 ). After the restart 1 of the
machine lost all it's shard data for some reason. There is nothing in the
logs. No disk/memory/hardware issue detected. Just the data is gone. Now I
can't bring the cluster to Green.

Is there a way to tell the cluster that it should ignore those 2 shards'
data and come up happily and start indexing again ? I am ready to loose
those 2 shards' data since it's the logs I am indexing there.

Thanks for any help!
-Amit Mohan

On Thursday, April 19, 2012 11:15:12 AM UTC-4, Amit Mohan wrote:

I am running 34 shards on 17 heavy duty machines. Due to a network maint.
we had to restart the cluster ( 0.19.1 ). After the restart 1 of the
machine lost all it's shard data for some reason. There is nothing in the
logs. No disk/memory/hardware issue detected. Just the data is gone. Now I
can't bring the cluster to Green.

Is there a way to tell the cluster that it should ignore those 2 shards'
data and come up happily and start indexing again ? I am ready to loose
those 2 shards' data since it's the logs I am indexing there.

Thanks for any help!
-Amit Mohan

On Thursday, April 19, 2012 11:15:12 AM UTC-4, Amit Mohan wrote:

I am running 34 shards on 17 heavy duty machines. Due to a network maint.
we had to restart the cluster ( 0.19.1 ). After the restart 1 of the
machine lost all it's shard data for some reason. There is nothing in the
logs. No disk/memory/hardware issue detected. Just the data is gone. Now I
can't bring the cluster to Green.

Is there a way to tell the cluster that it should ignore those 2 shards'
data and come up happily and start indexing again ? I am ready to loose
those 2 shards' data since it's the logs I am indexing there.

Thanks for any help!
-Amit Mohan

On Thursday, April 19, 2012 11:15:12 AM UTC-4, Amit Mohan wrote:

I am running 34 shards on 17 heavy duty machines. Due to a network maint.
we had to restart the cluster ( 0.19.1 ). After the restart 1 of the
machine lost all it's shard data for some reason. There is nothing in the
logs. No disk/memory/hardware issue detected. Just the data is gone. Now I
can't bring the cluster to Green.

Is there a way to tell the cluster that it should ignore those 2 shards'
data and come up happily and start indexing again ? I am ready to loose
those 2 shards' data since it's the logs I am indexing there.

Thanks for any help!
-Amit Mohan


(Shay Banon) #5

No need to setup a gateway, the local gateway is good and the way you get
high availability is by using replicas. You can't bring back those 2
shards, or force them to be created easily. What you can do is create
another index and index new data to it, and use the old index just to
search (you can use aliases to search over both indexes). Search on the
index that does not have all shards will still work, indexing operations
that will end up being routed to the 2 missing shards will fail.

On Thu, Apr 19, 2012 at 6:28 PM, Amit Mohan amisakrenvan@gmail.com wrote:

Sorry, did not read your question properly. I have only 1 index.

On Thursday, April 19, 2012 11:15:12 AM UTC-4, Amit Mohan wrote:

I am running 34 shards on 17 heavy duty machines. Due to a network maint.
we had to restart the cluster ( 0.19.1 ). After the restart 1 of the
machine lost all it's shard data for some reason. There is nothing in the
logs. No disk/memory/hardware issue detected. Just the data is gone. Now I
can't bring the cluster to Green.

Is there a way to tell the cluster that it should ignore those 2 shards'
data and come up happily and start indexing again ? I am ready to loose
those 2 shards' data since it's the logs I am indexing there.

Thanks for any help!
-Amit Mohan

On Thursday, April 19, 2012 11:15:12 AM UTC-4, Amit Mohan wrote:

I am running 34 shards on 17 heavy duty machines. Due to a network maint.
we had to restart the cluster ( 0.19.1 ). After the restart 1 of the
machine lost all it's shard data for some reason. There is nothing in the
logs. No disk/memory/hardware issue detected. Just the data is gone. Now I
can't bring the cluster to Green.

Is there a way to tell the cluster that it should ignore those 2 shards'
data and come up happily and start indexing again ? I am ready to loose
those 2 shards' data since it's the logs I am indexing there.

Thanks for any help!
-Amit Mohan

On Thursday, April 19, 2012 11:15:12 AM UTC-4, Amit Mohan wrote:

I am running 34 shards on 17 heavy duty machines. Due to a network maint.
we had to restart the cluster ( 0.19.1 ). After the restart 1 of the
machine lost all it's shard data for some reason. There is nothing in the
logs. No disk/memory/hardware issue detected. Just the data is gone. Now I
can't bring the cluster to Green.

Is there a way to tell the cluster that it should ignore those 2 shards'
data and come up happily and start indexing again ? I am ready to loose
those 2 shards' data since it's the logs I am indexing there.

Thanks for any help!
-Amit Mohan

On Thursday, April 19, 2012 11:15:12 AM UTC-4, Amit Mohan wrote:

I am running 34 shards on 17 heavy duty machines. Due to a network maint.
we had to restart the cluster ( 0.19.1 ). After the restart 1 of the
machine lost all it's shard data for some reason. There is nothing in the
logs. No disk/memory/hardware issue detected. Just the data is gone. Now I
can't bring the cluster to Green.

Is there a way to tell the cluster that it should ignore those 2 shards'
data and come up happily and start indexing again ? I am ready to loose
those 2 shards' data since it's the logs I am indexing there.

Thanks for any help!
-Amit Mohan

On Thursday, April 19, 2012 11:15:12 AM UTC-4, Amit Mohan wrote:

I am running 34 shards on 17 heavy duty machines. Due to a network maint.
we had to restart the cluster ( 0.19.1 ). After the restart 1 of the
machine lost all it's shard data for some reason. There is nothing in the
logs. No disk/memory/hardware issue detected. Just the data is gone. Now I
can't bring the cluster to Green.

Is there a way to tell the cluster that it should ignore those 2 shards'
data and come up happily and start indexing again ? I am ready to loose
those 2 shards' data since it's the logs I am indexing there.

Thanks for any help!
-Amit Mohan


(Amit Mohan) #6

Thanks for your. That is what I am going to do then until the old data
becomes irrelevant. Thanks again!

On Thursday, April 19, 2012 11:31:33 AM UTC-4, kimchy wrote:

No need to setup a gateway, the local gateway is good and the way you get
high availability is by using replicas. You can't bring back those 2
shards, or force them to be created easily. What you can do is create
another index and index new data to it, and use the old index just to
search (you can use aliases to search over both indexes). Search on the
index that does not have all shards will still work, indexing operations
that will end up being routed to the 2 missing shards will fail.

On Thu, Apr 19, 2012 at 6:28 PM, Amit Mohan amisakrenvan@gmail.comwrote:

Sorry, did not read your question properly. I have only 1 index.

On Thursday, April 19, 2012 11:15:12 AM UTC-4, Amit Mohan wrote:

I am running 34 shards on 17 heavy duty machines. Due to a network
maint. we had to restart the cluster ( 0.19.1 ). After the restart 1 of the
machine lost all it's shard data for some reason. There is nothing in the
logs. No disk/memory/hardware issue detected. Just the data is gone. Now I
can't bring the cluster to Green.

Is there a way to tell the cluster that it should ignore those 2 shards'
data and come up happily and start indexing again ? I am ready to loose
those 2 shards' data since it's the logs I am indexing there.

Thanks for any help!
-Amit Mohan

On Thursday, April 19, 2012 11:15:12 AM UTC-4, Amit Mohan wrote:

I am running 34 shards on 17 heavy duty machines. Due to a network
maint. we had to restart the cluster ( 0.19.1 ). After the restart 1 of the
machine lost all it's shard data for some reason. There is nothing in the
logs. No disk/memory/hardware issue detected. Just the data is gone. Now I
can't bring the cluster to Green.

Is there a way to tell the cluster that it should ignore those 2 shards'
data and come up happily and start indexing again ? I am ready to loose
those 2 shards' data since it's the logs I am indexing there.

Thanks for any help!
-Amit Mohan

On Thursday, April 19, 2012 11:15:12 AM UTC-4, Amit Mohan wrote:

I am running 34 shards on 17 heavy duty machines. Due to a network
maint. we had to restart the cluster ( 0.19.1 ). After the restart 1 of the
machine lost all it's shard data for some reason. There is nothing in the
logs. No disk/memory/hardware issue detected. Just the data is gone. Now I
can't bring the cluster to Green.

Is there a way to tell the cluster that it should ignore those 2 shards'
data and come up happily and start indexing again ? I am ready to loose
those 2 shards' data since it's the logs I am indexing there.

Thanks for any help!
-Amit Mohan

On Thursday, April 19, 2012 11:15:12 AM UTC-4, Amit Mohan wrote:

I am running 34 shards on 17 heavy duty machines. Due to a network
maint. we had to restart the cluster ( 0.19.1 ). After the restart 1 of the
machine lost all it's shard data for some reason. There is nothing in the
logs. No disk/memory/hardware issue detected. Just the data is gone. Now I
can't bring the cluster to Green.

Is there a way to tell the cluster that it should ignore those 2 shards'
data and come up happily and start indexing again ? I am ready to loose
those 2 shards' data since it's the logs I am indexing there.

Thanks for any help!
-Amit Mohan

On Thursday, April 19, 2012 11:15:12 AM UTC-4, Amit Mohan wrote:

I am running 34 shards on 17 heavy duty machines. Due to a network
maint. we had to restart the cluster ( 0.19.1 ). After the restart 1 of the
machine lost all it's shard data for some reason. There is nothing in the
logs. No disk/memory/hardware issue detected. Just the data is gone. Now I
can't bring the cluster to Green.

Is there a way to tell the cluster that it should ignore those 2 shards'
data and come up happily and start indexing again ? I am ready to loose
those 2 shards' data since it's the logs I am indexing there.

Thanks for any help!
-Amit Mohan


(james0tempe) #7

If there's a "hard way" to replace a failed shard, it would be very useful to us.

Scenario: A five node cluster with 5 shards, 5 replicas, loses shard 2 and 3 and cannot recover.
We would like to create two new empty shards to replace the lost ones, and deal with the data loss for the sake of cluster integrity.

Would the following idea work?

Create another 5 shard index with the same name, somewhere else,
and then copy the directory and metadata of the replacement shards in place of the broken shards.

Is there something we can do to the metadata that would make that a workable plan?

  • James
No need to setup a gateway, the local gateway is good and the way you get high availability is by using replicas. You can't bring back those 2 shards, or force them to be created easily. What you can do is create another index and index new data to it, and use the old index just to search (you can use aliases to search over both indexes). Search on the index that does not have all shards will still work, indexing operations that will end up being routed to the 2 missing shards will fail.

On Thu, Apr 19, 2012 at 6:28 PM, Amit Mohan <amisakrenvan@> wrote:

Sorry, did not read your question properly. I have only 1 index.

On Thursday, April 19, 2012 11:15:12 AM UTC-4, Amit Mohan wrote:

I am running 34 shards on 17 heavy duty machines. Due to a network maint.
we had to restart the cluster ( 0.19.1 ). After the restart 1 of the
machine lost all it's shard data for some reason. There is nothing in the
logs. No disk/memory/hardware issue detected. Just the data is gone. Now I
can't bring the cluster to Green.

Is there a way to tell the cluster that it should ignore those 2 shards'
data and come up happily and start indexing again ? I am ready to loose
those 2 shards' data since it's the logs I am indexing there.

Thanks for any help!
-Amit Mohan

On Thursday, April 19, 2012 11:15:12 AM UTC-4, Amit Mohan wrote:

I am running 34 shards on 17 heavy duty machines. Due to a network maint.
we had to restart the cluster ( 0.19.1 ). After the restart 1 of the
machine lost all it's shard data for some reason. There is nothing in the
logs. No disk/memory/hardware issue detected. Just the data is gone. Now I
can't bring the cluster to Green.

Is there a way to tell the cluster that it should ignore those 2 shards'
data and come up happily and start indexing again ? I am ready to loose
those 2 shards' data since it's the logs I am indexing there.

Thanks for any help!
-Amit Mohan

On Thursday, April 19, 2012 11:15:12 AM UTC-4, Amit Mohan wrote:

I am running 34 shards on 17 heavy duty machines. Due to a network maint.
we had to restart the cluster ( 0.19.1 ). After the restart 1 of the
machine lost all it's shard data for some reason. There is nothing in the
logs. No disk/memory/hardware issue detected. Just the data is gone. Now I
can't bring the cluster to Green.

Is there a way to tell the cluster that it should ignore those 2 shards'
data and come up happily and start indexing again ? I am ready to loose
those 2 shards' data since it's the logs I am indexing there.

Thanks for any help!
-Amit Mohan

On Thursday, April 19, 2012 11:15:12 AM UTC-4, Amit Mohan wrote:

I am running 34 shards on 17 heavy duty machines. Due to a network maint.
we had to restart the cluster ( 0.19.1 ). After the restart 1 of the
machine lost all it's shard data for some reason. There is nothing in the
logs. No disk/memory/hardware issue detected. Just the data is gone. Now I
can't bring the cluster to Green.

Is there a way to tell the cluster that it should ignore those 2 shards'
data and come up happily and start indexing again ? I am ready to loose
those 2 shards' data since it's the logs I am indexing there.

Thanks for any help!
-Amit Mohan

On Thursday, April 19, 2012 11:15:12 AM UTC-4, Amit Mohan wrote:

I am running 34 shards on 17 heavy duty machines. Due to a network maint.
we had to restart the cluster ( 0.19.1 ). After the restart 1 of the
machine lost all it's shard data for some reason. There is nothing in the
logs. No disk/memory/hardware issue detected. Just the data is gone. Now I
can't bring the cluster to Green.

Is there a way to tell the cluster that it should ignore those 2 shards'
data and come up happily and start indexing again ? I am ready to loose
those 2 shards' data since it's the logs I am indexing there.

Thanks for any help!
-Amit Mohan


(system) #8