Data loss with 0.19.8

Hi,

I've just experienced a scenario that caused data loss with our small ES cluster, and I wanted to share what happened in case the behaviour is regarded as a bug.

The short version is that it looks like you can lose all the data from an index if a node that's relocating data to itself runs out of heap memory. Read on for more info.

We were running a pair of ES nodes on the same machine (basically, we were just starting out with ES and weren't sure about stability - so we ran 2 nodes to survive one crashing). We have around 35 indices, each with approximately 10GB of data (3m documents), although that does vary a bit. We were running with a single shard in each index, with 1 replica.

ES having proven itself to be very stable, we decided to move to a single node on that machine, in preparation for two more nodes being deployed on new hardware. We couldn't find an 'approved' way to do this, so we found that simply turning off one of the nodes (cluster state turns yellow) and then updating all the indices to have 0 replicas (cluster goes green again) seemed to work well.

We went ahead and did this, but something - probably human error, probably me! - caused us to accidentally restart the node we'd just shut down, after we'd set replicas to 0. The cluster of course started rebalancing some indices (single shard, remember) to this new node. We figured we'd best leave it until it had completed the rebalance, then repeat the above procedure - set replicas to 1, allow replication to finish, then turn of the node again and set replicas to 0.

Unfortunately, the accidental restart of that node was such that it came up with default parameters for Java heap space - a max of 1GB. Given we normally run with 6GB, this node quickly crashed with an out of memory error, due to normal application activity. However, it crashed while it was relocating one of the indices.

We restarted the spare node again, this time with enough RAM allocated, and the cluster rebalance eventually finished. However, one shard - the one that was being moved - remains unallocated, and poking around in the data directories for each node, there's only a _state directory there. The data appears to be gone.

Fortunately this isn't a problem - once I've got the cluster stuff sorted out, I can just reindex that data from our source.

I don't know if this counts as a bug, but I thought I'd report it regardless. Let me know if you need me to raise a ticket, or if any other information would be useful.

Versions:
ElasticSearch: 0.19.8
OS: Ubuntu 12.04 LTS
JVM: java version "1.6.0_24" (OpenJDK)

All config is default, except:
ES_HEAP_SIZE set to 6GB
index.cache.field.type: sort
Custom native script jar for an app-specific filtering type

Server has 16GB RAM.

Cheers,
Dan

Dan Fairs | dan.fairs@gmail.com | @danfairs | secondsync.com

--

I fixed a similar bug in 0.19.10 where with 0 replicas, and while a shard is moving, and a "critical" failure happens (OOM), then you might loose data. This has been further hardened in the upcoming 0.19.11. Just to verify things, can you gist the log with the OOM failure?

On Oct 22, 2012, at 5:01 PM, Dan Fairs dan.fairs@gmail.com wrote:

Hi,

I've just experienced a scenario that caused data loss with our small ES cluster, and I wanted to share what happened in case the behaviour is regarded as a bug.

The short version is that it looks like you can lose all the data from an index if a node that's relocating data to itself runs out of heap memory. Read on for more info.

We were running a pair of ES nodes on the same machine (basically, we were just starting out with ES and weren't sure about stability - so we ran 2 nodes to survive one crashing). We have around 35 indices, each with approximately 10GB of data (3m documents), although that does vary a bit. We were running with a single shard in each index, with 1 replica.

ES having proven itself to be very stable, we decided to move to a single node on that machine, in preparation for two more nodes being deployed on new hardware. We couldn't find an 'approved' way to do this, so we found that simply turning off one of the nodes (cluster state turns yellow) and then updating all the indices to have 0 replicas (cluster goes green again) seemed to work well.

We went ahead and did this, but something - probably human error, probably me! - caused us to accidentally restart the node we'd just shut down, after we'd set replicas to 0. The cluster of course started rebalancing some indices (single shard, remember) to this new node. We figured we'd best leave it until it had completed the rebalance, then repeat the above procedure - set replicas to 1, allow replication to finish, then turn of the node again and set replicas to 0.

Unfortunately, the accidental restart of that node was such that it came up with default parameters for Java heap space - a max of 1GB. Given we normally run with 6GB, this node quickly crashed with an out of memory error, due to normal application activity. However, it crashed while it was relocating one of the indices.

We restarted the spare node again, this time with enough RAM allocated, and the cluster rebalance eventually finished. However, one shard - the one that was being moved - remains unallocated, and poking around in the data directories for each node, there's only a _state directory there. The data appears to be gone.

Fortunately this isn't a problem - once I've got the cluster stuff sorted out, I can just reindex that data from our source.

I don't know if this counts as a bug, but I thought I'd report it regardless. Let me know if you need me to raise a ticket, or if any other information would be useful.

Versions:
Elasticsearch: 0.19.8
OS: Ubuntu 12.04 LTS
JVM: java version "1.6.0_24" (OpenJDK)

All config is default, except:
ES_HEAP_SIZE set to 6GB
index.cache.field.type: sort
Custom native script jar for an app-specific filtering type

Server has 16GB RAM.

Cheers,
Dan

Dan Fairs | dan.fairs@gmail.com | @danfairs | secondsync.com

--

--

I fixed a similar bug in 0.19.10 where with 0 replicas, and while a shard is moving, and a "critical" failure happens (OOM), then you might loose data. This has been further hardened in the upcoming 0.19.11. Just to verify things, can you gist the log with the OOM failure?

Unfortunately I don't have it - the OOM was in an ES instance running in the foreground, and I was too focussed on Just Getting It Working again to keep the log! My fault.

Sorry about that.

I've got a traceback from the log of the other node in the cluster, which I've gisted, if that's any help:

Traceback · GitHub

Cheers,
Dan

On Oct 22, 2012, at 5:01 PM, Dan Fairs dan.fairs@gmail.com wrote:

Hi,

I've just experienced a scenario that caused data loss with our small ES cluster, and I wanted to share what happened in case the behaviour is regarded as a bug.

The short version is that it looks like you can lose all the data from an index if a node that's relocating data to itself runs out of heap memory. Read on for more info.

We were running a pair of ES nodes on the same machine (basically, we were just starting out with ES and weren't sure about stability - so we ran 2 nodes to survive one crashing). We have around 35 indices, each with approximately 10GB of data (3m documents), although that does vary a bit. We were running with a single shard in each index, with 1 replica.

ES having proven itself to be very stable, we decided to move to a single node on that machine, in preparation for two more nodes being deployed on new hardware. We couldn't find an 'approved' way to do this, so we found that simply turning off one of the nodes (cluster state turns yellow) and then updating all the indices to have 0 replicas (cluster goes green again) seemed to work well.

We went ahead and did this, but something - probably human error, probably me! - caused us to accidentally restart the node we'd just shut down, after we'd set replicas to 0. The cluster of course started rebalancing some indices (single shard, remember) to this new node. We figured we'd best leave it until it had completed the rebalance, then repeat the above procedure - set replicas to 1, allow replication to finish, then turn of the node again and set replicas to 0.

Unfortunately, the accidental restart of that node was such that it came up with default parameters for Java heap space - a max of 1GB. Given we normally run with 6GB, this node quickly crashed with an out of memory error, due to normal application activity. However, it crashed while it was relocating one of the indices.

We restarted the spare node again, this time with enough RAM allocated, and the cluster rebalance eventually finished. However, one shard - the one that was being moved - remains unallocated, and poking around in the data directories for each node, there's only a _state directory there. The data appears to be gone.

Fortunately this isn't a problem - once I've got the cluster stuff sorted out, I can just reindex that data from our source.

I don't know if this counts as a bug, but I thought I'd report it regardless. Let me know if you need me to raise a ticket, or if any other information would be useful.

Versions:
Elasticsearch: 0.19.8
OS: Ubuntu 12.04 LTS
JVM: java version "1.6.0_24" (OpenJDK)

All config is default, except:
ES_HEAP_SIZE set to 6GB
index.cache.field.type: sort
Custom native script jar for an app-specific filtering type

Server has 16GB RAM.

Cheers,
Dan

Dan Fairs | dan.fairs@gmail.com | @danfairs | secondsync.com

--

--

--
Dan Fairs | dan.fairs@gmail.com | @danfairs | secondsync.com

--