Lost data due to out of memory error

I encountered lost data as a result of java.lang.OutOfMemoryError
condition. I am using:

  • 0.17.4
  • 4g min/max heap size
  • mlockall
  • single node, 1 shard per index, 0 replicas
  • indexed 300m documents across 6 indexes

config:

index:
number_of_shards: 1
number_of_replicas: 0
refresh_interval: 20s

boostrap:
mlockall: true

I have been indexing about 3000 messages per second constantly for
over 80-hours, then at the 300m document level, I ran this query and
the query hung waiting for a response:

curl -XGET http://localhost:9200/_search -d '{"query":
{"query_string":{"query":"someword"}}}'

Where someword was contained in all 300m documents. Previous to this,
other queries returned results fine. Queries to the health endpoint
and any other query did not return a result (just "hangs"). The only
evidence of a problem other than the queries hanging was this in the
log file:

failed engine java.lang.OutOfMemoryError: Java heap space

A normal "kill PID" command did not initiate a graceful shudown and I
had to perform a "kill -9".

"lsof" showed 26k files open (vs. 1300 without ES running), and ulimit
is set at 100000.

Upon restart, with updated min/max heap size to 8g, all of the files
in the ./index and ./translog directories were deleted. The cluster
won't start up and shows a constant stream of errors like this and
this is continuous without stopping:

[2011-08-11 01:07:58,378][WARN ][indices.cluster ] [Unuscione, Angelo] [index0006][0] failed to start shard
org.elasticsearch.index.gateway.IndexShardGatewayRecoveryException: [index0006][0] shard allocated for local recovery (post api), should exists, but doesn't

Is there a way to predict when we might encounter out-of-memory
errors? I can enforce better queries with the user interface if there
are some queries that we should not be executing.

Is there a better way to startup ES if I know I previously had an
unclean shutdown?

I am expecting each node to have at least 10 billion documents across
100 to 300 indexes eventually. The document sizes are small, ranging
from 200 to 800 bytes, plus an additional 200 bytes for the fields.

I think you might have hit this one:
Failed shard recovery can cause shard data to be deleted (replicas will still work) · Issue #1227 · elastic/elasticsearch · GitHub, which I fixed
yesterday. Did shards failed before you shutdown the node?

On Thu, Aug 11, 2011 at 4:15 AM, Tom Le dottom@gmail.com wrote:

I encountered lost data as a result of java.lang.OutOfMemoryError
condition. I am using:

  • 0.17.4
  • 4g min/max heap size
  • mlockall
  • single node, 1 shard per index, 0 replicas
  • indexed 300m documents across 6 indexes

config:

index:
number_of_shards: 1
number_of_replicas: 0
refresh_interval: 20s

boostrap:
mlockall: true

I have been indexing about 3000 messages per second constantly for
over 80-hours, then at the 300m document level, I ran this query and
the query hung waiting for a response:

curl -XGET http://localhost:9200/_search -d '{"query":
{"query_string":{"query":"someword"}}}'

Where someword was contained in all 300m documents. Previous to this,
other queries returned results fine. Queries to the health endpoint
and any other query did not return a result (just "hangs"). The only
evidence of a problem other than the queries hanging was this in the
log file:

failed engine java.lang.OutOfMemoryError: Java heap space

A normal "kill PID" command did not initiate a graceful shudown and I
had to perform a "kill -9".

"lsof" showed 26k files open (vs. 1300 without ES running), and ulimit
is set at 100000.

Upon restart, with updated min/max heap size to 8g, all of the files
in the ./index and ./translog directories were deleted. The cluster
won't start up and shows a constant stream of errors like this and
this is continuous without stopping:

[2011-08-11 01:07:58,378][WARN ][indices.cluster ] [Unuscione,
Angelo] [index0006][0] failed to start shard
org.elasticsearch.index.gateway.IndexShardGatewayRecoveryException:
[index0006][0] shard allocated for local recovery (post api), should exists,
but doesn't

Is there a way to predict when we might encounter out-of-memory
errors? I can enforce better queries with the user interface if there
are some queries that we should not be executing.

Is there a better way to startup ES if I know I previously had an
unclean shutdown?

I am expecting each node to have at least 10 billion documents across
100 to 300 indexes eventually. The document sizes are small, ranging
from 200 to 800 bytes, plus an additional 200 bytes for the fields.

The logs don't show anything before the OOM error, upon restart there was a
"failed to start shard".

That issue looks very similar so it's likely the case. I will test some OOM
conditions with the new version and gist any problems. The 3 main scenarios
I want to ensure reliable recovery are:

  • OOM
  • too many open files
  • file system full

In theory, proper health monitoring would mean we don't encounter these
scenarios. I know you mentioned previously that both Lucene and ES goes to
great lengths to provide recovery from various failure states.

On Thu, Aug 11, 2011 at 1:26 AM, Shay Banon kimchy@gmail.com wrote:

I think you might have hit this one:
Failed shard recovery can cause shard data to be deleted (replicas will still work) · Issue #1227 · elastic/elasticsearch · GitHub, which I fixed
yesterday. Did shards failed before you shutdown the node?

On Thu, Aug 11, 2011 at 4:15 AM, Tom Le dottom@gmail.com wrote:

I encountered lost data as a result of java.lang.OutOfMemoryError
condition. I am using:

  • 0.17.4
  • 4g min/max heap size
  • mlockall
  • single node, 1 shard per index, 0 replicas
  • indexed 300m documents across 6 indexes

config:

index:
number_of_shards: 1
number_of_replicas: 0
refresh_interval: 20s

boostrap:
mlockall: true

I have been indexing about 3000 messages per second constantly for
over 80-hours, then at the 300m document level, I ran this query and
the query hung waiting for a response:

curl -XGET http://localhost:9200/_search -d '{"query":
{"query_string":{"query":"someword"}}}'

Where someword was contained in all 300m documents. Previous to this,
other queries returned results fine. Queries to the health endpoint
and any other query did not return a result (just "hangs"). The only
evidence of a problem other than the queries hanging was this in the
log file:

failed engine java.lang.OutOfMemoryError: Java heap space

A normal "kill PID" command did not initiate a graceful shudown and I
had to perform a "kill -9".

"lsof" showed 26k files open (vs. 1300 without ES running), and ulimit
is set at 100000.

Upon restart, with updated min/max heap size to 8g, all of the files
in the ./index and ./translog directories were deleted. The cluster
won't start up and shows a constant stream of errors like this and
this is continuous without stopping:

[2011-08-11 01:07:58,378][WARN ][indices.cluster ] [Unuscione,
Angelo] [index0006][0] failed to start shard
org.elasticsearch.index.gateway.IndexShardGatewayRecoveryException:
[index0006][0] shard allocated for local recovery (post api), should exists,
but doesn't

Is there a way to predict when we might encounter out-of-memory
errors? I can enforce better queries with the user interface if there
are some queries that we should not be executing.

Is there a better way to startup ES if I know I previously had an
unclean shutdown?

I am expecting each node to have at least 10 billion documents across
100 to 300 indexes eventually. The document sizes are small, ranging
from 200 to 800 bytes, plus an additional 200 bytes for the fields.

Yes, even with a no replicas index (single node or several), you should not
loose data in those failure cases.

On Thu, Aug 11, 2011 at 8:44 PM, Tom Le dottom@gmail.com wrote:

The logs don't show anything before the OOM error, upon restart there was a
"failed to start shard".

That issue looks very similar so it's likely the case. I will test some
OOM conditions with the new version and gist any problems. The 3 main
scenarios I want to ensure reliable recovery are:

  • OOM
  • too many open files
  • file system full

In theory, proper health monitoring would mean we don't encounter these
scenarios. I know you mentioned previously that both Lucene and ES goes to
great lengths to provide recovery from various failure states.

On Thu, Aug 11, 2011 at 1:26 AM, Shay Banon kimchy@gmail.com wrote:

I think you might have hit this one:
Failed shard recovery can cause shard data to be deleted (replicas will still work) · Issue #1227 · elastic/elasticsearch · GitHub, which I fixed
yesterday. Did shards failed before you shutdown the node?

On Thu, Aug 11, 2011 at 4:15 AM, Tom Le dottom@gmail.com wrote:

I encountered lost data as a result of java.lang.OutOfMemoryError
condition. I am using:

  • 0.17.4
  • 4g min/max heap size
  • mlockall
  • single node, 1 shard per index, 0 replicas
  • indexed 300m documents across 6 indexes

config:

index:
number_of_shards: 1
number_of_replicas: 0
refresh_interval: 20s

boostrap:
mlockall: true

I have been indexing about 3000 messages per second constantly for
over 80-hours, then at the 300m document level, I ran this query and
the query hung waiting for a response:

curl -XGET http://localhost:9200/_search -d '{"query":
{"query_string":{"query":"someword"}}}'

Where someword was contained in all 300m documents. Previous to this,
other queries returned results fine. Queries to the health endpoint
and any other query did not return a result (just "hangs"). The only
evidence of a problem other than the queries hanging was this in the
log file:

failed engine java.lang.OutOfMemoryError: Java heap space

A normal "kill PID" command did not initiate a graceful shudown and I
had to perform a "kill -9".

"lsof" showed 26k files open (vs. 1300 without ES running), and ulimit
is set at 100000.

Upon restart, with updated min/max heap size to 8g, all of the files
in the ./index and ./translog directories were deleted. The cluster
won't start up and shows a constant stream of errors like this and
this is continuous without stopping:

[2011-08-11 01:07:58,378][WARN ][indices.cluster ] [Unuscione,
Angelo] [index0006][0] failed to start shard
org.elasticsearch.index.gateway.IndexShardGatewayRecoveryException:
[index0006][0] shard allocated for local recovery (post api), should exists,
but doesn't

Is there a way to predict when we might encounter out-of-memory
errors? I can enforce better queries with the user interface if there
are some queries that we should not be executing.

Is there a better way to startup ES if I know I previously had an
unclean shutdown?

I am expecting each node to have at least 10 billion documents across
100 to 300 indexes eventually. The document sizes are small, ranging
from 200 to 800 bytes, plus an additional 200 bytes for the fields.

Is this problem fixed in the 0.17.5 release. If so, I'll upgrade to it
ASAP :wink:

Thanks for the hard work Shay,

-- Ben.

On Aug 11, 10:48 am, Shay Banon kim...@gmail.com wrote:

Yes, even with a no replicas index (single node or several), you should not
loose data in those failure cases.

On Thu, Aug 11, 2011 at 8:44 PM, Tom Le dot...@gmail.com wrote:

The logs don't show anything before the OOM error, upon restart there was a
"failed to start shard".

That issue looks very similar so it's likely the case. I will test some
OOM conditions with the new version and gist any problems. The 3 main
scenarios I want to ensure reliable recovery are:

  • OOM
  • too many open files
  • file system full

In theory, proper health monitoring would mean we don't encounter these
scenarios. I know you mentioned previously that both Lucene and ES goes to
great lengths to provide recovery from various failure states.

On Thu, Aug 11, 2011 at 1:26 AM, Shay Banon kim...@gmail.com wrote:

I think you might have hit this one:
Failed shard recovery can cause shard data to be deleted (replicas will still work) · Issue #1227 · elastic/elasticsearch · GitHub, which I fixed
yesterday. Did shards failed before you shutdown the node?

On Thu, Aug 11, 2011 at 4:15 AM, Tom Le dot...@gmail.com wrote:

I encountered lost data as a result of java.lang.OutOfMemoryError
condition. I am using:

  • 0.17.4
  • 4g min/max heap size
  • mlockall
  • single node, 1 shard per index, 0 replicas
  • indexed 300m documents across 6 indexes

config:

index:
number_of_shards: 1
number_of_replicas: 0
refresh_interval: 20s

boostrap:
mlockall: true

I have been indexing about 3000 messages per second constantly for
over 80-hours, then at the 300m document level, I ran this query and
the query hung waiting for a response:

curl -XGEThttp://localhost:9200/_search-d '{"query":
{"query_string":{"query":"someword"}}}'

Where someword was contained in all 300m documents. Previous to this,
other queries returned results fine. Queries to the health endpoint
and any other query did not return a result (just "hangs"). The only
evidence of a problem other than the queries hanging was this in the
log file:

failed engine java.lang.OutOfMemoryError: Java heap space

A normal "kill PID" command did not initiate a graceful shudown and I
had to perform a "kill -9".

"lsof" showed 26k files open (vs. 1300 without ES running), and ulimit
is set at 100000.

Upon restart, with updated min/max heap size to 8g, all of the files
in the ./index and ./translog directories were deleted. The cluster
won't start up and shows a constant stream of errors like this and
this is continuous without stopping:

[2011-08-11 01:07:58,378][WARN ][indices.cluster ] [Unuscione,
Angelo] [index0006][0] failed to start shard
org.elasticsearch.index.gateway.IndexShardGatewayRecoveryException:
[index0006][0] shard allocated for local recovery (post api), should exists,
but doesn't

Is there a way to predict when we might encounter out-of-memory
errors? I can enforce better queries with the user interface if there
are some queries that we should not be executing.

Is there a better way to startup ES if I know I previously had an
unclean shutdown?

I am expecting each node to have at least 10 billion documents across
100 to 300 indexes eventually. The document sizes are small, ranging
from 200 to 800 bytes, plus an additional 200 bytes for the fields.

Yes, in 0.17.5 there is a fix for a case where data could be lost (in a
single node / no replicas case). I've ran several long running failure
driven tests simulating those failures, and could not get to a state where
data was loss.

On Fri, Aug 12, 2011 at 7:57 PM, Ben Coe ben@attachments.me wrote:

Is this problem fixed in the 0.17.5 release. If so, I'll upgrade to it
ASAP :wink:

Thanks for the hard work Shay,

-- Ben.

On Aug 11, 10:48 am, Shay Banon kim...@gmail.com wrote:

Yes, even with a no replicas index (single node or several), you should
not
loose data in those failure cases.

On Thu, Aug 11, 2011 at 8:44 PM, Tom Le dot...@gmail.com wrote:

The logs don't show anything before the OOM error, upon restart there
was a
"failed to start shard".

That issue looks very similar so it's likely the case. I will test
some
OOM conditions with the new version and gist any problems. The 3 main
scenarios I want to ensure reliable recovery are:

  • OOM
  • too many open files
  • file system full

In theory, proper health monitoring would mean we don't encounter these
scenarios. I know you mentioned previously that both Lucene and ES
goes to
great lengths to provide recovery from various failure states.

On Thu, Aug 11, 2011 at 1:26 AM, Shay Banon kim...@gmail.com wrote:

I think you might have hit this one:
Failed shard recovery can cause shard data to be deleted (replicas will still work) · Issue #1227 · elastic/elasticsearch · GitHub, which I
fixed
yesterday. Did shards failed before you shutdown the node?

On Thu, Aug 11, 2011 at 4:15 AM, Tom Le dot...@gmail.com wrote:

I encountered lost data as a result of java.lang.OutOfMemoryError
condition. I am using:

  • 0.17.4
  • 4g min/max heap size
  • mlockall
  • single node, 1 shard per index, 0 replicas
  • indexed 300m documents across 6 indexes

config:

index:
number_of_shards: 1
number_of_replicas: 0
refresh_interval: 20s

boostrap:
mlockall: true

I have been indexing about 3000 messages per second constantly for
over 80-hours, then at the 300m document level, I ran this query and
the query hung waiting for a response:

curl -XGEThttp://localhost:9200/_search-d '{"query":
{"query_string":{"query":"someword"}}}'

Where someword was contained in all 300m documents. Previous to
this,
other queries returned results fine. Queries to the health endpoint
and any other query did not return a result (just "hangs"). The only
evidence of a problem other than the queries hanging was this in the
log file:

failed engine java.lang.OutOfMemoryError: Java heap space

A normal "kill PID" command did not initiate a graceful shudown and I
had to perform a "kill -9".

"lsof" showed 26k files open (vs. 1300 without ES running), and
ulimit
is set at 100000.

Upon restart, with updated min/max heap size to 8g, all of the files
in the ./index and ./translog directories were deleted. The cluster
won't start up and shows a constant stream of errors like this and
this is continuous without stopping:

[2011-08-11 01:07:58,378][WARN ][indices.cluster ]
[Unuscione,
Angelo] [index0006][0] failed to start shard
org.elasticsearch.index.gateway.IndexShardGatewayRecoveryException:
[index0006][0] shard allocated for local recovery (post api), should
exists,
but doesn't

Is there a way to predict when we might encounter out-of-memory
errors? I can enforce better queries with the user interface if
there
are some queries that we should not be executing.

Is there a better way to startup ES if I know I previously had an
unclean shutdown?

I am expecting each node to have at least 10 billion documents across
100 to 300 indexes eventually. The document sizes are small, ranging
from 200 to 800 bytes, plus an additional 200 bytes for the fields.