elasticsearch with version 0.19.4
4 es nodes in 4 machines
hosting 3 different indices:
indice 1: 10 shards and 1 replica
indice 2 and 3: 5 shards and 1 replica
the 4 nodes cluster are continuously running indexing and i submit a
normal search request with terms facet, which turn out that the
building of the terms facet trigger an OutOfMemory error:
[2012-06-28 09:43:44,053][WARN ][index.cache.field.data.resident] [Kid
Colt] [i3_product] loading field [deptIds] caused out of memory
failure
java.lang.OutOfMemoryError: Java heap space
at org.elasticsearch.index.field.data.support.FieldDataLoader.load(FieldDataLoader.java:61)
at org.elasticsearch.index.field.data.longs.LongFieldData.load(LongFieldData.java:166)
Since this search request happens in multiple threads at the same
time, i.e. 200 search requests are submitted at the same time to the
cluster and finally all nodes are showing continuously the above OOM
errors.
Then I restarted all 4 es nodes one by one and all the indices are
"lost", the whole cluster meta seems to be cleared out though the data
files still exists in each of es node directory.
"dangling index" message appears when I restarted the nodes:
dangling index, exists on local file system, but not in cluster
metadata, scheduling to delete in [2h]
So my questions are:
is the above behavior expected? how can I recover the cluster data?
can es do something to prevent this kind of OOM caused by search
query? because we may not be able to determine if the incoming search
query can cause OOM and thus bring diaster to the system
On Thursday, June 28, 2012 6:30:48 AM UTC-4, Yiu Wing TSANG wrote:
Here is my setup:
elasticsearch with version 0.19.4
4 es nodes in 4 machines
hosting 3 different indices:
indice 1: 10 shards and 1 replica
indice 2 and 3: 5 shards and 1 replica
the 4 nodes cluster are continuously running indexing and i submit a
normal search request with terms facet, which turn out that the
building of the terms facet trigger an OutOfMemory error:
[2012-06-28 09:43:44,053][WARN ][index.cache.field.data.resident] [Kid
Colt] [i3_product] loading field [deptIds] caused out of memory
failure
java.lang.OutOfMemoryError: Java heap space
at
org.elasticsearch.index.field.data.support.FieldDataLoader.load(FieldDataLoader.java:61)
Since this search request happens in multiple threads at the same
time, i.e. 200 search requests are submitted at the same time to the
cluster and finally all nodes are showing continuously the above OOM
errors.
Then I restarted all 4 es nodes one by one and all the indices are
"lost", the whole cluster meta seems to be cleared out though the data
files still exists in each of es node directory.
"dangling index" message appears when I restarted the nodes:
dangling index, exists on local file system, but not in cluster
metadata, scheduling to delete in [2h]
So my questions are:
is the above behavior expected? how can I recover the cluster data?
can es do something to prevent this kind of OOM caused by search
query? because we may not be able to determine if the incoming search
query can cause OOM and thus bring diaster to the system
On Thursday, June 28, 2012 6:30:48 AM UTC-4, Yiu Wing TSANG wrote:
Here is my setup:
elasticsearch with version 0.19.4
4 es nodes in 4 machines
hosting 3 different indices:
indice 1: 10 shards and 1 replica
indice 2 and 3: 5 shards and 1 replica
the 4 nodes cluster are continuously running indexing and i submit a
normal search request with terms facet, which turn out that the
building of the terms facet trigger an OutOfMemory error:
[2012-06-28 09:43:44,053][WARN ][index.cache.field.data.resident] [Kid
Colt] [i3_product] loading field [deptIds] caused out of memory
failure
java.lang.OutOfMemoryError: Java heap space
at
org.elasticsearch.index.field.data.support.FieldDataLoader.load(FieldDataLoader.java:61)
at
org.elasticsearch.index.field.data.longs.LongFieldData.load(LongFieldData.java:166)
Since this search request happens in multiple threads at the same
time, i.e. 200 search requests are submitted at the same time to the
cluster and finally all nodes are showing continuously the above OOM
errors.
Then I restarted all 4 es nodes one by one and all the indices are
"lost", the whole cluster meta seems to be cleared out though the data
files still exists in each of es node directory.
"dangling index" message appears when I restarted the nodes:
dangling index, exists on local file system, but not in cluster
metadata, scheduling to delete in [2h]
So my questions are:
is the above behavior expected? how can I recover the cluster data?
can es do something to prevent this kind of OOM caused by search
query? because we may not be able to determine if the incoming search
query can cause OOM and thus bring diaster to the system
Apache, Apache Lucene, Apache Hadoop, Hadoop, HDFS and the yellow elephant
logo are trademarks of the
Apache Software Foundation
in the United States and/or other countries.