I created a Elasticsearch cluster with 4 instance. Elasticsearch 0.90.10
is running all of them. Heap size is 6 GB for all the instances, so
total heap size is 24 GB. I have 5 shard for each index and each shard
has 1 replica. A new index is created for every day, so all indices have
nearly same size.
When total data size reaches around 100 GB (replicas are included), my
cluster begins to fail to allocate some of the shards (status yellow).
After I delete some old indices and restart all the nodes, everything is
fine (status is green). If I do not delete some data, status eventually
turns red.
So, I am wondering that is there any relationship between heap size and
total data size? Is there any formula to determine heap size based on
data size?
I created a Elasticsearch cluster with 4 instance. Elasticsearch 0.90.10 is running all of them. Heap size is 6 GB for all the instances, so total heap size is 24 GB. I have 5 shard for each index and each shard has 1 replica. A new index is created for every day, so all indices have nearly same size.
When total data size reaches around 100 GB (replicas are included), my cluster begins to fail to allocate some of the shards (status yellow). After I delete some old indices and restart all the nodes, everything is fine (status is green). If I do not delete some data, status eventually turns red.
So, I am wondering that is there any relationship between heap size and total data size? Is there any formula to determine heap size based on data size?
There is enough space on every machine. I looked in the logs and find
out that "org.apache.lucene.store.LockObtainFailedException: Lock obtain
timed out:
NativeFSLock@/ebs/elasticsearch/elasticsearch-0.90.10/data/elasticsearch/nodes/0/indices/logstash-2014.02.26/0/index/write.lock"
is what causes the shard fails to start.
On 02/25/2014 05:29 PM, Randy wrote:
Probably low on disc on at least one machine. Monitor disc usage. Also look in the logs and find out what error you are getting. Report back.
I created a Elasticsearch cluster with 4 instance. Elasticsearch 0.90.10 is running all of them. Heap size is 6 GB for all the instances, so total heap size is 24 GB. I have 5 shard for each index and each shard has 1 replica. A new index is created for every day, so all indices have nearly same size.
When total data size reaches around 100 GB (replicas are included), my cluster begins to fail to allocate some of the shards (status yellow). After I delete some old indices and restart all the nodes, everything is fine (status is green). If I do not delete some data, status eventually turns red.
So, I am wondering that is there any relationship between heap size and total data size? Is there any formula to determine heap size based on data size?
So, I am wondering that is there any relationship between heap size and total data size? Is there any formula to determine heap size based on data size?
You might want to check that you're not running out of file handles:
So, I am wondering that is there any relationship between heap size and total data size? Is there any formula to determine heap size based on data size?
You might want to check that you're not running out of file handles:
Apache, Apache Lucene, Apache Hadoop, Hadoop, HDFS and the yellow elephant
logo are trademarks of the
Apache Software Foundation
in the United States and/or other countries.