Relation Between Heap Size and Total Data Size


(Umutcan Onal) #1

Hi,

I created a Elasticsearch cluster with 4 instance. Elasticsearch 0.90.10
is running all of them. Heap size is 6 GB for all the instances, so
total heap size is 24 GB. I have 5 shard for each index and each shard
has 1 replica. A new index is created for every day, so all indices have
nearly same size.

When total data size reaches around 100 GB (replicas are included), my
cluster begins to fail to allocate some of the shards (status yellow).
After I delete some old indices and restart all the nodes, everything is
fine (status is green). If I do not delete some data, status eventually
turns red.

So, I am wondering that is there any relationship between heap size and
total data size? Is there any formula to determine heap size based on
data size?

Thanks,
Umutcan

--
You received this message because you are subscribed to the Google Groups "elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an email to elasticsearch+unsubscribe@googlegroups.com.
To view this discussion on the web visit https://groups.google.com/d/msgid/elasticsearch/530CB5FE.80203%40gamegos.com.
For more options, visit https://groups.google.com/groups/opt_out.


(Randall McRee) #2

Probably low on disc on at least one machine. Monitor disc usage. Also look in the logs and find out what error you are getting. Report back.

Sent from my iPhone

On Feb 25, 2014, at 7:25 AM, Umutcan umutcan@gamegos.com wrote:

Hi,

I created a Elasticsearch cluster with 4 instance. Elasticsearch 0.90.10 is running all of them. Heap size is 6 GB for all the instances, so total heap size is 24 GB. I have 5 shard for each index and each shard has 1 replica. A new index is created for every day, so all indices have nearly same size.

When total data size reaches around 100 GB (replicas are included), my cluster begins to fail to allocate some of the shards (status yellow). After I delete some old indices and restart all the nodes, everything is fine (status is green). If I do not delete some data, status eventually turns red.

So, I am wondering that is there any relationship between heap size and total data size? Is there any formula to determine heap size based on data size?

Thanks,
Umutcan

--
You received this message because you are subscribed to the Google Groups "elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an email to elasticsearch+unsubscribe@googlegroups.com.
To view this discussion on the web visit https://groups.google.com/d/msgid/elasticsearch/530CB5FE.80203%40gamegos.com.
For more options, visit https://groups.google.com/groups/opt_out.

--
You received this message because you are subscribed to the Google Groups "elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an email to elasticsearch+unsubscribe@googlegroups.com.
To view this discussion on the web visit https://groups.google.com/d/msgid/elasticsearch/A319EF53-D5C4-485C-B320-574C677D8314%40gmail.com.
For more options, visit https://groups.google.com/groups/opt_out.


(Umutcan Onal) #3

There is enough space on every machine. I looked in the logs and find
out that "org.apache.lucene.store.LockObtainFailedException: Lock obtain
timed out:
NativeFSLock@/ebs/elasticsearch/elasticsearch-0.90.10/data/elasticsearch/nodes/0/indices/logstash-2014.02.26/0/index/write.lock"
is what causes the shard fails to start.

On 02/25/2014 05:29 PM, Randy wrote:

Probably low on disc on at least one machine. Monitor disc usage. Also look in the logs and find out what error you are getting. Report back.

Sent from my iPhone

On Feb 25, 2014, at 7:25 AM, Umutcan umutcan@gamegos.com wrote:

Hi,

I created a Elasticsearch cluster with 4 instance. Elasticsearch 0.90.10 is running all of them. Heap size is 6 GB for all the instances, so total heap size is 24 GB. I have 5 shard for each index and each shard has 1 replica. A new index is created for every day, so all indices have nearly same size.

When total data size reaches around 100 GB (replicas are included), my cluster begins to fail to allocate some of the shards (status yellow). After I delete some old indices and restart all the nodes, everything is fine (status is green). If I do not delete some data, status eventually turns red.

So, I am wondering that is there any relationship between heap size and total data size? Is there any formula to determine heap size based on data size?

Thanks,
Umutcan

--
You received this message because you are subscribed to the Google Groups "elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an email to elasticsearch+unsubscribe@googlegroups.com.
To view this discussion on the web visit https://groups.google.com/d/msgid/elasticsearch/530CB5FE.80203%40gamegos.com.
For more options, visit https://groups.google.com/groups/opt_out.

--
You received this message because you are subscribed to the Google Groups "elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an email to elasticsearch+unsubscribe@googlegroups.com.
To view this discussion on the web visit https://groups.google.com/d/msgid/elasticsearch/530D9286.60300%40gamegos.com.
For more options, visit https://groups.google.com/groups/opt_out.


(Dan Fairs) #4

So, I am wondering that is there any relationship between heap size and total data size? Is there any formula to determine heap size based on data size?

You might want to check that you're not running out of file handles:

http://www.elasticsearch.org/tutorials/too-many-open-files/

Cheers,
Dan

--
Dan Fairs | dan.fairs@gmail.com | @danfairs | secondsync.com

--
You received this message because you are subscribed to the Google Groups "elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an email to elasticsearch+unsubscribe@googlegroups.com.
To view this discussion on the web visit https://groups.google.com/d/msgid/elasticsearch/FDC26956-0E46-4E2B-9A0D-1F899DDFD015%40gmail.com.
For more options, visit https://groups.google.com/groups/opt_out.


(Umutcan Onal) #5

So, I am wondering that is there any relationship between heap size and total data size? Is there any formula to determine heap size based on data size?

You might want to check that you're not running out of file handles:

http://www.elasticsearch.org/tutorials/too-many-open-files/

Thanks Dan. This article solves my problem.

--
You received this message because you are subscribed to the Google Groups "elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an email to elasticsearch+unsubscribe@googlegroups.com.
To view this discussion on the web visit https://groups.google.com/d/msgid/elasticsearch/530EDFEF.1060505%40gamegos.com.
For more options, visit https://groups.google.com/groups/opt_out.


(system) #6