Relation Between Heap Size and Total Data Size

umutcan · February 25, 2014, 3:25pm

Hi,

I created a Elasticsearch cluster with 4 instance. Elasticsearch 0.90.10
is running all of them. Heap size is 6 GB for all the instances, so
total heap size is 24 GB. I have 5 shard for each index and each shard
has 1 replica. A new index is created for every day, so all indices have
nearly same size.

When total data size reaches around 100 GB (replicas are included), my
cluster begins to fail to allocate some of the shards (status yellow).
After I delete some old indices and restart all the nodes, everything is
fine (status is green). If I do not delete some data, status eventually
turns red.

So, I am wondering that is there any relationship between heap size and
total data size? Is there any formula to determine heap size based on
data size?

Thanks,
Umutcan

--
You received this message because you are subscribed to the Google Groups "elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an email to elasticsearch+unsubscribe@googlegroups.com.
To view this discussion on the web visit https://groups.google.com/d/msgid/elasticsearch/530CB5FE.80203%40gamegos.com.
For more options, visit https://groups.google.com/groups/opt_out.

Randall_McRee · February 25, 2014, 3:29pm

Probably low on disc on at least one machine. Monitor disc usage. Also look in the logs and find out what error you are getting. Report back.

Sent from my iPhone

On Feb 25, 2014, at 7:25 AM, Umutcan umutcan@gamegos.com wrote:

Hi,

I created a Elasticsearch cluster with 4 instance. Elasticsearch 0.90.10 is running all of them. Heap size is 6 GB for all the instances, so total heap size is 24 GB. I have 5 shard for each index and each shard has 1 replica. A new index is created for every day, so all indices have nearly same size.

When total data size reaches around 100 GB (replicas are included), my cluster begins to fail to allocate some of the shards (status yellow). After I delete some old indices and restart all the nodes, everything is fine (status is green). If I do not delete some data, status eventually turns red.

So, I am wondering that is there any relationship between heap size and total data size? Is there any formula to determine heap size based on data size?

Thanks,
Umutcan

--
You received this message because you are subscribed to the Google Groups "elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an email to elasticsearch+unsubscribe@googlegroups.com.
To view this discussion on the web visit https://groups.google.com/d/msgid/elasticsearch/530CB5FE.80203%40gamegos.com.
For more options, visit https://groups.google.com/groups/opt_out.

--
You received this message because you are subscribed to the Google Groups "elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an email to elasticsearch+unsubscribe@googlegroups.com.
To view this discussion on the web visit https://groups.google.com/d/msgid/elasticsearch/A319EF53-D5C4-485C-B320-574C677D8314%40gmail.com.
For more options, visit https://groups.google.com/groups/opt_out.

umutcan · February 26, 2014, 7:06am

There is enough space on every machine. I looked in the logs and find
out that "org.apache.lucene.store.LockObtainFailedException: Lock obtain
timed out:
NativeFSLock@/ebs/elasticsearch/elasticsearch-0.90.10/data/elasticsearch/nodes/0/indices/logstash-2014.02.26/0/index/write.lock"
is what causes the shard fails to start.

On 02/25/2014 05:29 PM, Randy wrote:

Probably low on disc on at least one machine. Monitor disc usage. Also look in the logs and find out what error you are getting. Report back.

Sent from my iPhone

On Feb 25, 2014, at 7:25 AM, Umutcan umutcan@gamegos.com wrote:

Hi,

I created a Elasticsearch cluster with 4 instance. Elasticsearch 0.90.10 is running all of them. Heap size is 6 GB for all the instances, so total heap size is 24 GB. I have 5 shard for each index and each shard has 1 replica. A new index is created for every day, so all indices have nearly same size.

When total data size reaches around 100 GB (replicas are included), my cluster begins to fail to allocate some of the shards (status yellow). After I delete some old indices and restart all the nodes, everything is fine (status is green). If I do not delete some data, status eventually turns red.

So, I am wondering that is there any relationship between heap size and total data size? Is there any formula to determine heap size based on data size?

Thanks,
Umutcan

--
You received this message because you are subscribed to the Google Groups "elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an email to elasticsearch+unsubscribe@googlegroups.com.
To view this discussion on the web visit https://groups.google.com/d/msgid/elasticsearch/530CB5FE.80203%40gamegos.com.
For more options, visit https://groups.google.com/groups/opt_out.

--
You received this message because you are subscribed to the Google Groups "elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an email to elasticsearch+unsubscribe@googlegroups.com.
To view this discussion on the web visit https://groups.google.com/d/msgid/elasticsearch/530D9286.60300%40gamegos.com.
For more options, visit https://groups.google.com/groups/opt_out.

Dan_Fairs · February 26, 2014, 11:32am

So, I am wondering that is there any relationship between heap size and total data size? Is there any formula to determine heap size based on data size?

You might want to check that you're not running out of file handles:

Elasticsearch Platform — Find real-time answers at scale | Elastic

Cheers,
Dan

--
Dan Fairs | dan.fairs@gmail.com | @danfairs | secondsync.com

--
You received this message because you are subscribed to the Google Groups "elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an email to elasticsearch+unsubscribe@googlegroups.com.
To view this discussion on the web visit https://groups.google.com/d/msgid/elasticsearch/FDC26956-0E46-4E2B-9A0D-1F899DDFD015%40gmail.com.
For more options, visit https://groups.google.com/groups/opt_out.

umutcan · February 27, 2014, 6:49am

So, I am wondering that is there any relationship between heap size and total data size? Is there any formula to determine heap size based on data size?

You might want to check that you're not running out of file handles:

Elasticsearch Platform — Find real-time answers at scale | Elastic

Thanks Dan. This article solves my problem.

--
You received this message because you are subscribed to the Google Groups "elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an email to elasticsearch+unsubscribe@googlegroups.com.
To view this discussion on the web visit https://groups.google.com/d/msgid/elasticsearch/530EDFEF.1060505%40gamegos.com.
For more options, visit https://groups.google.com/groups/opt_out.

Topic		Replies	Views
Elasticsearch heap issues Elasticsearch	4	472	July 5, 2017
Storage per heap? Elasticsearch	3	966	July 5, 2017
How much heap memory need for elasticsearch data nodes Elasticsearch	5	890	March 1, 2021
Newbie question, ES "sizing"? Elasticsearch	5	1293	July 5, 2017
Problem with heap space overusage Elasticsearch	7	1894	July 6, 2017

Relation Between Heap Size and Total Data Size

Related topics