i am bit new Elastic search ,while testing on elasticsearch's aggregation
feature ,i am always hitting data too large,i understand that aggregations
are very memory intensive , so is there any way query in ES where one
query's output can be ingested to aggregation so that number of input to
aggregation is limited . i have used filter and querying before
aggregations .
Sounds like your problem might be your heap size is too low. How much
memory have you assigned to your heap (i.e. what have you set as
ES_HEAP_SIZE)? To perform aggregations, Elasticsearch has to load the
values for a field for every document into memory in a data structure
called field cache. It sounds like you are hitting the circuit breaker
which prevents this data structure using too much of the heap and causing
an OOM error.
Colin
On Wednesday, 3 September 2014 17:58:02 UTC+1, navdeep agarwal wrote:
hi ,
i am bit new Elastic search ,while testing on elasticsearch's aggregation
feature ,i am always hitting data too large,i understand that aggregations
are very memory intensive , so is there any way query in ES where one
query's output can be ingested to aggregation so that number of input to
aggregation is limited . i have used filter and querying before
aggregations .
thank you for reply ,my heap size is of 8gb for 74 gb index and yes i am
hitting circut breaker
so when i am querying or filtering before aggregations,aggregations are
passed only filtered/query output results ???
On Thursday, September 4, 2014 3:15:43 PM UTC+5:30, Colin Goodheart-Smithe
wrote:
Hi,
Sounds like your problem might be your heap size is too low. How much
memory have you assigned to your heap (i.e. what have you set as
ES_HEAP_SIZE)? To perform aggregations, Elasticsearch has to load the
values for a field for every document into memory in a data structure
called field cache. It sounds like you are hitting the circuit breaker
which prevents this data structure using too much of the heap and causing
an OOM error.
Colin
On Wednesday, 3 September 2014 17:58:02 UTC+1, navdeep agarwal wrote:
hi ,
i am bit new Elastic search ,while testing on elasticsearch's aggregation
feature ,i am always hitting data too large,i understand that aggregations
are very memory intensive , so is there any way query in ES where one
query's output can be ingested to aggregation so that number of input to
aggregation is limited . i have used filter and querying before
aggregations .
i am asking because query output or after filtering my output contain very
few entries(in hundreds),so if its is hitting oom error then aggregations
is taking everything into cache irrespective of before query or filtering .
On Wednesday, September 3, 2014 10:28:02 PM UTC+5:30, navdeep agarwal wrote:
hi ,
i am bit new Elastic search ,while testing on elasticsearch's aggregation
feature ,i am always hitting data too large,i understand that aggregations
are very memory intensive , so is there any way query in ES where one
query's output can be ingested to aggregation so that number of input to
aggregation is limited . i have used filter and querying before
aggregations .
What version of es have you been using, afaik in later versions you can
control the percentage of heap space to utilize with update settings api,
try to increase it a bit and see what happens, default is 60%, increase it
for example to 70%:
T.
On Wednesday, 3 September 2014 19:58:02 UTC+3, navdeep agarwal wrote:
hi ,
i am bit new Elastic search ,while testing on elasticsearch's aggregation
feature ,i am always hitting data too large,i understand that aggregations
are very memory intensive , so is there any way query in ES where one
query's output can be ingested to aggregation so that number of input to
aggregation is limited . i have used filter and querying before
aggregations .
Sorry for delayed response,
i am using 1.3 version ,i was able to change limit,field data circut
breaker,i changed it to 80 ,this is nice setting to know .
but it doesn't work ,may be heap size is my problem ,but i have very
limited heap space .
Thanks you.
On Friday, September 5, 2014 2:19:25 PM UTC+5:30, Thomas wrote:
What version of es have you been using, afaik in later versions you can
control the percentage of heap space to utilize with update settings api,
try to increase it a bit and see what happens, default is 60%, increase it
for example to 70%:
On Wednesday, 3 September 2014 19:58:02 UTC+3, navdeep agarwal wrote:
hi ,
i am bit new Elastic search ,while testing on elasticsearch's aggregation
feature ,i am always hitting data too large,i understand that aggregations
are very memory intensive , so is there any way query in ES where one
query's output can be ingested to aggregation so that number of input to
aggregation is limited . i have used filter and querying before
aggregations .
Field data does indeed load all the values for a field into memory
irrespective of the query and filter. This is how aggregations achieve
fast lookups on the values of a field for a particular document. The field
cache is loaded the first time it is needed and then stored in a cache.
Heap size is almost certainly your problem here. There are 2 options I can
see for you:
Increase your heap size to allow enough space to load the field cache
into memory
On Wednesday, 17 September 2014 07:47:27 UTC+1, navdeep agarwal wrote:
Sorry for delayed response,
i am using 1.3 version ,i was able to change limit,field data circut
breaker,i changed it to 80 ,this is nice setting to know .
but it doesn't work ,may be heap size is my problem ,but i have very
limited heap space .
Thanks you.
On Friday, September 5, 2014 2:19:25 PM UTC+5:30, Thomas wrote:
What version of es have you been using, afaik in later versions you can
control the percentage of heap space to utilize with update settings api,
try to increase it a bit and see what happens, default is 60%, increase it
for example to 70%:
On Wednesday, 3 September 2014 19:58:02 UTC+3, navdeep agarwal wrote:
hi ,
i am bit new Elastic search ,while testing on elasticsearch's
aggregation feature ,i am always hitting data too large,i understand that
aggregations are very memory intensive , so is there any way query in ES
where one query's output can be ingested to aggregation so that number of
input to aggregation is limited . i have used filter and querying before
aggregations .
Apache, Apache Lucene, Apache Hadoop, Hadoop, HDFS and the yellow elephant
logo are trademarks of the
Apache Software Foundation
in the United States and/or other countries.