I'm having trouble getting term_stats facets to run on documents that
contain nested objects in the parent. The index is about 120GB in
size, and there are around 50-100 nested objects per parent. I
consistently get a Java heap space error, even when the query by
itself only returns results in the low teens. I can run statistical
facets on root object properties, but not on the nested objects.
There are around 300k unique process names, and I'm usually winnowing
that down to around 100k.
ElasticSearch 0.17.4 is set to use 40GB of memory, it's a single node.
My questions are:
Am I misunderstanding how facets run? I thought facets would only
execute on the results produced by the query it's attached to.
I should be able to get around this with a script_value in the
facet, but what would the syntax for selecting nested document
properties be?
gist of the general document and the types of queries I'm trying to
run:
Can you try and not map the nested object to be included in the parent? This
will cause more memory used when loading the values for the field for the
term stats.
I'm having trouble getting term_stats facets to run on documents that
contain nested objects in the parent. The index is about 120GB in
size, and there are around 50-100 nested objects per parent. I
consistently get a Java heap space error, even when the query by
itself only returns results in the low teens. I can run statistical
facets on root object properties, but not on the nested objects.
There are around 300k unique process names, and I'm usually winnowing
that down to around 100k.
Elasticsearch 0.17.4 is set to use 40GB of memory, it's a single node.
My questions are:
Am I misunderstanding how facets run? I thought facets would only
execute on the results produced by the query it's attached to.
I should be able to get around this with a script_value in the
facet, but what would the syntax for selecting nested document
properties be?
I'll try that, thanks for the suggestion Shay. I'll let you know how
it works after loading finishes.
Did the nested facet query look right? No obvious errors? I guess
I'm still confused about how facets are executed, I thought they would
run against just the result set returned by the attached query, is
that correct? Thanks!
Can you try and not map the nested object to be included in the parent? This
will cause more memory used when loading the values for the field for the
term stats.
I'm having trouble getting term_stats facets to run on documents that
contain nested objects in the parent. The index is about 120GB in
size, and there are around 50-100 nested objects per parent. I
consistently get a Java heap space error, even when the query by
itself only returns results in the low teens. I can run statistical
facets on root object properties, but not on the nested objects.
There are around 300k unique process names, and I'm usually winnowing
that down to around 100k.
Elasticsearch 0.17.4 is set to use 40GB of memory, it's a single node.
My questions are:
Am I misunderstanding how facets run? I thought facets would only
execute on the results produced by the query it's attached to.
I should be able to get around this with a script_value in the
facet, but what would the syntax for selecting nested document
properties be?
I'll try that, thanks for the suggestion Shay. I'll let you know how
it works after loading finishes.
Did the nested facet query look right? No obvious errors? I guess
I'm still confused about how facets are executed, I thought they would
run against just the result set returned by the attached query, is
that correct? Thanks!
Can you try and not map the nested object to be included in the parent?
This
will cause more memory used when loading the values for the field for the
term stats.
I'm having trouble getting term_stats facets to run on documents that
contain nested objects in the parent. The index is about 120GB in
size, and there are around 50-100 nested objects per parent. I
consistently get a Java heap space error, even when the query by
itself only returns results in the low teens. I can run statistical
facets on root object properties, but not on the nested objects.
There are around 300k unique process names, and I'm usually winnowing
that down to around 100k.
Elasticsearch 0.17.4 is set to use 40GB of memory, it's a single node.
My questions are:
Am I misunderstanding how facets run? I thought facets would only
execute on the results produced by the query it's attached to.
I should be able to get around this with a script_value in the
facet, but what would the syntax for selecting nested document
properties be?
Apache, Apache Lucene, Apache Hadoop, Hadoop, HDFS and the yellow elephant
logo are trademarks of the
Apache Software Foundation
in the United States and/or other countries.