Hi all,
I've been trying to query facets on nested structures on a very small index
of around 14,000 documents, each containing around 100 nested docs.
The indexed documents look similar to this, with many elements in the color
array.
{
"color": [
{
"name": "n3",
"value": "v3",
"type": "t3"
},
{
"name": "n4",
"value": "v4",
"type": "t4"
}
]
}
My index mapping specifies that blue is nested and all strings are
not_analyzed.
Running a simple facet query like the one below on around 4,000 documents,
uses around 700MB of heap space (spikes to 700 MB from 300 MB and lots of
activity on garbage collection). Doing the same query on 14,000 documents,
the heap memory usage spikes to 3GB (that is my max right now) and
Elasticsearch throws an OutOfMemoryException. I've tried settingindex.cache.field.type: soft to
no avail.
{
"size": 0,
"facets": {
"fname": {
"terms": {
"field": "color.name"
}
}
}
}
Another interesting behavior is that if I don't specify a nested mapping
for the index, the facets return with little memory usage.
So, is this the typical behavior for 14,000 indexed documents with nested
docs?
Note: index has five shards and one replica.
I would greatly appreciate you help.
--