There is a way to clear the field data cache (there is an API for that
called clear cache), but not specifically for a specific field. Open an
issue for that one, its a good idea to have it.
Regarding the slower impl, I am guessing that its implemented either by
going to stored fields, or by extracting the stored source, parsing it, and
fetching the value. Thats going to be expensive, but for a small result set,
it might make sense. You can actually do that (for some facets) by using the
script option, since you can do both _source.obj.field (loads source and
parse it automatically) or _fields.field_name (fetches a stored field).
On Sun, Oct 2, 2011 at 8:38 PM, Stéphane Raux firstname.lastname@example.org:
Thank you for the plugin, I hope you will find some time to make it public!
Anyway, would it be possible to provide a way to free the memory taken
by the values of the facets, maybe with an explicit call on a given
field or by providing an optional timeout?
An other solution may be to implement a slower implementation for
requesting facets on small subsets of documents?
Should I open a feature or an issue?
---------- Forwarded message ----------
From: Jürgen kartnaller email@example.com
Subject: Re: terms facet explodes memory
It is implemented as a plugin but is not yet public available
I also made a simple distinct facet, alos for small data sets.
I will try to make it public if I find the time.
On Fri, Sep 30, 2011 at 11:14 AM, Stéphane Raux firstname.lastname@example.org
It seems be be a good solution for my use case, I am also doing facets
with small subsets of my documents.
Did you implement it with the Java API ? Is it available somewhere ?
2011/9/30 Jürgen kartnaller email@example.com:
To solve this problem we now have our own facet implementations which
using the field cache.
For us this is possible because we always have a small query result set
input for the facets.
The query filters about 100k documents out of 8G.
With the 100K docs the facet is still fast enough without a field
We did this only for fields containing strings, still using the cache
date and numerical fields.
On Fri, Sep 30, 2011 at 10:19 AM, Stéphane Raux <
I have the same problem.
The point is that once all the fields are loaded in memory for a term
facet, the memory is never released, so if I do several term facets on
several fields, I end up with a OutOfMemoryError.
Would it be possible to provide a mechanism allowing to free the
memory taken by the fields ?
Or to check if the node has enought memory before loading the fields ?
2011/8/17 Jürgen kartnaller firstname.lastname@example.org:
We are now using m2.xlarge with 30GB for ES. Will see tomorrow how
We will have 5.5T documents, as a start and will have a lot of facet
queries. We also implement our own specific facets
to fulfill customer
On Wed, Aug 17, 2011 at 1:45 PM, Shay Banon email@example.com
Yea :). Though, I do want to try and allow for other "cache"
that would allow not to have all values in memory, but still have
when doing facets, but its down the road...
On Wed, Aug 17, 2011 at 8:26 AM, Jürgen kartnaller
This basically means I need more memory.
On Wed, Aug 17, 2011 at 3:57 AM, Shay Banon firstname.lastname@example.org
Facets cause fields to be completely loaded to memory (its
each facet). The reason for that is performance, you don't want
disk for each hit you potentially have in order to fetch the
On Tue, Aug 16, 2011 at 5:15 PM, Jürgen kartnaller
The terms facet seems to read the terms field from ALL documents
the field cache not only the fields from the query result.
This also happens if the query returns no results for the facet.
In our case this results in :
java.lang.OutOfMemoryError: Java heap space
which then leads into a no longer responding cluster (need to
all ES instances).
For my understanding the facet should only read fields contained
result of the query.
Is there a way to avoid this problem?