On Tuesday, March 22, 2011 at 1:25 AM, Sebastian Gavarini wrote:
Agreed about difficulty to clear a filter in an API friendly way.
I think selecting the size of the filter cache would be fine.
Just one more thing, how do you know (in advance for sizing and in runtime
to monitor) the number of filters used? I know the ones I request from my
searches, but are some others created implicitly inside ES? is there an API
call to find out the size of that cache (in items quantity, not bytes)?
The most common other place where elasticsearch uses filters is when using
type level searches (the type is used as a term filter). But, I agree, we
can enhance the stats API to include counts as well, care to open a feature
request for that?
On Mon, Mar 21, 2011 at 7:30 PM, Shay Banon shay.banon@elasticsearch.comwrote:
If it is filters, then they will get cleared out when memory get scarce.
There are other options (not documented yet, I should really document it),
to control the maximum number of filters that are allowed to be cached:
Issues · elastic/elasticsearch · GitHub
.
The reason why its harder to say something clear all filters that match X
is because filters are very different from one another, so its more
difficult to know what group them and then expose an API to do that. Even if
you specify the field that they are used on.
I don't think you will need this option, but, if you see that you do, we
can think harder on how to do that
On Tuesday, March 22, 2011 at 12:24 AM, Sebastian Gavarini wrote:
Hi Shay,
Sorry for the confusion, I'll try to clarify it.
In my case I was planning to use "rangeFilter", with the epoch milliseconds
of the filter the user selected, eg: if the user selects "yesterday", I
would calculate, based on the date today, the milliseconds that corresponds
to "yesterday". I think a cached filter would be appropriate because until
the day changes at 24:00, I could keep using the cached filters.
I understand that there would be many filters associated with my
field "publish_date", one for each possible value {today, yesterday, past
week, past month}.
I would like to clean all the range filters associated with the
field "publish_date".
Specific filters make less sense, since they are so discrete you would
want to have some sort of way to group them possibly
Is this covered with my explanation of publish_date and the possible values
as a way to group them, or why do you think this won't be useful?
Thanks,
Sebastian.
On Mon, Mar 21, 2011 at 7:11 PM, Shay Banon shay.banon@elasticsearch.comwrote:
I think there is a confusion here... . There is the filter cache, which
is used when using certain filters in different places when searching, and
field cache, which is used for things like range faceting, date histogram
and the like. I did not understand which one are you going to use in order
to implement it (not enough info to guess).
There is no API to clear specific field/fields from the field level cache,
but can be added. Though, in your case (if you are going to use it), it does
not make sense to clear it, since you will still use it next time around.
Specific filters make less sense, since they are so discrete you would want
to have some sort of way to group them possibly. They do get cleared when
memory becomes scarce.
-shay.banon
On Tuesday, March 22, 2011 at 12:04 AM, Sebastian wrote:
Hi all,
I have a use case where a user can search by date recency, in fixed
intervals, eg: {today, yesterday, past week, past month}
I was planning to use field cache, as the filter is going to be used
repeatedly, but I would need to call cache clear for that specific
date filter alone, one every new day (at 24:00).
The problem is that the clear cache API, as I understood from the docs
and Java code of ClearIndicesCacheRequest, clears all the caches of a
certain type and index as a minimum granularity. I would like to clear
just one field cache, for example "publish_date". I could of course
clear all the caches, but I would loose a lot of other important
fields that won't have changed just for one field that did change.
Is clearing the cache of a single field (or list of fields) possible
somehow today? Can I add a feature request for it? Any other ideas?
Thanks,
Sebastian.